有没有办法消除Calamari-ocr文本检测引擎中不必要的时间延迟?

如何解决有没有办法消除Calamari-ocr文本检测引擎中不必要的时间延迟?

我一直在使用Calamari-ocr文本检测引擎。 Calamari ocr repo 我注意到的是,当一个人进行预测时,它会发出很多消息,而当它实际做出预测时,告诉您说要花1秒钟才能做出预测,对我来说已经花了5秒钟真正达到那个预测。即,如果我给“预测命令”计时,尽管它告诉我模型进行预测只花了1秒钟,但大约需要5秒钟。

我的调查导致我在进行预测的回购中使用了this脚本。然后,我进一步指出了MultiPredictor类对此负责(第92行)。因此,我查看了该类的定义位置,发现它是here定义的。 但是我找不到引起消息和警告喷涌的原因,这些消息和警告对我造成了延迟。

这是我用来进行预测和计时的代码片段:

import subprocess
import time

cmd = ['calamari-predict','--checkpoint','/opt/working/base_model/model_Calamari_1.ckpt.json','--files','/opt/working/data/*.png']
start_time = time.time()
subprocess.run(cmd)
print(time.time()-start_time)

请注意,我已经安装了calamari软件包以及根据仓库的说明所必需的依赖项。

这是我运行预测时提供的响应:

Found 1 files in the dataset
Checkpoint version 2 is up-to-date.
2020-09-02 14:50:49.243030: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
2020-09-02 14:50:49.243090: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2020-09-02 14:50:50.390703: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2020-09-02 14:50:50.390770: W tensorflow/stream_executor/cuda/cuda_driver.cc:312] failed call to cuInit: UNKNOWN ERROR (303)
2020-09-02 14:50:50.390814: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (6d0f92664ca6): /proc/driver/nvidia/version does not exist
2020-09-02 14:50:50.391069: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations,rebuild TensorFlow with the appropriate compiler flags.
2020-09-02 14:50:50.396503: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 1896005000 Hz
2020-09-02 14:50:50.396883: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55ca0736dbb0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-09-02 14:50:50.396937: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host,Default Version
Model: "functional_1"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_data (InputLayer)         [(None,None,48,1) 0                                            
__________________________________________________________________________________________________
conv2d_0 (Conv2D)               (None,40) 400         input_data[0][0]                 
__________________________________________________________________________________________________
pool2d_1 (MaxPooling2D)         (None,24,40) 0           conv2d_0[0][0]                   
__________________________________________________________________________________________________
conv2d_1 (Conv2D)               (None,60) 21660       pool2d_1[0][0]                   
__________________________________________________________________________________________________
pool2d_3 (MaxPooling2D)         (None,12,60) 0           conv2d_1[0][0]                   
__________________________________________________________________________________________________
reshape (Reshape)               (None,720)    0           pool2d_3[0][0]                   
__________________________________________________________________________________________________
bidirectional (Bidirectional)   (None,400)    1473600     reshape[0][0]                    
__________________________________________________________________________________________________
input_sequence_length (InputLay [(None,1)]          0                                            
__________________________________________________________________________________________________
dropout (Dropout)               (None,400)    0           bidirectional[0][0]              
__________________________________________________________________________________________________
tf_op_layer_FloorDiv (TensorFlo [(None,1)]          0           input_sequence_length[0][0]      
__________________________________________________________________________________________________
logits (Dense)                  (None,29)     11629       dropout[0][0]                    
__________________________________________________________________________________________________
tf_op_layer_FloorDiv_1 (TensorF [(None,1)]          0           tf_op_layer_FloorDiv[0][0]       
__________________________________________________________________________________________________
softmax (Softmax)               (None,29)     0           logits[0][0]                     
__________________________________________________________________________________________________
input_data_params (InputLayer)  [(None,1)]          0                                            
__________________________________________________________________________________________________
tf_op_layer_Cast (TensorFlowOpL [(None,1)]          0           tf_op_layer_FloorDiv_1[0][0]     
==================================================================================================
Total params: 1,507,289
Trainable params: 1,289
Non-trainable params: 0
__________________________________________________________________________________________________
None
Prediction:   0%|                                                                                       | 0/1 [00:00<?,?it/s]
barcode_0: '‪CLK0SU0M((141707‬'
Prediction: 100%|███████████████████████████████████████████████████████████████████████████████| 1/1 [00:01<00:00,1.52s/it]
Prediction of 1 models took 1.5640497207641602s
Average sentence confidence: 85.41%
All files written
4.881764888763428

您可以看到该模型表明进行预测仅花费了1.5640497207641602秒,但是在底部您可以看到实际上花费了4.881764888763428秒。

为确保将预测函数作为子例程运行不是造成时间延迟的罪魁祸首,我重构了进行预测的代码并运行它:

from bidi.algorithm import get_base_level

from google.protobuf.json_format import MessageToJson

from calamari_ocr import __version__
from calamari_ocr.utils.glob import glob_all
from calamari_ocr.ocr.datasets import DataSetType,create_dataset,DataSetMode
from calamari_ocr.ocr import MultiPredictor
from calamari_ocr.ocr.voting import voter_from_proto
from calamari_ocr.proto import VoterParams,Predictions,CTCDecoderParams

import os
import cv2
import time
import logging

logger = logging.getLogger()


class PredictionAttrs:
    def __init__(self):
        """Atributes:
            List all image files that shall be processed
            --files

            Optional list of additional text files. E.g. when updating Abbyy prediction,this parameter must be used for the xml files.
            --text_files

            Used to identify the DatasetType
            --dataset

            Define the prediction extension. This parameter can be used to override ground truth files.
            --extension

            Path to the checkpoint without file extension
            --checkpoint

            Number of processes to use"
            -j --processes

            The batch size during the prediction (number of lines to process in parallel)
            --batch_size

            Print additional information
            --verbose

            The voting algorithm to use. Possible values: confidence_voter_default_ctc (default),confidence_voter_fuzzy_ctc,sequence_voter
            --voter

            By default the prediction files will be written to the same directory as the given files. You can use this argument to specify a specific output dir for the prediction files.
            --output_dir

            Write: Predicted string,labels; position,probabilities and alternatives of chars to a .pred (protobuf) file
            --extended_prediction_data

            Extension format: Either pred or json. Note that json will not print logits.
            --extended_prediction_data_format

            Do not show any progress bars
            --no_progress_bars

            List of text files that will be used to create a dictionary
            --dictionary

            Number of beams when using the CTCWordBeamSearch decoder
            --beam_width

            # dataset extra args
            --dataset_pad
            --pagexml_text_index
        """
        self.files = []
        self.checkpoint = []
        self.processes = 1
        self.batch_size = 1
        self.verbose = True
        self.voter = "confidence_voter_default_ctc"
        self.output_dir = None
        self.extended_prediction_data = False
        self.extended_prediction_data_format = "json"
        self.no_progress_bars = True
        self.extension = None
        self.dataset = DataSetType.FILE
        self.text_files = None
        self.pagexml_text_index = 0
        self.beam_width = 25
        self.dictionary = []
        self.dataset_pad = None


class CalamariPredict(object):
    def __init__(self,checkpoint_path,file):
        """This class utilises the attributes of PredictionAttrs class and
         uses the specified checkpoint to predict the text for the input file.
         For predicting the method run() is called.

        Arguments:
            checkpoint_path: Path to checkpoint files
        """
        self.args = PredictionAttrs()
        self.args.checkpoint = checkpoint_path
        self.args.files = file

        # add json as extension,resolve wildcard,expand user,... and remove .json again
        self.args.checkpoint = [(cp if cp.endswith(".json") else cp + ".json") for cp in self.args.checkpoint]
        self.args.checkpoint = glob_all(self.args.checkpoint)
        self.args.checkpoint = [cp[:-5] for cp in self.args.checkpoint]

        self.args.extension = (
            self.args.extension if self.args.extension else DataSetType.pred_extension(self.args.dataset)
        )

        # create ctc decoder
        ctc_decoder_params = self.create_ctc_decoder_params(self.args)

        # create voter
        voter_params = VoterParams()
        voter_params.type = VoterParams.Type.Value(self.args.voter.upper())
        self.voter = voter_from_proto(voter_params)

        # predict for all models
        self.predictor = MultiPredictor(
            checkpoints=self.args.checkpoint,batch_size=self.args.batch_size,processes=self.args.processes,ctc_decoder_params=ctc_decoder_params,)
        # checks
        if self.args.extended_prediction_data_format not in ["pred","json"]:
            raise Exception("Only 'pred' and 'json' are allowed extended prediction data formats")
        # load files
        if self.args.dataset == DataSetType.FILE:
            input_image_files = glob_all(self.args.files)
            if self.args.text_files:
                self.args.text_files = glob_all(self.args.text_files)

        elif self.args.dataset == DataSetType.RAW:
            input_image_files = file

        # skip invalid files and remove them,there wont be predictions of invalid files
        self.dataset = create_dataset(
            self.args.dataset,DataSetMode.PREDICT,input_image_files,self.args.text_files,skip_invalid=True,remove_invalid=True,args={"text_index": self.args.pagexml_text_index,"pad": self.args.dataset_pad,},)

        if len(self.dataset) == 0:
            raise Exception(
                "Empty dataset provided. Check your files argument (got {})!".format(self.args.files)
            )

        do_prediction = self.predictor.predict_dataset(
            self.dataset,progress_bar=not self.args.no_progress_bars
        )

        avg_sentence_confidence = 0
        n_predictions = 0

        self.dataset.prepare_store()

        # output the voted results to the appropriate files
        for result,sample in do_prediction:
            n_predictions += 1
            for i,p in enumerate(result):
                p.prediction.id = "fold_{}".format(i)

            # vote the results (if only one model is given,this will just return the sentences)
            prediction = self.voter.vote_prediction_result(result)
            prediction.id = "voted"
            sentence = prediction.sentence
            avg_sentence_confidence += prediction.avg_char_probability

        print(sentence)
    def create_ctc_decoder_params(self,args):
        """Method for returning the ctc decoder params

        Arguments:
            args: args from an instance of class PredictionAttrs

        Returns:
            params -- ctc decoder params.
        """
        params = CTCDecoderParams()
        params.beam_width = args.beam_width
        params.word_separator = " "

        if args.dictionary and len(args.dictionary) > 0:
            dictionary = set()
            logger.info(f"Creating dictionary")
            for path in glob_all(args.dictionary):
                with open(path,"r") as f:
                    dictionary = dictionary.union({word for word in f.read().split()})

            params.dictionary[:] = dictionary
            logger.info("Dictionary with {} unique words successfully created.".format(len(dictionary)))
        else:
            args.dictionary = None

        if args.dictionary:
            logger.info(
                f"USING A LANGUAGE MODEL IS CURRENTLY EXPERIMENTAL ONLY. NOTE: THE PREDICTION IS VERY SLOW!"
            )
            params.type = CTCDecoderParams.CTC_WORD_BEAM_SEARCH

        return params


if __name__ == "__main__":
    start_time = time.time()
    pred = CalamariPredict(['/opt/working/base_model/model_Calamari_1.ckpt.json'],'/opt/working/data/*.png')
    print(time.time()-start_time)

但这仍然是我得到的:

Checkpoint version 2 is up-to-date.
2020-09-02 15:23:00.328806: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
2020-09-02 15:23:00.328876: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2020-09-02 15:23:01.536020: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2020-09-02 15:23:01.536096: W tensorflow/stream_executor/cuda/cuda_driver.cc:312] failed call to cuInit: UNKNOWN ERROR (303)
2020-09-02 15:23:01.536122: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (6d0f92664ca6): /proc/driver/nvidia/version does not exist
2020-09-02 15:23:01.536450: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations,rebuild TensorFlow with the appropriate compiler flags.
2020-09-02 15:23:01.542147: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 1896005000 Hz
2020-09-02 15:23:01.542953: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x560df22dbb30 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-09-02 15:23:01.543009: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host,289
Non-trainable params: 0
__________________________________________________________________________________________________
None
Prediction of 1 models took 1.556354284286499s
CLK0SU0M((141707
3.4778225421905518

如您所见,定时预测时间仍然是声称的预测时间的两倍。

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。

相关推荐


依赖报错 idea导入项目后依赖报错,解决方案:https://blog.csdn.net/weixin_42420249/article/details/81191861 依赖版本报错:更换其他版本 无法下载依赖可参考:https://blog.csdn.net/weixin_42628809/a
错误1:代码生成器依赖和mybatis依赖冲突 启动项目时报错如下 2021-12-03 13:33:33.927 ERROR 7228 [ main] o.s.b.d.LoggingFailureAnalysisReporter : *************************** APPL
错误1:gradle项目控制台输出为乱码 # 解决方案:https://blog.csdn.net/weixin_43501566/article/details/112482302 # 在gradle-wrapper.properties 添加以下内容 org.gradle.jvmargs=-Df
错误还原:在查询的过程中,传入的workType为0时,该条件不起作用 &lt;select id=&quot;xxx&quot;&gt; SELECT di.id, di.name, di.work_type, di.updated... &lt;where&gt; &lt;if test=&qu
报错如下,gcc版本太低 ^ server.c:5346:31: 错误:‘struct redisServer’没有名为‘server_cpulist’的成员 redisSetCpuAffinity(server.server_cpulist); ^ server.c: 在函数‘hasActiveC
解决方案1 1、改项目中.idea/workspace.xml配置文件,增加dynamic.classpath参数 2、搜索PropertiesComponent,添加如下 &lt;property name=&quot;dynamic.classpath&quot; value=&quot;tru
删除根组件app.vue中的默认代码后报错:Module Error (from ./node_modules/eslint-loader/index.js): 解决方案:关闭ESlint代码检测,在项目根目录创建vue.config.js,在文件中添加 module.exports = { lin
查看spark默认的python版本 [root@master day27]# pyspark /home/software/spark-2.3.4-bin-hadoop2.7/conf/spark-env.sh: line 2: /usr/local/hadoop/bin/hadoop: No s
使用本地python环境可以成功执行 import pandas as pd import matplotlib.pyplot as plt # 设置字体 plt.rcParams[&#39;font.sans-serif&#39;] = [&#39;SimHei&#39;] # 能正确显示负号 p
错误1:Request method ‘DELETE‘ not supported 错误还原:controller层有一个接口,访问该接口时报错:Request method ‘DELETE‘ not supported 错误原因:没有接收到前端传入的参数,修改为如下 参考 错误2:cannot r
错误1:启动docker镜像时报错:Error response from daemon: driver failed programming external connectivity on endpoint quirky_allen 解决方法:重启docker -&gt; systemctl r
错误1:private field ‘xxx‘ is never assigned 按Altʾnter快捷键,选择第2项 参考:https://blog.csdn.net/shi_hong_fei_hei/article/details/88814070 错误2:启动时报错,不能找到主启动类 #
报错如下,通过源不能下载,最后警告pip需升级版本 Requirement already satisfied: pip in c:\users\ychen\appdata\local\programs\python\python310\lib\site-packages (22.0.4) Coll
错误1:maven打包报错 错误还原:使用maven打包项目时报错如下 [ERROR] Failed to execute goal org.apache.maven.plugins:maven-resources-plugin:3.2.0:resources (default-resources)
错误1:服务调用时报错 服务消费者模块assess通过openFeign调用服务提供者模块hires 如下为服务提供者模块hires的控制层接口 @RestController @RequestMapping(&quot;/hires&quot;) public class FeignControl
错误1:运行项目后报如下错误 解决方案 报错2:Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.8.1:compile (default-compile) on project sb 解决方案:在pom.
参考 错误原因 过滤器或拦截器在生效时,redisTemplate还没有注入 解决方案:在注入容器时就生效 @Component //项目运行时就注入Spring容器 public class RedisBean { @Resource private RedisTemplate&lt;String
使用vite构建项目报错 C:\Users\ychen\work&gt;npm init @vitejs/app @vitejs/create-app is deprecated, use npm init vite instead C:\Users\ychen\AppData\Local\npm-