如何解决有没有办法消除Calamari-ocr文本检测引擎中不必要的时间延迟?
我一直在使用Calamari-ocr文本检测引擎。 Calamari ocr repo 我注意到的是,当一个人进行预测时,它会发出很多消息,而当它实际做出预测时,告诉您说要花1秒钟才能做出预测,对我来说已经花了5秒钟真正达到那个预测。即,如果我给“预测命令”计时,尽管它告诉我模型进行预测只花了1秒钟,但大约需要5秒钟。
我的调查导致我在进行预测的回购中使用了this脚本。然后,我进一步指出了MultiPredictor
类对此负责(第92行)。因此,我查看了该类的定义位置,发现它是here定义的。
但是我找不到引起消息和警告喷涌的原因,这些消息和警告对我造成了延迟。
这是我用来进行预测和计时的代码片段:
import subprocess
import time
cmd = ['calamari-predict','--checkpoint','/opt/working/base_model/model_Calamari_1.ckpt.json','--files','/opt/working/data/*.png']
start_time = time.time()
subprocess.run(cmd)
print(time.time()-start_time)
请注意,我已经安装了calamari软件包以及根据仓库的说明所必需的依赖项。
这是我运行预测时提供的响应:
Found 1 files in the dataset
Checkpoint version 2 is up-to-date.
2020-09-02 14:50:49.243030: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
2020-09-02 14:50:49.243090: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2020-09-02 14:50:50.390703: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2020-09-02 14:50:50.390770: W tensorflow/stream_executor/cuda/cuda_driver.cc:312] failed call to cuInit: UNKNOWN ERROR (303)
2020-09-02 14:50:50.390814: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (6d0f92664ca6): /proc/driver/nvidia/version does not exist
2020-09-02 14:50:50.391069: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations,rebuild TensorFlow with the appropriate compiler flags.
2020-09-02 14:50:50.396503: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 1896005000 Hz
2020-09-02 14:50:50.396883: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55ca0736dbb0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-09-02 14:50:50.396937: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host,Default Version
Model: "functional_1"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_data (InputLayer) [(None,None,48,1) 0
__________________________________________________________________________________________________
conv2d_0 (Conv2D) (None,40) 400 input_data[0][0]
__________________________________________________________________________________________________
pool2d_1 (MaxPooling2D) (None,24,40) 0 conv2d_0[0][0]
__________________________________________________________________________________________________
conv2d_1 (Conv2D) (None,60) 21660 pool2d_1[0][0]
__________________________________________________________________________________________________
pool2d_3 (MaxPooling2D) (None,12,60) 0 conv2d_1[0][0]
__________________________________________________________________________________________________
reshape (Reshape) (None,720) 0 pool2d_3[0][0]
__________________________________________________________________________________________________
bidirectional (Bidirectional) (None,400) 1473600 reshape[0][0]
__________________________________________________________________________________________________
input_sequence_length (InputLay [(None,1)] 0
__________________________________________________________________________________________________
dropout (Dropout) (None,400) 0 bidirectional[0][0]
__________________________________________________________________________________________________
tf_op_layer_FloorDiv (TensorFlo [(None,1)] 0 input_sequence_length[0][0]
__________________________________________________________________________________________________
logits (Dense) (None,29) 11629 dropout[0][0]
__________________________________________________________________________________________________
tf_op_layer_FloorDiv_1 (TensorF [(None,1)] 0 tf_op_layer_FloorDiv[0][0]
__________________________________________________________________________________________________
softmax (Softmax) (None,29) 0 logits[0][0]
__________________________________________________________________________________________________
input_data_params (InputLayer) [(None,1)] 0
__________________________________________________________________________________________________
tf_op_layer_Cast (TensorFlowOpL [(None,1)] 0 tf_op_layer_FloorDiv_1[0][0]
==================================================================================================
Total params: 1,507,289
Trainable params: 1,289
Non-trainable params: 0
__________________________________________________________________________________________________
None
Prediction: 0%| | 0/1 [00:00<?,?it/s]
barcode_0: 'CLK0SU0M((141707'
Prediction: 100%|███████████████████████████████████████████████████████████████████████████████| 1/1 [00:01<00:00,1.52s/it]
Prediction of 1 models took 1.5640497207641602s
Average sentence confidence: 85.41%
All files written
4.881764888763428
您可以看到该模型表明进行预测仅花费了1.5640497207641602秒,但是在底部您可以看到实际上花费了4.881764888763428秒。
为确保将预测函数作为子例程运行不是造成时间延迟的罪魁祸首,我重构了进行预测的代码并运行它:
from bidi.algorithm import get_base_level
from google.protobuf.json_format import MessageToJson
from calamari_ocr import __version__
from calamari_ocr.utils.glob import glob_all
from calamari_ocr.ocr.datasets import DataSetType,create_dataset,DataSetMode
from calamari_ocr.ocr import MultiPredictor
from calamari_ocr.ocr.voting import voter_from_proto
from calamari_ocr.proto import VoterParams,Predictions,CTCDecoderParams
import os
import cv2
import time
import logging
logger = logging.getLogger()
class PredictionAttrs:
def __init__(self):
"""Atributes:
List all image files that shall be processed
--files
Optional list of additional text files. E.g. when updating Abbyy prediction,this parameter must be used for the xml files.
--text_files
Used to identify the DatasetType
--dataset
Define the prediction extension. This parameter can be used to override ground truth files.
--extension
Path to the checkpoint without file extension
--checkpoint
Number of processes to use"
-j --processes
The batch size during the prediction (number of lines to process in parallel)
--batch_size
Print additional information
--verbose
The voting algorithm to use. Possible values: confidence_voter_default_ctc (default),confidence_voter_fuzzy_ctc,sequence_voter
--voter
By default the prediction files will be written to the same directory as the given files. You can use this argument to specify a specific output dir for the prediction files.
--output_dir
Write: Predicted string,labels; position,probabilities and alternatives of chars to a .pred (protobuf) file
--extended_prediction_data
Extension format: Either pred or json. Note that json will not print logits.
--extended_prediction_data_format
Do not show any progress bars
--no_progress_bars
List of text files that will be used to create a dictionary
--dictionary
Number of beams when using the CTCWordBeamSearch decoder
--beam_width
# dataset extra args
--dataset_pad
--pagexml_text_index
"""
self.files = []
self.checkpoint = []
self.processes = 1
self.batch_size = 1
self.verbose = True
self.voter = "confidence_voter_default_ctc"
self.output_dir = None
self.extended_prediction_data = False
self.extended_prediction_data_format = "json"
self.no_progress_bars = True
self.extension = None
self.dataset = DataSetType.FILE
self.text_files = None
self.pagexml_text_index = 0
self.beam_width = 25
self.dictionary = []
self.dataset_pad = None
class CalamariPredict(object):
def __init__(self,checkpoint_path,file):
"""This class utilises the attributes of PredictionAttrs class and
uses the specified checkpoint to predict the text for the input file.
For predicting the method run() is called.
Arguments:
checkpoint_path: Path to checkpoint files
"""
self.args = PredictionAttrs()
self.args.checkpoint = checkpoint_path
self.args.files = file
# add json as extension,resolve wildcard,expand user,... and remove .json again
self.args.checkpoint = [(cp if cp.endswith(".json") else cp + ".json") for cp in self.args.checkpoint]
self.args.checkpoint = glob_all(self.args.checkpoint)
self.args.checkpoint = [cp[:-5] for cp in self.args.checkpoint]
self.args.extension = (
self.args.extension if self.args.extension else DataSetType.pred_extension(self.args.dataset)
)
# create ctc decoder
ctc_decoder_params = self.create_ctc_decoder_params(self.args)
# create voter
voter_params = VoterParams()
voter_params.type = VoterParams.Type.Value(self.args.voter.upper())
self.voter = voter_from_proto(voter_params)
# predict for all models
self.predictor = MultiPredictor(
checkpoints=self.args.checkpoint,batch_size=self.args.batch_size,processes=self.args.processes,ctc_decoder_params=ctc_decoder_params,)
# checks
if self.args.extended_prediction_data_format not in ["pred","json"]:
raise Exception("Only 'pred' and 'json' are allowed extended prediction data formats")
# load files
if self.args.dataset == DataSetType.FILE:
input_image_files = glob_all(self.args.files)
if self.args.text_files:
self.args.text_files = glob_all(self.args.text_files)
elif self.args.dataset == DataSetType.RAW:
input_image_files = file
# skip invalid files and remove them,there wont be predictions of invalid files
self.dataset = create_dataset(
self.args.dataset,DataSetMode.PREDICT,input_image_files,self.args.text_files,skip_invalid=True,remove_invalid=True,args={"text_index": self.args.pagexml_text_index,"pad": self.args.dataset_pad,},)
if len(self.dataset) == 0:
raise Exception(
"Empty dataset provided. Check your files argument (got {})!".format(self.args.files)
)
do_prediction = self.predictor.predict_dataset(
self.dataset,progress_bar=not self.args.no_progress_bars
)
avg_sentence_confidence = 0
n_predictions = 0
self.dataset.prepare_store()
# output the voted results to the appropriate files
for result,sample in do_prediction:
n_predictions += 1
for i,p in enumerate(result):
p.prediction.id = "fold_{}".format(i)
# vote the results (if only one model is given,this will just return the sentences)
prediction = self.voter.vote_prediction_result(result)
prediction.id = "voted"
sentence = prediction.sentence
avg_sentence_confidence += prediction.avg_char_probability
print(sentence)
def create_ctc_decoder_params(self,args):
"""Method for returning the ctc decoder params
Arguments:
args: args from an instance of class PredictionAttrs
Returns:
params -- ctc decoder params.
"""
params = CTCDecoderParams()
params.beam_width = args.beam_width
params.word_separator = " "
if args.dictionary and len(args.dictionary) > 0:
dictionary = set()
logger.info(f"Creating dictionary")
for path in glob_all(args.dictionary):
with open(path,"r") as f:
dictionary = dictionary.union({word for word in f.read().split()})
params.dictionary[:] = dictionary
logger.info("Dictionary with {} unique words successfully created.".format(len(dictionary)))
else:
args.dictionary = None
if args.dictionary:
logger.info(
f"USING A LANGUAGE MODEL IS CURRENTLY EXPERIMENTAL ONLY. NOTE: THE PREDICTION IS VERY SLOW!"
)
params.type = CTCDecoderParams.CTC_WORD_BEAM_SEARCH
return params
if __name__ == "__main__":
start_time = time.time()
pred = CalamariPredict(['/opt/working/base_model/model_Calamari_1.ckpt.json'],'/opt/working/data/*.png')
print(time.time()-start_time)
但这仍然是我得到的:
Checkpoint version 2 is up-to-date.
2020-09-02 15:23:00.328806: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
2020-09-02 15:23:00.328876: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2020-09-02 15:23:01.536020: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2020-09-02 15:23:01.536096: W tensorflow/stream_executor/cuda/cuda_driver.cc:312] failed call to cuInit: UNKNOWN ERROR (303)
2020-09-02 15:23:01.536122: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (6d0f92664ca6): /proc/driver/nvidia/version does not exist
2020-09-02 15:23:01.536450: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations,rebuild TensorFlow with the appropriate compiler flags.
2020-09-02 15:23:01.542147: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 1896005000 Hz
2020-09-02 15:23:01.542953: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x560df22dbb30 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-09-02 15:23:01.543009: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host,289
Non-trainable params: 0
__________________________________________________________________________________________________
None
Prediction of 1 models took 1.556354284286499s
CLK0SU0M((141707
3.4778225421905518
如您所见,定时预测时间仍然是声称的预测时间的两倍。
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。