

我一直在使用Calamari-ocr文本检测引擎。 Calamari ocr repo 我注意到的是,当一个人进行预测时,它会发出很多消息,而当它实际做出预测时,告诉您说要花1秒钟才能做出预测,对我来说已经花了5秒钟真正达到那个预测。即,如果我给“预测命令”计时,尽管它告诉我模型进行预测只花了1秒钟,但大约需要5秒钟。

我的调查导致我在进行预测的回购中使用了this脚本。然后,我进一步指出了MultiPredictor类对此负责(第92行)。因此,我查看了该类的定义位置,发现它是here定义的。 但是我找不到引起消息和警告喷涌的原因,这些消息和警告对我造成了延迟。


import subprocess
import time

cmd = ['calamari-predict','--checkpoint','/opt/working/base_model/model_Calamari_1.ckpt.json','--files','/opt/working/data/*.png']
start_time = time.time()



Found 1 files in the dataset
Checkpoint version 2 is up-to-date.
2020-09-02 14:50:49.243030: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
2020-09-02 14:50:49.243090: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2020-09-02 14:50:50.390703: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2020-09-02 14:50:50.390770: W tensorflow/stream_executor/cuda/cuda_driver.cc:312] failed call to cuInit: UNKNOWN ERROR (303)
2020-09-02 14:50:50.390814: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (6d0f92664ca6): /proc/driver/nvidia/version does not exist
2020-09-02 14:50:50.391069: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations,rebuild TensorFlow with the appropriate compiler flags.
2020-09-02 14:50:50.396503: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 1896005000 Hz
2020-09-02 14:50:50.396883: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55ca0736dbb0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-09-02 14:50:50.396937: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host,Default Version
Model: "functional_1"
Layer (type)                    Output Shape         Param #     Connected to                     
input_data (InputLayer)         [(None,None,48,1) 0                                            
conv2d_0 (Conv2D)               (None,40) 400         input_data[0][0]                 
pool2d_1 (MaxPooling2D)         (None,24,40) 0           conv2d_0[0][0]                   
conv2d_1 (Conv2D)               (None,60) 21660       pool2d_1[0][0]                   
pool2d_3 (MaxPooling2D)         (None,12,60) 0           conv2d_1[0][0]                   
reshape (Reshape)               (None,720)    0           pool2d_3[0][0]                   
bidirectional (Bidirectional)   (None,400)    1473600     reshape[0][0]                    
input_sequence_length (InputLay [(None,1)]          0                                            
dropout (Dropout)               (None,400)    0           bidirectional[0][0]              
tf_op_layer_FloorDiv (TensorFlo [(None,1)]          0           input_sequence_length[0][0]      
logits (Dense)                  (None,29)     11629       dropout[0][0]                    
tf_op_layer_FloorDiv_1 (TensorF [(None,1)]          0           tf_op_layer_FloorDiv[0][0]       
softmax (Softmax)               (None,29)     0           logits[0][0]                     
input_data_params (InputLayer)  [(None,1)]          0                                            
tf_op_layer_Cast (TensorFlowOpL [(None,1)]          0           tf_op_layer_FloorDiv_1[0][0]     
Total params: 1,507,289
Trainable params: 1,289
Non-trainable params: 0
Prediction:   0%|                                                                                       | 0/1 [00:00<?,?it/s]
barcode_0: '‪CLK0SU0M((141707‬'
Prediction: 100%|███████████████████████████████████████████████████████████████████████████████| 1/1 [00:01<00:00,1.52s/it]
Prediction of 1 models took 1.5640497207641602s
Average sentence confidence: 85.41%
All files written



from bidi.algorithm import get_base_level

from google.protobuf.json_format import MessageToJson

from calamari_ocr import __version__
from calamari_ocr.utils.glob import glob_all
from calamari_ocr.ocr.datasets import DataSetType,create_dataset,DataSetMode
from calamari_ocr.ocr import MultiPredictor
from calamari_ocr.ocr.voting import voter_from_proto
from calamari_ocr.proto import VoterParams,Predictions,CTCDecoderParams

import os
import cv2
import time
import logging

logger = logging.getLogger()

class PredictionAttrs:
    def __init__(self):
            List all image files that shall be processed

            Optional list of additional text files. E.g. when updating Abbyy prediction,this parameter must be used for the xml files.

            Used to identify the DatasetType

            Define the prediction extension. This parameter can be used to override ground truth files.

            Path to the checkpoint without file extension

            Number of processes to use"
            -j --processes

            The batch size during the prediction (number of lines to process in parallel)

            Print additional information

            The voting algorithm to use. Possible values: confidence_voter_default_ctc (default),confidence_voter_fuzzy_ctc,sequence_voter

            By default the prediction files will be written to the same directory as the given files. You can use this argument to specify a specific output dir for the prediction files.

            Write: Predicted string,labels; position,probabilities and alternatives of chars to a .pred (protobuf) file

            Extension format: Either pred or json. Note that json will not print logits.

            Do not show any progress bars

            List of text files that will be used to create a dictionary

            Number of beams when using the CTCWordBeamSearch decoder

            # dataset extra args
        self.files = []
        self.checkpoint = []
        self.processes = 1
        self.batch_size = 1
        self.verbose = True
        self.voter = "confidence_voter_default_ctc"
        self.output_dir = None
        self.extended_prediction_data = False
        self.extended_prediction_data_format = "json"
        self.no_progress_bars = True
        self.extension = None
        self.dataset = DataSetType.FILE
        self.text_files = None
        self.pagexml_text_index = 0
        self.beam_width = 25
        self.dictionary = []
        self.dataset_pad = None

class CalamariPredict(object):
    def __init__(self,checkpoint_path,file):
        """This class utilises the attributes of PredictionAttrs class and
         uses the specified checkpoint to predict the text for the input file.
         For predicting the method run() is called.

            checkpoint_path: Path to checkpoint files
        self.args = PredictionAttrs()
        self.args.checkpoint = checkpoint_path
        self.args.files = file

        # add json as extension,resolve wildcard,expand user,... and remove .json again
        self.args.checkpoint = [(cp if cp.endswith(".json") else cp + ".json") for cp in self.args.checkpoint]
        self.args.checkpoint = glob_all(self.args.checkpoint)
        self.args.checkpoint = [cp[:-5] for cp in self.args.checkpoint]

        self.args.extension = (
            self.args.extension if self.args.extension else DataSetType.pred_extension(self.args.dataset)

        # create ctc decoder
        ctc_decoder_params = self.create_ctc_decoder_params(self.args)

        # create voter
        voter_params = VoterParams()
        voter_params.type = VoterParams.Type.Value(self.args.voter.upper())
        self.voter = voter_from_proto(voter_params)

        # predict for all models
        self.predictor = MultiPredictor(
        # checks
        if self.args.extended_prediction_data_format not in ["pred","json"]:
            raise Exception("Only 'pred' and 'json' are allowed extended prediction data formats")
        # load files
        if self.args.dataset == DataSetType.FILE:
            input_image_files = glob_all(self.args.files)
            if self.args.text_files:
                self.args.text_files = glob_all(self.args.text_files)

        elif self.args.dataset == DataSetType.RAW:
            input_image_files = file

        # skip invalid files and remove them,there wont be predictions of invalid files
        self.dataset = create_dataset(
            self.args.dataset,DataSetMode.PREDICT,input_image_files,self.args.text_files,skip_invalid=True,remove_invalid=True,args={"text_index": self.args.pagexml_text_index,"pad": self.args.dataset_pad,},)

        if len(self.dataset) == 0:
            raise Exception(
                "Empty dataset provided. Check your files argument (got {})!".format(self.args.files)

        do_prediction = self.predictor.predict_dataset(
            self.dataset,progress_bar=not self.args.no_progress_bars

        avg_sentence_confidence = 0
        n_predictions = 0


        # output the voted results to the appropriate files
        for result,sample in do_prediction:
            n_predictions += 1
            for i,p in enumerate(result):
                p.prediction.id = "fold_{}".format(i)

            # vote the results (if only one model is given,this will just return the sentences)
            prediction = self.voter.vote_prediction_result(result)
            prediction.id = "voted"
            sentence = prediction.sentence
            avg_sentence_confidence += prediction.avg_char_probability

    def create_ctc_decoder_params(self,args):
        """Method for returning the ctc decoder params

            args: args from an instance of class PredictionAttrs

            params -- ctc decoder params.
        params = CTCDecoderParams()
        params.beam_width = args.beam_width
        params.word_separator = " "

        if args.dictionary and len(args.dictionary) > 0:
            dictionary = set()
            logger.info(f"Creating dictionary")
            for path in glob_all(args.dictionary):
                with open(path,"r") as f:
                    dictionary = dictionary.union({word for word in f.read().split()})

            params.dictionary[:] = dictionary
            logger.info("Dictionary with {} unique words successfully created.".format(len(dictionary)))
            args.dictionary = None

        if args.dictionary:
            params.type = CTCDecoderParams.CTC_WORD_BEAM_SEARCH

        return params

if __name__ == "__main__":
    start_time = time.time()
    pred = CalamariPredict(['/opt/working/base_model/model_Calamari_1.ckpt.json'],'/opt/working/data/*.png')


Checkpoint version 2 is up-to-date.
2020-09-02 15:23:00.328806: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
2020-09-02 15:23:00.328876: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2020-09-02 15:23:01.536020: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2020-09-02 15:23:01.536096: W tensorflow/stream_executor/cuda/cuda_driver.cc:312] failed call to cuInit: UNKNOWN ERROR (303)
2020-09-02 15:23:01.536122: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (6d0f92664ca6): /proc/driver/nvidia/version does not exist
2020-09-02 15:23:01.536450: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations,rebuild TensorFlow with the appropriate compiler flags.
2020-09-02 15:23:01.542147: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 1896005000 Hz
2020-09-02 15:23:01.542953: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x560df22dbb30 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-09-02 15:23:01.543009: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host,289
Non-trainable params: 0
Prediction of 1 models took 1.556354284286499s


