使用.float_val

如何解决使用.float_val

由于某些原因，使用.float_val提取结果的时间非常长。

场景示例及其输出：

t2 = time.time()
options = [('grpc.max_receive_message_length',100 * 4000 * 4000)]
channel = grpc.insecure_channel('{host}:{port}'.format(host='localhost',port=str(self.serving_grpc_port)),options = options)
stub = prediction_service_pb2_grpc.PredictionServiceStub(channel)
request = predict_pb2.PredictRequest()
request.model_spec.name = 'ivi-detector'
request.model_spec.signature_name = 'serving_default'

request.inputs['inputs'].CopyFrom(tf.make_tensor_proto(imgs_array,shape=imgs_array.shape))
res = stub.Predict(request,100.0)

print("Time to detect:")
t3 = time.time(); print("t3:",t3 - t2)

t11 = time.time()
boxes_float_val = res.outputs['detection_boxes'].float_val
t12 = time.time(); print("t12:",t12 - t11)
classes_float_val = res.outputs['detection_classes'].float_val
t13 = time.time(); print("t13:",t13 - t12)
scores_float_val = res.outputs['detection_scores'].float_val
t14 = time.time(); print("t14:",t14 - t13)

boxes = np.reshape(boxes_float_val,[len(imgs_array),self.max_total_detections,4])
classes = np.reshape(classes_float_val,self.max_total_detections])
scores = np.reshape(scores_float_val,self.max_total_detections])
t15 = time.time(); print("t15:",t15 - t14)

Time to detect:
t3: 1.4687104225158691
t12: 1.9140026569366455
t13: 3.719329833984375e-05
t14: 9.298324584960938e-06
t15: 0.0008063316345214844

Tensorflow Serving正在从tensorflow的对象检测api（faster_rncc_resnet101）运行对象检测模型。我们可以看到，检测到的框的提取比预测本身要高。

检测到的盒子的当前形状是[batch_size，100，4]，其中100是最大检测次数。作为一种解决方法，我可以减少最大检测次数，并显着减少提取这些值所需的时间，但是（根据我的观点）它始终保持不必要的高状态。

我将tensorflow-serving-api == 2.3.0和tensorflow-serving 2.3.0-gpu用作docker容器

另外，重要的是要告知我，我尝试在公共保存的模型（完全在imagenet上进行了训练）上重现此行为，并且.float_val上的性能没有降低，这表明该问题可以通过我的自定义训练来解决模型。我已经尝试以不同的方式从.ckpt文件中导出保存的模型，但是问题仍然存在，并且，如果我对下载的模型使用任何导出方法（下载的模型同时包含.ckpt文件和save_model格式文件），不会出现问题，因此导出方法是安全的。

现在，我怀疑我训练的模型出了点问题/不同....但是..为什么？它会影响tensorflow-serving-api中的.float_val吗？

我使用的代码（效果很快）： https://github.com/denisb411/tfserving-od/blob/master/inference-using-tfserving-docker.ipynb

我不知道如何进行培训，因为我的自定义培训遵循与原始培训几乎相同的pipeline.config，因此，培训过程没有什么不同。

我该如何解决此问题？如果有关系，它与.float_val有什么关系？

假设这是一个错误，前一段时间我创建了一个github issue来谈论我遇到的这个问题，但是没有引起足够的重视。

如何解决使用.float_val

相关推荐