如何解决TFLite的CoreMLDelegate可以在iOS中同时使用GPU和CPU吗?
我已经在我的应用程序中成功使用了tflite的MetalDelegate。当我切换到CoreMLDelegate时,它完全在CPU上运行我的(浮动)tflite模型(MobileNet),显示GPU使用为0。我在兼容的iPhone 11MaxPro上运行此程序。在初始化期间,我注意到以下行:
“ CoreML委托:在31个节点中委托了29个节点,带有2个分区”。
有什么想法吗?如何使CoreMLDelegate在iOS上同时使用GPU和CPU?我从here下载了mobilenet_v1_1.0_224.tflite
模型文件。
import AVFoundation
import UIKit
import SpriteKit
import Metal
var device: MTLDevice!
var commandQueue: MTLCommandQueue!
private var total_latency:Double = 0
private var total_count:Double = 0
private var sstart = TimeInterval(NSDate().timeIntervalSince1970)
class ViewController: UIViewController {
...
}
// MARK: CameraFeedManagerDelegate Methods
extension ViewController: CameraFeedManagerDelegate {
func didOutput(pixelBuffer: CVPixelBuffer) {
let currentTimeMs = Date().timeIntervalSince1970 * 1
guard (currentTimeMs - previousInferenceTimeMs) >= delayBetweenInferencesMs else { return }
previousInferenceTimeMs = currentTimeMs
// 1. First create the Metal device and command queue in viewDidLoad():
device = MTLCreateSystemDefaultDevice()
commandQueue = device.makeCommandQueue()
var timestamp = NSDate().timeIntervalSince1970
let start = TimeInterval(timestamp)
// 2. Access the shared MTLCaptureManager and start capturing
let capManager = MTLCaptureManager.shared()
let myCaptureScope = capManager.makeCaptureScope(device: device)
myCaptureScope.begin()
let commandBuffer = commandQueue.makeCommandBuffer()!
// Do Metal work
// Pass the pixel buffer to TensorFlow Lite to perform inference.
result = modelDataHandler?.runModel(onFrame: pixelBuffer)
// 3.
// encode your kernel
commandBuffer.commit()
myCaptureScope.end()
timestamp = NSDate().timeIntervalSince1970
let end = TimeInterval(timestamp)
//var end = NSDate(timeIntervalSince1970: TimeInterval(myTimeInterval))
total_latency += (end - start)
total_count += 1;
let rfps = total_count/(end - sstart)
let fps = total_count/(end - start)
let stri = "Time: " + String(end - start) + " avg: " + String(total_latency/total_count)+" count: " + String(total_count)+" rfps: "+String(rfps)+" fps: "+String(fps)
print(stri)
// Display results by handing off to the InferenceViewController.
DispatchQueue.main.async {
guard let finalInferences = self.result?.inferences else {
self.resultLabel.text = ""
return
}
let resultStrings = finalInferences.map({ (inference) in
return String(format: "%@ %.2f",inference.label,inference.confidence)
})
self.resultLabel.text = resultStrings.joined(separator: "\n")
}
}
2020-08-22 07:09:39.783215-0400 ImageClassification [3039:645963] coreml_version必须为2或3。设置为3。 2020-08-22 07:09:39.785103-0400 ImageClassification [3039:645963]为Metal创建了TensorFlow Lite委托。 2020-08-22 07:09:39.785505-0400 ImageClassification [3039:645963]启用金属GPU帧捕获 2020-08-22 07:09:39.786110-0400 ImageClassification [3039:645963]启用了金属API验证 2020-08-22 07:09:39.927854-0400 ImageClassification [3039:645963]初始化TensorFlow Lite运行时。 2020-08-22 07:09:39.928928-0400 ImageClassification [3039:645963] CoreML委托:在31个节点中委托了29个节点,带有2个分区
解决方法
感谢您试用Core ML委托。您可以共享使用的TFLite版本以及用于初始化Core ML委托的代码吗?另外,您可以确认您要运行的是浮动模型,而不是量化模型吗?
延迟时间可能会有所不同,具体取决于您所测量的内容,但是当仅测量推理时间时,我的iPhone 11 Pro对于CPU来说显示11ms,对于Core ML委托显示5.5ms。
事件探查器无法捕获神经引擎的利用率,但是如果发现延迟和较高的CPU利用率,则可能表明您的模型仅在CPU上运行。您也可以尝试time profiler找出哪个部分消耗最多的资源。
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。