PyTorch 的 grid_sample 到 CoreML 的转换通过 coremltools 仅限 PyTorchSwift 和 CoreML

如何解决PyTorch 的 grid_sample 到 CoreML 的转换通过 coremltools 仅限 PyTorchSwift 和 CoreML

torch.nn.functional.grid_sample（来源 here，点击文档获取文档）目前不受 CoreML 支持（及其转换实用程序库：coremltools）。

我正在寻找的是一种将下面显示的层从 PyTorch 的 torchscript（文档 here）导出到 CoreML（使用通过 Swift 创建的自定义 op 或通过高效的 PyTorch重写grid_sample）。

有关入门的详细信息和提示，请参阅提示部分

最小可验证示例

import coremltools as ct
import torch


class GridSample(torch.nn.Module):
    def forward(self,inputs,grid):
        # Rest could be the default behaviour,e.g. bilinear
        return torch.nn.functional.grid_sample(inputs,grid,align_corners=True)


# Image could also have more in_channels,different dimension etc.,# for example (2,32,64,64)
image = torch.randn(2,3,32)  # (batch,in_channels,width,height)
grid = torch.randint(low=-1,high=2,size=(2,2)).float()

layer = GridSample()
# You could use `torch.jit.script` if preferable
scripted = torch.jit.trace(layer,(image,grid))

# Sanity check
print(scripted(image,grid).shape)


# Error during conversion
coreml_layer = ct.converters.convert(
    scripted,source="pytorch",inputs=[
        ct.TensorType(name="image",shape=image.shape),ct.TensorType(name="grid",shape=grid.shape),],)

引发以下错误：

Traceback (most recent call last):
  File "/home/REDACTED/Downloads/sample.py",line 23,in <module>
    coreml_layer = ct.converters.convert(
  File "/home/REDACTED/.conda/envs/REDACTED/lib/python3.9/site-packages/coremltools/converters/_converters_entry.py",line 175,in convert
    mlmodel = mil_convert(
  File "/home/REDACTED/.conda/envs/REDACTED/lib/python3.9/site-packages/coremltools/converters/mil/converter.py",line 128,in mil_convert
    proto = mil_convert_to_proto(,convert_from,convert_to,File "/home/REDACTED/.conda/envs/REDACTED/lib/python3.9/site-packages/coremltools/converters/mil/converter.py",line 171,in mil_convert_to_proto
    prog = frontend_converter(,**kwargs)
  File "/home/REDACTED/.conda/envs/REDACTED/lib/python3.9/site-packages/coremltools/converters/mil/converter.py",line 85,in __call__
    return load(*args,**kwargs)
  File "/home/REDACTED/.conda/envs/REDACTED/lib/python3.9/site-packages/coremltools/converters/mil/frontend/torch/load.py",line 81,in load
    raise e
  File "/home/REDACTED/.conda/envs/REDACTED/lib/python3.9/site-packages/coremltools/converters/mil/frontend/torch/load.py",line 73,in load
    prog = converter.convert()
  File "/home/REDACTED/.conda/envs/REDACTED/lib/python3.9/site-packages/coremltools/converters/mil/frontend/torch/converter.py",line 227,in convert
    convert_nodes(self.context,self.graph)
  File "/home/REDACTED/.conda/envs/REDACTED/lib/python3.9/site-packages/coremltools/converters/mil/frontend/torch/ops.py",line 54,in convert_nodes
    raise RuntimeError(
RuntimeError: PyTorch convert function for op 'grid_sampler' not implemented.

依赖项

Python (conda)：

coremltools==4.1
torch==1.8.0

您还可以使用 nightly/master 构建（至少在写作当天：2021-03-20）

提示

我目前看到的解决方案分为两种可能的解决方案：

仅限 PyTorch

从头开始重写torch.nn.functional.grid_sample。

这将只需要在张量上坚持 PyTorch 操作，因为循环（例如三重嵌套）会挂起转换器并且效率太低
您不能在 __getitem__ 或相关类型上使用 list - 似乎可以与 torch.Tensor 一起使用，但有问题，因此如果你得到RuntimeError: PyTorch convert function for op '__getitem__' not implemented

优点：

不需要两种语言并坚持使用单一技术

缺点：

受循环限制，需要坚持矢量化操作（大部分/所有时间）

Swift 和 CoreML

注册负责运行grid_sample的自定义层。仅 CPU 的实现会很好（尽管使用 Apple 的 Metal 来提高 GPU 速度会很棒）。

由于我不喜欢 Swift，所以我收集了一些可能对您有所帮助的资源：

https://coremltools.readme.io/docs/custom-operators - 起点，仅限 Python，非常简单，只需注册转换层
https://developer.apple.com/documentation/coreml/mlcustomlayer - 必须在 Swift 中编码的层的 API
https://developer.apple.com/documentation/coreml/core_ml_api/creating_a_custom_layer - 关于上述内容的更多信息（但不多）
https://machinethink.net/blog/coreml-custom-layers/ - 包含示例和将层调度到设备（GPU、CPU）的博客文章。需要 Swift（CPU 版本）、Metal（GPU 实现）。最终的 Metal 实现可能基于 PyTorch 的 CUDA impl，CPU 和 Swift 也可能相关。 3 岁了，所以请注意，swish 激活层似乎是一个很好的起点（同一作者的其他帖子也对 CoreML 本身有所了解）。
https://github.com/hollance/CoreML-Custom-Layers - 上述内容的回购

优点：

可以使用循环和更好地控制算法
可能会更容易，因为我们不仅限于 CoreML 当前可以读取的操作

缺点：

两种语言
稀疏文档

解决方法

嗯，这不是确切的答案，而是一些研究。 grid_sample 本质上是稀疏矩阵运算，其想法是尝试使其密集。下面的代码演示了它是如何完成的。它可能很慢，并且需要 grid 为静态以从要转换的模型中消除 grid_sample，但还可以。

目标是以线性形式获得我们的变换。在这里，为了得到密集矩阵，我们将单位对角线输入到 'grid_sample'，结果是我们正在寻找的矩阵保持变换。要进行命名变换，请将展平的图像乘以该矩阵。正如您在此处看到的 batch=1，必须为每个 grid 独立完成转换。

您的代码：

in_sz  = 2;    out_sz = 4;    batch  = 1;    ch     = 3

class GridSample(torch.nn.Module):
    def forward(self,inputs,grid):
        # Rest could be the default behaviour,e.g. bilinear
        return torch.nn.functional.grid_sample(inputs,grid,align_corners=True)

image = torch.randn( batch,ch,in_sz,in_sz)  # (batch,in_channels,width,height)
grid = torch.randint(low=-1,high=2,size=( batch,out_sz,2)).float()

layer = GridSample()
scripted = torch.jit.trace(layer,(image,grid))
print(scripted(image,grid))

出：

tensor([[[[-0.8226,-0.4457,-0.3382,-0.0795],[-0.4457,-0.0052,-0.8226,-0.6341],[-0.4510,-0.0424]],[[-1.0090,-1.6029,-1.3813,-0.1212],[-1.6029,-2.7920,-1.0090,-1.3060],[-0.5651,-1.4566]],[[ 0.1482,0.7313,0.8916,1.8723],[ 0.7313,0.8144,0.1482,0.4398],[ 1.0103,1.3434]]]])

转化：

oness  = torch.ones( in_sz*in_sz )
diagg  = torch.diag( oness ).reshape( 1,in_sz*in_sz,in_sz )
denser = torch.nn.functional.grid_sample( diagg,align_corners=True).reshape( in_sz*in_sz,out_sz*out_sz ).transpose(0,1)
print (denser.shape)
print (image.shape)
image_flat = image.reshape( batch,in_sz*in_sz )
print (image_flat.shape)
print( torch.nn.functional.linear( image_flat,denser ).reshape( batch,out_sz ) )

出：

torch.Size([16,4])
torch.Size([1,3,2,2])
torch.Size([1,4])
tensor([[[[-0.8226,1.3434]]]])

嗯，可能不是很有效，我希望这至少会很有趣。