部署在gpu上运行以进行推理的自定义,训练有素的pytorch Bert模型

如何解决部署在gpu上运行以进行推理的自定义,训练有素的pytorch Bert模型

感谢您抽出宝贵的时间阅读问题。

我想寻求一些建议,以部署定制的,经过训练的pytorch Bert模型,该模型在gpu上运行以进行推理(无需进行训练,模型已保存为.pt文件)。
我在AWS上搜索了不同的文档,并发现如下链接:
https://github.com/aws/amazon-sagemaker-examples/tree/master/advanced_functionality/scikit_bring_your_own
https://github.com/aws/amazon-sagemaker-examples/tree/master/advanced_functionality/pytorch_extending_our_containers
https://github.com/aws-samples/amazon-sagemaker-bert-pytorch

首先,我不知道是否要创建每日自动批处理推断的容器。因为我将最后一个链接放在了那里,所以他们甚至都没有创建容器。

如果是–我已尝试按照教程创建具有以下结构的容器文件:

容器

Dockerfile
build_and_push.sh
分类

  • predictor.py
  • 服务
  • wsgi.py
  • nginx.conf
  • 其他支持predictor.py的python脚本
  • 模型(包含保存.pt文件的文件夹)

但是,目前我对几件事感到困惑:

  1. 在线上有很多Dockerfile示例,一些使用python,一些使用ubuntu,还有一些使用来自AWS账户的pytorch_training。我从pytorch-gpu的拥抱面中选择了一个:NVIDIA / CUDA:10.1-cudnn7-runtime-ubuntu18.04 问题:我用作基本图像有关系吗?我是否还需要编写如下内容: ENV SAGEMAKER_SUBMIT_DIRECTORY / opt / program ENV SAGEMAKER_PROGRAM评论分类/服务 (假设这些仅在使用pytorch_training图片时有效?
  2. 我使用build_and_push.sh文件在我的ECR上创建了一个图像,但是如何知道它是否正确设置?
  3. 服务代码重要吗?现在,我从第一个链接之一获得了服务代码。它说您通常不需要修改服务中的任何内容。但是,我可以说这是CPU的设置参数。我是否需要针对GPU进行修改。如果可以,怎么办?
  4. 下一步应该做什么?

我当前的Dockerfile看起来像这样:

# https://hub.docker.com/r/huggingface/transformers-pytorch-gpu/dockerfile
FROM nvidia/cuda:10.1-cudnn7-runtime-ubuntu18.04
# FROM 763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-training:1.5.0-gpu-py36-cu101-ubuntu16.04
# FROM 785573368785.dkr.ecr.us-east-1.amazonaws.com/sagemaker-inference-pytorch:1.5.0f-gpu-py3

LABEL maintainer="bqge@amazon.com"
LABEL project="product-review-models"

RUN apt update && \
    apt install -y bash \
                   build-essential \
                   git \
                   curl \
                   wget \
                   nginx \
                   ca-certificates \
                   python3 \
                   python3-pip && \
    rm -rf /var/lib/apt/lists

# Here we get all python packages.
RUN python3 -m pip install --no-cache-dir --upgrade pip && \
    python3 -m pip install --no-cache-dir \
        mkl \
        torch==1.5.0 \
        transformers==2.11.0 \
        path \
        sklearn \
        xlrd \
        spacy==2.1.0 \
        flask \
        gevent \
        gunicorn \
        pandas \
        ipython \
        spacy==2.1.0 \
        neuralcoref==4.0 && \
    python3 -m spacy download en_core_web_md

# RUN rm -f /usr/bin/python && ln -s /usr/bin/python /usr/bin/python3
# Set some environment variables. PYTHONUNBUFFERED keeps Python from buffering our standard
# output stream,which means that logs can be delivered to the user quickly. PYTHONDONTWRITEBYTECODE
# keeps Python from writing the .pyc files which are unnecessary in this case. We also update
# PATH so that the train and serve programs are found when the container is invoked.

ENV PYTHONUNBUFFERED=TRUE
ENV PYTHONDONTWRITEBYTECODE=TRUE
ENV PATH="/opt/program:${PATH}"

# Set up the program in the image
# /opt/ml and all subdirectories are utilized by SageMaker,we use the /code subdirectory to store our user code.
COPY /review-classification /opt/program
WORKDIR /opt/program

# this environment variable is used by the SageMaker PyTorch container to determine our user code directory.
ENV SAGEMAKER_SUBMIT_DIRECTORY /opt/program

# this environment variable is used by the SageMaker PyTorch container to determine our program entry point
# for training and serving.
ENV SAGEMAKER_PROGRAM review-classification/serve

ENTRYPOINT ["/usr/bin/python3","/opt/program/serve"]

我当前的build_and_push.sh看起来像这样

#!/usr/bin/env bash

# This script shows how to build the Docker image and push it to ECR to be ready for use
# by SageMaker.

# The name of our algorithm
algorithm_name=product-review-repo

# parameters
PY_VERSION="py36"


account=$(aws sts get-caller-identity --query Account --output text)

if [ $? -ne 0 ]
then
    exit 255
fi

cd SageMaker/container

chmod +x review-classification/serve

# Get the region defined in the current configuration (default to us-west-2 if none defined)
region=$(aws configure get region)
region=${region:-us-east-1}

TAG="gpu-${PY_VERSION}"

fullname="${account}.dkr.ecr.${region}.amazonaws.com/${algorithm_name}:${TAG}"

# If the repository doesn't exist in ECR,create it.
aws ecr describe-repositories --repository-names ${algorithm_name} || aws ecr create-repository --repository-name ${algorithm_name}

if [ $? -ne 0 ]
then
    aws ecr create-repository --repository-name "${algorithm_name}" > /dev/null
fi

echo "---> repository done.."
# Get the login command from ECR and execute it directly
aws ecr get-login-password --region ${region}|docker login --username AWS --password-stdin ${fullname}

echo "---> logged in to account ecr.."

echo "Building image with arch=gpu,region=${region}"


# Build the docker image locally with the image name and then push it to ECR
# with the full name.

docker build  -t ${algorithm_name} .
docker tag ${algorithm_name} ${fullname}

docker push ${fullname}

以下是我的服务代码:

#!/usr/bin/env python

# This file implements the scoring service shell. You don't necessarily need to modify it for various
# algorithms. It starts nginx and gunicorn with the correct configurations and then simply waits until
# gunicorn exits.
#
# The flask server is specified to be the app object in wsgi.py
#
# We set the following parameters:
#
# Parameter                Environment Variable              Default Value
# ---------                --------------------              -------------
# number of workers        MODEL_SERVER_WORKERS              the number of CPU cores
# timeout                  MODEL_SERVER_TIMEOUT              60 seconds

from __future__ import print_function
import multiprocessing
import os
import signal
import subprocess
import sys

cpu_count = multiprocessing.cpu_count()

model_server_timeout = os.environ.get('MODEL_SERVER_TIMEOUT',60)
model_server_workers = int(os.environ.get('MODEL_SERVER_WORKERS',cpu_count))

def sigterm_handler(nginx_pid,gunicorn_pid):
    try:
        os.kill(nginx_pid,signal.SIGQUIT)
    except OSError:
        pass
    try:
        os.kill(gunicorn_pid,signal.SIGTERM)
    except OSError:
        pass

    sys.exit(0)

def start_server():
    print('Starting the inference server with {} workers.'.format(model_server_workers))


    # link the log streams to stdout/err so they will be logged to the container logs
    subprocess.check_call(['ln','-sf','/dev/stdout','/var/log/nginx/access.log'])
    subprocess.check_call(['ln','/dev/stderr','/var/log/nginx/error.log'])

    nginx = subprocess.Popen(['nginx','-c','/opt/program/nginx.conf'])
    gunicorn = subprocess.Popen(['gunicorn','--timeout',str(model_server_timeout),'-k','gevent','-b','unix:/tmp/gunicorn.sock','-w',str(model_server_workers),'wsgi:app'])

    signal.signal(signal.SIGTERM,lambda a,b: sigterm_handler(nginx.pid,gunicorn.pid))

    # If either subprocess exits,so do we.
    pids = set([nginx.pid,gunicorn.pid])
    while True:
        pid,_ = os.wait()
        if pid in pids:
            break

    sigterm_handler(nginx.pid,gunicorn.pid)
    print('Inference server exiting')

# The main routine just invokes the start function.

if __name__ == '__main__':
    start_server()

非常感谢您!

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。

相关推荐


依赖报错 idea导入项目后依赖报错,解决方案:https://blog.csdn.net/weixin_42420249/article/details/81191861 依赖版本报错:更换其他版本 无法下载依赖可参考:https://blog.csdn.net/weixin_42628809/a
错误1:代码生成器依赖和mybatis依赖冲突 启动项目时报错如下 2021-12-03 13:33:33.927 ERROR 7228 [ main] o.s.b.d.LoggingFailureAnalysisReporter : *************************** APPL
错误1:gradle项目控制台输出为乱码 # 解决方案:https://blog.csdn.net/weixin_43501566/article/details/112482302 # 在gradle-wrapper.properties 添加以下内容 org.gradle.jvmargs=-Df
错误还原:在查询的过程中,传入的workType为0时,该条件不起作用 <select id="xxx"> SELECT di.id, di.name, di.work_type, di.updated... <where> <if test=&qu
报错如下,gcc版本太低 ^ server.c:5346:31: 错误:‘struct redisServer’没有名为‘server_cpulist’的成员 redisSetCpuAffinity(server.server_cpulist); ^ server.c: 在函数‘hasActiveC
解决方案1 1、改项目中.idea/workspace.xml配置文件,增加dynamic.classpath参数 2、搜索PropertiesComponent,添加如下 <property name="dynamic.classpath" value="tru
删除根组件app.vue中的默认代码后报错:Module Error (from ./node_modules/eslint-loader/index.js): 解决方案:关闭ESlint代码检测,在项目根目录创建vue.config.js,在文件中添加 module.exports = { lin
查看spark默认的python版本 [root@master day27]# pyspark /home/software/spark-2.3.4-bin-hadoop2.7/conf/spark-env.sh: line 2: /usr/local/hadoop/bin/hadoop: No s
使用本地python环境可以成功执行 import pandas as pd import matplotlib.pyplot as plt # 设置字体 plt.rcParams['font.sans-serif'] = ['SimHei'] # 能正确显示负号 p
错误1:Request method ‘DELETE‘ not supported 错误还原:controller层有一个接口,访问该接口时报错:Request method ‘DELETE‘ not supported 错误原因:没有接收到前端传入的参数,修改为如下 参考 错误2:cannot r
错误1:启动docker镜像时报错:Error response from daemon: driver failed programming external connectivity on endpoint quirky_allen 解决方法:重启docker -> systemctl r
错误1:private field ‘xxx‘ is never assigned 按Altʾnter快捷键,选择第2项 参考:https://blog.csdn.net/shi_hong_fei_hei/article/details/88814070 错误2:启动时报错,不能找到主启动类 #
报错如下,通过源不能下载,最后警告pip需升级版本 Requirement already satisfied: pip in c:\users\ychen\appdata\local\programs\python\python310\lib\site-packages (22.0.4) Coll
错误1:maven打包报错 错误还原:使用maven打包项目时报错如下 [ERROR] Failed to execute goal org.apache.maven.plugins:maven-resources-plugin:3.2.0:resources (default-resources)
错误1:服务调用时报错 服务消费者模块assess通过openFeign调用服务提供者模块hires 如下为服务提供者模块hires的控制层接口 @RestController @RequestMapping("/hires") public class FeignControl
错误1:运行项目后报如下错误 解决方案 报错2:Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.8.1:compile (default-compile) on project sb 解决方案:在pom.
参考 错误原因 过滤器或拦截器在生效时,redisTemplate还没有注入 解决方案:在注入容器时就生效 @Component //项目运行时就注入Spring容器 public class RedisBean { @Resource private RedisTemplate<String
使用vite构建项目报错 C:\Users\ychen\work>npm init @vitejs/app @vitejs/create-app is deprecated, use npm init vite instead C:\Users\ychen\AppData\Local\npm-