如何从 nn.Transformer 生成句子?

如何解决如何从 nn.Transformer 生成句子?

我正在尝试使 Transformer 模型生成句子。现在我正在尝试使用光束搜索生成句子。

但我的模型只生成一个句子

这个符号表示这是句子的开始。

我不知道为什么会生成它?你有什么好的解决办法吗?

这是Transformer模型的代码。

import math
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.nn import TransformerEncoder,TransformerEncoderLayer,TransformerDecoder,TransformerDecoderLayer

class TransformerModel(nn.Module):

    def __init__(self,in_token,out_token,ninp,nhead,nhid,nlayers,dropout=0.5):
        super(TransformerModel,self).__init__()
        self.model_type = 'Transformer'
        self.pos_encoder = PositionalEncoding(ninp,dropout)
        encoder_layers = TransformerEncoderLayer(ninp,dropout)
        self.transformer_encoder = TransformerEncoder(encoder_layers,nlayers)
        self.encoder = nn.Embedding(in_token,ninp)
        self.decoder = nn.Embedding(out_token,ninp)
        self.ninp = ninp
        self.linear = nn.Linear(ninp,out_token)
        decoder_layers = TransformerDecoderLayer(ninp,dropout)
        self.transformer_decoder = TransformerDecoder(decoder_layers,norm = self.linear)

        self.init_weights()

    def generate_square_subsequent_mask(self,sz):
        mask = (torch.triu(torch.ones(sz,sz)) == 1).transpose(0,1)
        mask = mask.float().masked_fill(mask == 0,float('-inf')).masked_fill(mask == 1,float(0.0))
        return mask

    def init_weights(self):
        initrange = 0.1
        self.encoder.weight.data.uniform_(-initrange,initrange)
        # self.decoder.bias.data.zero_()
        self.decoder.weight.data.uniform_(-initrange,initrange)

    def forward(self,src,trg):

        src_mask = model.generate_square_subsequent_mask(src.size()[0]).to(device)
        trg_mask = model.generate_square_subsequent_mask(trg.size()[0]).to(device)
        src = self.encoder(src)
        trg = self.decoder(trg)
        src = self.pos_encoder(src)
        trg = self.pos_encoder(trg)
        output = self.transformer_encoder(src,mask = src_mask)
        output = self.transformer_decoder(trg,output,tgt_mask = trg_mask)

        return output

class PositionalEncoding(nn.Module):

    def __init__(self,d_model,dropout=0.1,max_len=5000):
        super(PositionalEncoding,self).__init__()
        self.dropout = nn.Dropout(p=dropout)

        pe = torch.zeros(max_len,d_model)
        position = torch.arange(0,max_len,dtype=torch.float).unsqueeze(1)
        div_term = torch.exp(torch.arange(0,2).float() * (-math.log(10000.0) / d_model))
        #print(pe[:,0::2].size())
        #print(pe[:,1::2].size())
        #print((position*div_term).size())
        pe[:,0::2] = torch.sin(position * div_term)
        pe[:,1::2] = torch.cos(position * div_term)
        pe = pe.unsqueeze(0).transpose(0,1)
        self.register_buffer('pe',pe)

    def forward(self,x):
        x = x + self.pe[:x.size(0),:]
        return self.dropout(x)

# this is the parametor of model

n_ntokens = len(SRC.vocab.stoi) # the size of vocabulary
out_ntokens = len(TRG.vocab.stoi)
emsize = 512 # embedding dimension
nhid = 512 # the dimension of the feedforward network model in nn.TransformerEncoder
nlayers = 1 # the number of nn.TransformerEncoderLayer in nn.TransformerEncoder
nhead = 2 # the number of heads in the multiheadattention models
dropout = 0.3 # the dropout value
model = TransformerModel(in_ntokens,out_ntokens,emsize,dropout).to(device)

这是生成句子的代码。

def gen_sentence(sentence,src_field,trg_field,model,max_len = 50):
    model.eval()

    tokens = [src_field.init_token] + \
        tokenizer(sentence) + [src_field.eos_token]
    src = [src_field.vocab.stoi[i] for i in tokens]
    src = torch.LongTensor([src])
    src = torch.t(src)
    src = src.to(device)

    src_tensor = model.encoder(src)
    src_tensor = model.pos_encoder(src_tensor).to(device)
    with torch.no_grad():
        src_output = model.transformer_encoder(src_tensor)#,mask = src_mask)

    trg = trg_field.vocab.stoi[trg_field.init_token]
    trg = torch.LongTensor([[trg]]).to(device)
    output = []
    for i in range(max_len):
        trg_tensor = model.decoder(trg)
        trg_tensor = model.pos_encoder(trg_tensor).to(device)
        trg_mask = model.generate_square_subsequent_mask(trg_tensor.size()[0]).to(device)
        with torch.no_grad():
            pred = model.transformer_decoder(trg_tensor,src_output,tgt_mask = trg_mask)
        pred_word_index = pred.argmax(2)[-1]
        output.append(pred_word_index)
        if pred_word_index == trg_field.vocab.stoi[trg_field.eos_token]:
            break

        last_index = torch.LongTensor([[pred_word_index.item()]]).to(device)
        trg = torch.cat((trg,last_index))

    # predict = "".join(output)
    predict = [trg_field.vocab.itos[i] for i in output]
    predict = "".join(predict)
    # print(predict)

    return predict

# this is the code when I use this function
sentence = "あまりにも気温が高いということでそれをやめてですね電車で来ました "
gen_sentence(sentence,SRC,TRG,model)

这是输出。

<sos><sos><sos><sos><sos><sos><sos><sos><sos><sos><sos><sos><sos><sos><sos><sos><sos><sos><sos><sos><sos><sos><sos><sos><sos><sos><sos><sos><sos><sos><sos><sos><sos><sos><sos><sos><sos><sos><sos><sos><sos><sos><sos><sos><sos><sos><sos><sos><sos><sos>

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。

相关推荐


依赖报错 idea导入项目后依赖报错,解决方案:https://blog.csdn.net/weixin_42420249/article/details/81191861 依赖版本报错:更换其他版本 无法下载依赖可参考:https://blog.csdn.net/weixin_42628809/a
错误1:代码生成器依赖和mybatis依赖冲突 启动项目时报错如下 2021-12-03 13:33:33.927 ERROR 7228 [ main] o.s.b.d.LoggingFailureAnalysisReporter : *************************** APPL
错误1:gradle项目控制台输出为乱码 # 解决方案:https://blog.csdn.net/weixin_43501566/article/details/112482302 # 在gradle-wrapper.properties 添加以下内容 org.gradle.jvmargs=-Df
错误还原:在查询的过程中,传入的workType为0时,该条件不起作用 &lt;select id=&quot;xxx&quot;&gt; SELECT di.id, di.name, di.work_type, di.updated... &lt;where&gt; &lt;if test=&qu
报错如下,gcc版本太低 ^ server.c:5346:31: 错误:‘struct redisServer’没有名为‘server_cpulist’的成员 redisSetCpuAffinity(server.server_cpulist); ^ server.c: 在函数‘hasActiveC
解决方案1 1、改项目中.idea/workspace.xml配置文件,增加dynamic.classpath参数 2、搜索PropertiesComponent,添加如下 &lt;property name=&quot;dynamic.classpath&quot; value=&quot;tru
删除根组件app.vue中的默认代码后报错:Module Error (from ./node_modules/eslint-loader/index.js): 解决方案:关闭ESlint代码检测,在项目根目录创建vue.config.js,在文件中添加 module.exports = { lin
查看spark默认的python版本 [root@master day27]# pyspark /home/software/spark-2.3.4-bin-hadoop2.7/conf/spark-env.sh: line 2: /usr/local/hadoop/bin/hadoop: No s
使用本地python环境可以成功执行 import pandas as pd import matplotlib.pyplot as plt # 设置字体 plt.rcParams[&#39;font.sans-serif&#39;] = [&#39;SimHei&#39;] # 能正确显示负号 p
错误1:Request method ‘DELETE‘ not supported 错误还原:controller层有一个接口,访问该接口时报错:Request method ‘DELETE‘ not supported 错误原因:没有接收到前端传入的参数,修改为如下 参考 错误2:cannot r
错误1:启动docker镜像时报错:Error response from daemon: driver failed programming external connectivity on endpoint quirky_allen 解决方法:重启docker -&gt; systemctl r
错误1:private field ‘xxx‘ is never assigned 按Altʾnter快捷键,选择第2项 参考:https://blog.csdn.net/shi_hong_fei_hei/article/details/88814070 错误2:启动时报错,不能找到主启动类 #
报错如下,通过源不能下载,最后警告pip需升级版本 Requirement already satisfied: pip in c:\users\ychen\appdata\local\programs\python\python310\lib\site-packages (22.0.4) Coll
错误1:maven打包报错 错误还原:使用maven打包项目时报错如下 [ERROR] Failed to execute goal org.apache.maven.plugins:maven-resources-plugin:3.2.0:resources (default-resources)
错误1:服务调用时报错 服务消费者模块assess通过openFeign调用服务提供者模块hires 如下为服务提供者模块hires的控制层接口 @RestController @RequestMapping(&quot;/hires&quot;) public class FeignControl
错误1:运行项目后报如下错误 解决方案 报错2:Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.8.1:compile (default-compile) on project sb 解决方案:在pom.
参考 错误原因 过滤器或拦截器在生效时,redisTemplate还没有注入 解决方案:在注入容器时就生效 @Component //项目运行时就注入Spring容器 public class RedisBean { @Resource private RedisTemplate&lt;String
使用vite构建项目报错 C:\Users\ychen\work&gt;npm init @vitejs/app @vitejs/create-app is deprecated, use npm init vite instead C:\Users\ychen\AppData\Local\npm-