如何解决在PyTorch中指定LSTM层中的单元数
我不完全理解PyTorch中的LSTM层。实例化LSTM层时,如何指定该层内部的LSTM单元数?我首先想到的是,如果我们假设LSTM单元垂直连接,则它是“ num_layers”参数。但是如果是这样的话,我们如何实现例如两层各有8个单元的堆叠LSTM?
解决方法
LSTM(或RNN或GRU)的像元数量是您输入需要/需要的时间步长。例如,当您想通过Pytorch中的LSTM函数运行单词“ hello”时,只需将其转换为向量(使用一键编码或嵌入),然后将该向量传递通过LSTM函数即可。然后,它将在后台迭代所有嵌入的字符(“ h”,“ e”,“ l”,...)。而且每个输入甚至可以具有不同数量的时间步长/单元,例如,当您想传递“ hello”时,而在“ Joe”之后,LSTM将需要不同数量的迭代(5代表Hello,3代表Joe)。因此,您可以看到,不需要提供一定数量的单元格!希望答案能使您满意。 :)
修改
一个例子:
sentence = "Hey Im Joe"
embedding_size = 300
batch_size = 1 # batch-size 1 for demonstration
input = [create_embedding(word,dims=embedding_size) for word in sentence]
# the LSTM will need three timesteps or cells to process that sentence
input = torch.tensor(x).reshape(1,embedding_size)
hidden_size = 256
layers = 256
lstm = nn.LSTM(input_size=embedding_size,hidden_size=hidden_size,num_layers=layers,dropout=0.5,batch_first=True)
# initialize hidden-state (must be tuple of following dimentsions
hidden = (torch.zeros(layers,batch_size,hidden_size),torch.zeros(layers,hidden_size))
outputs,hidden = lstm(input,hidden)
# outputs is now a list containing the outputs of each timestep
# for classification you can take output of the last timestep and use it further,like this
output = outputs[:,-1]
那么在此(outputs,hidden)
)行中会发生什么?
再次伪代码:
# inside the LSTM function
# lets say for demonstration reasons the embedding for "Hey" is a,for "Im" its b and for "Joe" its c
input = [a,b,c]
def LSTM(input_sentence):
hidden = ...
outputs = []
# each iteration is one timestep/cell
for embedded_word in input_sentence:
output,hidden = neural_network(embedded_word,hidden)
outputs.append(output)
# returns all outputs and hiddenstate (which you normally dont need)
return outputs,hidden
现在是否清楚LSTM函数的功能以及如何使用它?
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。