在PyTorch中指定LSTM层中的单元数

如何解决在PyTorch中指定LSTM层中的单元数

我不完全理解PyTorch中的LSTM层。实例化LSTM层时，如何指定该层内部的LSTM单元数？我首先想到的是，如果我们假设LSTM单元垂直连接，则它是“ num_layers”参数。但是如果是这样的话，我们如何实现例如两层各有8个单元的堆叠LSTM？

解决方法

LSTM（或RNN或GRU）的像元数量是您输入需要/需要的时间步长。例如，当您想通过Pytorch中的LSTM函数运行单词“ hello”时，只需将其转换为向量（使用一键编码或嵌入），然后将该向量传递通过LSTM函数即可。然后，它将在后台迭代所有嵌入的字符（“ h”，“ e”，“ l”，...）。而且每个输入甚至可以具有不同数量的时间步长/单元，例如，当您想传递“ hello”时，而在“ Joe”之后，LSTM将需要不同数量的迭代（5代表Hello，3代表Joe）。因此，您可以看到，不需要提供一定数量的单元格！希望答案能使您满意。：）

修改

一个例子：

sentence = "Hey Im Joe"

embedding_size = 300
batch_size = 1  # batch-size 1 for demonstration

input = [create_embedding(word,dims=embedding_size) for word in sentence]
# the LSTM will need three timesteps or cells to process that sentence 
input = torch.tensor(x).reshape(1,embedding_size)

hidden_size = 256
layers = 256

lstm = nn.LSTM(input_size=embedding_size,hidden_size=hidden_size,num_layers=layers,dropout=0.5,batch_first=True)

# initialize hidden-state (must be tuple of following dimentsions
hidden = (torch.zeros(layers,batch_size,hidden_size),torch.zeros(layers,hidden_size))

outputs,hidden = lstm(input,hidden)
# outputs is now a list containing the outputs of each timestep
# for classification you can take output of the last timestep and use it further,like this
output = outputs[:,-1]

那么在此（outputs,hidden)）行中会发生什么？再次伪代码：

# inside the LSTM function

# lets say for demonstration reasons the embedding for "Hey" is a,for "Im" its b and for "Joe" its c

input = [a,b,c]

def LSTM(input_sentence):
    hidden = ...
    outputs = []
    
    # each iteration is one timestep/cell
    for embedded_word in input_sentence:
        output,hidden = neural_network(embedded_word,hidden)
        outputs.append(output)
    
    # returns all outputs and hiddenstate (which you normally dont need)
    return outputs,hidden

现在是否清楚LSTM函数的功能以及如何使用它？

在PyTorch中指定LSTM层中的单元数

如何解决在PyTorch中指定LSTM层中的单元数

解决方法

相关推荐