如何解决Pytorch从张量文件中读取张量来自磁盘的流训练
我有一些非常大的输入张量,在构建它们时遇到了内存问题,因此我将它们一张一张地读到.pt
文件中。当我运行生成并保存文件的脚本时,文件越来越大,因此我假设张量正确保存。这是该代码:
with open(a_sync_save,"ab") as f:
print("saved")
torch.save(torch.unsqueeze(torch.cat(tensors,dim=0),f)
我想一次从文件中读取一定数量的这些张量,因为我不想再次遇到内存问题。当我尝试读取保存到该张量的每个张量时我只能设法获得第一个张量。
with open(a_sync_save,"rb") as f:
for tensor in torch.load(f):
print(tensor.shape)
这里的输出是第一个张量的形状,然后剧烈退出。
解决方法
这是我用来回答此问题的一些代码。很多都是我正在做的事情,但是它的本质可以供其他遇到与我一样的问题的人使用。
def stream_training(filepath,epochs=100):
"""
:param filepath: file path of pkl file
:param epochs: number of epochs to run
"""
def training(train_dataloader,model_obj,criterion,optimizer):
for j,data in enumerate(train_dataloader,start=0):
# get the inputs; data is a list of [inputs,labels]
inputs,labels = data
inputs,labels = inputs.cuda(),labels.cuda()
outputs = model_obj(inputs.float())
outputs = torch.flatten(outputs)
loss = criterion(outputs,labels.float())
print(loss)
# zero the parameter gradients
optimizer.zero_grad()
loss.backward()
torch.nn.utils.clip_grad_norm_(model_obj.parameters(),max_norm=1)
optimizer.step()
tensors = []
expected_values = []
model= Model(1000,1,256,1)
tea.cuda()
criterion = nn.BCELoss()
optimizer = optim.Adam(model.parameters(),lr=0.00001,betas=(0.9,0.99999),eps=1e-08,weight_decay=0.001,amsgrad=True)
for i in range(epochs):
with (open(filepath,'rb')) as openfile:
while True:
try:
data_list = pickle.load(openfile)
tensors.append(data_list[0])
expected_values.append(data_list[1])
if len(tensors) % BATCH_SIZE == 0:
tensors = torch.cat(tensors,dim=0)
tensors = torch.reshape(tensors,(tensors.shape[0],tensors.shape[1],-1))
train_loader = make_dataset(tensors,expected_values) # makes a dataloader for the batch that comes in
training(train_loader,model,optimizer) #Performs forward and back prop
tensors = [] # washes out the batch to conserve memory on my computer.
expected_values = []
except EOFError:
print("This file has finished training")
break
这是有趣的模型。
class Model(nn.Module):
def __init__(self,input_size,output_size,hidden_dim,n_layers):
super(Model,self).__init__()
# dimensions
self.hidden_dim = hidden_dim
self.n_layers = n_layers
#Define the layers
#GRU
self.gru = nn.GRU(input_size,n_layers,batch_first=True)
self.fc1 = nn.Linear(hidden_dim,hidden_dim)
self.bn1 = nn.BatchNorm1d(num_features=hidden_dim)
self.fc2 = nn.Linear(hidden_dim,hidden_dim)
self.bn2 = nn.BatchNorm1d(num_features=hidden_dim)
self.fc3 = nn.Linear(hidden_dim,hidden_dim)
self.bn3 = nn.BatchNorm1d(num_features=hidden_dim)
self.fc4 = nn.Linear(hidden_dim,hidden_dim)
self.bn4 = nn.BatchNorm1d(num_features=hidden_dim)
self.fc5 = nn.Linear(hidden_dim,hidden_dim)
self.output = nn.Linear(hidden_dim,output_size)
def forward(self,x):
x = x.float()
x = F.relu(self.gru(x)[1])
x = x[-1,:,:] # eliminates first dim
x = F.dropout(x,0.5)
x = F.relu(self.bn1(self.fc1(x)))
x = F.dropout(x,0.5)
x = F.relu(self.bn2(self.fc2(x)))
x = F.dropout(x,0.5)
x = F.relu(self.bn3(self.fc3(x)))
x = F.dropout(x,0.5)
x = F.relu(self.bn4(self.fc4(x)))
x = F.dropout(x,0.5)
x = F.relu(self.fc5(x))
return torch.sigmoid(self.output(x))
def init_hidden(self,batch_size):
hidden = torch.zeros(self.n_layers,batch_size,self.hidden_dim)
return hidden
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。