为什么 LSTM 预测的值很低？

如何解决为什么 LSTM 预测的值很低？

我需要预测具有 N 个虚拟机的数据中心的工作负载。数据的结构如下：

id,date,hour,dayofweek,cpu,ram,ram_tot,users,id_vm

5fff03b99b56dba65a873e2a,2020-12-14,00:00,1,2,820,8000,10,1

5fff03ba9b56dba65a873e2c,2458,16000,2

数据包括：id、日期、小时、星期几 (1-7)、VM 的 CPU 数量、使用的 RAM、总 RAM、连接到相关 VM 的用户数、VM id（1 或 2）。这是在熊猫数据框中导入的。在数据框中，我构建了一个名为 peak 的列，如果存在虚拟机的工作负载（% ram 使用率非常高，> 80%），则其值为 1，否则为 0。我构建了一个时间序列数据集并将其标准化。我构建了一个 LSTM 网络来预测是否会出现工作负载峰值（预测变量为峰值），具有训练和测试阶段我在验证阶段得到了非常糟糕的结果：预测值与实际值相比非常低。我想如果网络在预测峰值时运行良好，则相关值接近 1。

这是我的代码：

#read data from a mongo db and passed in a pandas dataframe
df = DataFrame(list_cur)
 
# calc for %mem used
df['pmem'] = (df['ram']/df['ram_tot'])*100
 
conditions = [(df['pmem'] <= 80),(df['pmem'] > 80)] #80
values = [0,1]
df['peak'] = np.select(conditions,values)
 
df['datetime'] = df['data'] + ' ' + df['ora']
 
# extract hour and minutes to build 2 new columns
df[['hh','mm']] = df.ora.str.split(":",expand=True,)
 
# dataset with 6 features and 1 label
# oevery row of the dataset = 1 observation
dataset = df[['hh','mm','dayofweek','users','pmem','id_app','peak']]
 
# normalization of the dataset
sc = MinMaxScaler(feature_range = (0,1))
dfn = sc.fit_transform(dataset) 
 
# build temporal series
x = []
y = []
 
n_steps = 192
for i in range(len(dfn)):
    # find the end of this pattern
    end_ix = i + n_steps
    # check if we are beyond the sequence
    if end_ix > len(dfn)-1:
        break
    # gather input and output parts of the pattern
    seq_x,seq_y = dfn[i:end_ix,0:5],dfn[end_ix,6]
    x.append(seq_x)
    y.append(seq_y)
 
# splitting dataset in train and test
X_train,X_test,y_train,y_test = train_test_split(x,y,test_size=0.33,random_state=42)
 
# convert in arrays
X_train = np.asarray(X_train,dtype=np.float32)
X_test = np.asarray(X_test,dtype=np.float32)
y_train = np.asarray(y_train,dtype=np.float32)
y_test = np.asarray(y_test,dtype=np.float32)
 
# LSTM neural network model
model = Sequential()
#Adding the first LSTM layer and some Dropout regularisation
model.add(LSTM(units = 6,return_sequences = True,input_shape = (X_train.shape[1],X_train.shape[2])))
model.add(Dropout(0.2))
# Adding a second LSTM layer and some Dropout regularisation
model.add(LSTM(units = 32,return_sequences = True))
model.add(Dropout(0.2))
# Adding a third LSTM layer and some Dropout regularisation
model.add(LSTM(units = 64,return_sequences = True))
model.add(Dropout(0.2))
# Adding a fourth LSTM layer and some Dropout regularisation
model.add(LSTM(units = 32))
model.add(Dropout(0.2))
# Adding the output layer
model.add(Dense(units = 1))
 
model.summary()
 
# Compiling the LSTM
model.compile(loss = 'categorical_crossentropy',optimizer='rmsprop',metrics=['accuracy'])
 
# Fitting the LSTM to the Training set
history = model.fit(X_train,epochs = 5,batch_size = 32,validation_data=(X_test,y_test))
model.evaluate(X_test,y_test,verbose=1,return_dict=True)
print("test loss,test acc:",history)
 
print("Generate predictions for all samples")
yhat = model.predict(X_test,verbose=1)
plot.figure(figsize=(20,10))
 
y1 = np.array(y_test)
y2 = np.array(yhat[:,0])
 
plt.plot(y1,label = "Test",marker="o",linewidth=0)
plt.plot(y2,label = "Previsto",marker="x",)
 
plt.xlabel('x - axis')
# Set the y axis label of the current axis.
plt.ylabel('y - axis')
# Set a title of the current axes.
plt.title('Two or more lines on same plot with suitable legends ')
# show a legend on the plot
plt.legend()
# Display a figure.
plt.show()

这是我的结果。

有错误吗？

解决方法

我不确定，但可以尝试对输出进行逆变换 minmaxscaler。

为什么 LSTM 预测的值很低？

如何解决为什么 LSTM 预测的值很低？

解决方法

相关推荐