Tensorflow ValueError：logits和标签必须具有相同的形状None，2vsNone，1

如何解决Tensorflow ValueError：logits和标签必须具有相同的形状None，2vsNone，1

我是机器学习的新手，以为我将从keras开始。在这里，我使用二进制交叉熵将电影评论分为三类分类（正值为1，中性为0，负值为-1）。因此，当我尝试使用tensorflow估计器包装keras模型时，出现错误。
代码如下：

import tensorflow as tf
import numpy as np
import pandas as pd
import numpy as K

csvfilename_train = 'train(cleaned).csv'
csvfilename_test = 'test(cleaned).csv'

# Read .csv files as pandas dataframes
df_train = pd.read_csv(csvfilename_train)
df_test = pd.read_csv(csvfilename_test)

train_sentences  = df_train['Comment'].values
test_sentences  = df_test['Comment'].values

# Extract labels from dataframes
train_labels = df_train['Sentiment'].values
test_labels = df_test['Sentiment'].values

vocab_size = 10000
embedding_dim = 16
max_length = 30
trunc_type = 'post'
oov_tok = '<OOV>'

from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences

tokenizer = Tokenizer(num_words = vocab_size,oov_token = oov_tok)
tokenizer.fit_on_texts(train_sentences)
word_index = tokenizer.word_index
sequences = tokenizer.texts_to_sequences(train_sentences)
padded = pad_sequences(sequences,maxlen = max_length,truncating = trunc_type)

test_sequences = tokenizer.texts_to_sequences(test_sentences)
test_padded = pad_sequences(test_sequences,maxlen = max_length)

model = tf.keras.Sequential([
    tf.keras.layers.Embedding(vocab_size,embedding_dim,input_length = max_length),tf.keras.layers.Flatten(),tf.keras.layers.Dense(6,activation = 'relu'),tf.keras.layers.Dense(2,activation = 'sigmoid'),])
model.compile(loss = 'binary_crossentropy',optimizer = 'adam',metrics = ['accuracy'])

num_epochs = 10
model.fit(padded,train_labels,epochs = num_epochs,validation_data = (test_padded,test_labels))

错误如下：

---> 10 model.fit(padded,test_labels))

最后是这个

ValueError: logits and labels must have the same shape ((None,2) vs (None,1))

解决方法

您的代码有几个问题。

您使用了错误的损失功能。二进制交叉熵损失用于二进制分类问题，但是您在这里进行了多类分类（3类-正，负，中性）。
在最后一层使用sigmoid激活函数是错误的，因为sigmoid函数将logit值映射到0到1之间的范围（但是，您的类标签是 0、1和-1 ）。这清楚地表明，由于S形函数（只能映射0到1之间的值），网络将永远无法预测负值，因此永远也不会学会预测负值类。

正确的方法是将其视为多类分类问题，并使用分类交叉熵损失和 softmax激活在最后一个密集层中以 3个单位（每个班级一个）的“ strong”为单位。请注意，必须为categorical cross-entropy丢失使用单热编码标签，并且可以将sparse categorical cross-entropy丢失与整数标签一起使用。

以下是使用分类交叉熵损失的示例。

tf.keras.layers.Dense(3,activation = 'softmax')

请注意3个更改：

损失函数变为分类交叉熵
否。最终密集层中的单元数为3
标签必须是一键编码，可以使用tf.one_hot
完成
tf.one_hot（train_labels，3）

。

Tensorflow ValueError：logits和标签必须具有相同的形状None，2vsNone，1

如何解决Tensorflow ValueError：logits和标签必须具有相同的形状None，2vsNone，1

解决方法

相关推荐