如何解决出现错误ValueError:预期离散类变量
嗨,我正在尝试使用SklTreeLearner对数据集进行分类。我已经对数据进行了预处理,并将其保存到新文件中。但是,当我尝试在新保存的文件上使用学习器时,出现错误。
代码:
#Importing Pandas library
import pandas as pd
#Reading file into data frame
csv_path = 'winequality-white-v3.csv'
df0 = pd.read_csv(csv_path)
#Filtering missing values
missingchlorides = pd.isna(df0["chlorides"])
missingIndices = df0[missingchlorides].index
#Replacing missing values by mean
meanchlorides = float(df0["chlorides"].mean())
df0["chlorides"].where(~ missingchlorides,meanchlorides,inplace=True)
#Deleteing missing values at the EOF
missing_winequailty = pd.isna(df0["alcohol"])
missingIndices = df0[missing_winequailty].index
df1 = df0.drop(missingIndices,axis=0)
#Saving into a csv file
df1.to_csv('filtered-winequality-white-v3.csv')
#---------------------------------
#Importing Orange Library
from Orange.data import Table,Domain
#Importing “SklTreeLearner”
from Orange.classification import SklTreeLearner
#Reading the fileterd data
Filtered_data = Table.from_file('filtered-winequality-white-v3.csv')
#Defining features
feature_vars = list(Filtered_data.domain.variables[1:6])
class_label_var = Filtered_data.domain.variables[7]
#Defining domain
winequality_domain = Domain(feature_vars,class_label_var)
Filtered_data= Table.from_table(domain=winequality_domain,source=Filtered_data)
print(Filtered_data.domain)
print(Filtered_data.domain.variables)
print(Filtered_data.domain.attributes)
print(Filtered_data.domain.class_var)
#Shuffling and splitting data for training and testing
Filtered_data.shuffle()
train_data_tab = Filtered_data[:1800]
test_data_tab = Filtered_data[1800:]
#creating tree learner and decision tree
tree_learner = SklTreeLearner()
decision_tree = tree_learner(train_data_tab)
错误:
ValueError: Discrete class variable expected.
我认为我需要将连续变量更改为谨慎变量,但不确定如何。有人可以帮忙吗?
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。