如何解决匹配列名称存储在另一个数据框中,并替换为其ID
我有一个称为Master的主数据框,其中包含所有问题ID。 我有多个包含这些问题的数据集作为标题,我想用其ID替换这些标题。
主表如下:
Question ID
gender 1
sex 1
what is your gender 1
sexual orientation 1
marital status 2
occupation 3
whats you job 3
df1看起来像这样:
gender marital status occupation
Male Single Doctor
Male Divorced Engineer
所需的输出
1 2 3
Male Single Doctor
Male Divorced Engineer
如果df1中有任何新变量在主数据表中没有提及ID,则应为其赋予新ID,变量名称和ID将在主表中更新
例如。
df2看起来像这样:
gender marital status country
Male Single India
Male Divorced UK
所需的df2:
1 2 4
Male Single India
Male Divorced UK
更新后的主表将为:
Question ID
gender 1
sex 1
what is your gender 1
sexual orientation 1
marital status 2
occupation 3
whats you job 3
country 4
解决方法
使用innodb_force_recovery > 0
的{{3}}通过其他数据设置新的列名称:
function split(val) {
return val.split(/,\s*/);
}
function extractLast(term) {
return split(term).pop();
}
var availableTags = [
"[Hello]","[Hello World]","[Google","[New Life]","[World]","[Old]"
];
$("#tags").autocomplete({
source: function(request,response) {
// delegate back to autocomplete,but extract the last term
response($.ui.autocomplete.filter(
availableTags,extractLast(request.term)));
},select: function(event,ui) {
var terms = split(this.value);
// remove the current input
terms.pop();
// add the selected item
terms.push(ui.item.value);
// add placeholder to get the comma-and-space at the end
terms.push("");
this.value = terms.join(" ");
return false;
}
});
编辑:
<script type="text/javascript" src="//code.jquery.com/jquery-1.9.1.js" jq=""></script>
<script type="text/javascript" src="//code.jquery.com/ui/1.9.2/jquery-ui.js"></script>
<link rel="stylesheet" type="text/css" href="//code.jquery.com/ui/1.9.2/themes/base/jquery-ui.css">
<div class="ui-widget">
<label for="tags">Search: </label>
<input type="text" id="tags" onkeypress="edValueKeyPress()" />
</div>
的{{1}}值中有重复项,因此需要创建唯一的Series
值。一种可能的解决方案是通过DataFrame.rename
删除重复项,以下是示例数据,了解其工作方式:
df2 = df1.rename(columns=df.set_index('Question')['ID'])
print (df2)
1 2 3
0 Male Single Doctor
1 Male Divorced Engineer
您可以测试真实数据中的重复项:
Question
删除重复项,并保留第一行重复项,这里df
:
Question
删除重复项,并保留第一行重复项,这里print (df)
Question ID
0 gender 10 <-duplicates,change ID for test
1 gender 15 <-duplicates,change ID for test
2 what is your gender 1
3 sexual orientation 1
4 marital status 2
5 occupation 3
6 whats you job 3
:
print (df[df.duplicated('Question',keep=False)])
Question ID
0 gender 10
1 gender 15
EDIT1:如果主DataFrame中的值不存在并且有必要先附加它们,则使用:
ID=10
获取print (df.drop_duplicates('Question').set_index('Question')['ID'])
Question
gender 10
what is your gender 1
sexual orientation 1
marital status 2
occupation 3
whats you job 3
Name: ID,dtype: int64
df21 = df1.rename(columns=df.drop_duplicates('Question').set_index('Question')['ID'])
print (df21)
10 2 3
0 Male Single Doctor
1 Male Divorced Engineer
中不存在的所有列:
ID=15
在最大值后加print (df.drop_duplicates('Question',keep='last').set_index('Question')['ID'])
Question
gender 15
what is your gender 1
sexual orientation 1
marital status 2
occupation 3
whats you job 3
Name: ID,dtype: int64
df22 = df1.rename(columns=df.drop_duplicates('Question',keep='last').set_index('Question')['ID'])
print (df22)
15 2 3
0 Male Single Doctor
1 Male Divorced Engineer
print (df.set_index('Question')['ID'].to_dict())
{'gender': 15,'what is your gender': 1,'sexual orientation': 1,'marital status': 2,'occupation': 3,'whats you job': 3}
df22 = df1.rename(columns=df.set_index('Question')['ID'].to_dict())
print (df22)
15 2 3
0 Male Single Doctor
1 Male Divorced Engineer
:
print (df)
Question ID
0 gender 1
1 sex 1
2 what is your gender 1
3 sexual orientation 1
4 marital status 2
5 occupation 3
6 whats you job 3
print (df1)
gender marital status country code1 code2
0 Male Single India 4 7
1 Male Divorced UK 3 5
附加到原始df['Question']
:
cols = df1.columns.difference(df['Question'].tolist(),sort=False)
print (cols)
Index(['country','code1','code2'],dtype='object')
最后使用原始解决方案:
ID
,
您可以使用匹配问题的ID重命名:
# Create an arbitrary model with some weights, for example
model = Sequential(layers = [
Dense(70, input_shape = (100,)),Dense(60),Dense(50),Dense(5)])
# Save the weights of the model
model.save_weights(“model.h5”)
# Later,load in the model (we only really need the layer in question)
old_model = Sequential(layers = [
Dense(70,Dense(5)])
old_model.load_weights(“model.h5”)
# Create a new model with slightly different architecture (except for the layer in question,at least)
new_model = Sequential(layers = [
Dense(80,Dense(5)])
# Set the weights of the final layer of the new model to the weights of the final layer of the old model,but leaving other layers unchanged.
new_model.layers[-1].set_weights(old_model.layers[-1].get_weights())
# Assert that the weights of the final layer is the same,but other are not.
print (np.all(new_model.layers[-1].get_weights()[0] == old_model.layers[-1].get_weights()[0]))
>> True
print (np.all(new_model.layers[-2].get_weights()[0] == old_model.layers[-2].get_weights()[0]))
>> False
这应该适用于给定列的多个可能名称。
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。