如何解决如何在matplotlib / seaborn条形图中添加第二条轴,并使次要点与正确的条形对齐?
我在下面编写了一个(新手)python函数,以绘制由主要维度(可能还有次要维度)分解的条形图。例如,下图显示了接受过特定教育程度的每种性别的百分比。
问题:如何在每个条形图上叠加该子组的中位数家庭人数,例如在“大学/女性”栏上放置一个表示值“ 3”的点。我见过的所有示例都没有正确地将点叠加到正确的条上。
我对此非常陌生,非常感谢您的帮助!
df = pd.DataFrame({'Student' : ['Alice','Bob','Chris','Dave','Edna','Frank'],'Education' : ['HS','HS','College','HS' ],'Household Size': [4,4,3,6 ],'Gender' : ['F','M','F','M' ]});
def MakePercentageFrequencyTable(dataFrame,primaryDimension,secondaryDimension=None,extraAggregatedField=None):
lod = dataFrame.groupby([secondaryDimension]) if secondaryDimension is not None else dataFrame
primaryDimensionPercent = lod[primaryDimension].value_counts(normalize=True) \
.rename('percentage') \
.mul(100) \
.reset_index(drop=False);
if secondaryDimension is not None:
primaryDimensionPercent = primaryDimensionPercent.sort_values(secondaryDimension)
g = sns.catplot(x="percentage",y=secondaryDimension,hue=primaryDimension,kind='bar',data=primaryDimensionPercent)
else:
sns.catplot(x="percentage",y='index',data=primaryDimensionPercent)
MakePercentageFrequencyTable(dataFrame=df,primaryDimension='Education',secondaryDimension='Gender')
# Question: I want to send in extraAggregatedField='Household Size' when I call the function such that
# it creates a secondary 'Household Size' axis at the top of the figure
# and aggregates/integrates the 'Household Size' column such that the following points are plotted
# against the secondary axis and positioned over the given bars:
#
# Female/College => 3
# Female/High School => 4
# Male/College => 3
# Male/High School => 4
Picture of what I have been able to achieve so far
解决方法
您将不得不使用轴级功能sns.barplot()
和sns.stripplot()
而不是catplot()
,它会创建一个新图形和一个FacetGrid
。
类似这样的东西:
df = pd.DataFrame({'Student' : ['Alice','Bob','Chris','Dave','Edna','Frank'],'Education' : ['HS','HS','College','HS' ],'Household Size': [4,4,3,6 ],'Gender' : ['F','M','F','M' ]});
def MakePercentageFrequencyTable(dataFrame,primaryDimension,secondaryDimension=None,extraAggregatedField=None,ax=None):
ax = plt.gca() if ax is None else ax
lod = dataFrame.groupby([secondaryDimension]) if secondaryDimension is not None else dataFrame
primaryDimensionPercent = lod[primaryDimension].value_counts(normalize=True) \
.rename('percentage') \
.mul(100) \
.reset_index(drop=False);
if secondaryDimension is not None:
primaryDimensionPercent = primaryDimensionPercent.sort_values(secondaryDimension)
ax = sns.barplot(x="percentage",y=secondaryDimension,hue=primaryDimension,data=primaryDimensionPercent,ax=ax)
else:
ax = sns.barplot(x="percentage",y='index',ax=ax)
if extraAggregatedField is not None:
ax2 = ax.twiny()
extraDimension = dataFrame.groupby([primaryDimension,secondaryDimension]).mean().reset_index(drop=False)
ax2 = sns.stripplot(data=extraDimension,x=extraAggregatedField,ax=ax2,dodge=True,edgecolors='k',linewidth=1,size=10)
plt.figure()
MakePercentageFrequencyTable(dataFrame=df,primaryDimension='Education',secondaryDimension='Gender',extraAggregatedField='Household Size')
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。