如何解决如何在多索引数据帧中以不同的随机顺序随机打乱外部索引和内部索引
以下是生成示例数据帧的一些代码:
fruits=pd.DataFrame()
fruits['month']=['jan','feb','march','jan','april','june','april']
fruits['fruit']=['apple','orange','pear','apple','cherry','cherry']
ind=fruits.index
ind_mnth=fruits['month'].values
fruits['price']=[30,20,40,25,30,45,60,55,37,60]
fruits_grp = fruits.set_index([ind_mnth,ind],drop=False)
如何在这个多索引数据框中随机打乱外层索引和内层索引以不同的随机顺序?
解决方法
假设此数据帧以 MultiIndex 作为输入:
month fruit price
jan 0 jan apple 30
feb 1 feb orange 20
2 feb pear 40
march 3 march orange 25
jan 4 jan apple 30
april 5 april pear 45
6 april cherry 60
june 7 june pear 45
march 8 march orange 25
9 march cherry 55
june 10 june apple 37
april 11 april cherry 60
首先将整个 DataFrame 打乱,然后通过按随机顺序索引来重新组合月份:
np.random.seed(0)
idx0 = np.unique(fruits_grp.index.get_level_values(0))
np.random.shuffle(idx0)
fruits_grp.sample(frac=1).loc[idx0]
输出:
month fruit price
jan 0 jan apple 30
4 jan apple 30
april 6 april cherry 60
5 april pear 45
11 april cherry 60
feb 1 feb orange 20
2 feb pear 40
june 10 june apple 37
7 june pear 45
march 8 march orange 25
9 march cherry 55
3 march orange 25
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。