如何在for循环中将字典追加到字典?

如何解决如何在for循环中将字典追加到字典?

我正在尝试创建一个字典,其中每个键的值是两个字典。

我有两个患者(正常组织,疾病组织)条形码列表,它们对应于数据框中的值列。我的目标是匹配两个列表中的患者,然后针对两个列表中的每个患者,将其正常值和疾病组织值附加到字典中。字典键将是患者条形码,而字典值将是正常组织的另一个字典:从数据框中提取的值,而疾病组织:从数据框中提取的值。

所以从

开始
In [3]: df = pd.DataFrame({'Patient1_Normal':['nan',0.01,0.1,0.16,0.88,0.83,0.82,'nan'],'Patient1_Disease':[0.12,0.06,0.19,0.34,'nan',0.73,0.91],'Patient2_Disease':['nan',1.0,0.24,0.67,0.97,0.98],'Patient3_Normal': [0.21,0.25,0.63,0.92,0.3,0.56,0.78,0.9],'Patient3_Disease':[0.11,0.45,0.22,0.89,0.17,0.12],'Patient4_Normal':['nan',0.35,0.66,0.21,'Patient4_Disease':['nan',0.72,0.91,0.79],'Patient5_Disease': [0.34,0.27,0.32,0.55,0.51]})


In [4]: df                                                                                                                                 
Out[4]: Patient1_Normal Patient1_Disease Patient2_Disease  Patient3_Normal Patient3_Disease Patient4_Normal Patient4_Disease Patient5_Disease
    0             nan             0.12              nan             0.21             0.11             nan              nan             0.34
    1            0.01             0.06              nan             0.25             0.45            0.35              nan             0.27
    2             0.1             0.19              nan             0.63              nan             nan             0.56              nan
    3            0.16             0.34                1             0.92             0.45            0.22             0.72             0.16
    4            0.88              nan             0.24             0.30             0.22            0.45              nan             0.32
    5            0.83              nan             0.67             0.56             0.89            0.66             0.97             0.27
    6            0.82             0.73             0.97             0.78             0.17            0.21             0.91             0.55
    7             nan             0.91             0.98             0.90             0.12            0.91             0.79             0.51

这是我到目前为止所拥有的:

D_col = [col for col in df if '_Disease' in col]
N_col = [col for col in df if '_Normal' in col]

paired_patients = {}
psi_sets = {}
psi_sets['d'] = []
psi_sets['n'] = []

for patient in N_col:
       patient_id = patient[0:8]

       n_id = patient
       d_id = [i for i in D_col if patient_id in i]

       if len(d_id) > 0:
           psi_sets['n'] = df[n_id].to_list()
           for d in d_id:
               psi_sets['d'] = df[d].to_list()

       paired_patients[patient_id] = psi_sets

但是,我的paired_patients字典值是覆盖而不是附加,因此paired_patients的输出看起来像这样:

{'Patient1': {'d': ['nan','n': ['nan',0.91]},'Patient3': {'d': ['nan','Patient4': {'d': ['nan',0.91]}}

我该如何修正代码的最后一位,以便为每个患者正确附加paired_patient字典值,以使paired_patient字典看起来像这样:

{'Patient1': {'d': [0.12,'nan']},'Patient3': {'d': [0.11,'n': [0.21,0.9]},'Patient4': {'nan',0.91]}}

解决方法

D_col = [col for col in df if '_Disease' in col]
N_col = [col for col in df if '_Normal' in col]
paired_patients = {}


for patient in N_col:
    psi_sets = {}
    patient_id = patient[0:8]
    n_id = patient
    d_id = [i for i in D_col if patient_id in i]

    if len(d_id) > 0:
        psi_sets['n'] = df[n_id].to_list()
        for d in d_id:
            psi_sets['d'] = df[d].to_list()
 
    paired_patients[patient_id] = psi_sets
,

您可以使用df.meltpd.concatseries.str.splitdf.replacedf.groupbydf.xs,最后使用df.to_dict。 请检查以下内容:

>>> df2 = (pd.concat([
                      df.melt().variable.str.split('_',expand=True),df.melt().drop('variable',1)
                    ],axis=1)
                       .replace({'Normal':'n','Disease':'d'})
                       .groupby([0,1]).agg(list))
>>> paired_patients = {k: v for k,v in
                       df2.groupby(level=0)
                          .apply(lambda df: df.xs(df.name).value.to_dict())
                          .to_dict().items()
                       if not ({'d','n'} ^ v.keys())}
>>> paired_patients
{'Patient1': {'d': [0.12,0.06,0.19,0.34,'nan',0.73,0.91],'n': ['nan',0.01,0.1,0.16,0.88,0.83,0.82,'nan']},'Patient3': {'d': [0.11,0.45,0.22,0.89,0.17,0.12],'n': [0.21,0.25,0.63,0.92,0.3,0.56,0.78,0.9]},'Patient4': {'nan',0.72,0.97,0.91,0.79],0.35,0.66,0.21,0.91]}}

EXPLANTION

>>> df.melt()
            variable  value
0    Patient1_Normal    NaN
1    Patient1_Normal   0.01
2    Patient1_Normal   0.10
..               ...    ...
62  Patient5_Disease   0.55
63  Patient5_Disease   0.51

>>> df.melt().variable.str.split('_',expand=True)
 
           0        1
0   Patient1   Normal
1   Patient1   Normal
2   Patient1   Normal
..       ...      ...
62  Patient5  Disease
63  Patient5  Disease

[64 rows x 2 columns]

# then concat these two,replace 'Normal' and 'Disease' with 'n' and 'd' and drop
# the 'variable' column
>>> pd.concat([
                      df.melt().variable.str.split('_',axis=1).replace({'Normal':'n','Disease':'d'})
           0  1  value
0   Patient1  n    NaN
1   Patient1  n   0.01
2   Patient1  n   0.10
..       ... ..    ...
62  Patient5  d   0.55
63  Patient5  d   0.51

[64 rows x 3 columns]

# then groupby column [0,1] and aggregate into list:
>>> df2 = _.groupby([0,1]).agg(list)
>>> df2
                                                      value
0        1                                                 
Patient1 d   [0.12,nan,0.91]
         n    [nan,nan]
Patient2 d     [nan,1.0,0.24,0.67,0.98]
Patient3 d  [0.11,0.12]
         n   [0.21,0.9]
Patient4 d    [nan,0.79]
         n   [nan,0.91]
Patient5 d  [0.34,0.27,0.32,0.55,0.51]

# Now groupby level=0,and convert that into dict,and finally check whether 
# both 'n' and 'd' are present as keys by using symmetric set difference
# properties of dict_keys objects

>>> paired_patients = {k: v for k,v in
                       df2.groupby(level=0)
                          .apply(lambda df: df.xs(df.name).value.to_dict())
                          .to_dict().items()
                       if ('n' in v) and ('d' in v)}

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。

相关推荐


Selenium Web驱动程序和Java。元素在(x,y)点处不可单击。其他元素将获得点击?
Python-如何使用点“。” 访问字典成员?
Java 字符串是不可变的。到底是什么意思?
Java中的“ final”关键字如何工作?(我仍然可以修改对象。)
“loop:”在Java代码中。这是什么,为什么要编译?
java.lang.ClassNotFoundException:sun.jdbc.odbc.JdbcOdbcDriver发生异常。为什么?
这是用Java进行XML解析的最佳库。
Java的PriorityQueue的内置迭代器不会以任何特定顺序遍历数据结构。为什么?
如何在Java中聆听按键时移动图像。
Java“Program to an interface”。这是什么意思?
Java在半透明框架/面板/组件上重新绘画。
Java“ Class.forName()”和“ Class.forName()。newInstance()”之间有什么区别?
在此环境中不提供编译器。也许是在JRE而不是JDK上运行?
Java用相同的方法在一个类中实现两个接口。哪种接口方法被覆盖?
Java 什么是Runtime.getRuntime()。totalMemory()和freeMemory()?
java.library.path中的java.lang.UnsatisfiedLinkError否*****。dll
JavaFX“位置是必需的。” 即使在同一包装中
Java 导入两个具有相同名称的类。怎么处理?
Java 是否应该在HttpServletResponse.getOutputStream()/。getWriter()上调用.close()?
Java RegEx元字符(。)和普通点?