我有一个数据,我希望得到该特定列的std偏差,然后再将其结果添加到原始数据。

import pandas as pd

raw_data = {'patient': [242, 151, 111,122, 342],
        'obs': [1, 2, 3, 1, 2],
        'treatment': [0, 1, 0, 1, 0],
        'score': ['strong', 'weak', 'weak', 'weak', 'strong']}

df = pd.DataFrame(raw_data, columns = ['patient', 'obs', 'treatment', 'score'])

df

   patient  obs  treatment   score
0      242    1          0  strong
1      151    2          1    weak
2      111    3          0    weak
3      122    1          1    weak
4      342    2          0  strong

所以我想得到score列的std dev,它是按score列分组的

所以我想要的方法是扫描列并找到patient列并检查它是否也是numeric(希望将来还添加它)和std偏差计算,最后将结果添加到orignial df

我试过这个;

std_dev_patient = []

for col in df.keys():

    df=df.groupby("score")

    if df[col]=='patient':
           np.std(col).append(std_dev_patient)
    else:
        pass

    df.concat([df,std_dev_patient], axis =1)

    df

TypeError: 'str' object is not callable

有没有办法有效地完成这个过程?

谢谢

预期的产出

   patient  obs  treatment   score  std_dev_patient std_dev_obs
0      242    1          0  strong    70.71            ..
1      151    2          1    weak    20.66            ..  
2      111    3          0    weak    20.66            ..
3      122    1          1    weak    20.66            .. 
4      342    2          0  strong    70.71            ..  
分析解答

使用pandas.Dataframe.groupby.transform

df['std_dev_patient'] = df.groupby('score')['patient'].transform('std')
print(df)
print(df.select_dtypes(np.number).dtypes)

输出:

   patient  obs  treatment   score  std_dev_patient
0      242    1          0  strong        70.710678
1      151    2          1    weak        20.663978
2      111    3          0    weak        20.663978
3      122    1          1    weak        20.663978
4      342    2          0  strong        70.710678

对于dtype检查,将pandas.DataFrame.select_dtypesnumpy.number一起使用:

import numpy as np

g = df.groupby('score')
for c in df.select_dtypes(np.number).columns:
    df['std_dev_%s' % c] = g[c].transform('std')

输出:

   patient  obs  treatment   score  std_dev_patient  std_dev_obs  \
0      242    1          0  strong        70.710678     0.707107   
1      151    2          1    weak        20.663978     1.000000   
2      111    3          0    weak        20.663978     1.000000   
3      122    1          1    weak        20.663978     1.000000   
4      342    2          0  strong        70.710678     0.707107   

   std_dev_treatment  
0            0.00000  
1            0.57735  
2            0.57735  
3            0.57735  
4            0.00000