Pandas DataFrame df
样子 -
fileName objectsIdentified objectName
file_01.jpg 1, 2, 3 obj1, obj2, obj3
file_02.jpg 2, 3 obj2, obj3
file_03.jpg 1, 2, 4, 2 obj1, obj2, obj4, obj2
type(df['objectName'].iloc[0]
是列表
type(df['objectName'].iloc[0][0]
是string
题: 我怎样才能在ObjectName相关的项目,如与算作值单独列?
预期输出:
fileName objectsIdentified objectName obj1 obj2 obj3 obj4
file_01.jpg 1, 2, 3 obj1, obj2, obj3 1 1 1
file_02.jpg 2, 3 obj2, obj3 1 1
file_03.jpg 1, 2, 4, 2 obj1, obj2, obj4, obj2 1 2 1
分析解答
添加的下一行:
df = df.join(pd.get_dummies(pd.DataFrame(df['objectName'].tolist()).stack()).sum(level=0).replace(0, ''))
你的代码,然后df
将变为:
fileName objectsIdentified objectName obj1 obj2 obj3 \
0 file_01.jpg [1, 2, 3] [obj1, obj2, obj3] 1 1 1
1 file_02.jpg [2, 3] [obj2, obj3] 1 1
2 file_03.jpg [1, 2, 4, 2] [obj1, obj2, obj4, obj2] 1 2
obj4
0
1
2 1