我有一个填充了用户和类别的 pandas 数据框,但这些类别有多个列。
| user | category | val1 | val2 | val3 |
| ------ | ------------------| -----| ---- | ---- |
| user 1 | c1 | 3 | NA | None |
| user 1 | c2 | NA | 4 | None |
| user 1 | c3 | NA | NA | 7 |
| user 2 | c1 | 5 | NA | None |
| user 2 | c2 | NA | 7 | None |
| user 2 | c3 | NA | NA | 2 |
我想得到它,以便将值压缩到单个列中。
| user | category | value|
| ------ | ------------------| -----|
| user 1 | c1 | 3 |
| user 1 | c2 | 4 |
| user 1 | c3 | 7 |
| user 2 | c1 | 5 |
| user 2 | c2 | 7 |
| user 2 | c3 | 2 |
最终得到如下矩阵:
np.array([[3, 4, 7], [5, 7, 2]])
您可以使用pd.DataFrame.bfill将值回填到选定的列上。
val_cols = ['val1', 'val2', 'val3']
df['value'] = pd.to_numeric(df[val_cols].bfill(axis=1).iloc[:, 0], errors='coerce')
print(df)
user0 category val1 val2 val3 value
0 user 1 c1 3.0 NaN None 3.0
1 user 1 c2 NaN 4.0 None 4.0
2 user 1 c3 NaN NaN 7 7.0
3 user 2 c1 5.0 NaN None 5.0
4 user 2 c2 NaN 7.0 2 7.0
5 user 2 c3 NaN NaN 2 2.0
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)