我怎样才能删除iterrows()
?使用 numpy 或 pandas 可以更快地完成此操作吗?
import pandas as pd
import numpy as np
df = pd.DataFrame({'A': 'foo bar foo bar foo bar foo foo'.split(),
'B': 'one one two three two two one three'.split(),
'C': np.arange(8)*0 })
print(df)
# A B C
# 0 foo one 0
# 1 bar one 0
# 2 foo two 0
# 3 bar three 0
# 4 foo two 0
# 5 bar two 0
# 6 foo one 0
# 7 foo three 0
selDict = {"foo":2, "bar":3}
这有效:
for i, r in df.iterrows():
if selDict[r["A"]] > 0:
selDict[r["A"]] -=1
df.set_value(i, 'C', 1)
print df
# A B C
# 0 foo one 1
# 1 bar one 1
# 2 foo two 1
# 3 bar three 1
# 4 foo two 0
# 5 bar two 1
# 6 foo one 0
# 7 foo three 0
如果我理解正确的话,你可以使用 cumcount:
df['C'] = (df.groupby('A').cumcount() < df['A'].map(selDict)).astype('int')
df
Out:
A B C
0 foo one 1
1 bar one 1
2 foo two 1
3 bar three 1
4 foo two 0
5 bar two 1
6 foo one 0
7 foo three 0
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)