select df.id, count(distinct airports) as num
from df
group by df.id
having count(distinct airports) > 3
我正在尝试在 Python pandas 中执行与上述相同的操作。我尝试过不同的组合filter
, nunique
, agg
,并且没有任何作用。有什么建议吗?
ex:
df
df
id airport
1 lax
1 ohare
2 phl
3 lax
2 mdw
2 lax
2 sfw
2 tpe
所以我希望结果是:
id num
2 5
您可以使用SeriesGroupBy.nunique http://pandas.pydata.org/pandas-docs/stable/generated/pandas.core.groupby.SeriesGroupBy.nunique.html with boolean indexing http://pandas.pydata.org/pandas-docs/stable/indexing.html#boolean-indexing or query http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.query.html:
s = df.groupby('id')['airport'].nunique()
print (s)
id
1 2
2 5
3 1
Name: airport, dtype: int64
df1 = s[s > 3].reset_index()
print (df1)
id airport
0 2 5
Or:
df1 = df.groupby('id')['airport'].nunique().reset_index().query('airport > 3')
print (df1)
id airport
1 2 5
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)