有区别value_counts http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.value_counts.html return:
生成的对象将按降序排列,以便第一个元素是最常出现的元素。
but count http://pandas.pydata.org/pandas-docs/stable/generated/pandas.core.groupby.DataFrameGroupBy.count.html不,它对输出进行排序index
(由列创建groupby('col')
).
df.groupby('colA').count()
用于聚合所有列df
按功能count.
所以它计算排除的值NaN
s.
所以如果需要的话count
仅一列需要:
df.groupby('colA')['colA'].count()
Sample:
df = pd.DataFrame({'colB':list('abcdefg'),
'colC':[1,3,5,7,np.nan,np.nan,4],
'colD':[np.nan,3,6,9,2,4,np.nan],
'colA':['c','c','b','a',np.nan,'b','b']})
print (df)
colA colB colC colD
0 c a 1.0 NaN
1 c b 3.0 3.0
2 b c 5.0 6.0
3 a d 7.0 9.0
4 NaN e NaN 2.0
5 b f NaN 4.0
6 b g 4.0 NaN
print (df['colA'].value_counts())
b 3
c 2
a 1
Name: colA, dtype: int64
print (df.groupby('colA').count())
colB colC colD
colA
a 1 1 1
b 3 2 2
c 2 2 1
print (df.groupby('colA')['colA'].count())
colA
a 1
b 3
c 2
Name: colA, dtype: int64