python - Grouping and finding most frequent values -
i have df this:
protein peptide aaa aaa aba b aaa b aba b aba
but need filter data finding each value in column 1 top occurring value in column 2.
so output like:
protein peptide aaa b aba
in reality need top 3 occuring values. don't know how solve using python , pandas?
mode isn't groupby method, though series (and dataframe) method, have pass apply:
in [11]: df.groupby('protein')['peptide'].apply(lambda x: x.mode()[0]) out[11]: protein aaa b aba name: peptide, dtype: object
to top three, use value_counts
(in same way):
in [12]: df.groupby('protein')['peptide'].apply(lambda x: x.value_counts()[:3]) out[12]: protein aaa 2 aba 1 b aba 2 aaa 1 dtype: int64
Comments
Post a Comment