python - Pandas, groupby where column value is greater than x -
i have table this
timestamp avg_hr hr_quality avg_rr rr_quality activity sleep_summary_id 1422404668 66 229 0 0 13 78 1422404670 64 223 0 0 20 78 1422404672 64 216 0 0 11 78 1422404674 66 198 0 40 9 78 1422404676 65 184 0 30 3 78 1422404678 64 173 0 10 17 78 1422404680 66 199 0 20 118 78 i'm trying group data timestamp,sleep id , rr_quality, rr_quality > 0
i've tried following , none of them seems work
df3 = df2.groupby([df2.index.hour,'sleep_summary_id',df2['rr_quality']>0]) df3 = df2.groupby([df2.index.hour,'sleep_summary_id','rr_quality'>0]) df3 = df2.groupby([df2.index.hour,'sleep_summary_id',['rr_quality']>0]) all of them returns keyerror.
edit:
also can't seem able pass more 1 filter @ time. tried following:
df2[df2['rr_quality'] >= 150, df2['hr_quality'] > 200] df2[df2['rr_quality'] >= 150, ['hr_quality'] > 200] df2[[df2['rr_quality'] >= 150, ['hr_quality'] > 200]] returns: typeerror: 'series' objects mutable, cannot hashed
the simplest thing here filter df first , perform groupby:
df2[df2['rr_quality'] > 0]].groupby([df2.index.hour,'sleep_summary_id') edit
if you're intending assign original df:
df2.loc[df2['rr_quality'] > 0, 'avg_hr'] = df2[df2['rr_quality'] >= 150].groupby([df2.index.hour,'emfit_sleep_summary_id'])['avg_hr'].transform('mean') the loc call mask lhs result of transform aligns correctly
to filter using multiple conditions need use array comparision operators &, | , ~ and, or , not respectively, additionally need wrap conditions in parentheses due operator precedence:
df2[(df2['rr_quality'] >= 150) & (df2['hr_quality'] > 200)]
Comments
Post a Comment