python - Pandas, groupby where column value is greater than x -
i have table this
timestamp avg_hr hr_quality avg_rr rr_quality activity sleep_summary_id 1422404668 66 229 0 0 13 78 1422404670 64 223 0 0 20 78 1422404672 64 216 0 0 11 78 1422404674 66 198 0 40 9 78 1422404676 65 184 0 30 3 78 1422404678 64 173 0 10 17 78 1422404680 66 199 0 20 118 78
i'm trying group data timestamp
,sleep id
, rr_quality
, rr_quality
> 0
i've tried following , none of them seems work
df3 = df2.groupby([df2.index.hour,'sleep_summary_id',df2['rr_quality']>0]) df3 = df2.groupby([df2.index.hour,'sleep_summary_id','rr_quality'>0]) df3 = df2.groupby([df2.index.hour,'sleep_summary_id',['rr_quality']>0])
all of them returns keyerror.
edit:
also can't seem able pass more 1 filter @ time. tried following:
df2[df2['rr_quality'] >= 150, df2['hr_quality'] > 200] df2[df2['rr_quality'] >= 150, ['hr_quality'] > 200] df2[[df2['rr_quality'] >= 150, ['hr_quality'] > 200]]
returns: typeerror: 'series' objects mutable, cannot hashed
the simplest thing here filter df first , perform groupby:
df2[df2['rr_quality'] > 0]].groupby([df2.index.hour,'sleep_summary_id')
edit
if you're intending assign original df:
df2.loc[df2['rr_quality'] > 0, 'avg_hr'] = df2[df2['rr_quality'] >= 150].groupby([df2.index.hour,'emfit_sleep_summary_id'])['avg_hr'].transform('mean')
the loc
call mask lhs result of transform aligns correctly
to filter using multiple conditions need use array comparision operators &
, |
, ~
and
, or
, not
respectively, additionally need wrap conditions in parentheses due operator precedence:
df2[(df2['rr_quality'] >= 150) & (df2['hr_quality'] > 200)]
Comments
Post a Comment