python count number of unique elements in csv column -
i'm trying counts of unique items in csv column using python.
sample csv file (has no header):
ab,asd ab,poi ab,asd bg,put bg,asd
i've tried far.
import csv collections import defaultdict, counter input_file = open('results/1_sample.csv') csv_reader = csv.reader(input_file, delimiter=',') data = defaultdict(list) row in csv_reader: data[row[0]].append(row[1]) k, v in data.items(): print k print counter(v)
this gives output in format:
ab counter({'asd': 2, 'poi': 1}) bg counter({'asd': 1, 'put': 1})
but want output like:
ab:2 bg:2 total_unique_count:3 #unique count of column[1], irrespective of data in column[0]
you're looking seriesgroupby method nunique
:
in [11]: df out[11]: 0 1 0 ab asd 1 ab poi 2 ab asd 3 bg put 4 bg asd in [12]: g = df.groupby(0) in [13]: g[1].nunique() out[13]: 0 ab 2 bg 2 name: 1, dtype: int64
Comments
Post a Comment