python count number of unique elements in csv column -


i'm trying counts of unique items in csv column using python.

sample csv file (has no header):

ab,asd ab,poi ab,asd bg,put bg,asd 

i've tried far.

import csv collections import defaultdict, counter  input_file = open('results/1_sample.csv') csv_reader = csv.reader(input_file, delimiter=',')  data = defaultdict(list) row in csv_reader:     data[row[0]].append(row[1]) k, v in data.items():     print k     print counter(v) 

this gives output in format:

ab counter({'asd': 2, 'poi': 1}) bg counter({'asd': 1, 'put': 1}) 

but want output like:

ab:2 bg:2 total_unique_count:3 #unique count of column[1], irrespective of data in column[0] 

you're looking seriesgroupby method nunique:

in [11]: df out[11]:     0    1 0  ab  asd 1  ab  poi 2  ab  asd 3  bg  put 4  bg  asd  in [12]: g = df.groupby(0)  in [13]: g[1].nunique() out[13]: 0 ab    2 bg    2 name: 1, dtype: int64 

Comments

Popular posts from this blog

css - SVG using textPath a symbol not rendering in Firefox -

Java 8 + Maven Javadoc plugin: Error fetching URL -

order - Notification for user in user account opencart -