Find the median from a CSV File using Python -
i have csv file named 'salaries.csv' content of files follows:
city,job,salary
delhi,doctors,500
delhi,lawyers,400
delhi,plumbers,100
london,doctors,800
london,lawyers,700
london,plumbers,300
tokyo,doctors,900
tokyo,lawyers,800
tokyo,plumbers,400
lawyers,doctors,300
lawyers,lawyers,400
lawyers,plumbers,500
hong kong,doctors,1800
hong kong,lawyers,1100
hong kong,plumbers,1000
moscow,doctors,300
moscow,lawyers,200
moscow,plumbers,100
berlin,doctors,800
berlin,plumbers,900
paris,doctors,900
paris,lawyers,800
paris,plumbers,500
paris,dog catchers,400
i need print median salary of each profession. tried code, shows error.
my code :
from stringio import stringio import sqlite3 import csv import operator #from operator import itemgetter, attrgetter data = open('sal.csv', 'r').read() string = ''.join(data) f = stringio(string) reader = csv.reader(f) conn = sqlite3.connect(':memory:') c = conn.cursor() c.execute('''create table data (city text, job text, salary real)''') conn.commit() count = 0 e in reader: if count==0: print "" else: e[0]=str(e[0]) e[1]=str(e[1]) e[2] = float(e[2]) c.execute("""insert data values (?,?,?)""", e) count=count+1 conn.commit() labels = [] counts = [] count = 0 c.execute('''select count(salary),job data group job''') row in c: in row: if count==0: counts.append(i) count=count+1 else: count=0 labels.append(i) c.execute('''select salary,job data order job''') count = 1 count1 = 1 temp = 0 pri = 0 lis = [] row in c: lis.append(row) cons in counts: if cons%2 == 0: pri = cons/2 else: pri = (cons+1)/2 if count1 == 1: li in lis: if count == pri: print "median ",li count = count + 1 count = 0 temp = pri+cons else: li in lis: if count == temp: print "median is",li count = count+1 count = 0 temp = temp + pri count1 = count1 + 1
however, showing error:
indentationerror('expected indented block', ('', 28, 2, 'if count==0:\n'))
how fix error?
you can use defaultdict put salaries each profession median.
import csv collections import defaultdict open("c:/users/jimenez/desktop/a.csv","r") f: d = defaultdict(list) reader = csv.reader(f) reader.next() row in reader: d[row[1]].append(float(row[2])) k,v in d.iteritems(): print "{} median {}".format(k,sorted(v)[len(v) // 2]) print "{} average {}".format(k,sum(v)/len(v))
outputs
plumbers median 500.0 plumbers average 475.0 lawyers median 700.0 lawyers average 628.571428571 dog catchers median 400.0 dog catchers average 400.0 doctors median 800.0 doctors average 787.5
Comments
Post a Comment