Find the median from a CSV File using Python -


i have csv file named 'salaries.csv' content of files follows:

city,job,salary
delhi,doctors,500
delhi,lawyers,400
delhi,plumbers,100
london,doctors,800
london,lawyers,700
london,plumbers,300
tokyo,doctors,900
tokyo,lawyers,800
tokyo,plumbers,400
lawyers,doctors,300
lawyers,lawyers,400
lawyers,plumbers,500
hong kong,doctors,1800
hong kong,lawyers,1100
hong kong,plumbers,1000
moscow,doctors,300
moscow,lawyers,200
moscow,plumbers,100
berlin,doctors,800
berlin,plumbers,900
paris,doctors,900
paris,lawyers,800
paris,plumbers,500
paris,dog catchers,400

i need print median salary of each profession. tried code, shows error.

my code :

from stringio import stringio import sqlite3 import csv import operator #from operator import itemgetter, attrgetter  data = open('sal.csv', 'r').read() string = ''.join(data) f = stringio(string) reader = csv.reader(f) conn = sqlite3.connect(':memory:') c = conn.cursor() c.execute('''create table data (city text, job text, salary real)''') conn.commit() count = 0  e in reader:     if count==0:         print ""     else:         e[0]=str(e[0])         e[1]=str(e[1])         e[2] = float(e[2])         c.execute("""insert data values (?,?,?)""", e)         count=count+1         conn.commit()  labels = [] counts = [] count = 0 c.execute('''select count(salary),job data group job''')  row in c:       in row:             if count==0:                counts.append(i)                count=count+1            else:                 count=0       labels.append(i)  c.execute('''select salary,job data order job''')  count = 1 count1 = 1 temp = 0 pri = 0 lis = []  row in c:       lis.append(row) cons in counts:       if cons%2 == 0:          pri = cons/2      else:          pri = (cons+1)/2      if count1 == 1:         li in lis:               if count == pri:                   print "median ",li         count = count + 1         count = 0         temp = pri+cons      else:         li in lis:               if count == temp:                   print "median is",li               count = count+1               count = 0               temp = temp + pri        count1 = count1 + 1 

however, showing error:

indentationerror('expected indented block', ('', 28, 2, 'if count==0:\n')) 

how fix error?

you can use defaultdict put salaries each profession median.

import csv collections import defaultdict  open("c:/users/jimenez/desktop/a.csv","r") f:     d = defaultdict(list)     reader = csv.reader(f)     reader.next()     row in reader:         d[row[1]].append(float(row[2]))     k,v in d.iteritems():     print "{} median {}".format(k,sorted(v)[len(v) // 2])     print "{} average {}".format(k,sum(v)/len(v)) 

outputs

plumbers median 500.0 plumbers average 475.0 lawyers median 700.0 lawyers average 628.571428571 dog catchers median 400.0 dog catchers average 400.0 doctors median 800.0 doctors average 787.5 

Comments

Popular posts from this blog

css - SVG using textPath a symbol not rendering in Firefox -

Java 8 + Maven Javadoc plugin: Error fetching URL -

datatable - Matlab struct computations -