python - How to speed up bulk insert to MS SQL Server from CSV using pyodbc -


below code i'd with. having run on 1,300,000 rows meaning takes 40 minutes insert ~300,000 rows.

i figure bulk insert route go speed up? or because i'm iterating on rows via for data in reader: portion?

#opens prepped csv file open (os.path.join(newpath,outfile), 'r') f:     #hooks csv reader file     reader = csv.reader(f)     #pulls out columns (which match sql table)     columns = next(reader)     #trims spaces     columns = [x.strip(' ') x in columns]     #starts sql statement     query = 'bulk insert spikedata123({0}) values ({1})'     #puts column names in sql query 'query'     query = query.format(','.join(columns), ','.join('?' * len(columns)))      print 'query is: %s' % query     #starts curser cnxn (which works)     cursor = cnxn.cursor()     #uploads row     data in reader:         cursor.execute(query, data)         cursor.commit() 

i dynamically picking column headers on purpose (as create pythonic code possible).

spikedata123 table name.

bulk insert much faster reading source file row-by-row , doing regular insert each row. however, both bulk insert , bcp have significant limitation regarding csv files in cannot handle text qualifiers (ref: here). is, if csv file not have qualified text strings in ...

1,gord thompson,2015-04-15 2,bob loblaw,2015-04-07 

... can bulk insert it, if contains text qualifiers (because text values contains commas) ...

1,"thompson, gord",2015-04-15 2,"loblaw, bob",2015-04-07 

... bulk insert cannot handle it. still, might faster overall pre-process such csv file pipe-delimited file ...

1|thompson, gord|2015-04-15 2|loblaw, bob|2015-04-07 

... or tab-delimited file (where represents tab character) ...

1→thompson, gord→2015-04-15 2→loblaw, bob→2015-04-07 

... , bulk insert file. latter (tab-delimited) file bulk insert code this:

import pypyodbc conn_str = "dsn=mydb_sqlexpress;" cnxn = pypyodbc.connect(conn_str) crsr = cnxn.cursor() sql = """ bulk insert mydb.dbo.spikedata123 'c:\\__tmp\\bitest.txt' (     fieldterminator='\\t',     rowterminator='\\n'     ); """ crsr.execute(sql) cnxn.commit() crsr.close() cnxn.close() 

note: mentioned in comment, executing bulk insert statement applicable if sql server instance can directly read source file. cases source file on remote client, see this answer.


Comments

Popular posts from this blog

Java 8 + Maven Javadoc plugin: Error fetching URL -

css - SVG using textPath a symbol not rendering in Firefox -

order - Notification for user in user account opencart -