By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
446,190 Members | 764 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 446,190 IT Pros & Developers. It's quick & easy.

Can I get help importing an excel document into mysql using python please?

P: 27
Hello,
This is my first time posting and just my second day using python on mysql, so please be patient with me. I should also say, I am using a linux machine. Using python I was able to make a database and a table. However, I am having difficulty getting the values from an excel and putting them into the table. The Excel document has just three columns, which are:
MATCH - this is just a four digit number
DATE - this is just a date in the 2009-06-29 format
POS - this is either just the phrase 'LONG' or 'SHORT'

However, after I do this and try to view the data, the MATCH values come out perfectly fine. The date values come out 0000-00-00 for all the dates, so that is not correct. and for the POS, I get this error:
mysql_exceptions.OperationalError: (1054, “Unkown column 'SHORT' in 'field list'”)

I have tried changing the variable types from VARCHAR to TEXT or DATE, but that didn't seem to work. Really any suggestions would be greatly appreciated.

The Code I am using is:
Expand|Select|Wrap|Line Numbers
  1. from xlrd import open_workbook, cellname
  2. import MySQLdb as mysql
  3. db=mysql.connect(db_info)
  4. c=db.cursor()
  5. c.execute('CREATE DATABASE actual_data')
  6.  
  7. db=mysql.connect(db_info)
  8. c=db.cursor()
  9. c.execute('CREATE TABLE actual_data_table (MATCH_ID VARCHAR(35) NOT NULL,DATETIME VARCHAR(35) NOT NULL,POS_TYPE VARCHAR(35) NOT NULL)')
  10.  
  11. file_to_import='actualdata.xls'
  12. column_count=5
  13. book=open_workbook(file_to_import)
  14. sheet=book.sheet_by_index(0)
  15. conn=mysql.connect(db_info)
  16. cursor=conn.cursor()
  17.  
  18. for row_index in range(sheet.nrows):
  19.  
  20.     row_num=row_index
  21.     MATCH_ID=sheet.cell(row_index,0).value
  22.     DATETIME=sheet.cell(row_index,1).value
  23.     POS_TYPE=sheet.cell(row_index,2).value
  24.  
  25.     cursor.execute('INSERT INTO actual_data_table(MATCH_ID, DATETIME, POS_TYPE) VALUES (%s, %s, %s)'%(MATCH_ID, DATETIME, POS_TYPE))
  26.  
  27.     (MATCH_ID,DATETIME,POS_TYPE)
  28.  
  29.  
  30. cursor.close()
  31. conn.commit()
  32. conn.close()
Aug 18 '10 #1

✓ answered by bvdet

No problem:
Expand|Select|Wrap|Line Numbers
  1. >>> result
  2. (('15.64',), ('-5.2',), ('24.25',), ('22.18',), ('14.52',), ('21.8',), ('21.2804',), ('-0.13',), ('15.48',), ('2.2',), ('-6.81',))
  3. >>> for item in result:
  4. ...     print float(item[0])
  5. ...     
  6. 15.64
  7. -5.2
  8. 24.25
  9. 22.18
  10. 14.52
  11. 21.8
  12. 21.2804
  13. -0.13
  14. 15.48
  15. 2.2
  16. -6.81
  17. >>> sum([float(item[0]) for item in result])/len(result)
  18. 11.382763636363636
  19. >>> 

Share this Question
Share on Google+
6 Replies


Expert 100+
P: 626
You first want to make sure that the data types are the same, so read the first 10 recs or so and print out the fields, and type(DATE), etc. and make sure they are strings. Second, I think the format for a MySQL insert is (note that you should be using "c" according to your code, and not "cursor")
c.execute('insert into actual_data_table values ("%s","%s","%s")' % (MATCH_ID, DATETIME, POS_TYPE))

Another option just in case because I use another SQL and have both of these examples in the MySQL notes
c.execute(u'''INSERT INTO actual_data_table VALUES (%s, %s, %s)''', MATCH_ID, DATETIME, POS_TYPE)
This avoids SQL injection AFAIK

Finally, you have duplicate code. You can delete the first occurrence of
Expand|Select|Wrap|Line Numbers
  1. db=mysql.connect(db_info)
  2. c=db.cursor()
  3. c.execute('CREATE DATABASE actual_data')
Aug 18 '10 #2

bvdet
Expert Mod 2.5K+
P: 2,851
Excel stores dates as integers. When you read an actual date from an Excel file with xlrd, the date comes in as a float. It represents the number of days since Jan 1, 1900 (Windows). To input the proper information into your database, you can first convert the float into a date. This works for me on my system:
Expand|Select|Wrap|Line Numbers
  1. import time
  2. import datetime
  3.  
  4. wb = xlrd.open_workbook('data.xls')
  5. sheet1 = wb.sheet_by_index(0)
  6.  
  7. rowList = sheet1.row_values(1)
  8. dateStr = datetime.datetime(*time.strptime("Dec 30 1899", "%b %d %Y")[:6]) + \
  9. datetime.timedelta(days=int(rowList[1]))
I used "Dec 30, 1899" instead of "Jan 1, 1900" because of a 2 day difference between expected results and actual results. It's possible Excel and Python calculate leap years differently.
Aug 18 '10 #3

P: 27
Hello,
I am attempting to import an excel document into a mysql database using python. The excel document has three columns of data: MATCH_ID, DATETIME, and POS_TYPE. Using the code I'll show at the end, I have successfully imported the excel document except for one problem. The column DATETIME has values in the format of 'yyyy-mm-dd'. Yet when I try the importing, all the values just come out as '0000-00-00'. When I tried importing this as a varchar file, all of the numbers that came out were just a large five-digit number. I;m thinking this may have something to do with the formating of the cells in excel. The other two values which are just a number string and a short word import perfectly. I have tried changing the variable type, but that was not helpful. Clearly form my code you can see I quite new at this. Your help will be very much appreciated.

Expand|Select|Wrap|Line Numbers
  1. from xlrd import open_workbook, cellname
  2. import MySQLdb as mysql
  3. db=mysql.connect(db_info)
  4. c=db.cursor()
  5. c.execute('CREATE DATABASE actual_data')
  6.  
  7. db=mysql.connect(db_info)
  8. c=db.cursor()
  9. c.execute('CREATE TABLE actual_data_table (MATCH_ID VARCHAR(35) NOT NULL,DATETIME DATE NOT NULL,POS_TYPE VARCHAR(35) NOT NULL, TICKER VARCHAR(35) NOT NULL, EXIT_TYPE TEXT NOT NULL)')
  10.  
  11. file_to_import='actualdata.xls'
  12. column_count=5
  13. book=open_workbook(file_to_import)
  14. sheet=book.sheet_by_index(0)
  15. db=mysql.connect(db_info)
  16. c=db.cursor()
  17.  
  18. for row_index in range(sheet.nrows):
  19.  
  20.     row_num=row_index
  21.     MATCH_ID=sheet.cell(row_index,0).value
  22.     DATETIME=sheet.cell(row_index,1).value
  23.     POS_TYPE=sheet.cell(row_index,2).value
  24.     c.execute('insert into actual_data_table values ("%s","%s","%s")' % (MATCH_ID, DATETIME, POS_TYPE))
  25.     (MATCH_ID,DATETIME,POS_TYPE)
  26.  
  27. cursor.close()
  28. conn.commit()
  29. conn.close()  
  30.  
Aug 20 '10 #4

bvdet
Expert Mod 2.5K+
P: 2,851
Ron,

Please read my reply to your thread here.

To summarize, Excel stores dates as a number and formats the number for display. The number represents the number of days since Jan 1, 1900 (Windows). You can use that number to calculate the date.

bvdet
Aug 20 '10 #5

P: 27
Sorry, I did not keep scrolling down to see the second part of your message. However, that was a LOT of help. THANK YOU!!

However, I have another question that should be much quicker. Using the following code I selected 11 numbers.
Expand|Select|Wrap|Line Numbers
  1. c.execute('SELECT REF_PNL FROM actual_data_table WHERE REF_ENTRY_TIME>"2010-06-29 09:45:00" and REF_ENTRY_TIME<"2010-06-29 10:00:00"')
  2. result1=c.fetchall()
  3.  
when I try to output these results, they come out like this:
>>>result1
(('15.64',), ('-5.2',), ('24.25',), ('22.18',), ('14.52',), ('21.8',), ('21.2804',), ('-0.13',), ('15.48',), ('2.2',), ('-6.81',))
This is fine, however, I can't seem to do any computations with these numbers. I'm trying to take the average of these numbers using numpy. But then I get this error:
TypeError: cannot perform reduce with flexible type

The result looks like a tuple within a tuple or something. So is there anyway to break the results down into regular computable numbers?
Aug 20 '10 #6

bvdet
Expert Mod 2.5K+
P: 2,851
No problem:
Expand|Select|Wrap|Line Numbers
  1. >>> result
  2. (('15.64',), ('-5.2',), ('24.25',), ('22.18',), ('14.52',), ('21.8',), ('21.2804',), ('-0.13',), ('15.48',), ('2.2',), ('-6.81',))
  3. >>> for item in result:
  4. ...     print float(item[0])
  5. ...     
  6. 15.64
  7. -5.2
  8. 24.25
  9. 22.18
  10. 14.52
  11. 21.8
  12. 21.2804
  13. -0.13
  14. 15.48
  15. 2.2
  16. -6.81
  17. >>> sum([float(item[0]) for item in result])/len(result)
  18. 11.382763636363636
  19. >>> 
Aug 20 '10 #7

Post your reply

Sign in to post your reply or Sign up for a free account.