By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
425,534 Members | 1,830 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 425,534 IT Pros & Developers. It's quick & easy.

How do I read Excel file in Python?

P: n/a
How do I read an Excel file in Python?

I have found a package to read excel file, which can be used on any
platform.

http://www.lexicon.net/sjmachin/xlrd.htm
I installed and working on the examples, I found its printing of cell's
contents in a different manner.
>>import xlrd
book=xlrd.open_workbook("Calculation_file.xls" )
book=xlrd.open_workbook("testbook.xls")
sh=book.sheet_by_index(0)
for row in range(sh.nrows):
print sh.row(rx)
[text:u'name', text:u'address', text:u'ph']
[text:u'sudhir', text:u'bangalore', number:1234.0]
[text:u'vinay', text:u'bangalore', number:3264.0]

I am bit confused with slicing. help me....
Thank you,
Regards,
Sudhir.

Oct 5 '06 #1
Share this Question
Share on Google+
12 Replies


P: n/a

kath wrote:
How do I read an Excel file in Python?

I have found a package to read excel file, which can be used on any
platform.
Hi Sudhir,
So far, so good :-)
>
http://www.lexicon.net/sjmachin/xlrd.htm
I installed and working on the examples, I found its printing of cell's
contents in a different manner.
>import xlrd
book=xlrd.open_workbook("Calculation_file.xls")
book=xlrd.open_workbook("testbook.xls")
sh=book.sheet_by_index(0)
for row in range(sh.nrows):
print sh.row(rx)
[text:u'name', text:u'address', text:u'ph']
[text:u'sudhir', text:u'bangalore', number:1234.0]
[text:u'vinay', text:u'bangalore', number:3264.0]
It helps when asking questions if you copy/paste exactly what is on
your screen;
in this case
print sh.row(rx)
would have given an error; you must have typed
for rx in range.....

A row is returned as a sequence of Cell objects. What you are seeing is
Python automatically doing repr(cell) on each cell in the row. The
Cell.__repr__ method formats it that way for debugging. Here are some
examples from a little test file of mine:
>>import xlrd
bk = xlrd.open_workbook('sjm1.xls')
sh = bk.sheet_by_index(0)
row0 = sh.row(0)
row0
[text:u'fubar', number:1.0, number:2.0]
>>firstcell = row0[0]
type(firstcell)
<class 'xlrd.sheet.Cell'>
>>firstcell.ctype
1
>># cell type 1 is text
firstcell.value
u'fubar'
>>repr(firstcell)
"text:u'fubar'"
>
I am bit confused with slicing. help me....
None of the above is anything to do with slicing; is this a 2nd
problem?

Perhaps you are having trouble with this:
>>help(sh.row_slice)
Help on method row_slice in module xlrd.sheet:

row_slice(self, rowx, start_colx=0, end_colx=None) method of
xlrd.sheet.Sheet instance
##
# Returns a slice of the Cell objects in the given row.
>>>
sh.row_slice(rowx, lo, hi) gives the same result as sh.row(rowx)[lo:hi]
-- it is provided because the latter would be inefficient for getting a
small slice from a long row.

If you are having trouble with the general concept of slicing, perhaps
you might like to try the Python tutorial. Otherwise, please try to be
a bit more specific about what the confusion is.

HTH, and e-mail me if you prefer ...

Cheers,
John

Oct 5 '06 #2

P: n/a

John Machin wrote:
kath wrote:
How do I read an Excel file in Python?

I have found a package to read excel file, which can be used on any
platform.

Hi Sudhir,
So far, so good :-)

http://www.lexicon.net/sjmachin/xlrd.htm
I installed and working on the examples, I found its printing of cell's
contents in a different manner.
>>import xlrd
>>book=xlrd.open_workbook("Calculation_file.xls" )
>>book=xlrd.open_workbook("testbook.xls")
>>sh=book.sheet_by_index(0)
>>for row in range(sh.nrows):
print sh.row(rx)
[text:u'name', text:u'address', text:u'ph']
[text:u'sudhir', text:u'bangalore', number:1234.0]
[text:u'vinay', text:u'bangalore', number:3264.0]

It helps when asking questions if you copy/paste exactly what is on
your screen;
in this case
print sh.row(rx)
would have given an error; you must have typed
for rx in range.....

A row is returned as a sequence of Cell objects. What you are seeing is
Python automatically doing repr(cell) on each cell in the row. The
Cell.__repr__ method formats it that way for debugging. Here are some
examples from a little test file of mine:
>import xlrd
bk = xlrd.open_workbook('sjm1.xls')
sh = bk.sheet_by_index(0)
row0 = sh.row(0)
row0
[text:u'fubar', number:1.0, number:2.0]
>firstcell = row0[0]
type(firstcell)
<class 'xlrd.sheet.Cell'>
>firstcell.ctype
1
># cell type 1 is text
firstcell.value
u'fubar'
>repr(firstcell)
"text:u'fubar'"

I am bit confused with slicing. help me....
None of the above is anything to do with slicing; is this a 2nd
problem?

Perhaps you are having trouble with this:
>help(sh.row_slice)
Help on method row_slice in module xlrd.sheet:

row_slice(self, rowx, start_colx=0, end_colx=None) method of
xlrd.sheet.Sheet instance
##
# Returns a slice of the Cell objects in the given row.
>>

sh.row_slice(rowx, lo, hi) gives the same result as sh.row(rowx)[lo:hi]
-- it is provided because the latter would be inefficient for getting a
small slice from a long row.

If you are having trouble with the general concept of slicing, perhaps
you might like to try the Python tutorial. Otherwise, please try to be
a bit more specific about what the confusion is.

HTH, and e-mail me if you prefer ...

Cheers,
John

Hi, thanks for the reply. I just took some time reading help file and
came to know to there is nothing do with slicing. But I do have a
problem with date field in the excel file.

the date( 8/9/2006 ) in Excel file, i am getting the value as 38938.0,
which I get when I convert date values to general format in Excel. I
want the actual date value. How do get that?
Thank you.
regards,
Sudhir.

Oct 5 '06 #3

P: n/a
the date( 8/9/2006 ) in Excel file, i am getting the value as 38938.0,
which I get when I convert date values to general format in Excel. I
want the actual date value. How do get that?
38938 appears to be the date in days since 1/1/1900. I'm sure someone
can help you figure out how to convert that to a more useful value.

-Matt

Oct 5 '06 #4

P: n/a
Matimus wrote:
>>the date( 8/9/2006 ) in Excel file, i am getting the value as 38938.0,
which I get when I convert date values to general format in Excel. I
want the actual date value. How do get that?


38938 appears to be the date in days since 1/1/1900. I'm sure someone
can help you figure out how to convert that to a more useful value.
I believe the win32all extension contains functionality to help with
that, thought it's a long time since I even ran PythonWin (sorry, Mark).

regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://holdenweb.blogspot.com
Recent Ramblings http://del.icio.us/steve.holden

Oct 5 '06 #5

P: n/a
On 5 Oct 2006 10:25:37 -0700, Matimus <mc******@gmail.comwrote:
the date( 8/9/2006 ) in Excel file, i am getting the value as 38938.0,
which I get when I convert date values to general format in Excel. I
want the actual date value. How do get that?

38938 appears to be the date in days since 1/1/1900. I'm sure someone
can help you figure out how to convert that to a more useful value.
>>excel_date = 38938.0
python_date = datetime.date(1900, 1, 1) +
datetime.timedelta(days=excel_date)
>>python_date
datetime.date(2006, 8, 11)

--
Cheers,
Simon B
si***@brunningonline.net
http://www.brunningonline.net/simon/blog/
Oct 5 '06 #6

P: n/a
On 10/5/06, Simon Brunning <si***@brunningonline.netwrote:
On 5 Oct 2006 10:25:37 -0700, Matimus <mc******@gmail.comwrote:
the date( 8/9/2006 ) in Excel file, i am getting the value as 38938.0,
which I get when I convert date values to general format in Excel. I
want the actual date value. How do get that?
38938 appears to be the date in days since 1/1/1900. I'm sure someone
can help you figure out how to convert that to a more useful value.
>excel_date = 38938.0
python_date = datetime.date(1900, 1, 1) + datetime.timedelta(days=excel_date)
python_date
datetime.date(2006, 8, 11)
Err, that's the wrong answer, isn't it? Perhaps it shoud be
datetime.date(1900, 1, 29)?

--
Cheers,
Simon B
si***@brunningonline.net
http://www.brunningonline.net/simon/blog/
Oct 5 '06 #7

P: n/a
>>excel_date = 38938.0
>>python_date = datetime.date(1900, 1, 1) + datetime.timedelta(days=excel_date)
>>python_date
datetime.date(2006, 8, 11)

Err, that's the wrong answer, isn't it? Perhaps it shoud be
datetime.date(1900, 1, 29)?
Actually was about to post same solution and got same results. (BTW
Simon, the OP date is Aug 9th, 2006). Scratched head and googled for
excel date calculations... found this bug where it treats 1900 as leap
year incorrectly:
http://www.ozgrid.com/Excel/ExcelDateandTimes.htm

Plus it treats 1 jan 1900 as day 1, not 0 so just subtract 2 in the
calc:
>>>python_date = datetime.date(1900, 1, 1) + datetime.timedelta(days=excel_date - 2)
python_date
datetime.date(2006, 8, 9)

HTH.

Oct 5 '06 #8

P: n/a
ho**********@yahoo.com wrote:
>excel_date = 38938.0
>python_date = datetime.date(1900, 1, 1) + datetime.timedelta(days=excel_date)
>python_date
datetime.date(2006, 8, 11)
Err, that's the wrong answer, isn't it? Perhaps it shoud be
datetime.date(1900, 1, 29)?

Actually was about to post same solution and got same results. (BTW
Simon, the OP date is Aug 9th, 2006). Scratched head and googled for
excel date calculations... found this bug where it treats 1900 as leap
year incorrectly:
http://www.ozgrid.com/Excel/ExcelDateandTimes.htm

Plus it treats 1 jan 1900 as day 1, not 0 so just subtract 2 in the
calc:
>>python_date = datetime.date(1900, 1, 1) + datetime.timedelta(days=excel_date - 2)
python_date
datetime.date(2006, 8, 9)
.... and 2006-08-09 is only the correct answer if the spreadsheet was,
when last saved, using the 1900 ("Windows") date system, not the 1904
("Macintosh") date system.

All the OP needs to do is to read the documentation that comes with the
xlrd package. It describes the problems with Excel dates, and offers
functions for conversion between the Excel date numbers and (year,
month, day, hour, minute, second) tuples which of course are
interoperable with Python's datetime module and with mx.DateTime.

| >>import xlrd
| >>xlrd.xldate_as_tuple(38938.0, 0)
| (2006, 8, 9, 0, 0, 0)
| >>xlrd.xldate_as_tuple(38938.0, 1)
| (2010, 8, 10, 0, 0, 0)
| >>>

Cheers,
John

Oct 5 '06 #9

P: n/a
On 5 Oct 2006 12:49:53 -0700, ho**********@yahoo.com
<ho**********@yahoo.comwrote:
Actually was about to post same solution and got same results. (BTW
Simon, the OP date is Aug 9th, 2006). Scratched head and googled for
excel date calculations... found this bug where it treats 1900 as leap
year incorrectly:
http://www.ozgrid.com/Excel/ExcelDateandTimes.htm
Ah - I was reading the OP's 8/9/2006 in the European way - DD/MM/YYYY.

One might argue over whether DD/MM/YYYY or MM/DD/YYYY are more
rational, but I find it best to avoid *both* those formats, 'cos they
are so easily confused.

--
Cheers,
Simon B
si***@brunningonline.net
http://www.brunningonline.net/simon/blog/
Oct 6 '06 #10

P: n/a
John Machin wrote:
ho**********@yahoo.com wrote:
>>excel_date = 38938.0
>>python_date = datetime.date(1900, 1, 1) + datetime.timedelta(days=excel_date)
>>python_date
datetime.date(2006, 8, 11)
>
Err, that's the wrong answer, isn't it? Perhaps it shoud be
datetime.date(1900, 1, 29)?
Actually was about to post same solution and got same results. (BTW
Simon, the OP date is Aug 9th, 2006). Scratched head and googled for
excel date calculations... found this bug where it treats 1900 as leap
year incorrectly:
http://www.ozgrid.com/Excel/ExcelDateandTimes.htm

Plus it treats 1 jan 1900 as day 1, not 0 so just subtract 2 in the
calc:
>>>python_date = datetime.date(1900, 1, 1) + datetime.timedelta(days=excel_date - 2)
>>python_date
datetime.date(2006, 8, 9)

... and 2006-08-09 is only the correct answer if the spreadsheet was,
when last saved, using the 1900 ("Windows") date system, not the 1904
("Macintosh") date system.
John,
Just for me own curiosity, is this Excel concept of date numbers same
as the OLE
concept (see http://msdn2.microsoft.com/en-us/library/82ab7w69.aspx or
search "MFC DATE" on MSDN).

I put in some test cases for conversion code here:
http://aspn.activestate.com/ASPN/Coo.../Recipe/496683

But would be interested to add any additional info on variations on
this theme.

Cheers,
Giles

Oct 6 '06 #11

P: n/a
John Machin wrote:
ho**********@yahoo.com wrote:
>>excel_date = 38938.0
>>python_date = datetime.date(1900, 1, 1) + datetime.timedelta(days=excel_date)
>>python_date
datetime.date(2006, 8, 11)
>
Err, that's the wrong answer, isn't it? Perhaps it shoud be
datetime.date(1900, 1, 29)?
Actually was about to post same solution and got same results. (BTW
Simon, the OP date is Aug 9th, 2006). Scratched head and googled for
excel date calculations... found this bug where it treats 1900 as leap
year incorrectly:
http://www.ozgrid.com/Excel/ExcelDateandTimes.htm

Plus it treats 1 jan 1900 as day 1, not 0 so just subtract 2 in the
calc:
>>>python_date = datetime.date(1900, 1, 1) + datetime.timedelta(days=excel_date - 2)
>>python_date
datetime.date(2006, 8, 9)

... and 2006-08-09 is only the correct answer if the spreadsheet was,
when last saved, using the 1900 ("Windows") date system, not the 1904
("Macintosh") date system.

All the OP needs to do is to read the documentation that comes with the
xlrd package. It describes the problems with Excel dates, and offers
functions for conversion between the Excel date numbers and (year,
month, day, hour, minute, second) tuples which of course are
interoperable with Python's datetime module and with mx.DateTime.

| >>import xlrd
| >>xlrd.xldate_as_tuple(38938.0, 0)
| (2006, 8, 9, 0, 0, 0)
| >>xlrd.xldate_as_tuple(38938.0, 1)
| (2010, 8, 10, 0, 0, 0)
| >>>

Cheers,
John


Hi,
>>import xlrd
book = xlrd.open_workbook("testbook1.xls")
sh = book.sheet_by_index(0)
sh.cell_value(rowx=1,colx=0)
38938.0
>>type(sh.cell_value(rowx=1,colx=0))
<type 'unicode'>
>>xlrd.xldate_as_tuple( sh.cell_value( rowx = 1,colx= 0 ), 0 )
Traceback (most recent call last):
File "D:\Python23\Testing area\Python and Excel\xlrdRead.py", line
30, in ?
temp=xlrd.xldate_as_tuple(sh.cell_value(rowx=r,col x=c),0)
File "D:\PYTHON23\Lib\site-packages\xlrd\xldate.py", line 61, in
xldate_as_tuple
xldays = int(xldate)
ValueError: invalid literal for int(): Date

because xlrd.xldate_as_tuple() function expects first argument to be an
integer. How do I convert an unicode character to integer, so that I
could get the date using xlrd.xldate_as_tuple() function.

Thank you,
sudhir.

Oct 6 '06 #12

P: n/a

Giles Brown wrote:
John Machin wrote:
ho**********@yahoo.com wrote:
>excel_date = 38938.0
>python_date = datetime.date(1900, 1, 1) + datetime.timedelta(days=excel_date)
>python_date
datetime.date(2006, 8, 11)

Err, that's the wrong answer, isn't it? Perhaps it shoud be
datetime.date(1900, 1, 29)?
>
Actually was about to post same solution and got same results. (BTW
Simon, the OP date is Aug 9th, 2006). Scratched head and googled for
excel date calculations... found this bug where it treats 1900 as leap
year incorrectly:
http://www.ozgrid.com/Excel/ExcelDateandTimes.htm
>
Plus it treats 1 jan 1900 as day 1, not 0 so just subtract 2 in the
calc:
>>python_date = datetime.date(1900, 1, 1) + datetime.timedelta(days=excel_date - 2)
>python_date
datetime.date(2006, 8, 9)
>
... and 2006-08-09 is only the correct answer if the spreadsheet was,
when last saved, using the 1900 ("Windows") date system, not the 1904
("Macintosh") date system.

John,
Just for me own curiosity, is this Excel concept of date numbers same
as the OLE
concept (see http://msdn2.microsoft.com/en-us/library/82ab7w69.aspx or
search "MFC DATE" on MSDN).

I put in some test cases for conversion code here:
http://aspn.activestate.com/ASPN/Coo.../Recipe/496683

But would be interested to add any additional info on variations on
this theme.
Hi Giles,

Those OLE date numbers coincide with Excel 1900-system date numbers
from 1900-03-01 onwards. Excel treats day 60 as the non-existent
1900-02-29.

Cheers,
John

Oct 6 '06 #13

This discussion thread is closed

Replies have been disabled for this discussion.