473,408 Members | 2,009 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,408 software developers and data experts.

How do I read Excel file in Python?

How do I read an Excel file in Python?

I have found a package to read excel file, which can be used on any
platform.

http://www.lexicon.net/sjmachin/xlrd.htm
I installed and working on the examples, I found its printing of cell's
contents in a different manner.
>>import xlrd
book=xlrd.open_workbook("Calculation_file.xls" )
book=xlrd.open_workbook("testbook.xls")
sh=book.sheet_by_index(0)
for row in range(sh.nrows):
print sh.row(rx)
[text:u'name', text:u'address', text:u'ph']
[text:u'sudhir', text:u'bangalore', number:1234.0]
[text:u'vinay', text:u'bangalore', number:3264.0]

I am bit confused with slicing. help me....
Thank you,
Regards,
Sudhir.

Oct 5 '06 #1
12 5984

kath wrote:
How do I read an Excel file in Python?

I have found a package to read excel file, which can be used on any
platform.
Hi Sudhir,
So far, so good :-)
>
http://www.lexicon.net/sjmachin/xlrd.htm
I installed and working on the examples, I found its printing of cell's
contents in a different manner.
>import xlrd
book=xlrd.open_workbook("Calculation_file.xls")
book=xlrd.open_workbook("testbook.xls")
sh=book.sheet_by_index(0)
for row in range(sh.nrows):
print sh.row(rx)
[text:u'name', text:u'address', text:u'ph']
[text:u'sudhir', text:u'bangalore', number:1234.0]
[text:u'vinay', text:u'bangalore', number:3264.0]
It helps when asking questions if you copy/paste exactly what is on
your screen;
in this case
print sh.row(rx)
would have given an error; you must have typed
for rx in range.....

A row is returned as a sequence of Cell objects. What you are seeing is
Python automatically doing repr(cell) on each cell in the row. The
Cell.__repr__ method formats it that way for debugging. Here are some
examples from a little test file of mine:
>>import xlrd
bk = xlrd.open_workbook('sjm1.xls')
sh = bk.sheet_by_index(0)
row0 = sh.row(0)
row0
[text:u'fubar', number:1.0, number:2.0]
>>firstcell = row0[0]
type(firstcell)
<class 'xlrd.sheet.Cell'>
>>firstcell.ctype
1
>># cell type 1 is text
firstcell.value
u'fubar'
>>repr(firstcell)
"text:u'fubar'"
>
I am bit confused with slicing. help me....
None of the above is anything to do with slicing; is this a 2nd
problem?

Perhaps you are having trouble with this:
>>help(sh.row_slice)
Help on method row_slice in module xlrd.sheet:

row_slice(self, rowx, start_colx=0, end_colx=None) method of
xlrd.sheet.Sheet instance
##
# Returns a slice of the Cell objects in the given row.
>>>
sh.row_slice(rowx, lo, hi) gives the same result as sh.row(rowx)[lo:hi]
-- it is provided because the latter would be inefficient for getting a
small slice from a long row.

If you are having trouble with the general concept of slicing, perhaps
you might like to try the Python tutorial. Otherwise, please try to be
a bit more specific about what the confusion is.

HTH, and e-mail me if you prefer ...

Cheers,
John

Oct 5 '06 #2

John Machin wrote:
kath wrote:
How do I read an Excel file in Python?

I have found a package to read excel file, which can be used on any
platform.

Hi Sudhir,
So far, so good :-)

http://www.lexicon.net/sjmachin/xlrd.htm
I installed and working on the examples, I found its printing of cell's
contents in a different manner.
>>import xlrd
>>book=xlrd.open_workbook("Calculation_file.xls" )
>>book=xlrd.open_workbook("testbook.xls")
>>sh=book.sheet_by_index(0)
>>for row in range(sh.nrows):
print sh.row(rx)
[text:u'name', text:u'address', text:u'ph']
[text:u'sudhir', text:u'bangalore', number:1234.0]
[text:u'vinay', text:u'bangalore', number:3264.0]

It helps when asking questions if you copy/paste exactly what is on
your screen;
in this case
print sh.row(rx)
would have given an error; you must have typed
for rx in range.....

A row is returned as a sequence of Cell objects. What you are seeing is
Python automatically doing repr(cell) on each cell in the row. The
Cell.__repr__ method formats it that way for debugging. Here are some
examples from a little test file of mine:
>import xlrd
bk = xlrd.open_workbook('sjm1.xls')
sh = bk.sheet_by_index(0)
row0 = sh.row(0)
row0
[text:u'fubar', number:1.0, number:2.0]
>firstcell = row0[0]
type(firstcell)
<class 'xlrd.sheet.Cell'>
>firstcell.ctype
1
># cell type 1 is text
firstcell.value
u'fubar'
>repr(firstcell)
"text:u'fubar'"

I am bit confused with slicing. help me....
None of the above is anything to do with slicing; is this a 2nd
problem?

Perhaps you are having trouble with this:
>help(sh.row_slice)
Help on method row_slice in module xlrd.sheet:

row_slice(self, rowx, start_colx=0, end_colx=None) method of
xlrd.sheet.Sheet instance
##
# Returns a slice of the Cell objects in the given row.
>>

sh.row_slice(rowx, lo, hi) gives the same result as sh.row(rowx)[lo:hi]
-- it is provided because the latter would be inefficient for getting a
small slice from a long row.

If you are having trouble with the general concept of slicing, perhaps
you might like to try the Python tutorial. Otherwise, please try to be
a bit more specific about what the confusion is.

HTH, and e-mail me if you prefer ...

Cheers,
John

Hi, thanks for the reply. I just took some time reading help file and
came to know to there is nothing do with slicing. But I do have a
problem with date field in the excel file.

the date( 8/9/2006 ) in Excel file, i am getting the value as 38938.0,
which I get when I convert date values to general format in Excel. I
want the actual date value. How do get that?
Thank you.
regards,
Sudhir.

Oct 5 '06 #3
the date( 8/9/2006 ) in Excel file, i am getting the value as 38938.0,
which I get when I convert date values to general format in Excel. I
want the actual date value. How do get that?
38938 appears to be the date in days since 1/1/1900. I'm sure someone
can help you figure out how to convert that to a more useful value.

-Matt

Oct 5 '06 #4
Matimus wrote:
>>the date( 8/9/2006 ) in Excel file, i am getting the value as 38938.0,
which I get when I convert date values to general format in Excel. I
want the actual date value. How do get that?


38938 appears to be the date in days since 1/1/1900. I'm sure someone
can help you figure out how to convert that to a more useful value.
I believe the win32all extension contains functionality to help with
that, thought it's a long time since I even ran PythonWin (sorry, Mark).

regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://holdenweb.blogspot.com
Recent Ramblings http://del.icio.us/steve.holden

Oct 5 '06 #5
On 5 Oct 2006 10:25:37 -0700, Matimus <mc******@gmail.comwrote:
the date( 8/9/2006 ) in Excel file, i am getting the value as 38938.0,
which I get when I convert date values to general format in Excel. I
want the actual date value. How do get that?

38938 appears to be the date in days since 1/1/1900. I'm sure someone
can help you figure out how to convert that to a more useful value.
>>excel_date = 38938.0
python_date = datetime.date(1900, 1, 1) +
datetime.timedelta(days=excel_date)
>>python_date
datetime.date(2006, 8, 11)

--
Cheers,
Simon B
si***@brunningonline.net
http://www.brunningonline.net/simon/blog/
Oct 5 '06 #6
On 10/5/06, Simon Brunning <si***@brunningonline.netwrote:
On 5 Oct 2006 10:25:37 -0700, Matimus <mc******@gmail.comwrote:
the date( 8/9/2006 ) in Excel file, i am getting the value as 38938.0,
which I get when I convert date values to general format in Excel. I
want the actual date value. How do get that?
38938 appears to be the date in days since 1/1/1900. I'm sure someone
can help you figure out how to convert that to a more useful value.
>excel_date = 38938.0
python_date = datetime.date(1900, 1, 1) + datetime.timedelta(days=excel_date)
python_date
datetime.date(2006, 8, 11)
Err, that's the wrong answer, isn't it? Perhaps it shoud be
datetime.date(1900, 1, 29)?

--
Cheers,
Simon B
si***@brunningonline.net
http://www.brunningonline.net/simon/blog/
Oct 5 '06 #7
>>excel_date = 38938.0
>>python_date = datetime.date(1900, 1, 1) + datetime.timedelta(days=excel_date)
>>python_date
datetime.date(2006, 8, 11)

Err, that's the wrong answer, isn't it? Perhaps it shoud be
datetime.date(1900, 1, 29)?
Actually was about to post same solution and got same results. (BTW
Simon, the OP date is Aug 9th, 2006). Scratched head and googled for
excel date calculations... found this bug where it treats 1900 as leap
year incorrectly:
http://www.ozgrid.com/Excel/ExcelDateandTimes.htm

Plus it treats 1 jan 1900 as day 1, not 0 so just subtract 2 in the
calc:
>>>python_date = datetime.date(1900, 1, 1) + datetime.timedelta(days=excel_date - 2)
python_date
datetime.date(2006, 8, 9)

HTH.

Oct 5 '06 #8
ho**********@yahoo.com wrote:
>excel_date = 38938.0
>python_date = datetime.date(1900, 1, 1) + datetime.timedelta(days=excel_date)
>python_date
datetime.date(2006, 8, 11)
Err, that's the wrong answer, isn't it? Perhaps it shoud be
datetime.date(1900, 1, 29)?

Actually was about to post same solution and got same results. (BTW
Simon, the OP date is Aug 9th, 2006). Scratched head and googled for
excel date calculations... found this bug where it treats 1900 as leap
year incorrectly:
http://www.ozgrid.com/Excel/ExcelDateandTimes.htm

Plus it treats 1 jan 1900 as day 1, not 0 so just subtract 2 in the
calc:
>>python_date = datetime.date(1900, 1, 1) + datetime.timedelta(days=excel_date - 2)
python_date
datetime.date(2006, 8, 9)
.... and 2006-08-09 is only the correct answer if the spreadsheet was,
when last saved, using the 1900 ("Windows") date system, not the 1904
("Macintosh") date system.

All the OP needs to do is to read the documentation that comes with the
xlrd package. It describes the problems with Excel dates, and offers
functions for conversion between the Excel date numbers and (year,
month, day, hour, minute, second) tuples which of course are
interoperable with Python's datetime module and with mx.DateTime.

| >>import xlrd
| >>xlrd.xldate_as_tuple(38938.0, 0)
| (2006, 8, 9, 0, 0, 0)
| >>xlrd.xldate_as_tuple(38938.0, 1)
| (2010, 8, 10, 0, 0, 0)
| >>>

Cheers,
John

Oct 5 '06 #9
On 5 Oct 2006 12:49:53 -0700, ho**********@yahoo.com
<ho**********@yahoo.comwrote:
Actually was about to post same solution and got same results. (BTW
Simon, the OP date is Aug 9th, 2006). Scratched head and googled for
excel date calculations... found this bug where it treats 1900 as leap
year incorrectly:
http://www.ozgrid.com/Excel/ExcelDateandTimes.htm
Ah - I was reading the OP's 8/9/2006 in the European way - DD/MM/YYYY.

One might argue over whether DD/MM/YYYY or MM/DD/YYYY are more
rational, but I find it best to avoid *both* those formats, 'cos they
are so easily confused.

--
Cheers,
Simon B
si***@brunningonline.net
http://www.brunningonline.net/simon/blog/
Oct 6 '06 #10
John Machin wrote:
ho**********@yahoo.com wrote:
>>excel_date = 38938.0
>>python_date = datetime.date(1900, 1, 1) + datetime.timedelta(days=excel_date)
>>python_date
datetime.date(2006, 8, 11)
>
Err, that's the wrong answer, isn't it? Perhaps it shoud be
datetime.date(1900, 1, 29)?
Actually was about to post same solution and got same results. (BTW
Simon, the OP date is Aug 9th, 2006). Scratched head and googled for
excel date calculations... found this bug where it treats 1900 as leap
year incorrectly:
http://www.ozgrid.com/Excel/ExcelDateandTimes.htm

Plus it treats 1 jan 1900 as day 1, not 0 so just subtract 2 in the
calc:
>>>python_date = datetime.date(1900, 1, 1) + datetime.timedelta(days=excel_date - 2)
>>python_date
datetime.date(2006, 8, 9)

... and 2006-08-09 is only the correct answer if the spreadsheet was,
when last saved, using the 1900 ("Windows") date system, not the 1904
("Macintosh") date system.
John,
Just for me own curiosity, is this Excel concept of date numbers same
as the OLE
concept (see http://msdn2.microsoft.com/en-us/library/82ab7w69.aspx or
search "MFC DATE" on MSDN).

I put in some test cases for conversion code here:
http://aspn.activestate.com/ASPN/Coo.../Recipe/496683

But would be interested to add any additional info on variations on
this theme.

Cheers,
Giles

Oct 6 '06 #11
John Machin wrote:
ho**********@yahoo.com wrote:
>>excel_date = 38938.0
>>python_date = datetime.date(1900, 1, 1) + datetime.timedelta(days=excel_date)
>>python_date
datetime.date(2006, 8, 11)
>
Err, that's the wrong answer, isn't it? Perhaps it shoud be
datetime.date(1900, 1, 29)?
Actually was about to post same solution and got same results. (BTW
Simon, the OP date is Aug 9th, 2006). Scratched head and googled for
excel date calculations... found this bug where it treats 1900 as leap
year incorrectly:
http://www.ozgrid.com/Excel/ExcelDateandTimes.htm

Plus it treats 1 jan 1900 as day 1, not 0 so just subtract 2 in the
calc:
>>>python_date = datetime.date(1900, 1, 1) + datetime.timedelta(days=excel_date - 2)
>>python_date
datetime.date(2006, 8, 9)

... and 2006-08-09 is only the correct answer if the spreadsheet was,
when last saved, using the 1900 ("Windows") date system, not the 1904
("Macintosh") date system.

All the OP needs to do is to read the documentation that comes with the
xlrd package. It describes the problems with Excel dates, and offers
functions for conversion between the Excel date numbers and (year,
month, day, hour, minute, second) tuples which of course are
interoperable with Python's datetime module and with mx.DateTime.

| >>import xlrd
| >>xlrd.xldate_as_tuple(38938.0, 0)
| (2006, 8, 9, 0, 0, 0)
| >>xlrd.xldate_as_tuple(38938.0, 1)
| (2010, 8, 10, 0, 0, 0)
| >>>

Cheers,
John


Hi,
>>import xlrd
book = xlrd.open_workbook("testbook1.xls")
sh = book.sheet_by_index(0)
sh.cell_value(rowx=1,colx=0)
38938.0
>>type(sh.cell_value(rowx=1,colx=0))
<type 'unicode'>
>>xlrd.xldate_as_tuple( sh.cell_value( rowx = 1,colx= 0 ), 0 )
Traceback (most recent call last):
File "D:\Python23\Testing area\Python and Excel\xlrdRead.py", line
30, in ?
temp=xlrd.xldate_as_tuple(sh.cell_value(rowx=r,col x=c),0)
File "D:\PYTHON23\Lib\site-packages\xlrd\xldate.py", line 61, in
xldate_as_tuple
xldays = int(xldate)
ValueError: invalid literal for int(): Date

because xlrd.xldate_as_tuple() function expects first argument to be an
integer. How do I convert an unicode character to integer, so that I
could get the date using xlrd.xldate_as_tuple() function.

Thank you,
sudhir.

Oct 6 '06 #12

Giles Brown wrote:
John Machin wrote:
ho**********@yahoo.com wrote:
>excel_date = 38938.0
>python_date = datetime.date(1900, 1, 1) + datetime.timedelta(days=excel_date)
>python_date
datetime.date(2006, 8, 11)

Err, that's the wrong answer, isn't it? Perhaps it shoud be
datetime.date(1900, 1, 29)?
>
Actually was about to post same solution and got same results. (BTW
Simon, the OP date is Aug 9th, 2006). Scratched head and googled for
excel date calculations... found this bug where it treats 1900 as leap
year incorrectly:
http://www.ozgrid.com/Excel/ExcelDateandTimes.htm
>
Plus it treats 1 jan 1900 as day 1, not 0 so just subtract 2 in the
calc:
>>python_date = datetime.date(1900, 1, 1) + datetime.timedelta(days=excel_date - 2)
>python_date
datetime.date(2006, 8, 9)
>
... and 2006-08-09 is only the correct answer if the spreadsheet was,
when last saved, using the 1900 ("Windows") date system, not the 1904
("Macintosh") date system.

John,
Just for me own curiosity, is this Excel concept of date numbers same
as the OLE
concept (see http://msdn2.microsoft.com/en-us/library/82ab7w69.aspx or
search "MFC DATE" on MSDN).

I put in some test cases for conversion code here:
http://aspn.activestate.com/ASPN/Coo.../Recipe/496683

But would be interested to add any additional info on variations on
this theme.
Hi Giles,

Those OLE date numbers coincide with Excel 1900-system date numbers
from 1900-03-01 onwards. Excel treats day 60 as the non-existent
1900-02-29.

Cheers,
John

Oct 6 '06 #13

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: dmateer | last post by:
I have the following code in an asp.net page: System.Data.OleDb.OleDbConnection cn = new System.Data.OleDb.OleDbConnection(); cn.ConnectionString = @"Provider=Microsoft.Jet.OLEDB.4.0;Data...
0
by: Ashish Tiwari | last post by:
Can i read excel file into asp.net 2.0
1
by: sunnyK | last post by:
hi all !!! i m new here ,,, i m suffering with excel file. i want help regarding how to open excel file in java as file & read row & column wise ..... not database (e.g. using jdbc odbc...
0
by: student491 | last post by:
hello, I opened the excel file by using workbook.open() method... Now i want to read the data from the excel file.. & save it into array. help out .. how to do this. Then i want to pass...
0
by: minhtran | last post by:
Hi Everyone Lately, I found the code as to read all file from Excel file as ".xls" from Excel before 2007, but does not work for ".xlsx" Excel 2007, Please, any help for me (new hired database...
1
by: plusplus | last post by:
Hi, thanks for taking up your precious time reading my question (: I am using 'Mircrosoft Visual Studio.NET'. The system have to read in an excel file, then convert to XML file, then lastly...
0
by: paintballer4lfe | last post by:
Hello, i am trying to create a code or get something that lets me read an excel file and take whats in the excel file and then put all of the cells into a list box, or import certain cells into a...
0
by: Manesh Pawar | last post by:
Hello All, I need to read a Excel file to Import that data into the database. I can read it When Excel data Is In Tabular Form, but here, Excel data is in Random cells Format. ...
2
by: Ehsan arman | last post by:
I was wondering how I could read excel file names in a particular folder using Access VBA. I need to compare the file names in that folder with what I have in a field of Access table.
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.