By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
445,758 Members | 1,225 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 445,758 IT Pros & Developers. It's quick & easy.

newbie - merging xls files using xldt and xlwt

P: n/a
Hi,

I wrote the program below to merge all xls files in a given directory into one multisheet xls file. It uses xlwt and xlrd. The xls files I use for input are generated by Spss. When I open and re-save the files in Excel, the program works, but when I use the xls files as they were created by Spss, the program returns an error message (see below). Some details: Python 2.4, xlwt version 0.7.0, xlrd version 0.5.2, Win NT.

Any ideas?

Thanks in advance!

Albert-Jan
"""
Merge all xls files in a given directory into one multisheet xls file.
The sheets get the orginal file name, without the extension.
File names should not exceed 31 characters, as this is the maximum
sheet name length
"""

import xlrd, xlwt
import glob, os.path

def merge_xls (in_dir, out_file="d:/merged_output.xls"):

xls_files = glob.glob(in_dir + "*.xls")
sheet_names = []
merged_book = xlwt.Workbook()

[sheet_names.append(os.path.basename(v)[:-4]) for k, v in enumerate(xls_files)]
for k, xls_file in enumerate(xls_files):
if len (sheet_names[k]) <= 31:
book = xlrd.open_workbook(xls_file)
ws = merged_book.add_sheet(sheet_names[k])
for sheetx in range(book.nsheets):
sheet = book.sheet_by_index(sheetx)
for rx in range(sheet.nrows):
for cx in range(sheet.ncols):
ws.write(rx, cx, sheet.cell_value(rx, cx))
else:
print "File name too long: <%s.xls(maximum is 31 chars) " % (sheet_names[k])
print "File <%s.xlsis *not* included in the merged xls file." % (sheet_names[k])
merged_book.save(out_file)

print "---Merged xls file written to %s using the following source files: " % (out_file)
for k, v in enumerate(sheet_names):
if len(v) <= 31: print "\t", str(k+1).zfill(3), "%s.xls" % (v)

merge_xls(in_dir="d:/temp/")

*** WARNING: OLE2 inconsistency: SSCS size is 0 but SSAT size is non-zero ***
put_cell 0 1

Traceback (most recent call last):
File "G:\generic_syntaxes\merge_xls.py", line 37, in -toplevel-
merge_xls(in_dir="d:/temp/")
File "G:\generic_syntaxes\merge_xls.py", line 21, in merge_xls
book = xlrd.open_workbook(xls_file)
File "C:\Python24\lib\site-packages\xlrd\__init__.py", line 139, in open_workbook
bk.get_sheets()
File "C:\Python24\lib\site-packages\xlrd\__init__.py", line 389, in get_sheets
sht = self.get_sheet(sheetno)
File "C:\Python24\lib\site-packages\xlrd\__init__.py", line 379, in get_sheet
sh.read(self)
File "C:\Python24\lib\site-packages\xlrd\sheet.py", line 285, in read
self.put_cell(rowx, colx, XL_CELL_TEXT, bk._sharedstrings[index])
File "C:\Python24\lib\site-packages\xlrd\sheet.py", line 214, in put_cell
self._cell_types[rowx][colx] = ctype
IndexError: list assignment index out of range

Oct 15 '08 #1
Share this Question
Share on Google+
5 Replies


P: n/a
On Oct 15, 9:16*pm, Albert-jan Roskam <fo...@yahoo.comwrote:
Hi,

I wrote the program below to merge all xls files in a given directory into one multisheet xls file. It uses xlwt and xlrd. The xls files I use for input are generated by Spss. When I open and re-save the files in Excel, theprogram works, but when I use the xls files as they were created by Spss, the program returns an error message (see below). Some details: Python 2.4,xlwt version 0.7.0, xlrd version 0.5.2, Win NT.

Any ideas?
The version of xlrd that you are using is an antique. Go to
http://pypi.python.org/pypi/xlrd and get the latest version. If that
doesn't work, send me a copy of one of the files that is displaying
this problem.
>
Thanks in advance!

Albert-Jan

"""
Merge all xls files in a given directory into one multisheet xls file.
The sheets get the orginal file name, without the extension.
File names should not exceed 31 characters, as this is the maximum
sheet name length
"""

import xlrd, xlwt
import glob, os.path

def merge_xls (in_dir, out_file="d:/merged_output.xls"):

* * xls_files * = glob.glob(in_dir + "*.xls")
* * sheet_names = []
* * merged_book = xlwt.Workbook()

* * [sheet_names.append(os.path.basename(v)[:-4]) for k, v in enumerate(xls_files)]
Wah! Try this:
sheet_names = [os.path.basename(v)[:-4]) for v in xls_files]

* * for k, xls_file in enumerate(xls_files):
* * * * if len (sheet_names[k]) <= 31:
* * * * * * book = xlrd.open_workbook(xls_file)
* * * * * * ws = merged_book.add_sheet(sheet_names[k])
* * * * * * for sheetx in range(book.nsheets):
* * * * * * * * sheet = book.sheet_by_index(sheetx)
* * * * * * * * for rx in range(sheet.nrows):
* * * * * * * * * * for cx in range(sheet.ncols):
* * * * * * * * * * * * *ws.write(rx, cx, sheet..cell_value(rx, cx))
I presume that you're not too worried about any date data.

If an input file has more than 1 sheet, you are creating only one
sheet in the output file, and overwriting cells.
* * * * else:
[snip]

merge_xls(in_dir="d:/temp/")

*** WARNING: OLE2 inconsistency: SSCS size is 0 but SSAT size is non-zero***
put_cell 0 1

Traceback (most recent call last):
[snip]
* * self._cell_types[rowx][colx] = ctype
IndexError: list assignment index out of range
Yeah, symptom of an xls file with a DIMENSIONS records that lies ...
xlrd has like Excel become more resilient over time :-)

BTW, consider joining the python-excel group at http://groups.google.com/group/python-excel

Cheers,
John
Oct 15 '08 #2

P: n/a
Hello all,

I was wondering if it would be possible to make a script to grab my
balance account balance a few times a day without having to login every
time. I know I can use the urlib2 library, but not sure how to go about
filling in the forms and submitting them. BOA has a mobile site that is
pretty simple. Anyone else use Bank of America and would be interested in
this. This is not for anything illegal, just for me to prevent overdrafting
my account

https://sitekey.bankofamerica.com/sa...ileDevice=true

y =
urllib.urlopen('https://sitekey.bankofamerica.com/sas/signonScreen.do?isMobl
eDevice=true',urllib.urlencode({'onlineID':'MYONLL INEID'})).readlines()

Oct 16 '08 #3

P: n/a
You'd probably have to use something like mechanize
(http://wwwsearch.sourceforge.net/mechanize/) to fill out the forms,
but if BofA's website uses Javascript at all, you're probably out of
luck.

Cheers,
Chris
--
Follow the path of the Iguana...
http://rebertia.com

On Wed, Oct 15, 2008 at 8:09 AM, Support Desk
<su**************@gmail.comwrote:
Hello all,

I was wondering if it would be possible to make a script to grab my
balance account balance a few times a day without having to login every
time. I know I can use the urlib2 library, but not sure how to go about
filling in the forms and submitting them. BOA has a mobile site that is
pretty simple. Anyone else use Bank of America and would be interested in
this. This is not for anything illegal, just for me to prevent overdrafting
my account

https://sitekey.bankofamerica.com/sa...ileDevice=true

y =
urllib.urlopen('https://sitekey.bankofamerica.com/sas/signonScreen.do?isMobl
eDevice=true',urllib.urlencode({'onlineID':'MYONLL INEID'})).readlines()

--
http://mail.python.org/mailman/listinfo/python-list
Oct 16 '08 #4

P: n/a
I was also looking at the ClientForm Library
http://wwwsearch.sourceforge.net/ClientForm/

which can get me past the first username form, but I noticed it then goes
to a challenge question form and im not sure how to take the resulting for
and resubmit it with new information and then resubmit the resulting form
with the password

-----Original Message-----
From: cv******@gmail.com [mailto:cv******@gmail.com] On Behalf Of Chris
Rebert
Sent: Thursday, October 16, 2008 10:15 AM
To: Support Desk
Cc: py*********@python.org
Subject: Re: account balance checker

You'd probably have to use something like mechanize
(http://wwwsearch.sourceforge.net/mechanize/) to fill out the forms,
but if BofA's website uses Javascript at all, you're probably out of
luck.

Cheers,
Chris
--
Follow the path of the Iguana...
http://rebertia.com

On Wed, Oct 15, 2008 at 8:09 AM, Support Desk
<su**************@gmail.comwrote:
Hello all,

I was wondering if it would be possible to make a script to grab my
balance account balance a few times a day without having to login every
time. I know I can use the urlib2 library, but not sure how to go about
filling in the forms and submitting them. BOA has a mobile site that is
pretty simple. Anyone else use Bank of America and would be interested in
this. This is not for anything illegal, just for me to prevent
overdrafting
my account

https://sitekey.bankofamerica.com/sa...ileDevice=true

y =
urllib.urlopen('https://sitekey.bankofamerica.com/sas/signonScreen.do?isMobl
eDevice=true',urllib.urlencode({'onlineID':'MYONLL INEID'})).readlines()

--
http://mail.python.org/mailman/listinfo/python-list
Oct 16 '08 #5

P: n/a
Hello all,

I've been tearing my hair out trying to get pylon installed most of the day,
and it seems that both setup tools and paster.exe have some serious issues
with 64bit on windows.

Unfortunately I'm stuck with 2.6 64bit.

I think I've got it nearly all up and running, the biggest problem is that
paster fails to find the python executable when run, as it is 32bit and the
python executable is 64 [afaik]. Does anyone either have a 64bit compile, or
know where the source code for paster.exe lives so I can attempt to compile
it myself. It doesn't appear to be in the source for PasterScript package.
Somehow it 'appears' when setup .py install is run, but I cannot find any c
code anywhere.

Many thanks in advance

Jules

Oct 23 '08 #6

This discussion thread is closed

Replies have been disabled for this discussion.