Pre-defining an action to take when an expected error occurs

Tempo

Hello. I am getting the error that is displayed below, and I know
exactly why it occurs. I posted some of my program's code below, and if
you look at it you will see that the error terminates the program
pre-maturely. Becasue of this pre-mature termination, the program is
not able to execute it's final line of code, which is a very important
line. The last line saves the Excel spreadsheet. So is there a way to
make sure the last line executes? Thanks in advanced for all of the
help. Thank you.
Error
####

IndexError: list index out of range
Code Sample
###########

for rx in range(sh.nrows):
rx = rx +1
u = sh.cell_value(rx, 0)
u = str(u)
if u != end:
page = urllib2.urlopen(u)
soup = BeautifulSoup(page)
p = soup.findAll('span', "sale")
p = str(p)
p2 = re.findall('\$\d+\.\d\d', p)
for row in p2:
ws.write(r,0,row)

w.save('price_list.xls')

Sep 15 '06 #1

Subscribe Post Reply

2535

Steven D'Aprano

On Thu, 14 Sep 2006 19:40:35 -0700, Tempo wrote:

Hello. I am getting the error that is displayed below, and I know
exactly why it occurs. I posted some of my program's code below, and if
you look at it you will see that the error terminates the program
pre-maturely. Becasue of this pre-mature termination, the program is
not able to execute it's final line of code, which is a very important
line. The last line saves the Excel spreadsheet. So is there a way to
make sure the last line executes?

Two methods:

(1) Fix the bug so the program no longer terminates early. You are getting
an IndexError "list index out of range", so fix the program so it no
longer tries to access beyond the end of the list.

I'm guessing that your error is right at the beginning of the loop. You
say:

for rx in range(sh.nrows):
rx = rx +1

Why are you adding one to the loop variable? That's equivalent to:

for rx in range(1, sh.nrows + 1)

which probably means it skips row 0 and tries to access one row past the
end of sh. If all you want to do is skip row 0, do this instead:

for rx in range(1, sh.nrows)
(2) Stick a band-aid over the error with a try...except block, and hope
you aren't covering up other errors as well. While we're at it, let's
refactor the code a little bit...

# untested
def write_row(rx, sh, end, ws):
u = str(sh.cell_value(rx, 0))
if u != end:
soup = BeautifulSoup(urllib2.urlopen(u))
p = str(soup.findAll('span', "sale"))
for row in re.findall('\$\d+\.\d\d', p):
ws.write(r,0,row) # what's r? did you mean rx?
Now call this:

for rx in range(sh.nrows):
rx += 1 # but see my comments above...
try:
write_row(rx, sh, end, ws)
except IndexError:
pass
w.save('price_list.xls')

--
Steven D'Aprano

Sep 15 '06 #2

Gabriel Genellina

At Thursday 14/9/2006 23:40, Tempo wrote:

>Hello. I am getting the error that is displayed below, and I know
exactly why it occurs. I posted some of my program's code below, and if
you look at it you will see that the error terminates the program
pre-maturely. Becasue of this pre-mature termination, the program is
not able to execute it's final line of code, which is a very important
line. The last line saves the Excel spreadsheet. So is there a way to
make sure the last line executes? Thanks in advanced for all of the
help. Thank you.

You want a try/finally block.
Read the Python Tutorial: http://docs.python.org/tut/node10.html

Gabriel Genellina
Softlab SRL

__________________________________________________
Preguntá. Respondé. Descubrí.
Todo lo que querías saber, y lo que ni imaginabas,
está en Yahoo! Respuestas (Beta).
¡Probalo ya!
http://www.yahoo.com.ar/respuestas

Sep 15 '06 #3

John Machin

Tempo wrote:

Hello. I am getting the error that is displayed below, and I know
exactly why it occurs. I posted some of my program's code below, and if
you look at it you will see that the error terminates the program
pre-maturely. Becasue of this pre-mature termination, the program is
not able to execute it's final line of code, which is a very important
line. The last line saves the Excel spreadsheet. So is there a way to
make sure the last line executes? Thanks in advanced for all of the
help. Thank you.
Error
####

IndexError: list index out of range

[snip]

Hi, Tempo, nice to see xlrd getting some usage :-)

Are you sure that you *really* want to save a spreadsheet written by
your buggy program??? It is better to fix errors, rather than ignore
them.

However, as you asked how: To ensure that cleanup code is executed no
matter what happens, use try/except/finally as in the following
example.

HTH,
John
C:\junk>type tempo.py
import sys

def main(n):
return (10 ** n) / n

try:
try:
print "executing the application ..."
n = int(sys.argv[1])
if n < 0:
# simulate something nasty happening ...
import nosuchmodule
else:
x = main(n)
print "app completed normally: %r" % x
except KeyboardInterrupt:
# need this to escape when n is large
print "kbd interrupt ...."
raise
except ZeroDivisionError:
print "doh!"
raise
finally:
# code to be executed no matter what happens
print "finally ... cleaning up"

C:\junk>tempo.py 0
executing the application ...
doh!
finally ... cleaning up
Traceback (most recent call last):
File "C:\junk\tempo.py", line 14, in ?
x = main(n)
File "C:\junk\tempo.py", line 4, in main
return (10 ** n) / n
ZeroDivisionError: integer division or modulo by zero

C:\junk>tempo.py -1
executing the application ...
finally ... cleaning up
Traceback (most recent call last):
File "C:\junk\tempo.py", line 12, in ?
import nosuchmodule
ImportError: No module named nosuchmodule

C:\junk>tempo.py 3
executing the application ...
app completed normally: 333
finally ... cleaning up

C:\junk>tempo.py 10000000000
executing the application ...
kbd interrupt ....
finally ... cleaning up
Traceback (most recent call last):
File "C:\junk\tempo.py", line 14, in ?
x = main(n)
File "C:\junk\tempo.py", line 4, in main
return (10 ** n) / n
KeyboardInterrupt

C:\junk>

Sep 15 '06 #4

Tempo

Thanks for all of the help. It all has been very useful to an new
python programmer. I agree that I should fix the error/bug instead of
handeling it with a try/etc. However, I do not know why
"range(sh.nrows)" never gets the right amount of rows right. For
example, if the Excel sheet has 10 rows with data in them, the
statement "range(sh.nrows)" should build the list of numbers [0,
1,...9]. It should, but it doesn't do that. What it does is buld a list
from [0, 1...20] or more or a little less, but the point is that it
always grabs empy rows after the last row containing data. Why is that?
I have no idea why, but I do know that that is what is producing the
error I am getting. Thanks again for the responses that I have received
already, and again thanks for any further help. Thanks you.

Sep 15 '06 #5

Steve Lianoglou

if the Excel sheet has 10 rows with data in them, the

statement "range(sh.nrows)" should build the list of numbers [0,
1,...9]. It should, but it doesn't do that. What it does is buld a list
from [0, 1...20] or more or a little less, but the point is that it
always grabs empy rows after the last row containing data. Why is that?

Just a stab in the dark, but maybe there's some rogue whitespace in
some cell that's in a rowyou think is empty?

You could try to just select out a (small) region of your data, copy it
and paste it into a new spreadhseet to see if you're still getting the
problem.

-steve

Sep 15 '06 #6

John Machin

Tempo wrote:

Thanks for all of the help. It all has been very useful to an new
python programmer. I agree that I should fix the error/bug instead of
handeling it with a try/etc. However, I do not know why
"range(sh.nrows)" never gets the right amount of rows right. For
example, if the Excel sheet has 10 rows with data in them, the
statement "range(sh.nrows)" should build the list of numbers [0,
1,...9]. It should, but it doesn't do that. What it does is buld a list
from [0, 1...20] or more or a little less, but the point is that it
always grabs empy rows after the last row containing data. Why is that?
I have no idea why, but I do know that that is what is producing the
error I am getting. Thanks again for the responses that I have received
already, and again thanks for any further help. Thanks you.

So the xlrd package's Book.Sheet.nrows allegedly "never gets the right
amount of rows right"? Never?? Before making such rash statements in a
public forum [1], you might like to check exactly what you have in your
file. Here's how:

(1) Using OpenOffice.org Calc or Gnumeric (or Excel if you must), open
yourfile.xls and save it as yourfile.csv. Inspect yourfile.csv

(2) Use the runxlrd script that's supplied with xlrd:

runxlrd.py show yourfile.xls >yourfile_show.txt

Inspect yourfile_show.txt. You'll see things like:
cell A23: type=1, data: u'ascii'
cell B23: type=0, data: ''
cell C23: type=1, data: u'123456'
cell D23: type=0, data: ''
cell E23: type=4, data: 0
The cell-type numbers are in the docs, but briefly: 0 is empty cell, 1
is text, 2 is number, 3 is date, 4 is boolean, 5 is error. If you find
only type=0 in the last row, then indeed you have found a bug and
should report it to the package author (together with a file that
exhibits the problem).

You are likely to find that there are cells containing zero-length
strings, or strings that contain spaces. They *do* contain data, as
opposed to empty cells.

[1] There's a possibility that the package's author reads this
newsgroup, and I've heard tell that he's a cranky old so-and-so; you
wouldn't want him to take umbrage, would you?

HTH,
John

Sep 15 '06 #7

Tempo

John Machin thanks for all of your help, and I take responsibility for
the way I worded my sentences in my last reply to this topic. So in an
effort to say sorry, I want to make it clear to everybody that it seems
as though errors in my code and use of external programs (Excel in
particular) are making "range(sh.nrows)" have faulty results. I am
trying to pinpoint the spot in my code or use of Excel, before
"range(sh.nrows) is executed, that is bugged. John Machin, I am
thrilled that the package xlrd exists at all because it simplifies a
daunting task for a beginner programer--me. Its uses are not bound to
beginners either. So thanks for the package and your help to this point.

Sep 15 '06 #8

Steve Holden

John Machin wrote:
[...]

>
[1] There's a possibility that the package's author reads this
newsgroup, and I've heard tell that he's a cranky old so-and-so; you
wouldn't want him to take umbrage, would you?

Cranks doesn't even *begin* to describe it ...

regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://holdenweb.blogspot.com
Recent Ramblings http://del.icio.us/steve.holden

Sep 15 '06 #9

John Machin

On 16/09/2006 2:55 AM, Tempo wrote:

John Machin thanks for all of your help, and I take responsibility for
the way I worded my sentences in my last reply to this topic. So in an
effort to say sorry, I want to make it clear to everybody that it seems
as though errors in my code and use of external programs (Excel in
particular) are making "range(sh.nrows)" have faulty results. I am
trying to pinpoint the spot in my code or use of Excel, before
"range(sh.nrows) is executed, that is bugged. John Machin, I am
thrilled that the package xlrd exists at all because it simplifies a
daunting task for a beginner programer--me. Its uses are not bound to
beginners either. So thanks for the package and your help to this point.

I'm sorry, too: I should have wrapped my post in <humour... </humour>
tags:-)

Of course it's up to you to decide the criteria for filtering out
accidental non-data from your spreadsheet. Note that this phenomenon is
not restricted to spreadsheets; one often sees text data files with
blank or empty lines on the end -- one's app just has to cope with that.

Here's an example of a function that will classify a bunch of cells for you:

def usefulness_of_cells(cells):
"""Score each cell:
as 0 if empty,
as 1 if zero-length text,
as 2 if text and value.isspace() is true,
otherwise as 3.
Return the highest score found.
"""
score = 0
for cell in cells:
if cell.ctype == xlrd.XL_CELL_EMPTY:
continue
if cell.ctype == xlrd.XL_CELL_TEXT:
if not cell.value:
if not score:
score = 1
continue
if cell.value.isspace():
score = 2
continue
return 3
return score

and here's an example of using it:

def number_of_good_rows(sheet):
"""Return 1 + the index of the last row with meaningful data in it."""
for rowx in xrange(sheet.nrows - 1, -1, -1):
score = usefulness_of_cells(sheet.row(rowx))
if score == 3:
return rowx+1
return 0

A note on using the isspace() method: ensure that you use it on
cell.value (which is Unicode), not on an 8-bit encoding (especially if
your locale is set to the default ("C")).

| >>'\xA0'.isspace()
| False
| >>u'\xA0'.isspace()
| True
| >>import unicodedata as ucd
| >>ucd.name(u'\xA0')
| 'NO-BREAK SPACE'

You can get these in spreadsheets when folk paste in stuff off a web
page that uses   as padding (because HTML trims out
leading/trailing/multiple instances of SPACE). Puzzled the heck out of
me the first time I encountered it until I did:
print repr(data_that_the_users_were_shrieking_about)

Here's a tip: repr() in Python and "View Page Source" in Firefox come
in very handy when you have "what you see is not what you've got" problems.

Anyway, I'll add something like the above functions in an examples
directory in the next release of xlrd (which is at alpha stage right
now). I'll also add in a Q&A section in the docs, starting with "Why
does xlrd report more rows than I see on the screen?" -- so do let us
know what you find down the end of your spreadsheet, in case it's a
strange beast that hasn't been seen before.

HTH,
John

Sep 17 '06 #10

Tempo

It worked. Those two functions (usefulness_of_cells &
number_of_good_rows) seem to work flawlessly...knock on wood. I have
run a number of different Excel spreadsheets through the functions, and
so far the functions have a 100% acuracy rating. The acuracy rating is
based on the functions' returned number of cells containing text,
excluding a space as text, against the actual, hand counted number of
cells with text. Thank you John Machin for all of your help. I am using
these two functions, with your name tagged to them both. Let me know if
that's a problem. Thank you again.

Sep 20 '06 #11

John Machin

Tempo wrote:

It worked. Those two functions (usefulness_of_cells &
number_of_good_rows) seem to work flawlessly...knock on wood. I have
run a number of different Excel spreadsheets through the functions, and
so far the functions have a 100% acuracy rating. The acuracy rating is
based on the functions' returned number of cells containing text,
excluding a space as text, against the actual, hand counted number of
cells with text.

So your worksheet(s) did have rows at the end with cells with spaces in
them?

Thank you John Machin for all of your help. I am using
these two functions, with your name tagged to them both. Let me know if
that's a problem. Thank you again.

Not a problem; like I said, I'll put those functions in the next
release as examples.

Cheers,
John

Sep 20 '06 #12

Similar topics

Pre-formatted text

by: GriffithsJ | last post by:

Hi I have been given some text that needs to be displayed on a web page. The text is pre-formatted (includes things like lists etc) and displays okay if I wrap it using the <pre/> tag. ...

ASP / Active Server Pages

<pre> and proportional vs fixed width fonts

by: Headless | last post by:

I've marked up song lyrics with the <pre> tag because it seems the most appropriate type of markup for the type of data. This results in inefficient use of horizontal space due to UA's default...

HTML / CSS

white-space: pre (well supported?)

by: Jerry Sievers | last post by:

tried to avoid using PRE in the page markup and instead used DIV CLASS=foo and assigned the white-space pre property to it. have some reports already that text is not showing as preformatted. ...

HTML / CSS

stylesheet with several <pre> styles?

by: Alan Illeman | last post by:

How do I set several different properties for PRE in a CSS stylesheet, rather than resorting to this: <BODY> <PRE STYLE="font-family:monospace; font-size:0.95em; width:40%; border:red 2px...

HTML / CSS

Two <PRE> Selectors

by: Buck Turgidson | last post by:

I want to have a css with 2 PRE styles, one bold with large font, and another non-bold and smaller font. I am new to CSS (and not exactly an expert in HTML, for that matter). Is there a way to...

HTML / CSS

<pre> tag font and size

by: Porthos | last post by:

I'm authoring an XML document and using the <pre> html tag for the portions that are not dynamically generated. The <pre> text is displaying in a smaller font size (and I believe different font)...

HTML / CSS

Pre element and Overflow Attribute

by: R0bert Nev1lle | last post by:

Here's my next IE challenge (or frustration). It deals with the overflow attribute. Overflow property was a challenge on my page since the page emulates position fixed for IE. The present...

HTML / CSS

RegEx for changing linefeeds to <BR> except between <PRE></PRE> tags?

by: Rocky Moore | last post by:

I have a web site called HintsAndTips.com. On this site people post tips using a very simply webform with a multi line TextBox for inputing the tip text. This text is encode to HTML so that no...

ASP.NET

pre ANSI code and writable-strings?

by: Paul Connolly | last post by:

char *s = "Hello"; s = 'J'; puts(s); might print "Jello" in a pre-ANSI compiler - is the behaviour of this program undefined in any pre-ANSI compiler - or would it always have printed "Jello"...

C / C++

Problem with pre {display: inline}

by: Vadim Guchenko | last post by:

Hello. I'm using the following code: <html> <head> <style type="text/css"> pre {display: inline;} </style> </head>

HTML / CSS

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing