473,388 Members | 1,480 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,388 software developers and data experts.

enumerate overflow

Hello all,

in python2.4, i read lines from a file with

for lineNum, line in enumerate(f): ...

However, lineNum soon overflows and starts counting backwards. How do
i force enumerate to return long integer?

Cheers.

Oct 3 '07 #1
11 1520
cr**@post.cz schrieb:
Hello all,

in python2.4, i read lines from a file with

for lineNum, line in enumerate(f): ...

However, lineNum soon overflows and starts counting backwards. How do
i force enumerate to return long integer?
Most probably you can't, because it is a C-written function I presume.

But as python 2.4 has generators, it's ease to create an enumerate yourself:
def lenumerate(f):
i = 0
for line in f:
yield i, line
i += 1

Diez
Oct 3 '07 #2
cr**@post.cz wrote:
Hello all,

in python2.4, i read lines from a file with

for lineNum, line in enumerate(f): ...

However, lineNum soon overflows and starts counting backwards. How do
i force enumerate to return long integer?
Just how "soon" exactly do you read sys.maxint lines from a file? I
should have thought that it would take a significant amount of time to
read 2,147,483,647 lines ...

But it is true that Python 2.5 uses an enumobject representation that
limits the index to a (C) long:

typedef struct {
PyObject_HEAD
long en_index; /* current index of enumeration */
PyObject* en_sit; /* secondary iterator of enumeration */
PyObject* en_result; /* result tuple */
} enumobject;

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden

Sorry, the dog ate my .sigline

Oct 3 '07 #3
>for lineNum, line in enumerate(f): ...
>>
However, lineNum soon overflows and starts counting backwards. How do
i force enumerate to return long integer?
Just how "soon" exactly do you read sys.maxint lines from a file? I
should have thought that it would take a significant amount of time to
read 2,147,483,647 lines ...
A modestly (but not overwhelmingly) long time:

(defining our own xrange-ish generator that can handle things
larger than longs)
>>def xxrange(x):
.... i = 0
.... while i < x:
.... yield i
.... i += 1
....
>>for i,j in enumerate(xxrange(2**33)): assert i==j
....
Traceback (most recent call last):
File "<stdin>", line 1, in ?
AssertionError
It took me about an 60-90 minutes to hit the assertion on a
dual-core 2.8ghz machine under otherwise-light-load. If
batch-processing lengthy log files or other large data such as
genetic data, it's entirely possible to hit this limit as the OP
discovered.

-tkc

Oct 3 '07 #4
Tim Chase wrote:
>>for lineNum, line in enumerate(f): ...

However, lineNum soon overflows and starts counting backwards. How do
i force enumerate to return long integer?
Just how "soon" exactly do you read sys.maxint lines from a file? I
should have thought that it would take a significant amount of time to
read 2,147,483,647 lines ...

A modestly (but not overwhelmingly) long time:

(defining our own xrange-ish generator that can handle things larger
than longs)
>>def xxrange(x):
... i = 0
... while i < x:
... yield i
... i += 1
...
>>for i,j in enumerate(xxrange(2**33)): assert i==j
...
Traceback (most recent call last):
File "<stdin>", line 1, in ?
AssertionError
It took me about an 60-90 minutes to hit the assertion on a dual-core
2.8ghz machine under otherwise-light-load. If batch-processing lengthy
log files or other large data such as genetic data, it's entirely
possible to hit this limit as the OP discovered.
I wouldn't dream of suggesting it's impossible. I just regard "soon" as
less than an hour in commuter's terms, I suppose.

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden

Sorry, the dog ate my .sigline
Oct 3 '07 #5
Steve Holden wrote:
I wouldn't dream of suggesting it's impossible.
I just regard "soon" as less than an hour in
commuter's terms, I suppose.
Sadly, speaking as a Londoner, an hour is indeed
"soon" in commuter terms.

TJG

Oct 3 '07 #6
[Paul Rubin]
I hope in 3.0 there's a real fix, i.e. the count should promote to
long.
In Py2.6, I will mostly likely put in an automatic promotion to long
for both enumerate() and count(). It took a while to figure-out how
to do this without killing the performance for normal cases (ones used
in real programs, not examples contrived to say, "omg, see what
*could* happen").
Raymond

Oct 3 '07 #7
Raymond Hettinger <py****@rcn.comwrites:
In Py2.6, I will mostly likely put in an automatic promotion to long
for both enumerate() and count(). It took a while to figure-out how
to do this without killing the performance for normal cases (ones used
in real programs, not examples contrived to say, "omg, see what
*could* happen").
Great, this is good to hear. I think it's ok if the enumeration slows
down after fixnum overflow is reached. So it's just a matter of
replacing the overflow signal with consing up a long. The fixnum case
would be the same as it is now. To be fancy, the count could be
stored in two C ints (or a gcc long long) so it would go up to 64 bits
but I don't think it's worth it, especially for itertools.count which
should be able to take arbitrary (i.e. larger than 64 bits) initializers.

As for real programs, well, the Y2038 bug is slowly creeping up on us.
That's when Unix timestamps overflow a signed 32-bit counter. It's
already caused an actual system failure, in 2006:

http://worsethanfailure.com/Articles...he_Epoch_.aspx

Really, the whole idea of int/long unification is so we can stop
worrying about "omg, that could happen". We want to write programs
without special consideration or "omg" about those possibilities, and
still have them keep working smoothly if that DOES happen. Just about
all of us these days have 100's of GB's or more of disk space on our
systems, and files with over 2**32 bytes or lines are not even
slightly unreasonable. We shouldn't have to write special generators
to deal with them, the library should instead just do the right thing.
Oct 3 '07 #8
Raymond Hettinger <py****@rcn.comwrites:
[Paul Rubin]
>I hope in 3.0 there's a real fix, i.e. the count should promote to
long.

In Py2.6, I will mostly likely put in an automatic promotion to long
for both enumerate() and count(). It took a while to figure-out how
to do this without killing the performance for normal cases (ones
used in real programs, not examples contrived to say, "omg, see what
*could* happen").
Using PY_LONG_LONG for the counter, and PyLong_FromLongLong to create
the Python number should work well for huge sequences without
(visibly) slowing down the normal case.
Oct 3 '07 #9
On Oct 3, 7:22 pm, Raymond Hettinger <pyt...@rcn.comwrote:
In Py2.6, I will mostly likely put in an automatic promotion to long
for both enumerate() and count(). It took a while to figure-out how
to do this without killing the performance for normal cases (ones used
in real programs, not examples contrived to say, "omg, see what
*could* happen").

Raymond

Thanks everybody for the reply and suggestions, I'm glad to see the
issues's already been discovered/discussed/almostresolved.

By the way, I do not consider my programs in any way 'unreal'.

Oct 3 '07 #10
On Oct 3, 12:52 pm, koara <ko...@atlas.czwrote:
Thanks everybody for the reply and suggestions, I'm glad to see the
issues's already been discovered/discussed/almostresolved.
The new code is checked-in. In Py2.6, enumerate() will no longer
raise an OverflowError and it will automatically shift from ints to
longs. Will check in something similar for itertools.count() when I
get a chance.
Raymond
Oct 3 '07 #11
En Wed, 03 Oct 2007 08:46:31 -0300, <cr**@post.czescribi�:
in python2.4, i read lines from a file with

for lineNum, line in enumerate(f): ...

However, lineNum soon overflows and starts counting backwards. How do
i force enumerate to return long integer?
(what kind of files are you using? enumerate overlows after more than two
billion lines... is that "soon" for you?)

I'm afraid neither iterate nor itertools.count will generate a long
integer; upgrading to Python 2.5 won't help. I think the only way is to
roll your own counter:

lineNum = 0
for line in f:
...
lineNum += 1

--
Gabriel Genellina

Oct 4 '07 #12

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: Pekka Niiranen | last post by:
Hi, I have Perl code looping thru lines in the file: line: while (<INFILE>) { ... $_ = do something ... if (/#START/) { # Start inner loop
5
by: HL | last post by:
Hi, I need to enumerate windows and find the sum of the rect of all the windows of a specific application. In C++, I use the APIs - 'EnumWindows , GetWindowRect and UnionRect to accomplish the...
1
by: smichr | last post by:
I see that there is a thread of a similar topic that was posted recently ( enumerate with a start index ) but thought I would start a new thread since what I am suggesting is a little different. ...
6
by: Gregory Petrosyan | last post by:
Hello! I have a question for the developer of enumerate(). Consider the following code: for x,y in coords(dots): print x, y When I want to iterate over enumerated sequence I expect this to...
2
by: eight02645999 | last post by:
hi, i am using python 2.1. Can i use the code below to simulate the enumerate() function in 2.3? If not, how to simulate in 2.1? thanks from __future__ import generators def...
8
by: Dustan | last post by:
Can I make enumerate(myObject) act differently? class A(object): def __getitem__(self, item): if item 0: return self.sequence elif item < 0: return self.sequence elif item == 0: raise...
21
by: James Stroud | last post by:
I think that it would be handy for enumerate to behave as such: def enumerate(itrbl, start=0, step=1): i = start for it in itrbl: yield (i, it) i += step This allows much more flexibility...
12
by: Danny Colligan | last post by:
In the following code snippet, I attempt to assign 10 to every index in the list a and fail because when I try to assign number to 10, number is a deep copy of the ith index (is this statement...
42
by: thomas.mertes | last post by:
Is it possible to use some C or compiler extension to catch integer overflow? The situation is as follows: I use C as target language for compiled Seed7 programs. For integer computions the C...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.