473,378 Members | 1,412 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,378 software developers and data experts.

Unexpected Behavior Iterating over a Mutating Object

OK, first, I don't often have the time to read this group, so
apologies if this is a FAQ, though I couldn't find anything at
python.org.

Second, this isn't my code. I wouldn't do this. But a colleague did,
got an unexpected result, and asked me why. I think I can infer what
is occurring, and I was able to find a simple work-around. But I
thought I'd ask about it anyway.

I've been pushing Python at work for use as a scripting language to
get simple tasks done. My colleague is modifying a parts library for
a schematic capture program with a Python script. The library is
entirely in ASCII text.

He's trying to delete a group of parts from the library. The simple
code below shows what is occurring (taken from an IDLE session, Python
2.4.1#65 for Windoze). The "parts" to be deleted in the example
contain the string 'DEL':

---begin included file---
data = [ 'First', 'Second DEL', 'Third', 'Fourth', 'Fifth DEL', 'DEL Sixth', 'Seventh DEL', 'Eighth DEL',
'Ninth DEL', 'Tenth', 'Eleventh', 'Twelfth']
bfr = []
for item in data: if item.find('DEL') >= 0:
bfr.append(item)
data.remove(item)

data ['First', 'Third', 'Fourth', 'DEL Sixth', 'Eighth DEL', 'Tenth',
'Eleventh', 'Twelfth'] bfr ['Second DEL', 'Fifth DEL', 'Seventh DEL', 'Ninth DEL']
---end included file---

It seems when an item is 'remove'd from data, the rest of the list
'shifts' over one, so what was 'next' now occupies the slot of the
'remove'd item. When the next 'item' is selected from 'data', the
_desired_ 'next' item is skipped. So when 'data' has several
consecutive items to be deleted, only every other one is 'remove'd.

The workaround is very simple:

---begin included file--- for item in data: if item.find('DEL') >= 0:
bfr.append(item)

for item in bfr: data.remove(item)

data ['First', 'Third', 'Fourth', 'Tenth', 'Eleventh', 'Twelfth'] bfr

['Second DEL', 'Fifth DEL', 'DEL Sixth', 'Seventh DEL', 'Eighth DEL',
'Ninth DEL']
---end included file---

Again, iterating over an item that is mutating seems like a Bad
Idea(tm) to me. But I was curious: is this the intended behavior, or
does this fall under what C programmers would call 'undefined
behavior.'

Thanks,

-=Dave
--
Change is inevitable, progress is not.
Sep 13 '05 #1
7 1674
You are correct if you remove items from a list that you
are iterating over you get the results you see. You either
need to iterate over the list in reverse order or better yet
create a new list from the old one with the items removed.

In Python 2.4 you can do:

data = [ 'First', 'Second DEL', 'Third', 'Fourth',
'Fifth DEL', 'DEL Sixth', 'Seventh DEL', 'Eighth DEL',
'Ninth DEL', 'Tenth', 'Eleventh', 'Twelfth']

bfr=[x for x in data if 'DEL' not in x]

Note: I'm not sure why 'DEL' is at the end of all the strings
EXCEPT the Sixth one where it is at the beginning, but if the
"DEL" was consistently at the end of the strings I would do
this instead (which would also work in Python 2.2+):

bfr=[x for x in data if not x.endswith('DEL')]

This also is much easier to read (IMHO) than the loops, etc.
of the original code. In Python 2.4 list comprehensions are
also quite efficient.

Larry Bates

Dave Hansen wrote:
OK, first, I don't often have the time to read this group, so
apologies if this is a FAQ, though I couldn't find anything at
python.org.

Second, this isn't my code. I wouldn't do this. But a colleague did,
got an unexpected result, and asked me why. I think I can infer what
is occurring, and I was able to find a simple work-around. But I
thought I'd ask about it anyway.

I've been pushing Python at work for use as a scripting language to
get simple tasks done. My colleague is modifying a parts library for
a schematic capture program with a Python script. The library is
entirely in ASCII text.

He's trying to delete a group of parts from the library. The simple
code below shows what is occurring (taken from an IDLE session, Python
2.4.1#65 for Windoze). The "parts" to be deleted in the example
contain the string 'DEL':

---begin included file---
data = [ 'First', 'Second DEL', 'Third', 'Fourth',
'Fifth DEL', 'DEL Sixth', 'Seventh DEL', 'Eighth DEL',
'Ninth DEL', 'Tenth', 'Eleventh', 'Twelfth']

bfr = []
for item in data:
if item.find('DEL') >= 0:
bfr.append(item)
data.remove(item)
data
['First', 'Third', 'Fourth', 'DEL Sixth', 'Eighth DEL', 'Tenth',
'Eleventh', 'Twelfth']
bfr
['Second DEL', 'Fifth DEL', 'Seventh DEL', 'Ninth DEL']
---end included file---

It seems when an item is 'remove'd from data, the rest of the list
'shifts' over one, so what was 'next' now occupies the slot of the
'remove'd item. When the next 'item' is selected from 'data', the
_desired_ 'next' item is skipped. So when 'data' has several
consecutive items to be deleted, only every other one is 'remove'd.

The workaround is very simple:

---begin included file---
for item in data:
if item.find('DEL') >= 0:
bfr.append(item)
for item in bfr:
data.remove(item)
data
['First', 'Third', 'Fourth', 'Tenth', 'Eleventh', 'Twelfth']
bfr


['Second DEL', 'Fifth DEL', 'DEL Sixth', 'Seventh DEL', 'Eighth DEL',
'Ninth DEL']
---end included file---

Again, iterating over an item that is mutating seems like a Bad
Idea(tm) to me. But I was curious: is this the intended behavior, or
does this fall under what C programmers would call 'undefined
behavior.'

Thanks,

-=Dave

Sep 13 '05 #2
Dave Hansen wrote:
...
data = [ 'First', 'Second DEL', 'Third', 'Fourth', 'Fifth DEL', 'DEL Sixth', 'Seventh DEL', 'Eighth DEL',
'Ninth DEL', 'Tenth', 'Eleventh', 'Twelfth']bfr = []
for item in data:

if item.find('DEL') >= 0:
bfr.append(item)
data.remove(item)


assuming Python 2.4:

kept = []
for position, text in reversed(list(enumerate(data))):
if 'DEL' in text:
removed.append(text)
else:
del data[position]

This uses reversed so positions remain valid as the list is pruned.

--Scott David Da*****@Acm.Org
Scott.Da*****@Acm.Org
Sep 13 '05 #3
[Dave Hansen]
It seems when an item is 'remove'd from data, the rest of the list
'shifts' over one, so what was 'next' now occupies the slot of the
'remove'd item. When the next 'item' is selected from 'data', the
_desired_ 'next' item is skipped. So when 'data' has several
consecutive items to be deleted, only every other one is 'remove'd.
Your analysis is dead-on. Instead of being the "desired" item, the
iterator fetches a[0], a[1], a[2], ... as determined by the state of
the list when the index is retrieved.
Again, iterating over an item that is mutating seems like a Bad
Idea(tm) to me.
Right.

But I was curious: is this the intended behavior, or
does this fall under what C programmers would call 'undefined
behavior.'


It is intended. The behavior is defined as above (a series of indexed
lookups).

BTW, the usual ways to deal with this are to:
1. iterating from the right using: for item in reversed(data)
2. making a copy of the original list before iterating
3. creating a new result list: data = [x for x in data if 'DEL' not in
x]
Raymond

Sep 13 '05 #4
Dave Hansen wrote:
Again, iterating over an item that is mutating seems like a Bad
Idea(tm) to me. But I was curious: is this the intended behavior, or
does this fall under what C programmers would call 'undefined
behavior.'


a C programmer wouldn't have much problem with this, of course,
since "for in" behaves like:

for (tmp = 0; var = getitem(seq, tmp); tmp++) {
...
}

</F>

Sep 14 '05 #5
Dave Hansen wrote:
(snip code snippets and sensible explanations)
Again, iterating over an item that is mutating seems like a Bad
Idea(tm) to me.
It as *always* been a bad idea to modify a list in place (I mean adding
or removing items) while iterating over it, whatever the language. If
you *really* need to do such a thing (for effeciency reasons - and then
it's usually in low-level C code), you'd better use indexed access, and
adjust the index as needed - but this results in tricky, hard to
maintain code.
But I was curious: is this the intended behavior, or
does this fall under what C programmers would call 'undefined
behavior.'


Not being a Language Lawyer(tm), I can't tell for sure, but I'd think
it's the expected behavior. Anyway it's not a behavior I'd relie upon,
since it would be too much of a dirty trick anyway.

My 2 cents
--
bruno desthuilliers
python -c "print '@'.join(['.'.join([w[::-1] for w in p.split('.')]) for
p in 'o****@xiludom.gro'.split('@')])"
Sep 14 '05 #6
On Tue, 13 Sep 2005 21:28:21 GMT, id**@hotmail.com (Dave Hansen)
wrote:
OK, first, I don't often have the time to read this group, so
apologies if this is a FAQ, though I couldn't find anything at
python.org.


Thanks to everyone who responded. All is clear now. And I know I
need to look deeper into list comprehensions...

Regards,

-=Dave
--
Change is inevitable, progress is not.
Sep 14 '05 #7
On Wed, 14 Sep 2005 11:12:14 +0200, bruno modulix <on***@xiludom.gro> wrote:
Dave Hansen wrote:
(snip code snippets and sensible explanations)
Again, iterating over an item that is mutating seems like a Bad
Idea(tm) to me.
It as *always* been a bad idea to modify a list in place (I mean adding
or removing items) while iterating over it, whatever the language. If
you *really* need to do such a thing (for effeciency reasons - and then
it's usually in low-level C code), you'd better use indexed access, and
adjust the index as needed - but this results in tricky, hard to
maintain code.

Do you consider this to be tricky, or likely to be hard to maintain?
(not specifically tested beyond what you see ;-)
data = [ 'First', 'Second DEL', 'Third', 'Fourth', ... 'Fifth DEL', 'DEL Sixth', 'Seventh DEL', 'Eighth DEL',
... 'Ninth DEL', 'Tenth', 'Eleventh', 'Twelfth'] iw = 0
for item in data: ... if 'DEL' in item: continue
... data[iw] = item
... iw += 1
... del data[iw:]
data ['First', 'Third', 'Fourth', 'Tenth', 'Eleventh', 'Twelfth']

Or that maybe a utility function like the following would be hard to maintain?
def filter_in_place(L, keep=bool): ... iw = 0
... for item in L:
... if keep(item):
... L[iw] = item
... iw += 1
... del L[iw:]
... data = [ 'First', 'Second DEL', 'Third', 'Fourth', ... 'Fifth DEL', 'DEL Sixth', 'Seventh DEL', 'Eighth DEL',
... 'Ninth DEL', 'Tenth', 'Eleventh', 'Twelfth'] data ['First', 'Second DEL', 'Third', 'Fourth', 'Fifth DEL', 'DEL Sixth', 'Seventh DEL', 'Eighth DEL'
, 'Ninth DEL', 'Tenth', 'Eleventh', 'Twelfth'] filter_in_place(data, lambda x: 'DEL' not in x)
data

['First', 'Third', 'Fourth', 'Tenth', 'Eleventh', 'Twelfth']

But I was curious: is this the intended behavior, or
does this fall under what C programmers would call 'undefined
behavior.'


Not being a Language Lawyer(tm), I can't tell for sure, but I'd think
it's the expected behavior. Anyway it's not a behavior I'd relie upon,
since it would be too much of a dirty trick anyway.

I think it depends on the kind of mutation involved. A mutation of the _structure_
of a container being iterated over effectively modifies the initial parameters of
the iteration control, so there you need either to have intimate knowledge of how
the iterator does its thing, or you have to insulate yourself from that need.
OTOH, a mutation of the content can be safe, if it doesn't modify the content
yet to be accessed during the iteration.

In the case of a list, the order can be depended on, so it's not that hard.
Other iterable containers are not so clear, and of course even in a list you
could have cases of access to early items having side effects on upcoming items,
e.g., if the "keep" evaluation had side effects on deep data shared between early
and later list items. But that would be nasty any which way ;-)

Regards,
Bengt Richter
Sep 14 '05 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: robert | last post by:
i've found the solution threads on changing a column on insert. works fine. question: - will one package serve for all such triggers, or does there need to be a package defined to support...
2
by: Attila Feher | last post by:
Hi all, I have not done much work around exceptions; and even when I do I avoid exception specifications. But now I have to teach people about these language facilities, so I am trying them out...
2
by: kw | last post by:
I'm getting different behavior from what I would expect and was hoping someone could clue me in. At run time I need to examine a property of an object for a custom TypeConverter, then use that...
0
by: G Dean Blake | last post by:
I am writing a control that prints any datagrid. My control gets passed a datagrid object and examines it, formats, and prints it. I am, however, encountering what I consider unexpected behavior...
6
by: Samuel M. Smith | last post by:
I have been playing around with a subclass of dict wrt a recipe for setting dict items using attribute syntax. The dict class has some read only attributes that generate an exception if I try to...
9
by: Jeff Louie | last post by:
In C# (and C++/cli) the destructor will be called even if an exception is thrown in the constructor. IMHO, this is unexpected behavior that can lead to an invalid system state. So beware! ...
2
by: Dimitri Furman | last post by:
SQL Server 2000 SP4. Running the script below prints 'Unexpected': ----------------------------- DECLARE @String AS varchar(1) SELECT @String = 'z' IF @String LIKE ''
23
by: gu | last post by:
hi to all! after two days debugging my code, i've come to the point that the problem was caused by an unexpected behaviour of python. or by lack of some information about the program, of course!...
5
by: anne.nospam01 | last post by:
Dear fellow Pythonians, I just stumbled upon the following unexpected behavior: class TestType(type): def Foo(self): return 'TestType Foo' class Test(object): __metaclass__ = TestType def...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.