473,224 Members | 1,770 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,224 software developers and data experts.

Unexpected Behavior Iterating over a Mutating Object

OK, first, I don't often have the time to read this group, so
apologies if this is a FAQ, though I couldn't find anything at
python.org.

Second, this isn't my code. I wouldn't do this. But a colleague did,
got an unexpected result, and asked me why. I think I can infer what
is occurring, and I was able to find a simple work-around. But I
thought I'd ask about it anyway.

I've been pushing Python at work for use as a scripting language to
get simple tasks done. My colleague is modifying a parts library for
a schematic capture program with a Python script. The library is
entirely in ASCII text.

He's trying to delete a group of parts from the library. The simple
code below shows what is occurring (taken from an IDLE session, Python
2.4.1#65 for Windoze). The "parts" to be deleted in the example
contain the string 'DEL':

---begin included file---
data = [ 'First', 'Second DEL', 'Third', 'Fourth', 'Fifth DEL', 'DEL Sixth', 'Seventh DEL', 'Eighth DEL',
'Ninth DEL', 'Tenth', 'Eleventh', 'Twelfth']
bfr = []
for item in data: if item.find('DEL') >= 0:
bfr.append(item)
data.remove(item)

data ['First', 'Third', 'Fourth', 'DEL Sixth', 'Eighth DEL', 'Tenth',
'Eleventh', 'Twelfth'] bfr ['Second DEL', 'Fifth DEL', 'Seventh DEL', 'Ninth DEL']
---end included file---

It seems when an item is 'remove'd from data, the rest of the list
'shifts' over one, so what was 'next' now occupies the slot of the
'remove'd item. When the next 'item' is selected from 'data', the
_desired_ 'next' item is skipped. So when 'data' has several
consecutive items to be deleted, only every other one is 'remove'd.

The workaround is very simple:

---begin included file--- for item in data: if item.find('DEL') >= 0:
bfr.append(item)

for item in bfr: data.remove(item)

data ['First', 'Third', 'Fourth', 'Tenth', 'Eleventh', 'Twelfth'] bfr

['Second DEL', 'Fifth DEL', 'DEL Sixth', 'Seventh DEL', 'Eighth DEL',
'Ninth DEL']
---end included file---

Again, iterating over an item that is mutating seems like a Bad
Idea(tm) to me. But I was curious: is this the intended behavior, or
does this fall under what C programmers would call 'undefined
behavior.'

Thanks,

-=Dave
--
Change is inevitable, progress is not.
Sep 13 '05 #1
7 1664
You are correct if you remove items from a list that you
are iterating over you get the results you see. You either
need to iterate over the list in reverse order or better yet
create a new list from the old one with the items removed.

In Python 2.4 you can do:

data = [ 'First', 'Second DEL', 'Third', 'Fourth',
'Fifth DEL', 'DEL Sixth', 'Seventh DEL', 'Eighth DEL',
'Ninth DEL', 'Tenth', 'Eleventh', 'Twelfth']

bfr=[x for x in data if 'DEL' not in x]

Note: I'm not sure why 'DEL' is at the end of all the strings
EXCEPT the Sixth one where it is at the beginning, but if the
"DEL" was consistently at the end of the strings I would do
this instead (which would also work in Python 2.2+):

bfr=[x for x in data if not x.endswith('DEL')]

This also is much easier to read (IMHO) than the loops, etc.
of the original code. In Python 2.4 list comprehensions are
also quite efficient.

Larry Bates

Dave Hansen wrote:
OK, first, I don't often have the time to read this group, so
apologies if this is a FAQ, though I couldn't find anything at
python.org.

Second, this isn't my code. I wouldn't do this. But a colleague did,
got an unexpected result, and asked me why. I think I can infer what
is occurring, and I was able to find a simple work-around. But I
thought I'd ask about it anyway.

I've been pushing Python at work for use as a scripting language to
get simple tasks done. My colleague is modifying a parts library for
a schematic capture program with a Python script. The library is
entirely in ASCII text.

He's trying to delete a group of parts from the library. The simple
code below shows what is occurring (taken from an IDLE session, Python
2.4.1#65 for Windoze). The "parts" to be deleted in the example
contain the string 'DEL':

---begin included file---
data = [ 'First', 'Second DEL', 'Third', 'Fourth',
'Fifth DEL', 'DEL Sixth', 'Seventh DEL', 'Eighth DEL',
'Ninth DEL', 'Tenth', 'Eleventh', 'Twelfth']

bfr = []
for item in data:
if item.find('DEL') >= 0:
bfr.append(item)
data.remove(item)
data
['First', 'Third', 'Fourth', 'DEL Sixth', 'Eighth DEL', 'Tenth',
'Eleventh', 'Twelfth']
bfr
['Second DEL', 'Fifth DEL', 'Seventh DEL', 'Ninth DEL']
---end included file---

It seems when an item is 'remove'd from data, the rest of the list
'shifts' over one, so what was 'next' now occupies the slot of the
'remove'd item. When the next 'item' is selected from 'data', the
_desired_ 'next' item is skipped. So when 'data' has several
consecutive items to be deleted, only every other one is 'remove'd.

The workaround is very simple:

---begin included file---
for item in data:
if item.find('DEL') >= 0:
bfr.append(item)
for item in bfr:
data.remove(item)
data
['First', 'Third', 'Fourth', 'Tenth', 'Eleventh', 'Twelfth']
bfr


['Second DEL', 'Fifth DEL', 'DEL Sixth', 'Seventh DEL', 'Eighth DEL',
'Ninth DEL']
---end included file---

Again, iterating over an item that is mutating seems like a Bad
Idea(tm) to me. But I was curious: is this the intended behavior, or
does this fall under what C programmers would call 'undefined
behavior.'

Thanks,

-=Dave

Sep 13 '05 #2
Dave Hansen wrote:
...
data = [ 'First', 'Second DEL', 'Third', 'Fourth', 'Fifth DEL', 'DEL Sixth', 'Seventh DEL', 'Eighth DEL',
'Ninth DEL', 'Tenth', 'Eleventh', 'Twelfth']bfr = []
for item in data:

if item.find('DEL') >= 0:
bfr.append(item)
data.remove(item)


assuming Python 2.4:

kept = []
for position, text in reversed(list(enumerate(data))):
if 'DEL' in text:
removed.append(text)
else:
del data[position]

This uses reversed so positions remain valid as the list is pruned.

--Scott David Da*****@Acm.Org
Scott.Da*****@Acm.Org
Sep 13 '05 #3
[Dave Hansen]
It seems when an item is 'remove'd from data, the rest of the list
'shifts' over one, so what was 'next' now occupies the slot of the
'remove'd item. When the next 'item' is selected from 'data', the
_desired_ 'next' item is skipped. So when 'data' has several
consecutive items to be deleted, only every other one is 'remove'd.
Your analysis is dead-on. Instead of being the "desired" item, the
iterator fetches a[0], a[1], a[2], ... as determined by the state of
the list when the index is retrieved.
Again, iterating over an item that is mutating seems like a Bad
Idea(tm) to me.
Right.

But I was curious: is this the intended behavior, or
does this fall under what C programmers would call 'undefined
behavior.'


It is intended. The behavior is defined as above (a series of indexed
lookups).

BTW, the usual ways to deal with this are to:
1. iterating from the right using: for item in reversed(data)
2. making a copy of the original list before iterating
3. creating a new result list: data = [x for x in data if 'DEL' not in
x]
Raymond

Sep 13 '05 #4
Dave Hansen wrote:
Again, iterating over an item that is mutating seems like a Bad
Idea(tm) to me. But I was curious: is this the intended behavior, or
does this fall under what C programmers would call 'undefined
behavior.'


a C programmer wouldn't have much problem with this, of course,
since "for in" behaves like:

for (tmp = 0; var = getitem(seq, tmp); tmp++) {
...
}

</F>

Sep 14 '05 #5
Dave Hansen wrote:
(snip code snippets and sensible explanations)
Again, iterating over an item that is mutating seems like a Bad
Idea(tm) to me.
It as *always* been a bad idea to modify a list in place (I mean adding
or removing items) while iterating over it, whatever the language. If
you *really* need to do such a thing (for effeciency reasons - and then
it's usually in low-level C code), you'd better use indexed access, and
adjust the index as needed - but this results in tricky, hard to
maintain code.
But I was curious: is this the intended behavior, or
does this fall under what C programmers would call 'undefined
behavior.'


Not being a Language Lawyer(tm), I can't tell for sure, but I'd think
it's the expected behavior. Anyway it's not a behavior I'd relie upon,
since it would be too much of a dirty trick anyway.

My 2 cents
--
bruno desthuilliers
python -c "print '@'.join(['.'.join([w[::-1] for w in p.split('.')]) for
p in 'o****@xiludom.gro'.split('@')])"
Sep 14 '05 #6
On Tue, 13 Sep 2005 21:28:21 GMT, id**@hotmail.com (Dave Hansen)
wrote:
OK, first, I don't often have the time to read this group, so
apologies if this is a FAQ, though I couldn't find anything at
python.org.


Thanks to everyone who responded. All is clear now. And I know I
need to look deeper into list comprehensions...

Regards,

-=Dave
--
Change is inevitable, progress is not.
Sep 14 '05 #7
On Wed, 14 Sep 2005 11:12:14 +0200, bruno modulix <on***@xiludom.gro> wrote:
Dave Hansen wrote:
(snip code snippets and sensible explanations)
Again, iterating over an item that is mutating seems like a Bad
Idea(tm) to me.
It as *always* been a bad idea to modify a list in place (I mean adding
or removing items) while iterating over it, whatever the language. If
you *really* need to do such a thing (for effeciency reasons - and then
it's usually in low-level C code), you'd better use indexed access, and
adjust the index as needed - but this results in tricky, hard to
maintain code.

Do you consider this to be tricky, or likely to be hard to maintain?
(not specifically tested beyond what you see ;-)
data = [ 'First', 'Second DEL', 'Third', 'Fourth', ... 'Fifth DEL', 'DEL Sixth', 'Seventh DEL', 'Eighth DEL',
... 'Ninth DEL', 'Tenth', 'Eleventh', 'Twelfth'] iw = 0
for item in data: ... if 'DEL' in item: continue
... data[iw] = item
... iw += 1
... del data[iw:]
data ['First', 'Third', 'Fourth', 'Tenth', 'Eleventh', 'Twelfth']

Or that maybe a utility function like the following would be hard to maintain?
def filter_in_place(L, keep=bool): ... iw = 0
... for item in L:
... if keep(item):
... L[iw] = item
... iw += 1
... del L[iw:]
... data = [ 'First', 'Second DEL', 'Third', 'Fourth', ... 'Fifth DEL', 'DEL Sixth', 'Seventh DEL', 'Eighth DEL',
... 'Ninth DEL', 'Tenth', 'Eleventh', 'Twelfth'] data ['First', 'Second DEL', 'Third', 'Fourth', 'Fifth DEL', 'DEL Sixth', 'Seventh DEL', 'Eighth DEL'
, 'Ninth DEL', 'Tenth', 'Eleventh', 'Twelfth'] filter_in_place(data, lambda x: 'DEL' not in x)
data

['First', 'Third', 'Fourth', 'Tenth', 'Eleventh', 'Twelfth']

But I was curious: is this the intended behavior, or
does this fall under what C programmers would call 'undefined
behavior.'


Not being a Language Lawyer(tm), I can't tell for sure, but I'd think
it's the expected behavior. Anyway it's not a behavior I'd relie upon,
since it would be too much of a dirty trick anyway.

I think it depends on the kind of mutation involved. A mutation of the _structure_
of a container being iterated over effectively modifies the initial parameters of
the iteration control, so there you need either to have intimate knowledge of how
the iterator does its thing, or you have to insulate yourself from that need.
OTOH, a mutation of the content can be safe, if it doesn't modify the content
yet to be accessed during the iteration.

In the case of a list, the order can be depended on, so it's not that hard.
Other iterable containers are not so clear, and of course even in a list you
could have cases of access to early items having side effects on upcoming items,
e.g., if the "keep" evaluation had side effects on deep data shared between early
and later list items. But that would be nasty any which way ;-)

Regards,
Bengt Richter
Sep 14 '05 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: robert | last post by:
i've found the solution threads on changing a column on insert. works fine. question: - will one package serve for all such triggers, or does there need to be a package defined to support...
2
by: Attila Feher | last post by:
Hi all, I have not done much work around exceptions; and even when I do I avoid exception specifications. But now I have to teach people about these language facilities, so I am trying them out...
2
by: kw | last post by:
I'm getting different behavior from what I would expect and was hoping someone could clue me in. At run time I need to examine a property of an object for a custom TypeConverter, then use that...
0
by: G Dean Blake | last post by:
I am writing a control that prints any datagrid. My control gets passed a datagrid object and examines it, formats, and prints it. I am, however, encountering what I consider unexpected behavior...
6
by: Samuel M. Smith | last post by:
I have been playing around with a subclass of dict wrt a recipe for setting dict items using attribute syntax. The dict class has some read only attributes that generate an exception if I try to...
9
by: Jeff Louie | last post by:
In C# (and C++/cli) the destructor will be called even if an exception is thrown in the constructor. IMHO, this is unexpected behavior that can lead to an invalid system state. So beware! ...
2
by: Dimitri Furman | last post by:
SQL Server 2000 SP4. Running the script below prints 'Unexpected': ----------------------------- DECLARE @String AS varchar(1) SELECT @String = 'z' IF @String LIKE ''
23
by: gu | last post by:
hi to all! after two days debugging my code, i've come to the point that the problem was caused by an unexpected behaviour of python. or by lack of some information about the program, of course!...
5
by: anne.nospam01 | last post by:
Dear fellow Pythonians, I just stumbled upon the following unexpected behavior: class TestType(type): def Foo(self): return 'TestType Foo' class Test(object): __metaclass__ = TestType def...
0
by: veera ravala | last post by:
ServiceNow is a powerful cloud-based platform that offers a wide range of services to help organizations manage their workflows, operations, and IT services more efficiently. At its core, ServiceNow...
0
by: VivesProcSPL | last post by:
Obviously, one of the original purposes of SQL is to make data query processing easy. The language uses many English-like terms and syntax in an effort to make it easy to learn, particularly for...
3
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 3 Jan 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). For other local times, please check World Time Buddy In...
0
by: jianzs | last post by:
Introduction Cloud-native applications are conventionally identified as those designed and nurtured on cloud infrastructure. Such applications, rooted in cloud technologies, skillfully benefit from...
0
by: mar23 | last post by:
Here's the situation. I have a form called frmDiceInventory with subform called subfrmDice. The subform's control source is linked to a query called qryDiceInventory. I've been trying to pick up the...
2
by: jimatqsi | last post by:
The boss wants the word "CONFIDENTIAL" overlaying certain reports. He wants it large, slanted across the page, on every page, very light gray, outlined letters, not block letters. I thought Word Art...
2
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 7 Feb 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:30 (7.30PM). In this month's session, the creator of the excellent VBE...
0
by: stefan129 | last post by:
Hey forum members, I'm exploring options for SSL certificates for multiple domains. Has anyone had experience with multi-domain SSL certificates? Any recommendations on reliable providers or specific...
0
Git
by: egorbl4 | last post by:
Скачал я git, хотел начать настройку, а там вылезло вот это Что это? Что мне с этим делать? ...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.