473,402 Members | 2,055 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,402 software developers and data experts.

altering an object as you iterate over it?

What is the best way of altering something (in my case, a file) while
you are iterating over it? I've tried this before by accident and got an
error, naturally.

I'm trying to read the lines of a file and remove all the blank ones.
One solution I tried is to open the file and use readlines(), then copy
that list into another variable, but this doesn't seem very efficient to
have two variables representing the file.

Perhaps there's also some better to do it than this, including using
readlines(), but I'm most interested in just how you edit something as
you are iterating with it.

Thanks.
May 19 '06 #1
28 1778
John Salerno wrote:
What is the best way of altering something (in my case, a file) while
you are iterating over it? I've tried this before by accident and got an
error, naturally.

I'm trying to read the lines of a file and remove all the blank ones.
One solution I tried is to open the file and use readlines(), then copy
that list into another variable, but this doesn't seem very efficient to
have two variables representing the file.

Perhaps there's also some better to do it than this, including using
readlines(), but I'm most interested in just how you edit something as
you are iterating with it.

Thanks.


Slightly new question as well. here's my code:

phonelist = open('file').readlines()
new_phonelist = phonelist

for line in phonelist:
if line == '\n':
new_phonelist.remove(line)

import pprint
pprint.pprint(new_phonelist)

But I notice that there are still several lines that print out as '\n',
so why doesn't it work for all lines?
May 19 '06 #2
John Salerno wrote:
What is the best way of altering something (in my case, a file) while
you are iterating over it? I've tried this before by accident and got an
error, naturally.

I'm trying to read the lines of a file and remove all the blank ones.
One solution I tried is to open the file and use readlines(), then copy
that list into another variable, but this doesn't seem very efficient to
have two variables representing the file.


If the file is huge, this can be a problem. But you cannot modify a text
file in place anyway.

For the general case, the best way to go would probably be an iterator:

def iterfilter(fileObj):
for line in fileObj:
if line.strip():
yield line
f = open(path, 'r')
for line in iterfilter(f):
doSomethingWith(line)

Now if what you want to do is just to rewrite the file without the blank
files, you need to use a second file:

fin = open(path, 'r')
fout = open(temp, 'w')
for line in fin:
if line.strip():
fout.write(line)
fin.close()
fout.close()

then delete path and rename temp, and you're done. And yes, this is
actually the canonical way to do this !-)

--
bruno desthuilliers
python -c "print '@'.join(['.'.join([w[::-1] for w in p.split('.')]) for
p in 'o****@xiludom.gro'.split('@')])"
May 19 '06 #3
bruno at modulix wrote:
Now if what you want to do is just to rewrite the file without the blank
files, you need to use a second file:

fin = open(path, 'r')
fout = open(temp, 'w')
for line in fin:
if line.strip():
fout.write(line)
fin.close()
fout.close()

then delete path and rename temp, and you're done. And yes, this is
actually the canonical way to do this !-)


Thanks, that's what I want. Seems a little strange, but at least you
showed me that line.strip() is far better than line == '\n'
May 19 '06 #4
"John Salerno" <jo******@NOSPAMgmail.com> wrote in message
news:D%******************@news.tufts.edu...
John Salerno wrote:
What is the best way of altering something (in my case, a file) while
you are iterating over it? I've tried this before by accident and got an
error, naturally.

I'm trying to read the lines of a file and remove all the blank ones.
One solution I tried is to open the file and use readlines(), then copy
that list into another variable, but this doesn't seem very efficient to
have two variables representing the file.

Perhaps there's also some better to do it than this, including using
readlines(), but I'm most interested in just how you edit something as
you are iterating with it.

Thanks.


Slightly new question as well. here's my code:

phonelist = open('file').readlines()
new_phonelist = phonelist

for line in phonelist:
if line == '\n':
new_phonelist.remove(line)

import pprint
pprint.pprint(new_phonelist)

But I notice that there are still several lines that print out as '\n',
so why doesn't it work for all lines?


Okay, so it looks like you are moving away from modifying a list while
iterating over it. In general this is good practice, that is, it is good
practice to *not* modify a list while iterating over it (although if you
*must* do this, it is possible, just iterate from back-to-front instead of
front to back, so that deletions don't mess up your "next" pointer).

Your coding style is a little dated - are you using an old version of
Python? This style is the old-fashioned way:

noblanklines = []
lines = open("filename.dat").readlines()
for line in lines:
if line != '\n':
noblanklines.append(lin)

1. open("xxx") still works - not sure if it's even deprecated or not - but
the new style is to use the file class
2. the file class is itself an iterator, so no need to invoke readlines
3. no need for such a simple for loop, a list comprehension will do the
trick - or even a generator expression passed to a list constructor.

So this construct collapses down to:

noblanklines = [ line for line in file("filename.dat") if line != '\n' ]
Now to your question about why '\n' lines persist into your new list. The
answer is - you are STILL UPDATING THE LIST YOUR ARE ITERATING OVER!!!
Here's your code:

new_phonelist = phonelist

for line in phonelist:
if line == '\n':
new_phonelist.remove(line)

phonelist and new_phonelist are just two names bound to the same list! If
you have two consecutive '\n's in the file (say lines 3 and 4), then
removing the first (line 3) shortens the list by one, so that line 4 becomes
the new line 3. Then you advance to the next line, being line 4, and the
second '\n' has been skipped over.

Also, don't confuse remove with del. new_phonelist.remove(line) does a
search of new_phonelist for the first matching entry of line. We know line
= '\n' - all this is doing is scanning through new_phonelist and removing
the first occurrence of '\n'. You'd do just as well with:

numEmptyLines = lines.count('\n')
for i in range( numEmptyLines ):
lines.remove('\n')

Why didn't I just write this:

for i in range( lines.count('\n') ):
lines.remove('\n')

Because lines.count('\n') would be evaluated every time in the loop,
reducing by one each time because of the line we'd removed. Talk about
sucky performance!

You might also want to strip whitespace from your lines - I expect while you
are removing blank lines, a line composed of all spaces and/or tabs would be
equally removable. Try this:

lines = map(str.rstrip, file("XYZZY.DAT") )

-- Paul
May 19 '06 #5
Paul McGuire wrote:
Your coding style is a little dated - are you using an old version of
Python? This style is the old-fashioned way:
I'm sure it has more to do with the fact that I'm new to Python, but
what is old-fashioned about open()? Does file() do anything different? I
know they are synonymous, but I like open because it seems like it's
more self-describing than 'file'.
Now to your question about why '\n' lines persist into your new list. The
answer is - you are STILL UPDATING THE LIST YOUR ARE ITERATING OVER!!!
Doh! I see that now! :)

You might also want to strip whitespace from your lines


Another good hint. Thanks for the reply!
May 19 '06 #6
Paul McGuire wrote:
Your coding style is a little dated - are you using an old version of
Python? This style is the old-fashioned way: [clip] 1. open("xxx") still works - not sure if it's even deprecated or not - but
the new style is to use the file class

Python 2.3.4 (#4, Oct 25 2004, 21:40:10)
[GCC 3.3.2 (Mandrake Linux 10.0 3.3.2-6mdk)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
py> open is file
True

James
--
James Stroud
UCLA-DOE Institute for Genomics and Proteomics
Box 951570
Los Angeles, CA 90095

http://www.jamesstroud.com/
May 19 '06 #7
"John Salerno" <jo******@NOSPAMgmail.com> wrote in message
news:WN******************@news.tufts.edu...
Paul McGuire wrote:
Your coding style is a little dated - are you using an old version of
Python? This style is the old-fashioned way:


I'm sure it has more to do with the fact that I'm new to Python, but
what is old-fashioned about open()? Does file() do anything different? I
know they are synonymous, but I like open because it seems like it's
more self-describing than 'file'.


I think it is just part of the objectification trend - "f =
open('xyzzy.dat')" is sort of a functional/verb concept, so it has to return
something, and its something non-objecty like a file handle - urk! Instead,
using "f = file('xyzzy.dat')" is more of an object construction concept - "I
am creating a file object around 'xyzzy.dat' that I will interact with." In
practice, yes, they both do the same thing. Note though, the asymmetry of
"f = open('blah')" and "f.close()" - there is no "close(f)". I see now in
the help for "file" this statement:

Note: open() is an alias for file().

Sounds like some global namespace pollution that may be up for review come
the new Python millennium (Py3K, that is).
Now to your question about why '\n' lines persist into your new list. The answer is - you are STILL UPDATING THE LIST YOUR ARE ITERATING OVER!!!


Doh! I see that now! :)


Sorry about the ALL CAPS... I think I got a little rant-ish in that last
post, didn't mean to shout. :)

Thanks for being a good sport,
-- Paul
May 19 '06 #8
Paul McGuire wrote:
answer is - you are STILL UPDATING THE LIST YOUR ARE ITERATING OVER!!!

Doh! I see that now! :)


Sorry about the ALL CAPS... I think I got a little rant-ish in that last
post, didn't mean to shout. :)

Thanks for being a good sport,


Heh heh, actually it was the all caps that kept making me read it over
and over until I really knew what you were saying! :)
May 19 '06 #9
Paul McGuire wrote:
I think it is just part of the objectification trend - "f =
open('xyzzy.dat')" is sort of a functional/verb concept, so it has to return
something, and its something non-objecty like a file handle - urk! Instead,
using "f = file('xyzzy.dat')" is more of an object construction concept
I see what you mean, but I think that's why I like using open, because I
like having my functions be verbs instead of nouns.
Note though, the asymmetry of
"f = open('blah')" and "f.close()" - there is no "close(f)".


I'm not sure that's a perfect comparison though, because the counterpart
of close(f) would be open(f), and whether you use file() or open(),
neither is taking f as the parameter like close() does, and you aren't
calling close() on 'blah' above.
May 19 '06 #10
In <1K******************@tornado.texas.rr.com>, Paul McGuire wrote:
1. open("xxx") still works - not sure if it's even deprecated or not - but
the new style is to use the file class


It's not deprecated and may be still used for opening files. I guess the
main reason for introducing `file` as a synonym was the possibility to
inherit from builtins. Inheriting from `open` looks quite strange.

Ciao,
Marc 'BlackJack' Rintsch
May 19 '06 #11
On Fri, 19 May 2006 13:36:35 -0700, James Stroud wrote:
Paul McGuire wrote:
Your coding style is a little dated - are you using an old version of
Python? This style is the old-fashioned way: [clip]
1. open("xxx") still works - not sure if it's even deprecated or not - but
the new style is to use the file class

Python 2.3.4 (#4, Oct 25 2004, 21:40:10)
[GCC 3.3.2 (Mandrake Linux 10.0 3.3.2-6mdk)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
py> open is file
True

James


As part of a discussion on Python-Dev in 2004 about using open() or file()
Guido replied:
Then should the following line in the reference be changed?

"The file() constructor is new in Python 2.2. The previous spelling,
open(), is retained for compatibility, and is an alias for file()."

That *strongly* suggests that the preferred spelling is file(), and
that open() shouldn't be used for new code.


Oops, yes. I didn't write that, and it doesn't convey my feelings
about file() vs. open(). Here's a suggestion for better words:

"The file class is new in Python 2.2. It represents the type (class)
of objects returned by the built-in open() function. Its constructor
is an alias for open(), but for future and backwards compatibility,
open() remains preferred."
See: http://mail.python.org/pipermail/pyt...ly/045931.html
--
Richard
May 19 '06 #12
John Salerno a écrit :
John Salerno wrote:
What is the best way of altering something (in my case, a file) while
you are iterating over it? I've tried this before by accident and got
an error, naturally.

I'm trying to read the lines of a file and remove all the blank ones.
One solution I tried is to open the file and use readlines(), then
copy that list into another variable, but this doesn't seem very
efficient to have two variables representing the file.

Perhaps there's also some better to do it than this, including using
readlines(), but I'm most interested in just how you edit something as
you are iterating with it.

Thanks.

Slightly new question as well. here's my code:

phonelist = open('file').readlines()


readlines() reads the whole file in memory. Take care, you may have
problem with huge files.
new_phonelist = phonelist
Woops ! Gotcha ! Try adding this:
assert(new_phonelist is phonelist)

Got it ? Python 'variables' are really name/object ref pairs, so here
you just made new_phonelist an alias to phonelist.

for line in phonelist:
if line == '\n':
replace this with:
if not line.strip()
new_phonelist.remove(line)
And end up modifying the list in place while iterating over it - which
is usually a very bad idea.

Also, FWIW, you'd have the same result with:

phonelist = filter(None, open('file'))
import pprint
pprint.pprint(new_phonelist)

But I notice that there are still several lines that print out as '\n',
so why doesn't it work for all lines?


Apart from the fact that it's usually safer to use line.strip(), the
main problem is that you modify the list in place while iterating over it.
May 19 '06 #13
John Salerno a écrit :
Paul McGuire wrote:
Your coding style is a little dated - are you using an old version of
Python? This style is the old-fashioned way:

I'm sure it has more to do with the fact that I'm new to Python, but
what is old-fashioned about open()?


It has been, at a time, recommended to use file() instead of open().
Don't worry, open() is ok - and I guess almost anyone uses it.
May 19 '06 #14
bruno at modulix a écrit :
(snip)

(responding to myself)
(but under another identity - now that's a bit schizophrenic, isn't it ?-)
For the general case, the best way to go would probably be an iterator:

def iterfilter(fileObj):
for line in fileObj:
if line.strip():
yield line
f = open(path, 'r')
for line in iterfilter(f):
doSomethingWith(line)


Which is good as an example of simple iterator, but pretty useless since
we have itertools :

import itertools
f = open(path, 'r')
for line in itertools.ifilter(lambda l: l.strip(), f):
doSomethingWith(line)
f.close()
May 19 '06 #15
Bruno Desthuilliers wrote

It has been, at a time, recommended to use file() instead of
open(). Don't worry, open() is ok - and I guess almost anyone
uses it.


http://mail.python.org/pipermail/pyt...er/059073.html

May 20 '06 #16
John Salerno <jo******@NOSPAMgmail.com> writes:
Paul McGuire wrote:
I think it is just part of the objectification trend - "f =
open('xyzzy.dat')" is sort of a functional/verb concept, so it has
to return something, and its something non-objecty like a file
handle - urk! Instead, using "f = file('xyzzy.dat')" is more of
an object construction concept


I see what you mean, but I think that's why I like using open,
because I like having my functions be verbs instead of nouns.


Note though that you're calling a class (in this case, type)
constructor, to return a new object. Do you find int(), dict(), set()
et al to be strange names for what they do?

--
\ "I was sleeping the other night, alone, thanks to the |
`\ exterminator." -- Emo Philips |
_o__) |
Ben Finney

May 20 '06 #17
In article <e4**********@daisy.noc.ucla.edu>,
James Stroud <js*****@ucla.edu> wrote:
Paul McGuire wrote:

1. open("xxx") still works - not sure if it's even deprecated or not - but
the new style is to use the file class


Python 2.3.4 (#4, Oct 25 2004, 21:40:10)
[GCC 3.3.2 (Mandrake Linux 10.0 3.3.2-6mdk)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
py> open is file
True


Python 2.5a2 (trunk:46052, May 19 2006, 19:54:46)
[GCC 4.0.2 20050808 (prerelease) (Ubuntu 4.0.1-4ubuntu9)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
open is file

False

Per the other comments in this thread, Guido agreed that making open() a
synonym of file() was a mistake, and my patch to split them was accepted.
Still need to do more doc update (per Uncle Timmy complaint), but that
shouldn't be too hard.
--
Aahz (aa**@pythoncraft.com) <*> http://www.pythoncraft.com/

"I saw `cout' being shifted "Hello world" times to the left and stopped
right there." --Steve Gonedes
May 20 '06 #18
Aahz wrote:
Python 2.5a2 (trunk:46052, May 19 2006, 19:54:46)
[GCC 4.0.2 20050808 (prerelease) (Ubuntu 4.0.1-4ubuntu9)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
open is file

False

Per the other comments in this thread, Guido agreed that making open() a
synonym of file() was a mistake, and my patch to split them was accepted.
Still need to do more doc update (per Uncle Timmy complaint), but that
shouldn't be too hard.


Interesting. What is the difference between them now?
May 20 '06 #19
"Bruno Desthuilliers" <bd*****************@free.quelquepart.fr> wrote in
message news:44**********************@news.free.fr...
bruno at modulix a écrit :
(snip)

(responding to myself)
(but under another identity - now that's a bit schizophrenic, isn't it ?-)


Do you ever flame yourself?

-- Paul
May 20 '06 #20
[John Salerno, on the difference between `open` and `file`]
Interesting. What is the difference between them now?


In 2.5 `file` is unchanged but `open` becomes a function:
file <type 'file'> open

<built-in function open>
May 20 '06 #21
Tim Peters wrote:
[John Salerno, on the difference between `open` and `file`]
Interesting. What is the difference between them now?


In 2.5 `file` is unchanged but `open` becomes a function:
file <type 'file'> open

<built-in function open>


So they are still used in the same way though?
May 20 '06 #22
"Tim Peters" <ti********@gmail.com> writes:
In 2.5 `file` is unchanged but `open` becomes a function:
file <type 'file'> open

<built-in function open>


So which one are we supposed to use?
May 20 '06 #23
[Tim Peters]
In 2.5 `file` is unchanged but `open` becomes a function:
>>> file <type 'file'>
>>> open

<built-in function open>
[Paul Rubin] So which one are we supposed to use?


Use for what? If you're trying to check an object's type, use the
type; if you're trying to open a file, use the function.
type(open('a.file', 'wb')) is file

True
May 20 '06 #24
"Tim Peters" <ti********@gmail.com> writes:
[John Salerno, on the difference between `open` and `file`]
Interesting. What is the difference between them now?


In 2.5 `file` is unchanged but `open` becomes a function:
file <type 'file'> open

<built-in function open>


In that case I'll happily use 'file()', since it meshes nicely with
creating a new instance of any built-in type.

--
\ "None can love freedom heartily, but good men; the rest love |
`\ not freedom, but license." -- John Milton |
_o__) |
Ben Finney

May 20 '06 #25
bruno at modulix wrote:
fin = open(path, 'r')
fout = open(temp, 'w')
for line in fin:
if line.strip():
fout.write(line)
fin.close()
fout.close()

then delete path and rename temp, and you're done. And yes, this is
actually the canonical way to do this !-)


What if there's a hard link to path?
--
Vaibhav

May 20 '06 #26
Ben Finney wrote:

I see what you mean, but I think that's why I like using open,
because I like having my functions be verbs instead of nouns.


Note though that you're calling a class (in this case, type)
constructor, to return a new object.


no, he's calling a factory function to get an object that provides the
expected behaviour.

in future versions of Python, open() may not always create an instance
of the file type.

</F>

May 20 '06 #27
In article <ma***************************************@python. org>,
Ben Finney <bi****************@benfinney.id.au> wrote:
"Tim Peters" <ti********@gmail.com> writes:
[John Salerno, on the difference between `open` and `file`]

Interesting. What is the difference between them now?


In 2.5 `file` is unchanged but `open` becomes a function:
> file

<type 'file'>
> open

<built-in function open>


In that case I'll happily use 'file()', since it meshes nicely with
creating a new instance of any built-in type.


Nobody will prevent you from going against the standard decreed by Guido.
But you also probably won't be able to contribute any code to the
standard library, and other people mucking with your code who do care
about Guido's decrees will probably change it.

Unlike all the other built-in types, files are special because they are
proxies for non-Python external objects. For that reason, there has long
been interest in extending open() to work with file-like objects (such as
URLs). Splitting open() and file() is a necessary precondition to making
that happen, and it's also possible that Python 3.0 may have separate
textfile and binary file objects. Finally, file() doesn't exist in
Python 2.1 and earlier, and using file() instead of open() is gratuitous
breakage.
--
Aahz (aa**@pythoncraft.com) <*> http://www.pythoncraft.com/

"I saw `cout' being shifted "Hello world" times to the left and stopped
right there." --Steve Gonedes
May 21 '06 #28
Paul McGuire wrote:
"Bruno Desthuilliers" <bd*****************@free.quelquepart.fr> wrote in
message news:44**********************@news.free.fr...
bruno at modulix a écrit :
(snip)

(responding to myself)
(but under another identity - now that's a bit schizophrenic, isn't it ?-)

Do you ever flame yourself?


class Myself(Developper, Schizophrenic):
def _flame(self):
""" implementation left as an exercice to the reader...
Note that this is *not* part of the public API !-)
"""
pass
--
bruno desthuilliers
python -c "print '@'.join(['.'.join([w[::-1] for w in p.split('.')]) for
p in 'o****@xiludom.gro'.split('@')])"
May 22 '06 #29

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

15
by: Scott Auge | last post by:
I am looking for comments on something that lets me abstract database updates in an object. Lemme explain what I am thinking: Lets say I have an object Person with... SetFirstName()...
10
by: mike | last post by:
If I have 2 object arrays like: var txtobj = theform.getElementsByTagName("input"); var selobj = theform.getElementsByTagName("select"); and i want to iterate over them I'd like to combine...
4
by: Dixie | last post by:
I wish to be able to do some things to tables in code. 1. Add a field and its properties. 2. Alter the properties of an existing field in a table. 3. Append some extra entries onto the bottom of...
3
by: Scott | last post by:
I am trying to alter the ForeColor of a TextBox object so that parts of the displayed text are written in various colors. For example, when writing to the TextBox I wish to display parts of the...
0
by: hazz | last post by:
i have an array (or collection if necessary) of Customers with CustomerID and several properties. For each of these Customers I will have a one to many relationship with Customer products....
7
by: Sehboo | last post by:
We have several generic List objects in our project. Some of them have about 1000 items in them. Everytime we have to find something, we have to do a for loop. There is one method which does the...
7
by: =?Utf-8?B?RXZhbiBSZXlub2xkcw==?= | last post by:
I am a C++ programmer and have been learning C#. I have constructed a List<> and I want to iterate over it, altering each string in the list. In C++, I'd just create an iterator and walk the...
2
by: chris fellows | last post by:
In VS2005 (C#) I want to set the properties of an object dynamically at runtime from an XML configuration file but without having to know the property name when writing the code. The properties are...
3
Kelicula
by: Kelicula | last post by:
Hi all, I am usually a Perl programmer, I have some background in javascript and am attempting to create a Googleish selector div. Please bear with me, excuse the long introduction... Here's...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.