473,326 Members | 2,081 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,326 software developers and data experts.

Allowing ref counting to close file items bad style?

Dan
Is this discouraged?:

for line in open(filename):
<do something with line>

That is, should I do this instead?:

fileptr = open(filename)
for line in fileptr:
<do something with line>
fileptr.close()

Can I count on the ref count going to zero to close the file?

How about a write case? For example:

class Foo(list):
def __init__(self):
self.extend([1, 2, 3, 4])
def write(self, fileptr):
for item in self:
fileptr.write("%s\n" % item)

foo_obj = Foo()
foo_obj.write(open("the.file", "w"))

Is my data safer if I explicitly close, like this?:
fileptr = open("the.file", "w")
foo_obj.write(fileptr)
fileptr.close()

I understand that the upcoming 'with' statement will obviate this
question, but how about without 'with'?

/Dan

--
dedded att verizon dott net
Aug 30 '06 #1
14 2053
Dan <bo*****@foo.orgwrites:
Is this discouraged?:

for line in open(filename):
<do something with line>
Yes.
Can I count on the ref count going to zero to close the file?
You really shouldn't. It's a CPython artifact.
I understand that the upcoming 'with' statement will obviate this
question, but how about without 'with'?
f = open(filename)
try:
for line in f:
<do something with line>
finally:
f.close()
Aug 30 '06 #2
Dan
Paul Rubin wrote:
Dan <bo*****@foo.orgwrites:
>Is this discouraged?:

for line in open(filename):
<do something with line>

Yes.
Well, not what I wanted to hear, but what I expected.

Thanks,
Dan

--
dedded att verizon dott net
Aug 30 '06 #3
Paul Rubin wrote:
Dan <bo*****@foo.orgwrites:
Is this discouraged?:

for line in open(filename):
<do something with line>

Yes.
Can I count on the ref count going to zero to close the file?

You really shouldn't. It's a CPython artifact.
I disagree, somewhat. No, you shouldn't count on the "ref count" per
se going to 0. And you shouldn't count on the file object being GC'd
_immediately_ after the last reference is destroyed. You should be able
to rely on it being GC'd at some point in the not-horribly-distant
future, though.

Doing an explicit .close() is not normally useful and muddies the code
(and introduces more lines for potential bugs to infest).

And yes, I know that the language spec technically allows for no GC at
all--it's a QOI issue, not a spec issue, but any implementation that
didn't GC would be useless as a general Python platform (perhaps useful
for specific embedded uses, but programming for such an environment
would be different from programming for rational python platforms in
bigger ways than this).

(And personally I think the benefits to programmers of guaranteeing
ref-counting semantics would outweigh the additional headaches for
Jython and other alternative implementations).

Aug 30 '06 #4
"sj*******@yahoo.com" <sj*******@yahoo.comwrites:
I disagree, somewhat. No, you shouldn't count on the "ref count" per
se going to 0. And you shouldn't count on the file object being GC'd
_immediately_ after the last reference is destroyed. You should be able
to rely on it being GC'd at some point in the not-horribly-distant
future, though.
Is there something in the language specification that says I should be
able to rely on something like that? In Jython, for example, I think
GC is handled totally by the underlying JVM and therefore totally up
to the Java implementation.
Doing an explicit .close() is not normally useful and muddies the code
(and introduces more lines for potential bugs to infest).
Yes, the "with" statement is the right way to do it.
And yes, I know that the language spec technically allows for no GC at
all--it's a QOI issue, not a spec issue, but any implementation that
QOI?
didn't GC would be useless as a general Python platform (perhaps useful
GC's typically track memory allocation but not file handle allocation.
If you're opening a lot of files, you could run out of fd's before the
GC ever runs.
(And personally I think the benefits to programmers of guaranteeing
ref-counting semantics would outweigh the additional headaches for
Jython and other alternative implementations).
Yes, "with" (doing an implicit close guaranteed to happen at the right
time) takes care of it properly.
Aug 30 '06 #5
Dan wrote:
Is this discouraged?:

for line in open(filename):
<do something with line>

That is, should I do this instead?:

fileptr = open(filename)
for line in fileptr:
<do something with line>
fileptr.close()
depends on the use case; in a small program that you know will only read
a few files, you can leave it to the system (especially on CPython). if
you're about to process large number of files, or you're writing files,
it's usually better to be explicit.

note that to be really safe, you should use try/finally:

f = open(filename)
try:
f.write(...)
finally:
f.close()

</F>

Aug 30 '06 #6
Paul Rubin wrote:
"sj*******@yahoo.com" <sj*******@yahoo.comwrites:
I disagree, somewhat. No, you shouldn't count on the "ref count" per
se going to 0. And you shouldn't count on the file object being GC'd
_immediately_ after the last reference is destroyed. You should be able
to rely on it being GC'd at some point in the not-horribly-distant
future, though.

Is there something in the language specification that says I should be
able to rely on something like that?
No, as I said I know the language spec doesn't require any GC at all.
In Jython, for example, I think
GC is handled totally by the underlying JVM and therefore totally up
to the Java implementation.
Sure. But most Java GCs are pretty reasonable and for typical code
will run periodically (what I call the not-horribly-distant future).
Doing an explicit .close() is not normally useful and muddies the code
(and introduces more lines for potential bugs to infest).

Yes, the "with" statement is the right way to do it.
Ugh.
And yes, I know that the language spec technically allows for no GC at
all--it's a QOI issue, not a spec issue, but any implementation that

QOI?
Sorry, I had introduced and defined it earlier but wound up editing out
that sentence. Quality of implementation.
didn't GC would be useless as a general Python platform (perhaps useful

GC's typically track memory allocation but not file handle allocation.
If you're opening a lot of files, you could run out of fd's before the
GC ever runs.
Yes, if you're opening lots of files quickly without giving the GC time
to work then you may be stuck having to use some hack to support
non-refcounting implementations (or simply deciding that the cost of
doing so is not worth supporting implementations with nondeterministic
GC). Yet another reason I said:
(And personally I think the benefits to programmers of guaranteeing
ref-counting semantics would outweigh the additional headaches for
Jython and other alternative implementations).

Yes, "with" (doing an implicit close guaranteed to happen at the right
time) takes care of it properly.
In many cases, that's adding additional programmer burden to duplicate
information about an object's lifetime that's already in the code. In
simple cases, it uglifies the code with what should be an unnecessary
statement (and adds additional layers of indentation).

Guaranteeing ref-counting semantics at least for local variables when
you return from a function makes for more readable code and makes life
easier on the programmer.

It's obvious to the reader that in code like:

def myFunc(filename):
f = open(filename, 'r')
for line in f:
# do something not using f

that f is used only in myFunc. Indeed, such scoping is a big part of
the point of having functions, and having to duplicate scope
declarations (via with statements or anything else) is broken.

Having f destructed at least when the function returns makes for more
readable code and fewer mistakes. CPython's refcounting behaves very
nicely in this regard, and Python programmers would be much better
served IMO if the language required at least this level of
sophistication from the GC (if not full ref-counting).

Aug 30 '06 #7
"sj*******@yahoo.com" <sj*******@yahoo.comwrites:
Sure. But most Java GCs are pretty reasonable and for typical code
will run periodically (what I call the not-horribly-distant future).
If your system allows max 100 files open and you're using 98 of them,
then "horribly distant future" can be awfully close by.
(And personally I think the benefits to programmers of guaranteeing
ref-counting semantics would outweigh the additional headaches for
Jython and other alternative implementations).
Ref counting is a rather primitive GC technique and implementations
shouldn't be stuck having to use it.
It's obvious to the reader that in code like:

def myFunc(filename):
f = open(filename, 'r')
for line in f:
# do something not using f
That's not obvious except by recognizing the idiom and knowing the
special semantics of files. Otherwise, look at

def myOtherFunc(x):
a = SomeClass(x) # make an instance of some class
b = a.foo()
# do something with b

One can't say for sure that 'a' can be destructed when the above
function finishes. Maybe a.foo() saved a copy of its 'self' argument
somewhere. It's the same thing with your file example: "for line in f"
calls f's iter method and them repeatedly calls f's next method.
Those methods could have side effects that save f somewhere.
Having f destructed at least when the function returns makes for more
readable code and fewer mistakes. CPython's refcounting behaves very
nicely in this regard,
The ref counting only works if it applies to all the lower scopes and
not just the local scope. That means you can't use any other type of GC.
Aug 30 '06 #8
Paul Rubin wrote:
"sj*******@yahoo.com" <sj*******@yahoo.comwrites:
(And personally I think the benefits to programmers of guaranteeing
ref-counting semantics would outweigh the additional headaches for
Jython and other alternative implementations).

Ref counting is a rather primitive GC technique
I disagree strongly with this assertion. It's not as efficient overall
as other GC implementations, but it's not a case of "less efficient to
do the same task". Reference counting buys you deterministic GC in the
pretty common case where you do not have circular references--and
determinism is very valuable to programmers. Other GCs be faster, but
they don't actually accomplish the same task.

I can come up with plenty of "superior" algorithms for all kinds of
tasks if I'm not bound to any particular semantics, but losing
correctness for speed is rarely a good idea.

Aug 30 '06 #9
Dan <bo*****@foo.orgwrote:
Is this discouraged?:

for line in open(filename):
<do something with line>

That is, should I do this instead?:

fileptr = open(filename)
for line in fileptr:
<do something with line>
fileptr.close()
One reason to use close() explicitly is to make sure that errors are
reported properly.

If you use close(), an error from the operating system will cause an
exception at a well-defined point in your code. With the implicit
close, an error will probably cause a message to be spewed to stderr
and you might never know about it.

If (as in your example) the file was open for reading only, errors from
close() are unlikely. But I do not think they are guaranteed not to
occur. If you were writing to the file, checking for errors on close()
is indispensable.

-M-

Aug 30 '06 #10
"sj*******@yahoo.com" <sj*******@yahoo.comwrites:
I disagree strongly with this assertion. It's not as efficient overall
as other GC implementations, but it's not a case of "less efficient to
do the same task". Reference counting buys you deterministic GC in the
pretty common case where you do not have circular references--and
determinism is very valuable to programmers. Other GCs be faster, but
they don't actually accomplish the same task.
GC is supposed to create the illusion that all objects stay around
forever. It releases unreachable objects since the application can't
tell whether those objects are gone or not.

Closing a file is a state change in which stuff is supposed to
actually happen (buffers flushed, CLOSE message sent over socket,
etc.) That's independent of releasing it. In your example
(simplified):

def func(x):
f = open_some_file(x)
# do stuff with f

it might even be that the open call saves the file handle somewhere,
maybe for logging purposes. You presumably still want it closed at
function exit. The GC can't possibly do that for you. Relying on GC
to close files is simply a kludge that Python users have been relying
on, because doing it "manually" has been messy prior to 2.5.
I can come up with plenty of "superior" algorithms for all kinds of
tasks if I'm not bound to any particular semantics, but losing
correctness for speed is rarely a good idea.
Then don't write incorrect code that relies on the GC's implementation
accidents to make it work ;-). PEP 343 really is the right way to
handle this.
Aug 31 '06 #11
Paul Rubin wrote:
"sj*******@yahoo.com" <sj*******@yahoo.comwrites:
I disagree strongly with this assertion. It's not as efficient overall
as other GC implementations, but it's not a case of "less efficient to
do the same task". Reference counting buys you deterministic GC in the
pretty common case where you do not have circular references--and
determinism is very valuable to programmers. Other GCs be faster, but
they don't actually accomplish the same task.

GC is supposed to create the illusion that all objects stay around
forever. It releases unreachable objects since the application can't
tell whether those objects are gone or not.
No, that's not true of all GC implementations. Refcounting
implementations give much nicer deterministic guarantees.

Aug 31 '06 #12
On 8/30/06, Dan <bo*****@foo.orgwrote:
Is this discouraged?:

for line in open(filename):
<do something with line>
In theory, it is. In practice, that is the way Python code is written
because it more natural and to the point. Not just for hacked together
scripts, lots of third party modules includes code like "data =
open(filename).read()" and similar idioms.
Is my data safer if I explicitly close, like this?:
fileptr = open("the.file", "w")
foo_obj.write(fileptr)
fileptr.close()
Have you ever experienced a problem caused by not explicitly closing
your file handles?

--
mvh Björn
Aug 31 '06 #13
Dan
BJörn Lindqvist wrote:
On 8/30/06, Dan <bo*****@foo.orgwrote:
>Is my data safer if I explicitly close, like this?:
fileptr = open("the.file", "w")
foo_obj.write(fileptr)
fileptr.close()

Have you ever experienced a problem caused by not explicitly closing
your file handles?
No. If I had, I wouldn't have asked the question. It seems to work,
but can I really count on it?

I am a sample of one (In that happy place that Brooks described as
quadrant 1, where a person writes programs for himself to be run on his
own computer. Or perhaps to be run by a handful of co-workers.) Such a
small sample isn't statistically significant; a 100% success rate
doesn't mean much. I've also never had a burned CD go bad, but I know
that they do.

/Dan

--
dedded att verizon dott net
Aug 31 '06 #14
Dan wrote:
BJo:rn Lindqvist wrote:
On 8/30/06, Dan <bo*****@foo.orgwrote:
Is my data safer if I explicitly close, like this?:
fileptr = open("the.file", "w")
foo_obj.write(fileptr)
fileptr.close()
Have you ever experienced a problem caused by not explicitly closing
your file handles?

No. If I had, I wouldn't have asked the question. It seems to work,
but can I really count on it?
In CPython, you can rely on file objects to be closed when the last
reference to them is removed. In general, the language spec does not
guarantee that so if you use Jython or other implementations you cannot
rely on files being closed on last reference. (see my other posts in
this thread for why I think the language spec should be changed to
guarantee the ref-counting semantics at least in simple cases).

Sep 1 '06 #15

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: Andras Gilicz | last post by:
Hi VB fans I'm working on a relatively large project in VB6 with about a dozen forms, including graphs, labels, text boxes, etc. The software itself is actually a flow simulator with more or...
7
by: mikester | last post by:
First off I'll say - I am a bad perl programmer. I want to be better and with your help I'll get there and then be able to contribute more here. That being said, I have a simple problem...
7
by: Sam Lowry | last post by:
Greetings. I am trying to do something which should elementary for Perl, but I have only been able to find bits and pieces on it. When I put the bits together they do not work. Maybe I am going...
1
by: j | last post by:
Hi, I've been trying to do line/character counts on documents that are being uploaded. As well as the "counting" I also have to remove certain sections from the file. So, firstly I was working...
1
by: vj | last post by:
How i can populate all fileds dynamically in jsp page based on contents found in xml file? I have written jsp servlets and java class file. i transferred automatic data from jsp to servlet then to...
4
by: bigbagy | last post by:
Notes The programs will be compiled and tested on the machine which runs the Linux operating system. V3.4 of the GNU C/C++ compiler (gcc ,g++) must be used. A significant amount coding is...
2
by: verci | last post by:
Hi guys, sorry if this seems stupid but I'm a newbie, I'm running Windows XP Pro SP2, IE 7, VS2005, ASP.net 2.0 The problem is that I'm trying to display this news scroller made in a Javascript...
13
by: rajash | last post by:
Hi I am new to this forum. I have taken a class in C some time ago but now I am reading Kernigan and Richie's book to refresh my knowledge. I think I have forgotten alot and there are no...
19
by: Sam | last post by:
A friend of mine is picking up some Python and is frustrated by Python's indentation rules (http://greatbiggary.livejournal.com/ 260460.html?thread=1835884#t1835884). Personally, I've never had...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.