By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
439,941 Members | 1,753 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 439,941 IT Pros & Developers. It's quick & easy.

Loading contents behind the scenes

P: n/a
Hi, I wanted to know how cautious it is to do something like:

f = file("filename", "rb")
f.read()

for a possibly huge file. When calling f.read(), and not doing
anything with the return value, what is Python doing internally? Is it
loading the content of the file into memory (regardless of whether it
is discarding it immediately)?

In my case, what I'm doing is sending the return value through a
socket:

sock.send(f.read())

Is that gonna make a difference (memory-wise)? I guess I'm just
concerned with whether I can do a file.read() for any file in the
system in an efficient and memory-kind way, and with low overhead in
general. (For one thing, I'm not loading the contents into a
variable.)

Not that I'm saying that loading a huge file into memory will horribly
crash the system, but it's good to try to program in the safest way
possibly. For example, if you try something like this in the
interpreter on a Windows machine, everything will start working
slowly, and you'll likely have to reboot the OS:

s = ((("abc" * 999999) * 999999) * 99999) * 999999
Jun 27 '08 #1
Share this Question
Share on Google+
5 Replies


P: n/a
On 2008-05-22, s0****@gmail.com <s0****@gmail.comwrote:
Hi, I wanted to know how cautious it is to do something like:

f = file("filename", "rb")
f.read()

for a possibly huge file. When calling f.read(), and not doing
anything with the return value, what is Python doing internally? Is it
loading the content of the file into memory (regardless of whether it
is discarding it immediately)?
I am not a Python interpreter developer, but as user, yes I'd expect that to
happen. The method doesn't know you are not doing anything with its return
value.
In my case, what I'm doing is sending the return value through a
socket:

sock.send(f.read())

Is that gonna make a difference (memory-wise)? I guess I'm just
concerned with whether I can do a file.read() for any file in the
system in an efficient and memory-kind way, and with low overhead in
general. (For one thing, I'm not loading the contents into a
variable.)
Doesn't matter. You allocate a string in which the contents is loaded (the
return value of 'f.read()', and you hand over (a reference to) that string to
the 'send()' method.

Note that memory is allocated by data *values*, not by *variables* in Python
(they are merely references to values).
Not that I'm saying that loading a huge file into memory will horribly
crash the system, but it's good to try to program in the safest way
possibly. For example, if you try something like this in the
Depends on your system, and your biggest file.

At a 32 bit platform, anything bigger than about 4GB (usually already at around
3GB) will crash the program for the simple reason that you are running out of
address space to store bytes in.
To fix, read and write blocks by specifying a block-size in the 'read()' call.

Sincerely,
Albert
Jun 27 '08 #2

P: n/a
On May 22, 8:51 am, "A.T.Hofkamp" <h...@se-162.se.wtb.tue.nlwrote:
On 2008-05-22, s0s...@gmail.com <s0s...@gmail.comwrote:
Hi, I wanted to know how cautious it is to do something like:
f = file("filename", "rb")
f.read()
for a possibly huge file. When calling f.read(), and not doing
anything with the return value, what is Python doing internally? Is it
loading the content of the file into memory (regardless of whether it
is discarding it immediately)?

I am not a Python interpreter developer, but as user, yes I'd expect that to
happen. The method doesn't know you are not doing anything with its return
value.
In my case, what I'm doing is sending the return value through a
socket:
sock.send(f.read())
Is that gonna make a difference (memory-wise)? I guess I'm just
concerned with whether I can do a file.read() for any file in the
system in an efficient and memory-kind way, and with low overhead in
general. (For one thing, I'm not loading the contents into a
variable.)

Doesn't matter. You allocate a string in which the contents is loaded (the
return value of 'f.read()', and you hand over (a reference to) that string to
the 'send()' method.

Note that memory is allocated by data *values*, not by *variables* in Python
(they are merely references to values).
Not that I'm saying that loading a huge file into memory will horribly
crash the system, but it's good to try to program in the safest way
possibly. For example, if you try something like this in the

Depends on your system, and your biggest file.

At a 32 bit platform, anything bigger than about 4GB (usually already at around
3GB) will crash the program for the simple reason that you are running out of
address space to store bytes in.

To fix, read and write blocks by specifying a block-size in the 'read()' call.
I see... Thanks for the reply.

So what would be a good approach to solve that problem? The best I can
think of is something like:

MAX_BUF_SIZE = 100000000 # about 100 MBs

f = file("filename", "rb")
f.seek(0, 2) # relative to EOF
length = f.tell()
bPos = 0

while bPos < length:
f.seek(bPos)
bPos += sock.send(f.read(MAX_BUF_SIZE))
Jun 27 '08 #3

P: n/a
s0****@gmail.com wrote:
On May 22, 8:51 am, "A.T.Hofkamp" <h...@se-162.se.wtb.tue.nlwrote:
>On 2008-05-22, s0s...@gmail.com <s0s...@gmail.comwrote:
Hi, I wanted to know how cautious it is to do something like:
f = file("filename", "rb")
f.read()
for a possibly huge file. When calling f.read(), and not doing
anything with the return value, what is Python doing internally? Is it
loading the content of the file into memory (regardless of whether it
is discarding it immediately)?

I am not a Python interpreter developer, but as user, yes I'd expect that
to happen. The method doesn't know you are not doing anything with its
return value.
In my case, what I'm doing is sending the return value through a
socket:
sock.send(f.read())
Is that gonna make a difference (memory-wise)? I guess I'm just
concerned with whether I can do a file.read() for any file in the
system in an efficient and memory-kind way, and with low overhead in
general. (For one thing, I'm not loading the contents into a
variable.)

Doesn't matter. You allocate a string in which the contents is loaded
(the return value of 'f.read()', and you hand over (a reference to) that
string to the 'send()' method.

Note that memory is allocated by data *values*, not by *variables* in
Python (they are merely references to values).
Not that I'm saying that loading a huge file into memory will horribly
crash the system, but it's good to try to program in the safest way
possibly. For example, if you try something like this in the

Depends on your system, and your biggest file.

At a 32 bit platform, anything bigger than about 4GB (usually already at
around 3GB) will crash the program for the simple reason that you are
running out of address space to store bytes in.

To fix, read and write blocks by specifying a block-size in the 'read()'
call.

I see... Thanks for the reply.

So what would be a good approach to solve that problem? The best I can
think of is something like:
You are aware that read() takes an int-argument to limit the number of bytes
returned, and of course advances the internal seek-pointer for you?

Diez
Jun 27 '08 #4

P: n/a
On May 22, 3:20 pm, s0s...@gmail.com wrote:
On May 22, 8:51 am, "A.T.Hofkamp" <h...@se-162.se.wtb.tue.nlwrote:
On 2008-05-22, s0s...@gmail.com <s0s...@gmail.comwrote:
Hi, I wanted to know how cautious it is to do something like:
f = file("filename", "rb")
f.read()
for a possibly huge file. When calling f.read(), and not doing
anything with the return value, what is Python doing internally? Is it
loading the content of the file into memory (regardless of whether it
is discarding it immediately)?
I am not a Python interpreter developer, but as user, yes I'd expect that to
happen. The method doesn't know you are not doing anything with its return
value.
In my case, what I'm doing is sending the return value through a
socket:
sock.send(f.read())
Is that gonna make a difference (memory-wise)? I guess I'm just
concerned with whether I can do a file.read() for any file in the
system in an efficient and memory-kind way, and with low overhead in
general. (For one thing, I'm not loading the contents into a
variable.)
Doesn't matter. You allocate a string in which the contents is loaded (the
return value of 'f.read()', and you hand over (a reference to) that string to
the 'send()' method.
Note that memory is allocated by data *values*, not by *variables* in Python
(they are merely references to values).
Not that I'm saying that loading a huge file into memory will horribly
crash the system, but it's good to try to program in the safest way
possibly. For example, if you try something like this in the
Depends on your system, and your biggest file.
At a 32 bit platform, anything bigger than about 4GB (usually already at around
3GB) will crash the program for the simple reason that you are running out of
address space to store bytes in.
To fix, read and write blocks by specifying a block-size in the 'read()' call.

I see... Thanks for the reply.

So what would be a good approach to solve that problem? The best I can
think of is something like:

MAX_BUF_SIZE = 100000000 # about 100 MBs

f = file("filename", "rb")
f.seek(0, 2) # relative to EOF
length = f.tell()
bPos = 0

while bPos < length:
f.seek(bPos)
bPos += sock.send(f.read(MAX_BUF_SIZE))
I would go with:

f = file("filename", "rb")
while True:
data = f.read(MAX_BUF_SIZE)
if not data:
break
sock.sendall(data)
Jun 27 '08 #5

P: n/a
En Thu, 22 May 2008 14:05:42 -0300, MRAB <go****@mrabarnett.plus.comescribió:
On May 22, 3:20 pm, s0s...@gmail.com wrote:
In my case, what I'm doing is sending the return value through a
socket:
sock.send(f.read())

I would go with:

f = file("filename", "rb")
while True:
data = f.read(MAX_BUF_SIZE)
if not data:
break
sock.sendall(data)
Another way is to use the shutil module:

fin = open("filename", "rb")
fout = sock.makefile()
shutil.copyfileobj(fin, fout)

--
Gabriel Genellina

Jun 27 '08 #6

This discussion thread is closed

Replies have been disabled for this discussion.