473,287 Members | 1,714 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,287 software developers and data experts.

Loading contents behind the scenes

Hi, I wanted to know how cautious it is to do something like:

f = file("filename", "rb")
f.read()

for a possibly huge file. When calling f.read(), and not doing
anything with the return value, what is Python doing internally? Is it
loading the content of the file into memory (regardless of whether it
is discarding it immediately)?

In my case, what I'm doing is sending the return value through a
socket:

sock.send(f.read())

Is that gonna make a difference (memory-wise)? I guess I'm just
concerned with whether I can do a file.read() for any file in the
system in an efficient and memory-kind way, and with low overhead in
general. (For one thing, I'm not loading the contents into a
variable.)

Not that I'm saying that loading a huge file into memory will horribly
crash the system, but it's good to try to program in the safest way
possibly. For example, if you try something like this in the
interpreter on a Windows machine, everything will start working
slowly, and you'll likely have to reboot the OS:

s = ((("abc" * 999999) * 999999) * 99999) * 999999
Jun 27 '08 #1
5 1261
On 2008-05-22, s0****@gmail.com <s0****@gmail.comwrote:
Hi, I wanted to know how cautious it is to do something like:

f = file("filename", "rb")
f.read()

for a possibly huge file. When calling f.read(), and not doing
anything with the return value, what is Python doing internally? Is it
loading the content of the file into memory (regardless of whether it
is discarding it immediately)?
I am not a Python interpreter developer, but as user, yes I'd expect that to
happen. The method doesn't know you are not doing anything with its return
value.
In my case, what I'm doing is sending the return value through a
socket:

sock.send(f.read())

Is that gonna make a difference (memory-wise)? I guess I'm just
concerned with whether I can do a file.read() for any file in the
system in an efficient and memory-kind way, and with low overhead in
general. (For one thing, I'm not loading the contents into a
variable.)
Doesn't matter. You allocate a string in which the contents is loaded (the
return value of 'f.read()', and you hand over (a reference to) that string to
the 'send()' method.

Note that memory is allocated by data *values*, not by *variables* in Python
(they are merely references to values).
Not that I'm saying that loading a huge file into memory will horribly
crash the system, but it's good to try to program in the safest way
possibly. For example, if you try something like this in the
Depends on your system, and your biggest file.

At a 32 bit platform, anything bigger than about 4GB (usually already at around
3GB) will crash the program for the simple reason that you are running out of
address space to store bytes in.
To fix, read and write blocks by specifying a block-size in the 'read()' call.

Sincerely,
Albert
Jun 27 '08 #2
On May 22, 8:51 am, "A.T.Hofkamp" <h...@se-162.se.wtb.tue.nlwrote:
On 2008-05-22, s0s...@gmail.com <s0s...@gmail.comwrote:
Hi, I wanted to know how cautious it is to do something like:
f = file("filename", "rb")
f.read()
for a possibly huge file. When calling f.read(), and not doing
anything with the return value, what is Python doing internally? Is it
loading the content of the file into memory (regardless of whether it
is discarding it immediately)?

I am not a Python interpreter developer, but as user, yes I'd expect that to
happen. The method doesn't know you are not doing anything with its return
value.
In my case, what I'm doing is sending the return value through a
socket:
sock.send(f.read())
Is that gonna make a difference (memory-wise)? I guess I'm just
concerned with whether I can do a file.read() for any file in the
system in an efficient and memory-kind way, and with low overhead in
general. (For one thing, I'm not loading the contents into a
variable.)

Doesn't matter. You allocate a string in which the contents is loaded (the
return value of 'f.read()', and you hand over (a reference to) that string to
the 'send()' method.

Note that memory is allocated by data *values*, not by *variables* in Python
(they are merely references to values).
Not that I'm saying that loading a huge file into memory will horribly
crash the system, but it's good to try to program in the safest way
possibly. For example, if you try something like this in the

Depends on your system, and your biggest file.

At a 32 bit platform, anything bigger than about 4GB (usually already at around
3GB) will crash the program for the simple reason that you are running out of
address space to store bytes in.

To fix, read and write blocks by specifying a block-size in the 'read()' call.
I see... Thanks for the reply.

So what would be a good approach to solve that problem? The best I can
think of is something like:

MAX_BUF_SIZE = 100000000 # about 100 MBs

f = file("filename", "rb")
f.seek(0, 2) # relative to EOF
length = f.tell()
bPos = 0

while bPos < length:
f.seek(bPos)
bPos += sock.send(f.read(MAX_BUF_SIZE))
Jun 27 '08 #3
s0****@gmail.com wrote:
On May 22, 8:51 am, "A.T.Hofkamp" <h...@se-162.se.wtb.tue.nlwrote:
>On 2008-05-22, s0s...@gmail.com <s0s...@gmail.comwrote:
Hi, I wanted to know how cautious it is to do something like:
f = file("filename", "rb")
f.read()
for a possibly huge file. When calling f.read(), and not doing
anything with the return value, what is Python doing internally? Is it
loading the content of the file into memory (regardless of whether it
is discarding it immediately)?

I am not a Python interpreter developer, but as user, yes I'd expect that
to happen. The method doesn't know you are not doing anything with its
return value.
In my case, what I'm doing is sending the return value through a
socket:
sock.send(f.read())
Is that gonna make a difference (memory-wise)? I guess I'm just
concerned with whether I can do a file.read() for any file in the
system in an efficient and memory-kind way, and with low overhead in
general. (For one thing, I'm not loading the contents into a
variable.)

Doesn't matter. You allocate a string in which the contents is loaded
(the return value of 'f.read()', and you hand over (a reference to) that
string to the 'send()' method.

Note that memory is allocated by data *values*, not by *variables* in
Python (they are merely references to values).
Not that I'm saying that loading a huge file into memory will horribly
crash the system, but it's good to try to program in the safest way
possibly. For example, if you try something like this in the

Depends on your system, and your biggest file.

At a 32 bit platform, anything bigger than about 4GB (usually already at
around 3GB) will crash the program for the simple reason that you are
running out of address space to store bytes in.

To fix, read and write blocks by specifying a block-size in the 'read()'
call.

I see... Thanks for the reply.

So what would be a good approach to solve that problem? The best I can
think of is something like:
You are aware that read() takes an int-argument to limit the number of bytes
returned, and of course advances the internal seek-pointer for you?

Diez
Jun 27 '08 #4
On May 22, 3:20 pm, s0s...@gmail.com wrote:
On May 22, 8:51 am, "A.T.Hofkamp" <h...@se-162.se.wtb.tue.nlwrote:
On 2008-05-22, s0s...@gmail.com <s0s...@gmail.comwrote:
Hi, I wanted to know how cautious it is to do something like:
f = file("filename", "rb")
f.read()
for a possibly huge file. When calling f.read(), and not doing
anything with the return value, what is Python doing internally? Is it
loading the content of the file into memory (regardless of whether it
is discarding it immediately)?
I am not a Python interpreter developer, but as user, yes I'd expect that to
happen. The method doesn't know you are not doing anything with its return
value.
In my case, what I'm doing is sending the return value through a
socket:
sock.send(f.read())
Is that gonna make a difference (memory-wise)? I guess I'm just
concerned with whether I can do a file.read() for any file in the
system in an efficient and memory-kind way, and with low overhead in
general. (For one thing, I'm not loading the contents into a
variable.)
Doesn't matter. You allocate a string in which the contents is loaded (the
return value of 'f.read()', and you hand over (a reference to) that string to
the 'send()' method.
Note that memory is allocated by data *values*, not by *variables* in Python
(they are merely references to values).
Not that I'm saying that loading a huge file into memory will horribly
crash the system, but it's good to try to program in the safest way
possibly. For example, if you try something like this in the
Depends on your system, and your biggest file.
At a 32 bit platform, anything bigger than about 4GB (usually already at around
3GB) will crash the program for the simple reason that you are running out of
address space to store bytes in.
To fix, read and write blocks by specifying a block-size in the 'read()' call.

I see... Thanks for the reply.

So what would be a good approach to solve that problem? The best I can
think of is something like:

MAX_BUF_SIZE = 100000000 # about 100 MBs

f = file("filename", "rb")
f.seek(0, 2) # relative to EOF
length = f.tell()
bPos = 0

while bPos < length:
f.seek(bPos)
bPos += sock.send(f.read(MAX_BUF_SIZE))
I would go with:

f = file("filename", "rb")
while True:
data = f.read(MAX_BUF_SIZE)
if not data:
break
sock.sendall(data)
Jun 27 '08 #5
En Thu, 22 May 2008 14:05:42 -0300, MRAB <go****@mrabarnett.plus.comescribió:
On May 22, 3:20 pm, s0s...@gmail.com wrote:
In my case, what I'm doing is sending the return value through a
socket:
sock.send(f.read())

I would go with:

f = file("filename", "rb")
while True:
data = f.read(MAX_BUF_SIZE)
if not data:
break
sock.sendall(data)
Another way is to use the shutil module:

fin = open("filename", "rb")
fout = sock.makefile()
shutil.copyfileobj(fin, fout)

--
Gabriel Genellina

Jun 27 '08 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: sasan3 | last post by:
I have a main form "topform" contaning "subform1" and "subform2" The goal is: I need to requery subform2 on CURRENT event of subform1, and I need to load subform2 contents based on settings on...
2
by: Lauren Quantrell | last post by:
Is there any speed difference or advantage to loading different types of images from a folder residing on a client machine into an Access Image Control on a form? For example, if I have a folder...
10
by: cppdev | last post by:
Hi All! I want to clear the string contents from sensitive information such as passwords, and etc. It's always a case that password will appear as string at some point or another. And i feel...
6
by: Mountain Bikn' Guy | last post by:
When one gets a row from a database (ie, a DataTable), the row contains a typed value in each column. How is this typically implemented behind scenes. I want to build this functionality myself. The...
2
by: Larry Foulkrod | last post by:
I would like to learn what physically happens behind the scenes when I add references to a project and then build an assembly. If the reference were part of the mscorlib would the action taken by...
0
by: Ramesh | last post by:
I am using a datalist control in my page and it has a header template. In the header template I have a table control with table rows and table cells. While the page is loading I am trying to change...
1
by: mplutodh1 | last post by:
This may seem like an odd thing to do, but is there a way to blindly post to a form. By that I mean, sending data (First_Name=John) without actually having the browser go to that page? I am...
12
by: Joe | last post by:
Hello All: Do I have to use the LoadControl method of the Page to load a UserControl? I have a class which contains three methods (one public and two private). The class acts as a control...
6
by: Mark Denardo | last post by:
My question is similar to one someone posted a few months back, but I don't see any replies. Basically I want to be able to have users upload photos and save them in a database (as byte data)...
0
by: MeoLessi9 | last post by:
I have VirtualBox installed on Windows 11 and now I would like to install Kali on a virtual machine. However, on the official website, I see two options: "Installer images" and "Virtual machines"....
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: Aftab Ahmad | last post by:
Hello Experts! I have written a code in MS Access for a cmd called "WhatsApp Message" to open WhatsApp using that very code but the problem is that it gives a popup message everytime I clicked on...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: marcoviolo | last post by:
Dear all, I would like to implement on my worksheet an vlookup dynamic , that consider a change of pivot excel via win32com, from an external excel (without open it) and save the new file into a...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.