By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
443,767 Members | 2,114 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 443,767 IT Pros & Developers. It's quick & easy.

os.path.getsize() on Windows

P: n/a
Hi all,

I'm seeing some behavior that is confusing me. I often use a simple
function to tell if a file is growing...ie being copied into a certain
location. (Can't process it until it's complete) My function is not
working on windows, and I'm wondering if I am missing something
simple, or if I have just never tried this before. Here's what I'm
trying to do:

def isGrowing(f, timeout):
ssize = os.path.getsize(f)
time.sleep(timeout)
esize =os.path.getsize(f)
return esize != ssize

On windows, this returns the size of the file as it _will be_, not the
size that it currently is. Is this a feature? What is the proper way
to get the current size of the file? I noticed
win32File.GetFileSize() Does that behave the way I expect?

PS. I also tried os.stat()[6]

~Sean
Mar 18 '08 #1
Share this Question
Share on Google+
9 Replies


P: n/a
Sean DiZazzo <ha**********@gmail.comwrote:
On windows, this returns the size of the file as it _will be_, not the
size that it currently is. Is this a feature? What is the proper way
to get the current size of the file? I noticed
win32File.GetFileSize() Does that behave the way I expect?

PS. I also tried os.stat()[6]
I think all of those will return the current size of the file, but that may
be the same as the final size: just because the data hasn't been copied
doesn't mean the file space hasn't been allocated. You don't say how you
are copying the file, but I seem to remember that Windows copy command pre-
allocates the file at its final size (so as to reduce fragmentation) and
then just copies the data after that.

If you need to make sure you don't access a file until the copy has
finished then get hwatever is doing the copy to copy it to a temporary
filename in the same folder and rename it when complete. Then you just have
to check for existence of the target file.
Mar 18 '08 #2

P: n/a
On Mar 18, 2:27*pm, Duncan Booth <duncan.bo...@invalid.invalidwrote:
Sean DiZazzo <half.ital...@gmail.comwrote:
On windows, this returns the size of the file as it _will be_, not the
size that it currently is. *Is this a feature? *What is the proper way
to get the current size of the file? *I noticed
win32File.GetFileSize() *Does that behave the way I expect?
PS. *I also tried os.stat()[6]

I think all of those will return the current size of the file, but that may
be the same as the final size: just because the data hasn't been copied
doesn't mean the file space hasn't been allocated. You don't say how you
are copying the file, but I seem to remember that Windows copy command pre-
allocates the file at its final size (so as to reduce fragmentation) and
then just copies the data after that.

If you need to make sure you don't access a file until the copy has
finished then get hwatever is doing the copy to copy it to a temporary
filename in the same folder and rename it when complete. Then you just have
to check for existence of the target file.
Hmmm... The file could be copied in by several different sources of
which I have no control. I can't use your technique in my situation.
I also tried getting md5 hashes with some time in between on advice,
but the file is not opened for reading until the copy completes so I
can't get the hashes.

Any other ideas?
Mar 18 '08 #3

P: n/a
On Mar 19, 9:27 am, Sean DiZazzo <half.ital...@gmail.comwrote:
On Mar 18, 2:27 pm, Duncan Booth <duncan.bo...@invalid.invalidwrote:
Sean DiZazzo <half.ital...@gmail.comwrote:
On windows, this returns the size of the file as it _will be_, not the
size that it currently is. Is this a feature? What is the proper way
to get the current size of the file? I noticed
win32File.GetFileSize() Does that behave the way I expect?
PS. I also tried os.stat()[6]
I think all of those will return the current size of the file, but that may
be the same as the final size: just because the data hasn't been copied
doesn't mean the file space hasn't been allocated. You don't say how you
are copying the file, but I seem to remember that Windows copy command pre-
allocates the file at its final size (so as to reduce fragmentation) and
then just copies the data after that.
If you need to make sure you don't access a file until the copy has
finished then get hwatever is doing the copy to copy it to a temporary
filename in the same folder and rename it when complete. Then you just have
to check for existence of the target file.

Hmmm... The file could be copied in by several different sources of
which I have no control. I can't use your technique in my situation.
I also tried getting md5 hashes with some time in between on advice,
but the file is not opened for reading until the copy completes so I
can't get the hashes.

Any other ideas?
Why not try to open the file for exclusive write/update access?

Mar 18 '08 #4

P: n/a
Sean DiZazzo <ha**********@gmail.comwrote:
In this case, there will be so few people touching the system, that I
think I can get away with having the copy be done from Unix, but it
would be nice to have a general way of knowing this on Windows.
Doesn't the CreateFile call I posted earlier do what you want?
Mar 20 '08 #5

P: n/a
On Thu, 20 Mar 2008 10:45:03 -0700, Sean DiZazzo wrote:
After trying again this morning, the file is opened for reading. I must
have had some wonky permissions on that file, so the error method won't
work.
Then fix the permissions.
--
Steven
Mar 20 '08 #6

P: n/a
Sean DiZazzo <ha**********@gmail.comwrote:
>
The overall idea is to be able to tell if a file has finished being
placed in a directory without any control over what is putting it
there.
There is simply no way to do this on Windows that works in the general
case.
--
Tim Roberts, ti**@probo.com
Providenza & Boekelheide, Inc.
Mar 21 '08 #7

P: n/a
def isGrowing(f, timeout):
ssize = os.path.getsize(f)
time.sleep(timeout)
esize =os.path.getsize(f)
return esize != ssize

On windows, this returns the size of the file as it _will be_, not the
size that it currently is.
Why do you say that? It most definitely returns what the size currently
is, not what it will be in the future (how could it know, anyway).

Regards,
Martin
Mar 21 '08 #8

P: n/a
Martin v. Löwis wrote:
>def isGrowing(f, timeout):
ssize = os.path.getsize(f)
time.sleep(timeout)
esize =os.path.getsize(f)
return esize != ssize

On windows, this returns the size of the file as it _will be_, not the
size that it currently is.

Why do you say that? It most definitely returns what the size currently
is, not what it will be in the future (how could it know, anyway).
I've seen this before, when copying a file in Windows. Windows reports
the size the file will be after the copy is complete (it knows, after
all, the size of the source file). I always thought this meant that
Windows is just much smarter than me, so I ignored it.

Paul
Mar 21 '08 #9

P: n/a
>Why do you say that? It most definitely returns what the size currently
>is, not what it will be in the future (how could it know, anyway).

I've seen this before, when copying a file in Windows. Windows reports
the size the file will be after the copy is complete (it knows, after
all, the size of the source file). I always thought this meant that
Windows is just much smarter than me, so I ignored it.
No, I really think the target file has its size right from the
beginning.

Regards,
Martin

Mar 21 '08 #10

This discussion thread is closed

Replies have been disabled for this discussion.