473,394 Members | 1,750 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,394 software developers and data experts.

os.path.getsize() on Windows

Hi all,

I'm seeing some behavior that is confusing me. I often use a simple
function to tell if a file is growing...ie being copied into a certain
location. (Can't process it until it's complete) My function is not
working on windows, and I'm wondering if I am missing something
simple, or if I have just never tried this before. Here's what I'm
trying to do:

def isGrowing(f, timeout):
ssize = os.path.getsize(f)
time.sleep(timeout)
esize =os.path.getsize(f)
return esize != ssize

On windows, this returns the size of the file as it _will be_, not the
size that it currently is. Is this a feature? What is the proper way
to get the current size of the file? I noticed
win32File.GetFileSize() Does that behave the way I expect?

PS. I also tried os.stat()[6]

~Sean
Mar 18 '08 #1
9 6544
Sean DiZazzo <ha**********@gmail.comwrote:
On windows, this returns the size of the file as it _will be_, not the
size that it currently is. Is this a feature? What is the proper way
to get the current size of the file? I noticed
win32File.GetFileSize() Does that behave the way I expect?

PS. I also tried os.stat()[6]
I think all of those will return the current size of the file, but that may
be the same as the final size: just because the data hasn't been copied
doesn't mean the file space hasn't been allocated. You don't say how you
are copying the file, but I seem to remember that Windows copy command pre-
allocates the file at its final size (so as to reduce fragmentation) and
then just copies the data after that.

If you need to make sure you don't access a file until the copy has
finished then get hwatever is doing the copy to copy it to a temporary
filename in the same folder and rename it when complete. Then you just have
to check for existence of the target file.
Mar 18 '08 #2
On Mar 18, 2:27*pm, Duncan Booth <duncan.bo...@invalid.invalidwrote:
Sean DiZazzo <half.ital...@gmail.comwrote:
On windows, this returns the size of the file as it _will be_, not the
size that it currently is. *Is this a feature? *What is the proper way
to get the current size of the file? *I noticed
win32File.GetFileSize() *Does that behave the way I expect?
PS. *I also tried os.stat()[6]

I think all of those will return the current size of the file, but that may
be the same as the final size: just because the data hasn't been copied
doesn't mean the file space hasn't been allocated. You don't say how you
are copying the file, but I seem to remember that Windows copy command pre-
allocates the file at its final size (so as to reduce fragmentation) and
then just copies the data after that.

If you need to make sure you don't access a file until the copy has
finished then get hwatever is doing the copy to copy it to a temporary
filename in the same folder and rename it when complete. Then you just have
to check for existence of the target file.
Hmmm... The file could be copied in by several different sources of
which I have no control. I can't use your technique in my situation.
I also tried getting md5 hashes with some time in between on advice,
but the file is not opened for reading until the copy completes so I
can't get the hashes.

Any other ideas?
Mar 18 '08 #3
On Mar 19, 9:27 am, Sean DiZazzo <half.ital...@gmail.comwrote:
On Mar 18, 2:27 pm, Duncan Booth <duncan.bo...@invalid.invalidwrote:
Sean DiZazzo <half.ital...@gmail.comwrote:
On windows, this returns the size of the file as it _will be_, not the
size that it currently is. Is this a feature? What is the proper way
to get the current size of the file? I noticed
win32File.GetFileSize() Does that behave the way I expect?
PS. I also tried os.stat()[6]
I think all of those will return the current size of the file, but that may
be the same as the final size: just because the data hasn't been copied
doesn't mean the file space hasn't been allocated. You don't say how you
are copying the file, but I seem to remember that Windows copy command pre-
allocates the file at its final size (so as to reduce fragmentation) and
then just copies the data after that.
If you need to make sure you don't access a file until the copy has
finished then get hwatever is doing the copy to copy it to a temporary
filename in the same folder and rename it when complete. Then you just have
to check for existence of the target file.

Hmmm... The file could be copied in by several different sources of
which I have no control. I can't use your technique in my situation.
I also tried getting md5 hashes with some time in between on advice,
but the file is not opened for reading until the copy completes so I
can't get the hashes.

Any other ideas?
Why not try to open the file for exclusive write/update access?

Mar 18 '08 #4
Sean DiZazzo <ha**********@gmail.comwrote:
In this case, there will be so few people touching the system, that I
think I can get away with having the copy be done from Unix, but it
would be nice to have a general way of knowing this on Windows.
Doesn't the CreateFile call I posted earlier do what you want?
Mar 20 '08 #5
On Thu, 20 Mar 2008 10:45:03 -0700, Sean DiZazzo wrote:
After trying again this morning, the file is opened for reading. I must
have had some wonky permissions on that file, so the error method won't
work.
Then fix the permissions.
--
Steven
Mar 20 '08 #6
Sean DiZazzo <ha**********@gmail.comwrote:
>
The overall idea is to be able to tell if a file has finished being
placed in a directory without any control over what is putting it
there.
There is simply no way to do this on Windows that works in the general
case.
--
Tim Roberts, ti**@probo.com
Providenza & Boekelheide, Inc.
Mar 21 '08 #7
def isGrowing(f, timeout):
ssize = os.path.getsize(f)
time.sleep(timeout)
esize =os.path.getsize(f)
return esize != ssize

On windows, this returns the size of the file as it _will be_, not the
size that it currently is.
Why do you say that? It most definitely returns what the size currently
is, not what it will be in the future (how could it know, anyway).

Regards,
Martin
Mar 21 '08 #8
Martin v. Löwis wrote:
>def isGrowing(f, timeout):
ssize = os.path.getsize(f)
time.sleep(timeout)
esize =os.path.getsize(f)
return esize != ssize

On windows, this returns the size of the file as it _will be_, not the
size that it currently is.

Why do you say that? It most definitely returns what the size currently
is, not what it will be in the future (how could it know, anyway).
I've seen this before, when copying a file in Windows. Windows reports
the size the file will be after the copy is complete (it knows, after
all, the size of the source file). I always thought this meant that
Windows is just much smarter than me, so I ignored it.

Paul
Mar 21 '08 #9
>Why do you say that? It most definitely returns what the size currently
>is, not what it will be in the future (how could it know, anyway).

I've seen this before, when copying a file in Windows. Windows reports
the size the file will be after the copy is complete (it knows, after
all, the size of the source file). I always thought this meant that
Windows is just much smarter than me, so I ignored it.
No, I really think the target file has its size right from the
beginning.

Regards,
Martin

Mar 21 '08 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

7
by: Earl Eiland | last post by:
os.path.getsize(Inputdirectory + '\\' + Filename) works, but os.path.getsize(Inputdirectory + '\\' + Filename.split('.') + '.ext') Fails reporting "no such file or directory...
5
by: chirayuk | last post by:
Hi, I am trying to treat an environment variable as a python list - and I'm sure there must be a standard and simple way to do so. I know that the interpreter itself must use it (to process...
6
by: kimes | last post by:
I've just started digging into how python works.. I found that other mudules are clearly declared like one file per a module.. But the only os.path doesn't have their own file.. ye I know is...
34
by: Reinhold Birkenfeld | last post by:
Hi, the arguments in the previous thread were convincing enough, so I made the Path class inherit from str/unicode again. It still can be found in CVS:...
2
by: Rob Cowie | last post by:
Hi, Given a string representing the path to a file, what is the best way to get at the filename? Does the OS module provide a function to parse the path? or is it acceptable to split the string...
17
by: chris.atlee | last post by:
Hi there, I haven't seen this topic pop up in a while, so I thought I'd raise it again... What is the status of the path module/class PEP? Did somebody start writing one, or did it die? I...
1
by: crusson | last post by:
(edited to add): this is in Visual Basic .net I am at a complete loss... I've been building a program on my machine, running it out of the developer with the f5 key and builidng it and running...
4
by: python | last post by:
Bad file names, i.e. filenames the OS considers illegal, will cause functions in the os.path module to raise an error. Example: import os.path print os.path.getsize( 'c:/pytest/*.py' ) On...
8
by: yguan08 | last post by:
I have trouble of obtaining the file size of a file because the fullpath exceeds 255 characters. I get this message with os.path.getsize(fullpath). fullpath = r"\\LOSSSFS002\NWE_TECHNICAL\05....
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.