Hi all,
I'm writing some code that monitors a directory for the appearance of
files from a workflow. When those files appear I write a command file
to a device that tells the device how to process the file. The
appearance of the command file triggers the device to grab the
original file. My problem is I don't want to write the command file to
the device until the original file from the workflow has been copied
completely. Since these files are large, my program has a good chance
of scanning the directory while they are mid-copy, so I need to
determine which files are finished being copied and which are still
mid-copy.
I haven't seen anything on Google talking about this, and I don't see
an obvious way of doing this using the os.stat() method on the
filepath. Anyone have any ideas about how I might accomplish this?
Thanks in advance!
Doug 13 10712
writeson wrote:
Hi all,
I'm writing some code that monitors a directory for the appearance of
files from a workflow. When those files appear I write a command file
to a device that tells the device how to process the file. The
appearance of the command file triggers the device to grab the
original file. My problem is I don't want to write the command file to
the device until the original file from the workflow has been copied
completely. Since these files are large, my program has a good chance
of scanning the directory while they are mid-copy, so I need to
determine which files are finished being copied and which are still
mid-copy.
I haven't seen anything on Google talking about this, and I don't see
an obvious way of doing this using the os.stat() method on the
filepath. Anyone have any ideas about how I might accomplish this?
Thanks in advance!
Doug
The best way to do this is to have the program that copies the files copy them
to a temporarily named file and rename it when it is completed. That way you
know when it is done by scanning for files with a specific mask.
If that is not possible you might be able to use pyinotify
( http://pyinotify.sourceforge.net/) to watch for WRITE_CLOSE events on the
directory and then process the files.
-Larry
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
This seems a synchronization problem. A scenario description could clear
things up so we can help:
Program W (The workflow) copies file F to directory B
Program D (the dog) polls directory B to find is there's any new file F
In this scenario, program D does not know whether F has been fully
copied, but W does.
Solution:
Create a custom lock mechanism. Program W writes a file D/F.lock to
indicate file F is not complete, it's removed when F is fully copied.
I program W crashes in mid-copy both F and F.lock are kept so program D
does not bother to process F. Recovery from the crash in W would another
issue to tackle down.
Best regards,
Manuel.
writeson wrote:
Hi all,
I'm writing some code that monitors a directory for the appearance of
files from a workflow. When those files appear I write a command file
to a device that tells the device how to process the file. The
appearance of the command file triggers the device to grab the
original file. My problem is I don't want to write the command file to
the device until the original file from the workflow has been copied
completely. Since these files are large, my program has a good chance
of scanning the directory while they are mid-copy, so I need to
determine which files are finished being copied and which are still
mid-copy.
I haven't seen anything on Google talking about this, and I don't see
an obvious way of doing this using the os.stat() method on the
filepath. Anyone have any ideas about how I might accomplish this?
Thanks in advance!
Doug
-- http://mail.python.org/mailman/listinfo/python-list
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iEYEARECAAYFAkh 04skACgkQI2zpkm cEAhi0eQCgsVqg5 1fWiwi47jxqtbR8 Gz2U
UukAoKm15UAm3Kp EyjhsIGQ+68rq8W uU
=UFHi
-----END PGP SIGNATURE-----
Also available:
pgm-W copies/creates-fills whatever B/dummy
when done, pgm-W renames B/dummy to B/F
pgm-D only scouts for B/F and does it thing when found
Steve no******@hughes .net
Manuel Vazquez Acosta wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
This seems a synchronization problem. A scenario description could clear
things up so we can help:
Program W (The workflow) copies file F to directory B
Program D (the dog) polls directory B to find is there's any new file F
In this scenario, program D does not know whether F has been fully
copied, but W does.
Solution:
Create a custom lock mechanism. Program W writes a file D/F.lock to
indicate file F is not complete, it's removed when F is fully copied.
I program W crashes in mid-copy both F and F.lock are kept so program D
does not bother to process F. Recovery from the crash in W would another
issue to tackle down.
Best regards,
Manuel.
writeson wrote:
>Hi all,
I'm writing some code that monitors a directory for the appearance of files from a workflow. When those files appear I write a command file to a device that tells the device how to process the file. The appearance of the command file triggers the device to grab the original file. My problem is I don't want to write the command file to the device until the original file from the workflow has been copied completely. Since these files are large, my program has a good chance of scanning the directory while they are mid-copy, so I need to determine which files are finished being copied and which are still mid-copy.
I haven't seen anything on Google talking about this, and I don't see an obvious way of doing this using the os.stat() method on the filepath. Anyone have any ideas about how I might accomplish this?
Thanks in advance! Doug -- http://mail.python.org/mailman/listinfo/python-list
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iEYEARECAAYFAkh 04skACgkQI2zpkm cEAhi0eQCgsVqg5 1fWiwi47jxqtbR8 Gz2U
UukAoKm15UAm3Kp EyjhsIGQ+68rq8W uU
=UFHi
-----END PGP SIGNATURE-----
-- http://mail.python.org/mailman/listinfo/python-list
Guys,
Thanks for your replies, they are helpful. I should have included in
my initial question that I don't have as much control over the program
that writes (pgm-W) as I'd like. Otherwise, the write to a different
filename and then rename solution would work great. There's no way to
tell from the os.stat() methods to tell when the file is finished
being copied? I ran some test programs, one of which continously
copies big files from one directory to another, and another that
continously does a glob.glob("*.pd f") on those files and looks at the
st_atime and st_mtime parts of the return value of os.stat(filenam e).
From that experiment it looks like st_atime and st_mtime equal each
other until the file has finished being copied. Nothing in the
documentation about st_atime or st_mtime leads me to think this is
true, it's just my observations about the two test programs I've
described.
Any thoughts? Thanks!
Doug
writeson wrote:
Guys,
Thanks for your replies, they are helpful. I should have included in
my initial question that I don't have as much control over the program
that writes (pgm-W) as I'd like. Otherwise, the write to a different
filename and then rename solution would work great. There's no way to
tell from the os.stat() methods to tell when the file is finished
being copied? I ran some test programs, one of which continously
copies big files from one directory to another, and another that
continously does a glob.glob("*.pd f") on those files and looks at the
st_atime and st_mtime parts of the return value of os.stat(filenam e).
From that experiment it looks like st_atime and st_mtime equal each
other until the file has finished being copied. Nothing in the
documentation about st_atime or st_mtime leads me to think this is
true, it's just my observations about the two test programs I've
described.
Any thoughts? Thanks!
Doug
I guess the problem is "What is the definition of 'finished copying'?". There
is no explicit operating system command that says "I'm done copying to this file
and I won't add anything on to the end of it".
If I could not control the sending application, I would make an estimation of
how long the longest file could possibly take to copy, double it and then only
look at files where the st_ctime was at least that far in the past. What you
suggest could work as well.
-Larry
Thanks for your replies, they are helpful. I should have included in
my initial question that I don't have as much control over the program
that writes (pgm-W) as I'd like. Otherwise, the write to a different
filename and then rename solution would work great. There's no way to
tell from the os.stat() methods to tell when the file is finished
being copied? I ran some test programs, one of which continously
copies big files from one directory to another, and another that
continously does a glob.glob("*.pd f") on those files and looks at the
st_atime and st_mtime parts of the return value of os.stat(filenam e).
From that experiment it looks like st_atime and st_mtime equal each
other until the file has finished being copied. Nothing in the
documentation about st_atime or st_mtime leads me to think this is
true, it's just my observations about the two test programs I've
described.
Any thoughts? Thanks!
Doug
Could you maybe us the os module to call out to lsof to see if anyone
still has the target file open? I am assuming that when the write process
finishes writing it would close the file.
Check "man lsof"
writeson wrote:
Guys,
Thanks for your replies, they are helpful. I should have included in
my initial question that I don't have as much control over the program
that writes (pgm-W) as I'd like. Otherwise, the write to a different
filename and then rename solution would work great. There's no way to
tell from the os.stat() methods to tell when the file is finished
being copied? I ran some test programs, one of which continously
copies big files from one directory to another, and another that
continously does a glob.glob("*.pd f") on those files and looks at the
st_atime and st_mtime parts of the return value of os.stat(filenam e).
>>From that experiment it looks like st_atime and st_mtime equal each
other until the file has finished being copied. Nothing in the
documentation about st_atime or st_mtime leads me to think this is
true, it's just my observations about the two test programs I've
described.
Any thoughts? Thanks!
Doug
The solution my team has used is to monitor the file size. If the file
has stopped growing for x amount of time (we use 45 seconds) the file is
done copying. Not elegant, but it works.
--
Ethan
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Ethan Furman wrote:
writeson wrote:
>Guys,
Thanks for your replies, they are helpful. I should have included in my initial question that I don't have as much control over the program that writes (pgm-W) as I'd like. Otherwise, the write to a different filename and then rename solution would work great. There's no way to tell from the os.stat() methods to tell when the file is finished being copied? I ran some test programs, one of which continously copies big files from one directory to another, and another that continously does a glob.glob("*.pd f") on those files and looks at the st_atime and st_mtime parts of the return value of os.stat(filenam e).
>>From that experiment it looks like st_atime and st_mtime equal each
other until the file has finished being copied. Nothing in the documentatio n about st_atime or st_mtime leads me to think this is true, it's just my observations about the two test programs I've described.
Any thoughts? Thanks! Doug
The solution my team has used is to monitor the file size. If the file
has stopped growing for x amount of time (we use 45 seconds) the file is
done copying. Not elegant, but it works.
--
Ethan
Also I think that matching the md5sums may work. Just set up so that it
checks the copy's md5sum every couple of seconds (or whatever time
interval you want) and matches against the original's. When they match
copying's done. I haven't actually tried this but think it may work.
Any more experienced programmers out there let me know if this is
unworkable please.
K
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFIdVkX8vm NfzrLpqoRAsJ2AK Cp8wMz93Vz8y9K+ MDSP33kH/WHngCgl/wM
qTFBfyIEGhu/dNSQzeRrwYQ=
=Xvjq
-----END PGP SIGNATURE-----
keith wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Ethan Furman wrote:
>writeson wrote:
>>Guys,
Thanks for your replies, they are helpful. I should have included in my initial question that I don't have as much control over the program that writes (pgm-W) as I'd like. Otherwise, the write to a different filename and then rename solution would work great. There's no way to tell from the os.stat() methods to tell when the file is finished being copied? I ran some test programs, one of which continously copies big files from one directory to another, and another that continously does a glob.glob("*.pd f") on those files and looks at the st_atime and st_mtime parts of the return value of os.stat(filenam e). From that experiment it looks like st_atime and st_mtime equal each other until the file has finished being copied. Nothing in the documentati on about st_atime or st_mtime leads me to think this is true, it's just my observations about the two test programs I've described.
Any thoughts? Thanks! Doug
The solution my team has used is to monitor the file size. If the file has stopped growing for x amount of time (we use 45 seconds) the file is done copying. Not elegant, but it works. -- Ethan
Also I think that matching the md5sums may work. Just set up so that it
checks the copy's md5sum every couple of seconds (or whatever time
interval you want) and matches against the original's. When they match
copying's done. I haven't actually tried this but think it may work.
Any more experienced programmers out there let me know if this is
unworkable please.
K
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFIdVkX8vm NfzrLpqoRAsJ2AK Cp8wMz93Vz8y9K+ MDSP33kH/WHngCgl/wM
qTFBfyIEGhu/dNSQzeRrwYQ=
=Xvjq
-----END PGP SIGNATURE-----
If the files are large this could consume a lot of CPU and I/O to recalculate
the checksum over and over. I would try the "hasn't been
modified/accessed/created" in some amount of time first.
-Larry This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics |
by: Mike MacSween |
last post by:
On each workstation there's a front end, when the front end opens it checks
a 'version number' held in a table of properties against the version number
in a copy of the front end held on the server. If it's the same it opens
normally. If not it shells out to an UpdateClient.mdb and quits. The
UpdateClient.mdb overwrites the FE on the workstation with the one on the
server. But I was getting an access error (70, I think). Presuming that the...
|
by: Trowa |
last post by:
Hi,
I'm trying to use FileSystemWatcher to determine when a new file has been
added to a directory. However, after the file has been added, I need to
process the file. This results in a problem if the file is large and takes
time to copy, as the events appear to get received before the copy is
finished.
I've tried listening to the Created event, as well as the Changed event
(which gets fired multiple times) based on the last write...
|
by: Randy |
last post by:
Hi,
is it possible to show the progress of a big file being copied e.g. in
a "progressbar"?
I tried to use file.copy - but this seems to make no sense :-(
Thanks in advance,
Randy
|
by: TB |
last post by:
Hi All:
This news group is proving to be great help on my path towards
mastering ASP.NET thanks to all of you helpful souls out there. I am
looking forward to the day when I can contribute with a few ounces of
hard-won knowledge as well.
Meanwhile I have this (hopefully small) problem.
|
by: bloggs |
last post by:
I am writting an application that runs on an FTP server. The purpose of the app is to uncompress gzip files that are sent to the server and then move the file to another folder according to it's contents. The challenge I am having is preventing the application from uncompressing a file before the file has landed on the server in its entirety. Currently, I am using the following statement to check for incoming files
For Each sFile In...
| |
by: elake |
last post by:
I found this thread about a pst file in Windows being locked and I am
having the same issue.
http://groups.google.com/group/comp.lang.python/browse_thread/thread/d3dee5550b6d3652/ed00977acf62484f?lnk=gst&q=%27copying+locked+files%27&rnum=1
The problem is that I have a script that can find the pst files on
every machine in my network and back them up to a server for safe
keeping. The problem is that when Outlook is running it locks the...
|
by: jeffc |
last post by:
I'm maintaining some older code, apparently created in Vis Studio 2003. I
have Vis Studio 2005. When I open this solution, it says it has to convert
all the projects. There are some problems with the application and what I'm
trying to do is recreate the original application from scratch in 2005, and
then copying the code over. It's not clear how to reverse engineer this.
For example, when I add a new item to a project, you can choose...
|
by: =?Utf-8?B?UmFkZW5rb19aZWM=?= |
last post by:
I am using standard File.Copy(source,dest,true) method in C# and I have
problem with copying large number of files.
Here is my code:
foreach (FileInfo file in files)
{
File.Copy(file.FullName,destPath+ "\\" + file.Name, true);
}
This code copies only 5 or 10 files but in "files" collection there is 60
files.
|
by: Manuel Vazquez Acosta |
last post by:
Cameron Simpson wrote:
Hum, what about the last file in the sequence?
I think polling file's size maybe a good indicator, as Ethan proposed.
Best regards,
Manuel.
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed.
This is as boiled down as I can make it.
Here is my compilation command:
g++-12 -std=c++20 -Wnarrowing bit_field.cpp
Here is the code in...
|
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
| |
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own....
Now, this would greatly impact the work of software developers. The idea...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules.
He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms.
Adolph will...
|
by: conductexam |
last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one.
At the time of converting from word file to html my equations which are in the word document file was convert into image.
Globals.ThisAddIn.Application.ActiveDocument.Select();...
|
by: TSSRALBI |
last post by:
Hello
I'm a network technician in training and I need your help.
I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs.
The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols.
I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
|
by: muto222 |
last post by:
How can i add a mobile payment intergratation into php mysql website.
| |
by: bsmnconsultancy |
last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...
| |