471,602 Members | 1,239 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,602 software developers and data experts.

File Polling (Rereading)

Hello Fellow Python Programmers,

I have the following problem:

i want to read the content of a file every 10 seconds. now what is the
best funktion to do that? to open the file, read it and close it
consumes a lot of CPU time. is there a better solution?

Greetings Daniel
--
Please use my GnuPG (PGP) Public-Key for secure message exchange.
(http://157.161.155.222/damueller-pubkey.txt)
Jul 18 '05 #1
13 2312
Daniel Mueller wrote:
Hello Fellow Python Programmers,

I have the following problem:

i want to read the content of a file every 10 seconds. now what is the
best funktion to do that? to open the file, read it and close it
consumes a lot of CPU time. is there a better solution?


No. At least not if what you really want is always the full content of the
file. If you're only interested in the content if the file has actually
changed - thats a totally different animal, and is under unix doable using
stat-calls (somewhere in the os module). I'm not sure how much that extends
to windows, but I'm pretty much confident that there is similar stuff
available.

Diez-
Jul 18 '05 #2
Diez B. Roggisch wrote:
If you're only interested in the content if the file has actually
changed - thats a totally different animal,
Good to hear! i only need the changes! do you have code examples?
and is under unix doable using
stat-calls (somewhere in the os module). I'm not sure how much that extends
to windows, but I'm pretty much confident that there is similar stuff
available.


im programming under Linux

--
Please use my GnuPG (PGP) Public-Key for secure message exchange.
(http://157.161.155.222/damueller-pubkey.txt)
Jul 18 '05 #3
> Good to hear! i only need the changes! do you have code examples?

This is a misunderstanding: I didn't mean "the changes". What you can get by
calling

os.stat

is the timestamp of the last modification - so you can then skip rereading
the file if there has been no modification after the last time you read the
file.

But you _can't_ only read what has been changed in the file - there is no
such thing neither in python, nor the underlying OSses (which is the reason
for python not having such a functionality)

I suggest you tell us more about what you actually want to accomplish, then
we might be able to offer better advice.

Diez
Jul 18 '05 #4
Daniel Mueller <da********@gmx.net> wrote:
i want to read the content of a file every 10 seconds. now what is the
best funktion to do that? to open the file, read it and close it
consumes a lot of CPU time. is there a better solution?


The portable way is to stat the file (see os.stat) every
10 seconds, look at the mtime (modification time), and
if it did change, then open/read/close. This still means
polling the file, but at least stat is much more efficient
than open/read/close. If you're also interested in changes
to the metainformations of the file (permissions, owner
etc.), you also have to look at the ctime.

If you're lucky enough to work on a FreeBSD UNIX system
(and don't need a portable solution), you can use FreeBSD's
kqueue API. Using that interface, you don't have to poll
at all. The kernel will notify you immediately when a
specific event occurs, such as someone writing to a certain
file (this is used by the "tail -f" command, for example,
so it doesn't have to poll the file). The Python bindings
for the kqueue interface are in the ports collection of
FreeBSD (see ports/devel/kqueue). Otherwise, see this
webpage for more information:

http://people.freebsd.org/~dwhite/PyKQueue/

Best regards
Oliver

--
Oliver Fromme, Konrad-Celtis-Str. 72, 81369 Munich, Germany

``All that we see or seem is just a dream within a dream.''
(E. A. Poe)
Jul 18 '05 #5
Hello Diez,
Good to hear! i only need the changes! do you have code examples?
This is a misunderstanding: I didn't mean "the changes". What you can get by
calling

os.stat

is the timestamp of the last modification - so you can then skip rereading
the file if there has been no modification after the last time you read the
file.

After that you can compare md5 hashes (which are fast to compute) and know if
the file content has changed.
But you _can't_ only read what has been changed in the file - there is no
such thing neither in python, nor the underlying OSses (which is the reason
for python not having such a functionality)

IMO it is possible (don't know exactly how) on journalized file systems.

Bye.
--
------------------------------------------------------------------------
Miki Tebeka <mi*********@gmail.com>
http://tebeka.spymac.net
The only difference between children and adults is the price of the toys
Jul 18 '05 #6
Oliver Fromme wrote:
The portable way is to stat the file (see os.stat) every
10 seconds, look at the mtime (modification time), and
if it did change, then open/read/close.
hmm i'll have to rephrase my question... i KNOW that the file changes
every 10 seconds... easiest would be i just give you the source code ;)
the program is a nice little windows which displays ACPI information
about battery status and stuff like that. the program is working just
fine. but i want the performance tweak it...

here is the link to the source code:

http://sourceforge.net/project/showf...kage_id=117073

help would be greatly appreciated!
If you're lucky enough to work on a FreeBSD UNIX system
(and don't need a portable solution)


Linux Platform... and no i dont need a portable solution
--
Please use my GnuPG (PGP) Public-Key for secure message exchange.
(http://157.161.155.222/damueller-pubkey.txt)
Jul 18 '05 #7
>> But you _can't_ only read what has been changed in the file - there is no
such thing neither in python, nor the underlying OSses (which is the
reason for python not having such a functionality)

IMO it is possible (don't know exactly how) on journalized file systems.


I seriously doubt that - at least that this sort of diff you can get is
more than only a list of file-offsets together with blocks of a certain
size. The os doesn't do versioned updates like subverision or cvs do. So
most times, that won't be of much use (think of xml-data - howto replace
only certain parts, that not neccelarily respect the structural
requirements of xml?)

But as long as the OP doesn't fill us in with more details of what he is
after , this is all idle speculation.

--
Regards,

Diez B. Roggisch
Jul 18 '05 #8
> hmm i'll have to rephrase my question... i KNOW that the file changes
every 10 seconds... easiest would be i just give you the source code ;)
the program is a nice little windows which displays ACPI information
about battery status and stuff like that. the program is working just
fine. but i want the performance tweak it...


Why don't you keep the file open, and read until the end - that will give
you all that has been appended since the last read. You might run into
buffering issues here, but you should be able to modifiy the
termios-settings of the filedescriptor so you can read data even if only a
byte has been send.

--
Regards,

Diez B. Roggisch
Jul 18 '05 #9
Daniel Mueller wrote:
hmm i'll have to rephrase my question... i KNOW that the file changes
every 10 seconds... easiest would be i just give you the source code ;)
the program is a nice little windows which displays ACPI information
about battery status and stuff like that. the program is working just
fine.
What makes you think there is any way to detect the file changes
or read only the changes from files in the /proc file system?

Looking briefly at /proc on my own (non-laptop, so no acpi folder)
machine, I see that the dates as shown by "ls -l" or os.stat()
are *always* the current time... they increment second-by-second
as I run os.stat() repeatedly.
but i want the performance tweak it...


Why? Do you have any evidence that you have a performance problem
related to reading the data from these pseudo-files? (I'm guessing
that you think they are real files on your hard drive, but even
if that were the case, you probably haven't measured the access
or read times to prove that you actually *have* a problem.)

/proc is a *virtual* file system, so reading data from it is
about as fast as transferring bytes around in memory (to a
first approximation, anyway). Measure it!

If you don't have evidence of poor performance, you are probably
doing "premature optimization".

-Peter
Jul 18 '05 #10
Peter Hansen wrote:
Why? Do you have any evidence that you have a performance problem
related to reading the data from these pseudo-files? (I'm guessing
that you think they are real files on your hard drive, but even
if that were the case, you probably haven't measured the access
or read times to prove that you actually *have* a problem.)


Yeah i know that /proc is a virtual filesystem. And yes i know that the
files always have the pressent timestamp.

Well the application uses up to 10% of CPU power with every poll... is
that a normal amount??

Daniel

--
Please use my GnuPG (PGP) Public-Key for secure message exchange.
(http://157.161.155.222/damueller-pubkey.txt)
Jul 18 '05 #11
Daniel Mueller wrote:
Peter Hansen wrote:
Why? Do you have any evidence that you have a performance problem
Yeah i know that /proc is a virtual filesystem. And yes i know that the
files always have the pressent timestamp.


Then we can at least dispense with any kind of "read changes only"
concept, can't we? You clearly have to read the entire file and
compare it with the previous file to know exactly what has changed.
Or read the file and parse it and compare the parsed results. I'm
unclear what other approach could be conceived of...
Well the application uses up to 10% of CPU power with every poll... is
that a normal amount??


Sure... actually I wouldn't be surprised if my program appeared
to take 100% of CPU while processing the data after reading the
files... to do otherwise would be abnormal. What I'd be focusing
on, however, was for *how long* it did so.

-Peter
Jul 18 '05 #12
Daniel Mueller <da********@gmx.net> wrote:
Oliver Fromme wrote:
The portable way is to stat the file (see os.stat) every
10 seconds, look at the mtime (modification time), and
if it did change, then open/read/close.


hmm i'll have to rephrase my question... i KNOW that the file changes
every 10 seconds...


In that case you have to re-read the file every 10 seconds
anyway, no matter what.

The only optimization that's worthwhile is to keep the file
open all the time, so you spare the overhead of the open()
system call. In other words, open it _once_, then read it
(do not close it!), and after 10 seconds rewind -- that is,
file.seek(0) -- and re-read it.

I don't know what the contents of that file look like (I
don't use Linux), nor do I want to know. But since the Li-
nux procfs usually produces ASCII text files (which is a
mistake, in my opinion), I guess it should be pretty easy
to parse and find the difference.

If it's more complicated than that, Python's difflib might
be helpful: http://docs.python.org/lib/module-difflib.html

Best regards
Oliver

--
Oliver Fromme, Konrad-Celtis-Str. 72, 81369 Munich, Germany

``All that we see or seem is just a dream within a dream.''
(E. A. Poe)
Jul 18 '05 #13
Daniel Mueller <da********@gmx.net> wrote in message news:<cl***********@news.imp.ch>...
Hello Fellow Python Programmers,

I have the following problem:

i want to read the content of a file every 10 seconds. now what is the
best funktion to do that? to open the file, read it and close it
consumes a lot of CPU time. is there a better solution?

Greetings Daniel


Daniel,

I have written a python program that does this. It does not use a lot
of CPU cycles. You need to change the SLEEP variable from 0.50 to 10.
I have used it on Windows and on Linux. Here it is...

-John

#!/usr/bin/env python
"""
tail.py
John Taylor, Oct 19 2004
Adapted from:
http://groups.google.com/groups?hl=e...t%40python.org
"""

import sys,os,time,stat
SLEEP = 0.50

if len(sys.argv) != 2:
print
print "tail.py [ filename ]"
print
sys.exit(1)

FILENAME = sys.argv[1]

try:
fd = os.open(FILENAME,os.O_RDONLY) # on Linux, may want |O_LARGEFILE
except OSError, e:
print e
sys.exit(1)

info = os.fstat( fd )
lastsize = info[stat.ST_SIZE]
os.lseek( fd, lastsize, 2 )

try:
while True:
info = os.fstat( fd )
size = info[stat.ST_SIZE]
if size > lastsize:
os.lseek(fd, lastsize, 0)
data = os.read(fd, size - lastsize)
print data,
lastsize=size
time.sleep( SLEEP )
except KeyboardInterrupt:
pass

# end of program
Jul 18 '05 #14

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

10 posts views Thread by Ricardo Luceac | last post: by
1 post views Thread by engwar | last post: by
1 post views Thread by kelvin.jones | last post: by
5 posts views Thread by Mike | last post: by
13 posts views Thread by LordHog | last post: by
3 posts views Thread by Chris Mullins [MVP - C#] | last post: by
1 post views Thread by XIAOLAOHU | last post: by
reply views Thread by leo001 | last post: by
reply views Thread by MichaelMortimer | last post: by
reply views Thread by CCCYYYY | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.