473,385 Members | 1,449 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

File Polling (Rereading)

Hello Fellow Python Programmers,

I have the following problem:

i want to read the content of a file every 10 seconds. now what is the
best funktion to do that? to open the file, read it and close it
consumes a lot of CPU time. is there a better solution?

Greetings Daniel
--
Please use my GnuPG (PGP) Public-Key for secure message exchange.
(http://157.161.155.222/damueller-pubkey.txt)
Jul 18 '05 #1
13 2467
Daniel Mueller wrote:
Hello Fellow Python Programmers,

I have the following problem:

i want to read the content of a file every 10 seconds. now what is the
best funktion to do that? to open the file, read it and close it
consumes a lot of CPU time. is there a better solution?


No. At least not if what you really want is always the full content of the
file. If you're only interested in the content if the file has actually
changed - thats a totally different animal, and is under unix doable using
stat-calls (somewhere in the os module). I'm not sure how much that extends
to windows, but I'm pretty much confident that there is similar stuff
available.

Diez-
Jul 18 '05 #2
Diez B. Roggisch wrote:
If you're only interested in the content if the file has actually
changed - thats a totally different animal,
Good to hear! i only need the changes! do you have code examples?
and is under unix doable using
stat-calls (somewhere in the os module). I'm not sure how much that extends
to windows, but I'm pretty much confident that there is similar stuff
available.


im programming under Linux

--
Please use my GnuPG (PGP) Public-Key for secure message exchange.
(http://157.161.155.222/damueller-pubkey.txt)
Jul 18 '05 #3
> Good to hear! i only need the changes! do you have code examples?

This is a misunderstanding: I didn't mean "the changes". What you can get by
calling

os.stat

is the timestamp of the last modification - so you can then skip rereading
the file if there has been no modification after the last time you read the
file.

But you _can't_ only read what has been changed in the file - there is no
such thing neither in python, nor the underlying OSses (which is the reason
for python not having such a functionality)

I suggest you tell us more about what you actually want to accomplish, then
we might be able to offer better advice.

Diez
Jul 18 '05 #4
Daniel Mueller <da********@gmx.net> wrote:
i want to read the content of a file every 10 seconds. now what is the
best funktion to do that? to open the file, read it and close it
consumes a lot of CPU time. is there a better solution?


The portable way is to stat the file (see os.stat) every
10 seconds, look at the mtime (modification time), and
if it did change, then open/read/close. This still means
polling the file, but at least stat is much more efficient
than open/read/close. If you're also interested in changes
to the metainformations of the file (permissions, owner
etc.), you also have to look at the ctime.

If you're lucky enough to work on a FreeBSD UNIX system
(and don't need a portable solution), you can use FreeBSD's
kqueue API. Using that interface, you don't have to poll
at all. The kernel will notify you immediately when a
specific event occurs, such as someone writing to a certain
file (this is used by the "tail -f" command, for example,
so it doesn't have to poll the file). The Python bindings
for the kqueue interface are in the ports collection of
FreeBSD (see ports/devel/kqueue). Otherwise, see this
webpage for more information:

http://people.freebsd.org/~dwhite/PyKQueue/

Best regards
Oliver

--
Oliver Fromme, Konrad-Celtis-Str. 72, 81369 Munich, Germany

``All that we see or seem is just a dream within a dream.''
(E. A. Poe)
Jul 18 '05 #5
Hello Diez,
Good to hear! i only need the changes! do you have code examples?
This is a misunderstanding: I didn't mean "the changes". What you can get by
calling

os.stat

is the timestamp of the last modification - so you can then skip rereading
the file if there has been no modification after the last time you read the
file.

After that you can compare md5 hashes (which are fast to compute) and know if
the file content has changed.
But you _can't_ only read what has been changed in the file - there is no
such thing neither in python, nor the underlying OSses (which is the reason
for python not having such a functionality)

IMO it is possible (don't know exactly how) on journalized file systems.

Bye.
--
------------------------------------------------------------------------
Miki Tebeka <mi*********@gmail.com>
http://tebeka.spymac.net
The only difference between children and adults is the price of the toys
Jul 18 '05 #6
Oliver Fromme wrote:
The portable way is to stat the file (see os.stat) every
10 seconds, look at the mtime (modification time), and
if it did change, then open/read/close.
hmm i'll have to rephrase my question... i KNOW that the file changes
every 10 seconds... easiest would be i just give you the source code ;)
the program is a nice little windows which displays ACPI information
about battery status and stuff like that. the program is working just
fine. but i want the performance tweak it...

here is the link to the source code:

http://sourceforge.net/project/showf...kage_id=117073

help would be greatly appreciated!
If you're lucky enough to work on a FreeBSD UNIX system
(and don't need a portable solution)


Linux Platform... and no i dont need a portable solution
--
Please use my GnuPG (PGP) Public-Key for secure message exchange.
(http://157.161.155.222/damueller-pubkey.txt)
Jul 18 '05 #7
>> But you _can't_ only read what has been changed in the file - there is no
such thing neither in python, nor the underlying OSses (which is the
reason for python not having such a functionality)

IMO it is possible (don't know exactly how) on journalized file systems.


I seriously doubt that - at least that this sort of diff you can get is
more than only a list of file-offsets together with blocks of a certain
size. The os doesn't do versioned updates like subverision or cvs do. So
most times, that won't be of much use (think of xml-data - howto replace
only certain parts, that not neccelarily respect the structural
requirements of xml?)

But as long as the OP doesn't fill us in with more details of what he is
after , this is all idle speculation.

--
Regards,

Diez B. Roggisch
Jul 18 '05 #8
> hmm i'll have to rephrase my question... i KNOW that the file changes
every 10 seconds... easiest would be i just give you the source code ;)
the program is a nice little windows which displays ACPI information
about battery status and stuff like that. the program is working just
fine. but i want the performance tweak it...


Why don't you keep the file open, and read until the end - that will give
you all that has been appended since the last read. You might run into
buffering issues here, but you should be able to modifiy the
termios-settings of the filedescriptor so you can read data even if only a
byte has been send.

--
Regards,

Diez B. Roggisch
Jul 18 '05 #9
Daniel Mueller wrote:
hmm i'll have to rephrase my question... i KNOW that the file changes
every 10 seconds... easiest would be i just give you the source code ;)
the program is a nice little windows which displays ACPI information
about battery status and stuff like that. the program is working just
fine.
What makes you think there is any way to detect the file changes
or read only the changes from files in the /proc file system?

Looking briefly at /proc on my own (non-laptop, so no acpi folder)
machine, I see that the dates as shown by "ls -l" or os.stat()
are *always* the current time... they increment second-by-second
as I run os.stat() repeatedly.
but i want the performance tweak it...


Why? Do you have any evidence that you have a performance problem
related to reading the data from these pseudo-files? (I'm guessing
that you think they are real files on your hard drive, but even
if that were the case, you probably haven't measured the access
or read times to prove that you actually *have* a problem.)

/proc is a *virtual* file system, so reading data from it is
about as fast as transferring bytes around in memory (to a
first approximation, anyway). Measure it!

If you don't have evidence of poor performance, you are probably
doing "premature optimization".

-Peter
Jul 18 '05 #10
Peter Hansen wrote:
Why? Do you have any evidence that you have a performance problem
related to reading the data from these pseudo-files? (I'm guessing
that you think they are real files on your hard drive, but even
if that were the case, you probably haven't measured the access
or read times to prove that you actually *have* a problem.)


Yeah i know that /proc is a virtual filesystem. And yes i know that the
files always have the pressent timestamp.

Well the application uses up to 10% of CPU power with every poll... is
that a normal amount??

Daniel

--
Please use my GnuPG (PGP) Public-Key for secure message exchange.
(http://157.161.155.222/damueller-pubkey.txt)
Jul 18 '05 #11
Daniel Mueller wrote:
Peter Hansen wrote:
Why? Do you have any evidence that you have a performance problem
Yeah i know that /proc is a virtual filesystem. And yes i know that the
files always have the pressent timestamp.


Then we can at least dispense with any kind of "read changes only"
concept, can't we? You clearly have to read the entire file and
compare it with the previous file to know exactly what has changed.
Or read the file and parse it and compare the parsed results. I'm
unclear what other approach could be conceived of...
Well the application uses up to 10% of CPU power with every poll... is
that a normal amount??


Sure... actually I wouldn't be surprised if my program appeared
to take 100% of CPU while processing the data after reading the
files... to do otherwise would be abnormal. What I'd be focusing
on, however, was for *how long* it did so.

-Peter
Jul 18 '05 #12
Daniel Mueller <da********@gmx.net> wrote:
Oliver Fromme wrote:
The portable way is to stat the file (see os.stat) every
10 seconds, look at the mtime (modification time), and
if it did change, then open/read/close.


hmm i'll have to rephrase my question... i KNOW that the file changes
every 10 seconds...


In that case you have to re-read the file every 10 seconds
anyway, no matter what.

The only optimization that's worthwhile is to keep the file
open all the time, so you spare the overhead of the open()
system call. In other words, open it _once_, then read it
(do not close it!), and after 10 seconds rewind -- that is,
file.seek(0) -- and re-read it.

I don't know what the contents of that file look like (I
don't use Linux), nor do I want to know. But since the Li-
nux procfs usually produces ASCII text files (which is a
mistake, in my opinion), I guess it should be pretty easy
to parse and find the difference.

If it's more complicated than that, Python's difflib might
be helpful: http://docs.python.org/lib/module-difflib.html

Best regards
Oliver

--
Oliver Fromme, Konrad-Celtis-Str. 72, 81369 Munich, Germany

``All that we see or seem is just a dream within a dream.''
(E. A. Poe)
Jul 18 '05 #13
Daniel Mueller <da********@gmx.net> wrote in message news:<cl***********@news.imp.ch>...
Hello Fellow Python Programmers,

I have the following problem:

i want to read the content of a file every 10 seconds. now what is the
best funktion to do that? to open the file, read it and close it
consumes a lot of CPU time. is there a better solution?

Greetings Daniel


Daniel,

I have written a python program that does this. It does not use a lot
of CPU cycles. You need to change the SLEEP variable from 0.50 to 10.
I have used it on Windows and on Linux. Here it is...

-John

#!/usr/bin/env python
"""
tail.py
John Taylor, Oct 19 2004
Adapted from:
http://groups.google.com/groups?hl=e...t%40python.org
"""

import sys,os,time,stat
SLEEP = 0.50

if len(sys.argv) != 2:
print
print "tail.py [ filename ]"
print
sys.exit(1)

FILENAME = sys.argv[1]

try:
fd = os.open(FILENAME,os.O_RDONLY) # on Linux, may want |O_LARGEFILE
except OSError, e:
print e
sys.exit(1)

info = os.fstat( fd )
lastsize = info[stat.ST_SIZE]
os.lseek( fd, lastsize, 2 )

try:
while True:
info = os.fstat( fd )
size = info[stat.ST_SIZE]
if size > lastsize:
os.lseek(fd, lastsize, 0)
data = os.read(fd, size - lastsize)
print data,
lastsize=size
time.sleep( SLEEP )
except KeyboardInterrupt:
pass

# end of program
Jul 18 '05 #14

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Dietrich | last post by:
Hi. I can't reread a file handle after I have copied it on another file handle for output. Here is the code snippet. Please let me know if you need more. Lines 26 & 27 are my attempts to get...
10
by: Ricardo Luceac | last post by:
Hi all. I'm having a problem with this, I have look if a file exists, if don't wait till it is created and if it exists I need to open it. I do the following: for (; ; ) {
1
by: engwar | last post by:
Is anyone aware of any freely available software written in a .Net language for adding polling features to an existing website? I'm looking for just a polling component and not interested in...
1
by: kelvin.jones | last post by:
Hi guys, I have read several discussions on this group (and others) that talk about polling a server using ajax transactions and if it is possible to push to the client. The general consensus seems...
5
by: Mike | last post by:
Hi everyone, I would like to be able to send updates to my web page from the server. If I understood correctly, this is not possible to achieve in a web environment, unless some sort of polling...
2
by: Potiuper | last post by:
Question: Is it possible to use a char pointer array ( char *<name> ) to read an array of strings from a file in C? Given: code is written in ANSI C; I know the exact nature of the strings to be...
3
by: CptDondo | last post by:
I am working on an embedded system. The entire configuration for the system is stored in an XML file, which is pretty long. It takes about 3 seconds to open the file using domxml_open_file. ...
13
by: LordHog | last post by:
Hello all, I have a little application that needs to poll a device (CAN communications) every 10 to 15 ms otherwise the hardware buffer might overflow when there are message burst on the bus. I...
3
by: Chris Mullins [MVP - C#] | last post by:
I'm sitting on the fence on this one, and wanted to get some other people's input. If you're a big B2B person, I would love to hear your feedback... I've got a SOA system. It's based on a...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.