467,906 Members | 1,704 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 467,906 developers. It's quick & easy.

Python and stale file handles

Hi, All!

I started back programming Python again after a hiatus of several
years and run into a sticky problem that I can't seem to fix,
regardless of how hard I try- it it starts with tailing a log file.

Basically, I'm trying to tail a log file and send the contents
elsewhere in the script (here, I call it processor()). My first
iteration below works perfectly fine- as long as the log file itself
(logfile.log) keeps getting written to.

I have a shell script constantly writes to the logfile.log... If I
happen to kill it off and restart it (overwriting the log file with
more entries) then the python script will stop sending anything at all
out.

import time, os

def processor(message,address):
#do something clever here

#Set the filename and open the file
filename = 'logfile.log'
file = open(filename,'r')

#Find the size of the file and move to the end
st_results = os.stat(filename)
st_size = st_results[6]
file.seek(st_size)

while 1:
where = file.tell()
line = file.readline()
if not line:
time.sleep(1)
file.seek(where)
else:
print line, # already has newline
data = line
if not data:
break
else:
processor(data,addr)
print "Sending message '",data,"'....."

someotherstuffhere()

===

This is perfectly normal behavior since the same thing happens when I
do a tail -f on the log file. However, I was hoping to build in a bit
of cleverness in the python script- that it would note that there was
a change in the log file and could compensate for it.

So, I wrote up a new script that opens the file to begin with,
attempts to do a quick file measurement of the file (to see if it's
suddenly stuck) and then reopen the log file if there's something
dodgy going on.

However, it's not quite working the way that I really intended it to.
It will either start reading the file from the beginning (instead of
tailing from the end) or just sit there confuzzled until I kill it
off.

===
import time, os

filename = logfile.log

def processor(message):
# do something clever here

def checkfile(filename):
file = open(filename,'r')
print "checking file, first pass"
pass1 = os.stat(filename)
pass1_size = pass1[6]

time.sleep(5)

print "file check, 2nd pass"
pass2 = os.stat(filename)
pass2_size = pass2[6]
if pass1_size == pass2_size:
print "reopening file"
file.close()
file = open(filename,'r')
else:
print "file is OK"
pass

while 1:
checkfile(filename)
where = file.tell()
line = file.readline()
print "reading file", where
if not line:
print "sleeping here"
time.sleep(5)
print "seeking file here"
file.seek(where)
else:
# print line, # already has newline
data = line
print "readying line"
if not data:
print "no data, breaking here"
break
else:
print "sending line"
processor(data)

So, have any thoughts on how to keep a Python script from bugging out
after a tailed file has been refreshed? I'd love to hear any thoughts
you my have on the matter, even if it's of the 'that's the way things
work' variety.

Cheers, and thanks in advance for any ideas on how to get around the
issue.

tom
Jun 27 '08 #1
  • viewed: 2181
Share:
2 Replies
On 17 Apr, 04:22, tgiles <tgi...@gmail.comwrote:
Hi, All!

I started back programming Python again after a hiatus of several
years and run into a sticky problem that I can't seem to fix,
regardless of how hard I try- it it starts with tailing a log file.

Basically, I'm trying to tail a log file and send the contents
elsewhere in the script (here, I call it processor()). My first
iteration below works perfectly fine- as long as the log file itself
(logfile.log) keeps getting written to.

I have a shell script constantly writes to the logfile.log... If I
happen to kill it off and restart it (overwriting the log file with
more entries) then the python script will stop sending anything at all
out.

import time, os

def processor(message,address):
* * * * #do something clever here

#Set the filename and open the file
filename = 'logfile.log'
file = open(filename,'r')

#Find the size of the file and move to the end
st_results = os.stat(filename)
st_size = st_results[6]
file.seek(st_size)

while 1:
* * where = file.tell()
* * line = file.readline()
* * if not line:
* * * * time.sleep(1)
* * * * file.seek(where)
* * else:
* * * * print line, # already has newline
* * * * data = line
* * * * if not data:
* * * * * * break
* * * * else:
* * * * * * * * processor(data,addr)
* * * * * * * * print "Sending message '",data,"'....."

someotherstuffhere()

===

This is perfectly normal behavior since the same thing happens when I
do a tail -f on the log file. However, I was hoping to build in a bit
of cleverness in the python script- that it would note that there was
a change in the log file and could compensate for it.

So, I wrote up a new script that opens the file to begin with,
attempts to do a quick file measurement of the file (to see if it's
suddenly stuck) and then reopen the log file if there's something
dodgy going on.

However, it's not quite working the way that I really intended it to.
It will either start reading the file from the beginning (instead of
tailing from the end) or just sit there confuzzled until I kill it
off.

===

import time, os

filename = logfile.log

def processor(message):
* * # do something clever here

def checkfile(filename):
* * file = open(filename,'r')
* * print "checking file, first pass"
* * pass1 = os.stat(filename)
* * pass1_size = pass1[6]

* * time.sleep(5)

* * print "file check, 2nd pass"
* * pass2 = os.stat(filename)
* * pass2_size = pass2[6]
* * if pass1_size == pass2_size:
* * * * print "reopening file"
* * * * file.close()
* * * * file = open(filename,'r')
* * else:
* * * * print "file is OK"
* * * * pass

while 1:
* * * * checkfile(filename)
* * where = file.tell()
* * line = file.readline()
* * print "reading file", where
* * if not line:
* * * * print "sleeping here"
* * * * time.sleep(5)
* * * * print "seeking file here"
* * * * file.seek(where)
* * else:
* * * * # print line, # already has newline
* * * * data = line
* * * * print "readying line"
* * * * if not data:
* * * * * * print "no data, breaking here"
* * * * * * break
* * * * else:
* * * * * * print "sending line"
* * * * * * processor(data)

So, have any thoughts on how to keep a Python script from bugging out
after a tailed file has been refreshed? I'd love to hear any thoughts
you my have on the matter, even if it's of the 'that's the way things
work' variety.

Cheers, and thanks in advance for any ideas on how to get around the
issue.

tom
Possibly, restarting the program that writes the log file creates a
new file rather than
appending to the old one??

I think you should always reopen the file between the first and the
second pass
of your checkfile function, and then:
- if the file has the same size, it is probably the same file (but it
would better to
check the update time!), so seek to the end of it
- otherwise, its a new file, and then start reading it from the
beginning

To reduce the number of seeks, you could perform checkfile only if for
N cycles you did not
get any data.

Ciao
-----
FB
Jun 27 '08 #2
On 17 avr, 14:43, bock...@virgilio.it wrote:
On 17 Apr, 04:22, tgiles <tgi...@gmail.comwrote:
Hi, All!
I started back programming Python again after a hiatus of several
years and run into a sticky problem that I can't seem to fix,
regardless of how hard I try- it it starts with tailing a log file.
Basically, I'm trying to tail a log file and send the contents
elsewhere in the script (here, I call it processor()). My first
iteration below works perfectly fine- as long as the log file itself
(logfile.log) keeps getting written to.
I have a shell script constantly writes to the logfile.log... If I
happen to kill it off and restart it (overwriting the log file with
more entries) then the python script will stop sending anything at all
out.
import time, os
def processor(message,address):
#do something clever here
#Set the filename and open the file
filename = 'logfile.log'
file = open(filename,'r')
#Find the size of the file and move to the end
st_results = os.stat(filename)
st_size = st_results[6]
file.seek(st_size)
while 1:
where = file.tell()
line = file.readline()
if not line:
time.sleep(1)
file.seek(where)
else:
print line, # already has newline
data = line
if not data:
break
else:
processor(data,addr)
print "Sending message '",data,"'....."
someotherstuffhere()
===
This is perfectly normal behavior since the same thing happens when I
do a tail -f on the log file. However, I was hoping to build in a bit
of cleverness in the python script- that it would note that there was
a change in the log file and could compensate for it.
So, I wrote up a new script that opens the file to begin with,
attempts to do a quick file measurement of the file (to see if it's
suddenly stuck) and then reopen the log file if there's something
dodgy going on.
However, it's not quite working the way that I really intended it to.
It will either start reading the file from the beginning (instead of
tailing from the end) or just sit there confuzzled until I kill it
off.
===
import time, os
filename = logfile.log
def processor(message):
# do something clever here
def checkfile(filename):
file = open(filename,'r')
print "checking file, first pass"
pass1 = os.stat(filename)
pass1_size = pass1[6]
time.sleep(5)
print "file check, 2nd pass"
pass2 = os.stat(filename)
pass2_size = pass2[6]
if pass1_size == pass2_size:
print "reopening file"
file.close()
file = open(filename,'r')
else:
print "file is OK"
pass
while 1:
checkfile(filename)
where = file.tell()
line = file.readline()
print "reading file", where
if not line:
print "sleeping here"
time.sleep(5)
print "seeking file here"
file.seek(where)
else:
# print line, # already has newline
data = line
print "readying line"
if not data:
print "no data, breaking here"
break
else:
print "sending line"
processor(data)
So, have any thoughts on how to keep a Python script from bugging out
after a tailed file has been refreshed? I'd love to hear any thoughts
you my have on the matter, even if it's of the 'that's the way things
work' variety.
Cheers, and thanks in advance for any ideas on how to get around the
issue.
tom

Possibly, restarting the program that writes the log file creates a
new file rather than
appending to the old one??
It seems at least the op should definitely reopen the file:

# create a file
In [322]: f1 = open("test.txt", 'w')
In [323]: f1.write("test\n")
In [324]: f1.close()

# check content of file
In [325]: f_test1 = open("test.txt")
In [326]: f_test1.readline()
Out[326]: 'test\n'
# check twice, we never know
In [327]: f_test1.seek(0)
In [328]: f_test1.readline()
Out[328]: 'test\n'

# rewrite over the same file
In [329]: f1 = open("test.txt", 'w')
In [330]: f1.write("new test\n")
In [331]: f1.close()

# check if ok
In [332]: f_test2 = open("test.txt")
In [333]: f_test2.readline()
Out[333]: 'new test\n'

# first file object has not seen the change
In [334]: f_test1.seek(0)
In [335]: f_test1.readline()
Out[335]: 'test\n'
Jun 27 '08 #3

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

13 posts views Thread by Grant Edwards | last post: by
5 posts views Thread by Erik Max Francis | last post: by
1 post views Thread by John Rivers | last post: by
reply views Thread by Kurt B. Kaiser | last post: by
12 posts views Thread by gregpinero | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.