473,394 Members | 1,703 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,394 software developers and data experts.

Python and stale file handles

Hi, All!

I started back programming Python again after a hiatus of several
years and run into a sticky problem that I can't seem to fix,
regardless of how hard I try- it it starts with tailing a log file.

Basically, I'm trying to tail a log file and send the contents
elsewhere in the script (here, I call it processor()). My first
iteration below works perfectly fine- as long as the log file itself
(logfile.log) keeps getting written to.

I have a shell script constantly writes to the logfile.log... If I
happen to kill it off and restart it (overwriting the log file with
more entries) then the python script will stop sending anything at all
out.

import time, os

def processor(message,address):
#do something clever here

#Set the filename and open the file
filename = 'logfile.log'
file = open(filename,'r')

#Find the size of the file and move to the end
st_results = os.stat(filename)
st_size = st_results[6]
file.seek(st_size)

while 1:
where = file.tell()
line = file.readline()
if not line:
time.sleep(1)
file.seek(where)
else:
print line, # already has newline
data = line
if not data:
break
else:
processor(data,addr)
print "Sending message '",data,"'....."

someotherstuffhere()

===

This is perfectly normal behavior since the same thing happens when I
do a tail -f on the log file. However, I was hoping to build in a bit
of cleverness in the python script- that it would note that there was
a change in the log file and could compensate for it.

So, I wrote up a new script that opens the file to begin with,
attempts to do a quick file measurement of the file (to see if it's
suddenly stuck) and then reopen the log file if there's something
dodgy going on.

However, it's not quite working the way that I really intended it to.
It will either start reading the file from the beginning (instead of
tailing from the end) or just sit there confuzzled until I kill it
off.

===
import time, os

filename = logfile.log

def processor(message):
# do something clever here

def checkfile(filename):
file = open(filename,'r')
print "checking file, first pass"
pass1 = os.stat(filename)
pass1_size = pass1[6]

time.sleep(5)

print "file check, 2nd pass"
pass2 = os.stat(filename)
pass2_size = pass2[6]
if pass1_size == pass2_size:
print "reopening file"
file.close()
file = open(filename,'r')
else:
print "file is OK"
pass

while 1:
checkfile(filename)
where = file.tell()
line = file.readline()
print "reading file", where
if not line:
print "sleeping here"
time.sleep(5)
print "seeking file here"
file.seek(where)
else:
# print line, # already has newline
data = line
print "readying line"
if not data:
print "no data, breaking here"
break
else:
print "sending line"
processor(data)

So, have any thoughts on how to keep a Python script from bugging out
after a tailed file has been refreshed? I'd love to hear any thoughts
you my have on the matter, even if it's of the 'that's the way things
work' variety.

Cheers, and thanks in advance for any ideas on how to get around the
issue.

tom
Jun 27 '08 #1
2 2663
On 17 Apr, 04:22, tgiles <tgi...@gmail.comwrote:
Hi, All!

I started back programming Python again after a hiatus of several
years and run into a sticky problem that I can't seem to fix,
regardless of how hard I try- it it starts with tailing a log file.

Basically, I'm trying to tail a log file and send the contents
elsewhere in the script (here, I call it processor()). My first
iteration below works perfectly fine- as long as the log file itself
(logfile.log) keeps getting written to.

I have a shell script constantly writes to the logfile.log... If I
happen to kill it off and restart it (overwriting the log file with
more entries) then the python script will stop sending anything at all
out.

import time, os

def processor(message,address):
* * * * #do something clever here

#Set the filename and open the file
filename = 'logfile.log'
file = open(filename,'r')

#Find the size of the file and move to the end
st_results = os.stat(filename)
st_size = st_results[6]
file.seek(st_size)

while 1:
* * where = file.tell()
* * line = file.readline()
* * if not line:
* * * * time.sleep(1)
* * * * file.seek(where)
* * else:
* * * * print line, # already has newline
* * * * data = line
* * * * if not data:
* * * * * * break
* * * * else:
* * * * * * * * processor(data,addr)
* * * * * * * * print "Sending message '",data,"'....."

someotherstuffhere()

===

This is perfectly normal behavior since the same thing happens when I
do a tail -f on the log file. However, I was hoping to build in a bit
of cleverness in the python script- that it would note that there was
a change in the log file and could compensate for it.

So, I wrote up a new script that opens the file to begin with,
attempts to do a quick file measurement of the file (to see if it's
suddenly stuck) and then reopen the log file if there's something
dodgy going on.

However, it's not quite working the way that I really intended it to.
It will either start reading the file from the beginning (instead of
tailing from the end) or just sit there confuzzled until I kill it
off.

===

import time, os

filename = logfile.log

def processor(message):
* * # do something clever here

def checkfile(filename):
* * file = open(filename,'r')
* * print "checking file, first pass"
* * pass1 = os.stat(filename)
* * pass1_size = pass1[6]

* * time.sleep(5)

* * print "file check, 2nd pass"
* * pass2 = os.stat(filename)
* * pass2_size = pass2[6]
* * if pass1_size == pass2_size:
* * * * print "reopening file"
* * * * file.close()
* * * * file = open(filename,'r')
* * else:
* * * * print "file is OK"
* * * * pass

while 1:
* * * * checkfile(filename)
* * where = file.tell()
* * line = file.readline()
* * print "reading file", where
* * if not line:
* * * * print "sleeping here"
* * * * time.sleep(5)
* * * * print "seeking file here"
* * * * file.seek(where)
* * else:
* * * * # print line, # already has newline
* * * * data = line
* * * * print "readying line"
* * * * if not data:
* * * * * * print "no data, breaking here"
* * * * * * break
* * * * else:
* * * * * * print "sending line"
* * * * * * processor(data)

So, have any thoughts on how to keep a Python script from bugging out
after a tailed file has been refreshed? I'd love to hear any thoughts
you my have on the matter, even if it's of the 'that's the way things
work' variety.

Cheers, and thanks in advance for any ideas on how to get around the
issue.

tom
Possibly, restarting the program that writes the log file creates a
new file rather than
appending to the old one??

I think you should always reopen the file between the first and the
second pass
of your checkfile function, and then:
- if the file has the same size, it is probably the same file (but it
would better to
check the update time!), so seek to the end of it
- otherwise, its a new file, and then start reading it from the
beginning

To reduce the number of seeks, you could perform checkfile only if for
N cycles you did not
get any data.

Ciao
-----
FB
Jun 27 '08 #2
On 17 avr, 14:43, bock...@virgilio.it wrote:
On 17 Apr, 04:22, tgiles <tgi...@gmail.comwrote:
Hi, All!
I started back programming Python again after a hiatus of several
years and run into a sticky problem that I can't seem to fix,
regardless of how hard I try- it it starts with tailing a log file.
Basically, I'm trying to tail a log file and send the contents
elsewhere in the script (here, I call it processor()). My first
iteration below works perfectly fine- as long as the log file itself
(logfile.log) keeps getting written to.
I have a shell script constantly writes to the logfile.log... If I
happen to kill it off and restart it (overwriting the log file with
more entries) then the python script will stop sending anything at all
out.
import time, os
def processor(message,address):
#do something clever here
#Set the filename and open the file
filename = 'logfile.log'
file = open(filename,'r')
#Find the size of the file and move to the end
st_results = os.stat(filename)
st_size = st_results[6]
file.seek(st_size)
while 1:
where = file.tell()
line = file.readline()
if not line:
time.sleep(1)
file.seek(where)
else:
print line, # already has newline
data = line
if not data:
break
else:
processor(data,addr)
print "Sending message '",data,"'....."
someotherstuffhere()
===
This is perfectly normal behavior since the same thing happens when I
do a tail -f on the log file. However, I was hoping to build in a bit
of cleverness in the python script- that it would note that there was
a change in the log file and could compensate for it.
So, I wrote up a new script that opens the file to begin with,
attempts to do a quick file measurement of the file (to see if it's
suddenly stuck) and then reopen the log file if there's something
dodgy going on.
However, it's not quite working the way that I really intended it to.
It will either start reading the file from the beginning (instead of
tailing from the end) or just sit there confuzzled until I kill it
off.
===
import time, os
filename = logfile.log
def processor(message):
# do something clever here
def checkfile(filename):
file = open(filename,'r')
print "checking file, first pass"
pass1 = os.stat(filename)
pass1_size = pass1[6]
time.sleep(5)
print "file check, 2nd pass"
pass2 = os.stat(filename)
pass2_size = pass2[6]
if pass1_size == pass2_size:
print "reopening file"
file.close()
file = open(filename,'r')
else:
print "file is OK"
pass
while 1:
checkfile(filename)
where = file.tell()
line = file.readline()
print "reading file", where
if not line:
print "sleeping here"
time.sleep(5)
print "seeking file here"
file.seek(where)
else:
# print line, # already has newline
data = line
print "readying line"
if not data:
print "no data, breaking here"
break
else:
print "sending line"
processor(data)
So, have any thoughts on how to keep a Python script from bugging out
after a tailed file has been refreshed? I'd love to hear any thoughts
you my have on the matter, even if it's of the 'that's the way things
work' variety.
Cheers, and thanks in advance for any ideas on how to get around the
issue.
tom

Possibly, restarting the program that writes the log file creates a
new file rather than
appending to the old one??
It seems at least the op should definitely reopen the file:

# create a file
In [322]: f1 = open("test.txt", 'w')
In [323]: f1.write("test\n")
In [324]: f1.close()

# check content of file
In [325]: f_test1 = open("test.txt")
In [326]: f_test1.readline()
Out[326]: 'test\n'
# check twice, we never know
In [327]: f_test1.seek(0)
In [328]: f_test1.readline()
Out[328]: 'test\n'

# rewrite over the same file
In [329]: f1 = open("test.txt", 'w')
In [330]: f1.write("new test\n")
In [331]: f1.close()

# check if ok
In [332]: f_test2 = open("test.txt")
In [333]: f_test2.readline()
Out[333]: 'new test\n'

# first file object has not seen the change
In [334]: f_test1.seek(0)
In [335]: f_test1.readline()
Out[335]: 'test\n'
Jun 27 '08 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

13
by: Grant Edwards | last post by:
A few months back I wrote a sort of a strip-chart recorder program that talks DeviceNet to a measurement widget and plots the received data in more-or-less real time using the Gnuplot module. It...
5
by: Erik Max Francis | last post by:
I just got myself a new Treo 650 and was looking around for Python for Palm projects. The only ones I find are Pippy http://pippy.sourceforge.net/ and Python to Palm Pilot Port ...
1
by: John Rivers | last post by:
Hello, This topic has bugged me for years. The ideal for handling web forms would be that submitting the form replaces the browser history's current url with the url resulting from the form...
3
by: petermichaux | last post by:
Hi, I am trying to put together the last major pieces of my project's puzzle. This is more website/client-side architecture than JavaScript syntax but I hope this is a good place to ask. I'm a...
10
by: A.M | last post by:
Hi, I am having difficulty with shell scripting in Python. I use the following command to run a DOS command and put the return value in a Python variable:
0
by: Kurt B. Kaiser | last post by:
Patch / Bug Summary ___________________ Patches : 380 open (-36) / 3658 closed (+65) / 4038 total (+29) Bugs : 965 open ( -9) / 6555 closed (+35) / 7520 total (+26) RFE : 272 open...
12
by: gregpinero | last post by:
This wiki page suggests using a chroot jail to sandbox Python, but wouldn't running something like this in your sandboxed Python instance still break you out of the chroot jail: os.execle...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.