Python and stale file handles

tgiles

Hi, All!

I started back programming Python again after a hiatus of several
years and run into a sticky problem that I can't seem to fix,
regardless of how hard I try- it it starts with tailing a log file.

Basically, I'm trying to tail a log file and send the contents
elsewhere in the script (here, I call it processor()). My first
iteration below works perfectly fine- as long as the log file itself
(logfile.log) keeps getting written to.

I have a shell script constantly writes to the logfile.log... If I
happen to kill it off and restart it (overwriting the log file with
more entries) then the python script will stop sending anything at all
out.

import time, os

def processor(message,address):
#do something clever here

#Set the filename and open the file
filename = 'logfile.log'
file = open(filename,'r')

#Find the size of the file and move to the end
st_results = os.stat(filename)
st_size = st_results[6]
file.seek(st_size)

while 1:
where = file.tell()
line = file.readline()
if not line:
time.sleep(1)
file.seek(where)
else:
print line, # already has newline
data = line
if not data:
break
else:
processor(data,addr)
print "Sending message '",data,"'....."

someotherstuffhere()

===

This is perfectly normal behavior since the same thing happens when I
do a tail -f on the log file. However, I was hoping to build in a bit
of cleverness in the python script- that it would note that there was
a change in the log file and could compensate for it.

So, I wrote up a new script that opens the file to begin with,
attempts to do a quick file measurement of the file (to see if it's
suddenly stuck) and then reopen the log file if there's something
dodgy going on.

However, it's not quite working the way that I really intended it to.
It will either start reading the file from the beginning (instead of
tailing from the end) or just sit there confuzzled until I kill it
off.

===
import time, os

filename = logfile.log

def processor(message):
# do something clever here

def checkfile(filename):
file = open(filename,'r')
print "checking file, first pass"
pass1 = os.stat(filename)
pass1_size = pass1[6]

time.sleep(5)

print "file check, 2nd pass"
pass2 = os.stat(filename)
pass2_size = pass2[6]
if pass1_size == pass2_size:
print "reopening file"
file.close()
file = open(filename,'r')
else:
print "file is OK"
pass

while 1:
checkfile(filename)
where = file.tell()
line = file.readline()
print "reading file", where
if not line:
print "sleeping here"
time.sleep(5)
print "seeking file here"
file.seek(where)
else:
# print line, # already has newline
data = line
print "readying line"
if not data:
print "no data, breaking here"
break
else:
print "sending line"
processor(data)

So, have any thoughts on how to keep a Python script from bugging out
after a tailed file has been refreshed? I'd love to hear any thoughts
you my have on the matter, even if it's of the 'that's the way things
work' variety.

Cheers, and thanks in advance for any ideas on how to get around the
issue.

tom

Jun 27 '08 #1

Subscribe Post Reply

2663

bockman

On 17 Apr, 04:22, tgiles <tgi...@gmail.comwrote:

Hi, All!

I started back programming Python again after a hiatus of several
years and run into a sticky problem that I can't seem to fix,
regardless of how hard I try- it it starts with tailing a log file.

Basically, I'm trying to tail a log file and send the contents
elsewhere in the script (here, I call it processor()). My first
iteration below works perfectly fine- as long as the log file itself
(logfile.log) keeps getting written to.

I have a shell script constantly writes to the logfile.log... If I
happen to kill it off and restart it (overwriting the log file with
more entries) then the python script will stop sending anything at all
out.

import time, os

def processor(message,address):
* * * * #do something clever here

#Set the filename and open the file
filename = 'logfile.log'
file = open(filename,'r')

#Find the size of the file and move to the end
st_results = os.stat(filename)
st_size = st_results[6]
file.seek(st_size)

while 1:
* * where = file.tell()
* * line = file.readline()
* * if not line:
* * * * time.sleep(1)
* * * * file.seek(where)
* * else:
* * * * print line, # already has newline
* * * * data = line
* * * * if not data:
* * * * * * break
* * * * else:
* * * * * * * * processor(data,addr)
* * * * * * * * print "Sending message '",data,"'....."

someotherstuffhere()

===

This is perfectly normal behavior since the same thing happens when I
do a tail -f on the log file. However, I was hoping to build in a bit
of cleverness in the python script- that it would note that there was
a change in the log file and could compensate for it.

So, I wrote up a new script that opens the file to begin with,
attempts to do a quick file measurement of the file (to see if it's
suddenly stuck) and then reopen the log file if there's something
dodgy going on.

However, it's not quite working the way that I really intended it to.
It will either start reading the file from the beginning (instead of
tailing from the end) or just sit there confuzzled until I kill it
off.

===

import time, os

filename = logfile.log

def processor(message):
* * # do something clever here

def checkfile(filename):
* * file = open(filename,'r')
* * print "checking file, first pass"
* * pass1 = os.stat(filename)
* * pass1_size = pass1[6]

* * time.sleep(5)

* * print "file check, 2nd pass"
* * pass2 = os.stat(filename)
* * pass2_size = pass2[6]
* * if pass1_size == pass2_size:
* * * * print "reopening file"
* * * * file.close()
* * * * file = open(filename,'r')
* * else:
* * * * print "file is OK"
* * * * pass

while 1:
* * * * checkfile(filename)
* * where = file.tell()
* * line = file.readline()
* * print "reading file", where
* * if not line:
* * * * print "sleeping here"
* * * * time.sleep(5)
* * * * print "seeking file here"
* * * * file.seek(where)
* * else:
* * * * # print line, # already has newline
* * * * data = line
* * * * print "readying line"
* * * * if not data:
* * * * * * print "no data, breaking here"
* * * * * * break
* * * * else:
* * * * * * print "sending line"
* * * * * * processor(data)

So, have any thoughts on how to keep a Python script from bugging out
after a tailed file has been refreshed? I'd love to hear any thoughts
you my have on the matter, even if it's of the 'that's the way things
work' variety.

Cheers, and thanks in advance for any ideas on how to get around the
issue.

tom

Possibly, restarting the program that writes the log file creates a
new file rather than
appending to the old one??

I think you should always reopen the file between the first and the
second pass
of your checkfile function, and then:
- if the file has the same size, it is probably the same file (but it
would better to
check the update time!), so seek to the end of it
- otherwise, its a new file, and then start reading it from the
beginning

To reduce the number of seeks, you could perform checkfile only if for
N cycles you did not
get any data.

Ciao
-----
FB

Jun 27 '08 #2

colas.francis

On 17 avr, 14:43, bock...@virgilio.it wrote:

On 17 Apr, 04:22, tgiles <tgi...@gmail.comwrote:

Hi, All!

I started back programming Python again after a hiatus of several
years and run into a sticky problem that I can't seem to fix,
regardless of how hard I try- it it starts with tailing a log file.

Basically, I'm trying to tail a log file and send the contents
elsewhere in the script (here, I call it processor()). My first
iteration below works perfectly fine- as long as the log file itself
(logfile.log) keeps getting written to.

I have a shell script constantly writes to the logfile.log... If I
happen to kill it off and restart it (overwriting the log file with
more entries) then the python script will stop sending anything at all
out.

import time, os

def processor(message,address):
#do something clever here

#Set the filename and open the file
filename = 'logfile.log'
file = open(filename,'r')

#Find the size of the file and move to the end
st_results = os.stat(filename)
st_size = st_results[6]
file.seek(st_size)

while 1:
where = file.tell()
line = file.readline()
if not line:
time.sleep(1)
file.seek(where)
else:
print line, # already has newline
data = line
if not data:
break
else:
processor(data,addr)
print "Sending message '",data,"'....."

someotherstuffhere()

===

This is perfectly normal behavior since the same thing happens when I
do a tail -f on the log file. However, I was hoping to build in a bit
of cleverness in the python script- that it would note that there was
a change in the log file and could compensate for it.

So, I wrote up a new script that opens the file to begin with,
attempts to do a quick file measurement of the file (to see if it's
suddenly stuck) and then reopen the log file if there's something
dodgy going on.

However, it's not quite working the way that I really intended it to.
It will either start reading the file from the beginning (instead of
tailing from the end) or just sit there confuzzled until I kill it
off.

===

import time, os

filename = logfile.log

def processor(message):
# do something clever here

def checkfile(filename):
file = open(filename,'r')
print "checking file, first pass"
pass1 = os.stat(filename)
pass1_size = pass1[6]

time.sleep(5)

print "file check, 2nd pass"
pass2 = os.stat(filename)
pass2_size = pass2[6]
if pass1_size == pass2_size:
print "reopening file"
file.close()
file = open(filename,'r')
else:
print "file is OK"
pass

while 1:
checkfile(filename)
where = file.tell()
line = file.readline()
print "reading file", where
if not line:
print "sleeping here"
time.sleep(5)
print "seeking file here"
file.seek(where)
else:
# print line, # already has newline
data = line
print "readying line"
if not data:
print "no data, breaking here"
break
else:
print "sending line"
processor(data)

So, have any thoughts on how to keep a Python script from bugging out
after a tailed file has been refreshed? I'd love to hear any thoughts
you my have on the matter, even if it's of the 'that's the way things
work' variety.

Cheers, and thanks in advance for any ideas on how to get around the
issue.

tom

Possibly, restarting the program that writes the log file creates a
new file rather than
appending to the old one??

It seems at least the op should definitely reopen the file:

# create a file
In [322]: f1 = open("test.txt", 'w')
In [323]: f1.write("test\n")
In [324]: f1.close()

# check content of file
In [325]: f_test1 = open("test.txt")
In [326]: f_test1.readline()
Out[326]: 'test\n'
# check twice, we never know
In [327]: f_test1.seek(0)
In [328]: f_test1.readline()
Out[328]: 'test\n'

# rewrite over the same file
In [329]: f1 = open("test.txt", 'w')
In [330]: f1.write("new test\n")
In [331]: f1.close()

# check if ok
In [332]: f_test2 = open("test.txt")
In [333]: f_test2.readline()
Out[333]: 'new test\n'

# first file object has not seen the change
In [334]: f_test1.seek(0)
In [335]: f_test1.readline()
Out[335]: 'test\n'

Jun 27 '08 #3

Similar topics

It's in Python. It just _works_!

by: Grant Edwards | last post by:

A few months back I wrote a sort of a strip-chart recorder program that talks DeviceNet to a measurement widget and plots the received data in more-or-less real time using the Gnuplot module. It...

Python

Python for Palm OS?

by: Erik Max Francis | last post by:

I just got myself a new Treo 650 and was looking around for Python for Palm projects. The only ones I find are Pippy http://pippy.sourceforge.net/ and Python to Palm Pilot Port ...

Python

Stale Forms

by: John Rivers | last post by:

Hello, This topic has bugged me for years. The ideal for handling web forms would be that submitting the form replaces the browser history's current url with the url resulting from the form...

ASP.NET

AJAX/stale browser data/polling vs. server push/drag and drop

by: petermichaux | last post by:

Hi, I am trying to put together the last major pieces of my project's puzzle. This is more website/client-side architecture than JavaScript syntax but I hope this is a good place to ask. I'm a...

Javascript

Win XP: Problem with shell scripting in Python

by: A.M | last post by:

Hi, I am having difficulty with shell scripting in Python. I use the following command to run a DOS command and put the return value in a Python variable:

Python

Weekly Python Patch/Bug Summary

by: Kurt B. Kaiser | last post by:

Patch / Bug Summary ___________________ Patches : 380 open (-36) / 3658 closed (+65) / 4038 total (+29) Bugs : 965 open ( -9) / 6555 closed (+35) / 7520 total (+26) RFE : 272 open...

Python

Chroot Jail Not Secure for Sandboxing Python?

by: gregpinero | last post by:

This wiki page suggests using a chroot jail to sandbox Python, but wouldn't running something like this in your sandboxed Python instance still break you out of the chroot jail: os.execle...

Python

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Batch import of multiple excel files into the database

by: ryjfgjl | last post by:

If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...

Data Management

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

Windows Server

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

General