471,356 Members | 1,639 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,356 software developers and data experts.

File Parsing Question

Hi,
I am new to Python. I am trying to do the following

inp = open(my_file,'r')

for line in inp:
# Perform some operations with line
if condition something:
# Start re reading for that position again
for line in inp:
if some other condition
break
# I need to go back one line and use that line value.
# I need to perform the operations which are listed in the
top with this line
# value. I cannot push that operation here
# I cannot do this with seek or tell.

In Perl this is what I have
while (<inp){
# my_operations
next if /pattern/
while (<inp>) {
operations again
last if /pattern2/
}
seek(inp,(-1-length),1)
}

This works perfectly in Perl. Can I do the same in Python.

Thanks
Jee
Sep 12 '07 #1
7 1541

Save the previous line in a variable if you want the previous line
only.
for line in inp:
# Perform some operations with line
if condition something:
print prev_line
print line
break
# I need to go back one line and use that line value
--prev_line = line

If you want to do more than that, then use data=inp.readlines() or you
can use
data = open(myfile), "r").readlines(). The data will be stored in
list format so you can access each line individually.

Sep 12 '07 #2
I'm assuming you know that python has a file.seek(), but you have to
know the number of bytes you want to move from the beginning of the
file or from the current location. You could save the length of the
previous record, and use file seek to backup and then move forward,
but it is simpler to save the previous rec or use readlines() if the
file will fit into a reasonable amount of memory.

Sep 12 '07 #3
I would prefer to use something with seek. I am not able to use seek()
with "for line in inp". Use tell and seek does not seem to do anything
with the code. When I try to do

for line in inp.readlines():
# Top of Loop
if not condition in line:
do_something
else:
for lines in inp.readlines():
if not condition
do_something
else:
break
pos = inp.tell()
inp.seek(pos) ---This line has not effect in the program

Not sure if Iam missing something very basic. Also the previous line
needs to be used in the position I call # Top of Loop.

Thanks
On 9/12/07, Zentrader <ze********@gmail.comwrote:
I'm assuming you know that python has a file.seek(), but you have to
know the number of bytes you want to move from the beginning of the
file or from the current location. You could save the length of the
previous record, and use file seek to backup and then move forward,
but it is simpler to save the previous rec or use readlines() if the
file will fit into a reasonable amount of memory.

--
http://mail.python.org/mailman/listinfo/python-list
Sep 12 '07 #4
for line in inp.readlines():

If you are now using readlines() instead of readline(), then
a) it is only used once to read all data into a container
b) you can access each element/line by it's relative number

data=open(filename, "r").readlines()
for eachline in data : (not readlines())

so try
print data[0] ## first rec
print data[9] ## 10th rec, etc

you can use
ctr = 0
for eachline in data:
##do something
if ctr 0:
print "this line is", eachline ## or data[ctr]
print "prev_line = ", data[ctr-1]
ctr += 1

or a slightly different way
stop = len(data)
ctr=0
while ctr < stop:
## do something
if ctr 0 :
this_line = data[ctr]
prev_line = data[ctr-1]
ctr += 1

Sorry, I don't use file.seek() so can't help there

Sep 13 '07 #5
Am Wed, 12 Sep 2007 17:28:08 -0500 schrieb Shankarjee Krishnamoorthi:
I would prefer to use something with seek.
Writing Perl in any language?
I am not able to use seek()
with "for line in inp". Use tell and seek does not seem to do anything
with the code. When I try to do

for line in inp.readlines():
readlines() reads the whole file at once, so inp.tell() will give the
position at the end of the file from now on.
# Top of Loop
if not condition in line:
do_something
else:
for lines in inp.readlines():
if not condition
do_something
else:
break
pos = inp.tell()
inp.seek(pos) ---This line has not effect in the program

Not sure if Iam missing something very basic. Also the previous line
needs to be used in the position I call # Top of Loop.
If you want to use seek/tell you can't iterate over the file directly
because

for line in inp:
# ...

reads ahead to make that iteration highly efficient -- so you will often
get a position further ahead than the end of the current line.

But you can use readline() (which doesn't read ahead) in conjunction with
tell/seek; just replace all occurences of

for line in inp:
# ...

with

for line in iter(inp.readline, ""):
# ...

Peter
Sep 13 '07 #6
Dennis Lee Bieber wrote:
for line in inp:

will read one line at a time (I'm fairly sure the iterator doesn't
attempt to buffer multiple lines behind the scenes)
You are wrong:
>>open("tmp.txt", "w").writelines("%s\n" % (9*c) for c in "ABCDE")
instream = open("tmp.txt")
for line in instream:
.... print instream.tell(), line.strip()
....
50 AAAAAAAAA
50 BBBBBBBBB
50 CCCCCCCCC
50 DDDDDDDDD
50 EEEEEEEEE
>>>
Here's the workaround:
>>instream = open("tmp.txt")
for line in iter(instream.readline, ""):
.... print instream.tell(), line.strip()
....
10 AAAAAAAAA
20 BBBBBBBBB
30 CCCCCCCCC
40 DDDDDDDDD
50 EEEEEEEEE
>>>
Peter
Sep 13 '07 #7
Great. That worked for me. I had some of my routines implemented in
Perl earlier. Now that I started using Python I am trying to do all my
automation scripts with Python. Thanks a ton

Jee

On 9/13/07, Peter Otten <__*******@web.dewrote:
Dennis Lee Bieber wrote:
for line in inp:

will read one line at a time (I'm fairly sure the iterator doesn't
attempt to buffer multiple lines behind the scenes)

You are wrong:
>open("tmp.txt", "w").writelines("%s\n" % (9*c) for c in "ABCDE")
instream = open("tmp.txt")
for line in instream:
... print instream.tell(), line.strip()
...
50 AAAAAAAAA
50 BBBBBBBBB
50 CCCCCCCCC
50 DDDDDDDDD
50 EEEEEEEEE
>>

Here's the workaround:
>instream = open("tmp.txt")
for line in iter(instream.readline, ""):
... print instream.tell(), line.strip()
...
10 AAAAAAAAA
20 BBBBBBBBB
30 CCCCCCCCC
40 DDDDDDDDD
50 EEEEEEEEE
>>

Peter
--
http://mail.python.org/mailman/listinfo/python-list
Sep 13 '07 #8

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

13 posts views Thread by Ørjan Langbakk | last post: by
5 posts views Thread by baskarpr | last post: by
5 posts views Thread by bmichel | last post: by
AdrianH
5 posts views Thread by AdrianH | last post: by
AdrianH
1 post views Thread by AdrianH | last post: by
reply views Thread by XIAOLAOHU | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.