469,356 Members | 1,999 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,356 developers. It's quick & easy.

Parsing ascii file

Hello ,

I have a file that contains the following data (example) and does NOT have
any line feeds:

11 22 33 44 55 66 77 88 99 00 aa bb cc
dd ....to 128th byte 11 22 33 44 55 66 77 88 99
00 aa bb cc dd .... and so on

record 1 starts at 0 and finishes at 128, record 2 starts at 129 and
finishes at 256 and so on. there can be as many as 5000 record per file. I
would like to parse the file and retreive the value at field at byte 64-65
and conduct an arithmetical operation on the field (sum them all up).

Can I do this with python?

if I was to use awk it would look something like this :

cat <filename> | fold -w 128 | awk ' { SUM=SUM + substr($0,64,2) } END
{print SUM}'
Regards
Dean
Jul 18 '05 #1
2 1879
diablo wrote:
Hello ,

I have a file that contains the following data (example) and does NOT have
any line feeds:

11 22 33 44 55 66 77 88 99 00 aa bb cc
dd ....to 128th byte 11 22 33 44 55 66 77 88
99
00 aa bb cc dd .... and so on

record 1 starts at 0 and finishes at 128, record 2 starts at 129 and
finishes at 256 and so on. there can be as many as 5000 record per file. I
would like to parse the file and retreive the value at field at byte 64-65
and conduct an arithmetical operation on the field (sum them all up).

Can I do this with python?

if I was to use awk it would look something like this :

cat <filename> | fold -w 128 | awk ' { SUM=SUM + substr($0,64,2) } END
{print SUM}'


Is it an ascii or a binary file? I'm not entire sure from your description.
In the following I assume binary data, but it should be easy to modify the
value() function if those two bytes are ascii digits.

import struct, sys
from itertools import imap

def fold(instream, width=80):
while 1:
line = instream.read(width)
if not line: break
yield line

def value(line, start=64): # may be an "off by one" bug
# return int(line[start:start+2]))
return struct.unpack("h", line[start:start+2])[0]

if __name__ == "__main__":
try:
filename = sys.argv[1]
except IndexError:
instream = sys.stdin
else:
instream = file(filename)

print sum(imap(value, fold(instream, 128)))

Peter

Jul 18 '05 #2
"diablo" <dl******@btinternet.com> writes:
Hello , I have a file that contains the following data (example) and does NOT have
any line feeds: 11 22 33 44 55 66 77 88 99 00 aa bb cc
dd ....to 128th byte 11 22 33 44 55 66 77 88 99
00 aa bb cc dd .... and so on record 1 starts at 0 and finishes at 128, record 2 starts at 129 and
finishes at 256 and so on. there can be as many as 5000 record per file. I
would like to parse the file and retreive the value at field at byte 64-65
and conduct an arithmetical operation on the field (sum them all up). Can I do this with python? if I was to use awk it would look something like this : cat <filename> | fold -w 128 | awk ' { SUM=SUM + substr($0,64,2) } END
{print SUM}'


You can use stdin.read(128) to get consecutive records and slicing to extract
the fields. Something like:

from sys import stdin
sum = 0
while True:
record = stdin.read(128)
if not record: break
sum += int(record[64:65])
print sum

Frankly, I'd stick with the Awk version unless it's a pedagogical exercise.
Actually I'd go further and have a script that simplys sums up all the numbers
in the input and add 'cut' into the pipeline to extract the columns first.

Eddie
Jul 18 '05 #3

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

5 posts views Thread by Odd-R. | last post: by
9 posts views Thread by Hemang Shah | last post: by
1 post views Thread by Thomas Kowalski | last post: by
3 posts views Thread by aspineux | last post: by
4 posts views Thread by R Wood | last post: by
8 posts views Thread by lokeshrajoria | last post: by
31 posts views Thread by broli | last post: by
8 posts views Thread by lawrence k | last post: by
reply views Thread by zhoujie | last post: by
reply views Thread by suresh191 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.