473,473 Members | 1,959 Online
Bytes | Software Development & Data Engineering Community
Create Post

Home Posts Topics Members FAQ

Extract data from ASCII file

Ren
Suppose I have a file containing several lines similar to this:

:10000000E7280530AC00A530AD00AD0B0528AC0BE2

The data I want to extract are 8 hexadecimal strings, the first of
which is E728, like this:

:10000000 E728 0530 AC00 A530 AD00 AD0B 0528 AC0B E2

Also, the bytes in the string are reversed. The E728 needs to be 28E7,
0530 needs to be 3005 and so on.

I can do this in C++ and Pascal, but it seems like Python may be more
suited for the task.

How is this accomplished using Python?
Jul 18 '05 #1
11 3139
With Python 2.3:
def splitter( line ): .... line = line[9:] # skip prefix
.... while line:
.... prefix, line = line[:4],line[4:]
.... yield prefix[2:]+prefix[:2]
.... for number in splitter( ':10000000E7280530AC00A530AD00AD0B0528AC0BE2'): .... print number
....
28E7
3005
00AC
30A5
00AD
0BAD
2805
0BAC
E2

If you want to convert the hexadecimal strings to actual integers, use
int( prefix, 16 ).

HTH,
Mike

Ren wrote:
Suppose I have a file containing several lines similar to this:

:10000000E7280530AC00A530AD00AD0B0528AC0BE2

The data I want to extract are 8 hexadecimal strings, the first of
which is E728, like this:

:10000000 E728 0530 AC00 A530 AD00 AD0B 0528 AC0B E2

Also, the bytes in the string are reversed. The E728 needs to be 28E7,
0530 needs to be 3005 and so on.

I can do this in C++ and Pascal, but it seems like Python may be more
suited for the task.

How is this accomplished using Python?

_______________________________________
Mike C. Fletcher
Designer, VR Plumber, Coder
http://members.rogers.com/mcfletch/

Jul 18 '05 #2
Ren wrote:
Suppose I have a file containing several lines similar to this:

:10000000E7280530AC00A530AD00AD0B0528AC0BE2


Say the file is called data.txt
Try this:
---------------------------------
def process(line):
line=line[9:]
result=[]
for i in range(0,32,4):
result.append( line[i+2:i+4] + line[i:i+2] )
return result

for line in open("data.txt"):
print process(line)
---------------------------------
For your single example data line, it prints
['28E7', '3005', '00AC', '30A5', '00AD', '0BAD', '2805', '0BAC']

It's a list containing the 8 extracted hexadecimal strings.
Instead of printing the list you can do whatever you want with it.
If you need more info, just ask.

--Irmen de Jong
Jul 18 '05 #3
Ren,
If you go here:

http://www.python.org/doc/current/tu...00000000000000

about half way down the page it talks about string slicing.

wes

Ren wrote:
Suppose I have a file containing several lines similar to this:

:10000000E7280530AC00A530AD00AD0B0528AC0BE2

The data I want to extract are 8 hexadecimal strings, the first of
which is E728, like this:

:10000000 E728 0530 AC00 A530 AD00 AD0B 0528 AC0B E2

Also, the bytes in the string are reversed. The E728 needs to be 28E7,
0530 needs to be 3005 and so on.

I can do this in C++ and Pascal, but it seems like Python may be more
suited for the task.

How is this accomplished using Python?


Jul 18 '05 #4
rl*******@sbec.com (Ren) wrote in message news:<36*************************@posting.google.c om>...
Suppose I have a file containing several lines similar to this:

:10000000E7280530AC00A530AD00AD0B0528AC0BE2

The data I want to extract are 8 hexadecimal strings, the first of
which is E728, like this:

:10000000 E728 0530 AC00 A530 AD00 AD0B 0528 AC0B E2

Also, the bytes in the string are reversed. The E728 needs to be 28E7,
0530 needs to be 3005 and so on.

I can do this in C++ and Pascal, but it seems like Python may be more
suited for the task.

How is this accomplished using Python?


The first response only works with python-2.3 (yield is a newly
reserved word).

The second response did not work for me and left off the last couple
values.

You might want to try this. It iterates down the list, grabbing two
characters at a time, reversing them and appending them to a list. It
also allows a second list argument to store the first 8 digits
(mutable lists are passed by reference)

-------------------------------------------------------
from types import *

def process(line,key):
""" Pass in a string type (line) and
an empty list to store the key """
if type(key) is ListType and key == []:
key.append(line[1:8])
else:
print "Key not ListType or not empty"
result=[]
line=line[9:]
while line:
k2,k1 = line[:2],line[2:4]
line=line[4:]
result.append(k1+k2)
return result
-------------------------------------------------------
Jul 18 '05 #5
Ren <rl*******@sbec.com> wrote:
Suppose I have a file containing several lines similar to this:

:10000000E7280530AC00A530AD00AD0B0528AC0BE2

The data I want to extract are 8 hexadecimal strings, the first of
which is E728, like this:

:10000000 E728 0530 AC00 A530 AD00 AD0B 0528 AC0B E2

Also, the bytes in the string are reversed. The E728 needs to be 28E7,
0530 needs to be 3005 and so on.

I can do this in C++ and Pascal, but it seems like Python may be more
suited for the task.

How is this accomplished using Python?


1. Use FIXEDWIDTH in Awk.

2. Use string slice in Python.

3. Use variable operation in (Bash) shell.

--
William Park, Open Geometry Consulting, <op**********@yahoo.ca>
Linux solution for data management and processing.
Jul 18 '05 #6
How is this accomplished using Python?


Check the struct documentation.

- Josiah
Jul 18 '05 #7
el*******@bah.com (eleyg) wrote:
:10000000E7280530AC00A530AD00AD0B0528AC0BE2

The data I want to extract are 8 hexadecimal strings, the first of
which is E728, like this:

:10000000 E728 0530 AC00 A530 AD00 AD0B 0528 AC0B E2

Also, the bytes in the string are reversed. The E728 needs to be 28E7,
0530 needs to be 3005 and so on.


The first response only works with python-2.3 (yield is a newly
reserved word).

The second response did not work for me and left off the last couple
values.


The third response uses typechecking and stores a value in an
unreachable place ...

Maybe the feachur-less code is better (tested very lightly):

def asBytes(line,offset):
""" split a line into 2-char chunks, starting at offset'"""
res = []
for i in range(offset,len(line),2):
res.append(line[i:i+2])
return res

def asWords(line,offset=0,swapbytes=0):
"""split a line into words that have maximally 4 chars,
starting at offset, optionally swapping 2-char chunks"""
res = []
flip = 0
for b in asBytes(line,offset):
if flip:
if swapbytes:
res.append(b+prev)
else:
res.append(prev+b)
else:
prev = b
flip = 1-flip
if flip:
res.append(b)
return res

def test():
line =":10000000E7280530AC00A530AD00AD0B0528AC0BE2"
print asWords(line,offset=9,swapbytes=1)

if __name__=='__main__':
test()

output is:

['28E7', '3005', '00AC', '30A5', '00AD', '0BAD', '2805', '0BAC', 'E2']

Anton
Jul 18 '05 #8
Ren
What is 'prefix' used for? I searched the docs and didn't come up with
anything that seemed appropriated.
"Mike C. Fletcher" <mc******@rogers.com> wrote in message news:<ma**************************************@pyt hon.org>...
With Python 2.3:
>>> def splitter( line ): ... line = line[9:] # skip prefix
... while line:
... prefix, line = line[:4],line[4:]
... yield prefix[2:]+prefix[:2]
... >>> for number in splitter( ':10000000E7280530AC00A530AD00AD0B0528AC0BE2'):
... print number
...

............snip............... _______________________________________
Mike C. Fletcher
Designer, VR Plumber, Coder
http://members.rogers.com/mcfletch/

Jul 18 '05 #9
Ren wrote:
What is 'prefix' used for? I searched the docs and didn't come up with
anything that seemed appropriated.


Umm... it's just a variable name :-)

--Irmen
Jul 18 '05 #10
Ren wrote:
What is 'prefix' used for? I searched the docs and didn't come up with
anything that seemed appropriated.

It's just the name (variable) I used to store the "prefix" of the rest
of the line. It could just as easily have been called "vlad", but using
simple, descriptive names for variables makes the code easier to read
(in most cases, this being the obvious counter-example). In Python when
you assign to something:

x, y = v, t

you are creating a (possibly new) bound name (if something of the same
name exists in a higher namespace it is shadowed by this bound name, so
even if there was a built-in function called "prefix" my assignment to
the name would have shadowed the name).

This line here says:

prefix, line = line[:4],line[4:]

that is, assign the name "prefix" to the result of slicing the line from
the starting index to index 4, and assign the name "line" to the result
of slicing from index 4 to the ending index. Under the covers the
right-hand-side of the expression is creating a two-element tuple, then
that tuple is unpacked to assign it's elements to the two variables on
the left-hand-side.

Python is a fairly small language, if a linguistic construct works a
particular way in one context it *normally* works that way in every
context (unless the programmer explicitly changes that (and that's
generally *only* done by meta-programmers seeking to create
domain-specific functionality, and even then as a matter of style, it's
kept to a minimum to avoid confusing people (and in this particular
case, AFAIK there's no way to override variable assignment (though (evil
;) ) people have proposed adding such a hook on numerous occasions)))).

The later line is simply manipulating the (string) object now referred
to as "prefix":

result.append( prefix[2:]+prefix[:2] )

that is, take the result of slicing from index 2 to the end and add it
to the result of slicing from the start to index 2. This has the effect
of reversing the order of the 2-byte hexadecimal encodings of "characters".

Oh, and since someone took issue with my use of (new in Python 2.2)
yield (luddites :) ;) ), here's a non-generator version using the same
basic code pattern:
def splitter( line ): .... line = line[9:] # skip prefix
.... result = []
.... while line:
.... prefix, line = line[:4],line[4:]
.... result.append( prefix[2:]+prefix[:2] )
.... return result
.... splitter( ':10000000E7280530AC00A530AD00AD0B0528AC0BE2')

['28E7', '3005', '00AC', '30A5', '00AD', '0BAD', '2805', '0BAC', 'E2']

Have fun :) ,
Mike

_______________________________________
Mike C. Fletcher
Designer, VR Plumber, Coder
http://members.rogers.com/mcfletch/

Jul 18 '05 #11
"Mike C. Fletcher" <mc******@rogers.com> wrote:
Oh, and since someone took issue with my use of (new in Python 2.2)
yield (luddites :) ;) ), here's a non-generator version using the same
basic code pattern:
def splitter( line ):

... line = line[9:] # skip prefix
... result = []
... while line:
... prefix, line = line[:4],line[4:]
... result.append( prefix[2:]+prefix[:2] )
... return result


The basic problem with this code pattern is that it makes a lot of
large slices of the line. With a small line there is no problem but it
looks like it doesn't scale well.

After reconsidering all alternatives I finally favor a variant of
Irmen's code, but without slicing the whole line and -after all-
definitely *using* yield because it seems appropriate here.

def process(line,offset):
for i in xrange(offset,len(line),4):
yield line[i+2:i+4] + line[i:i+2]

def test():
line = ":10000000E7280530AC00A530AD00AD0B0528AC0BE2"
print '\n'.join(process(line,9))

if __name__=='__main__':
test()

output is:

28E7
3005
00AC
30A5
00AD
0BAD
2805
0BAC
E2

Anton
Jul 18 '05 #12

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Ferran Foz | last post by:
Hello, I'm using ADODB.Stream to open a binary file on the server and write it down to the browser using Response.BinaryWrite. It's working fine, but i need to make some changes to the binary...
0
by: Shawn Mehaffie | last post by:
I have the following class that I've wirtten to take a Dataset and automatically export it to either XML, ASCII or Tab delimited file. The reason I wrote it they way I did was that I don't want to...
7
by: Dica | last post by:
i've used the sample code from msdn to create an encyption/decryption assembly as found here: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnnetsec/html/SecNetHT10.asp i'm...
7
by: fool | last post by:
Dear group, Extract the integer value present in a given string. So I tried the following: int main(void) { int val; char *data; data = malloc(sizeof *data); if(data)
12
by: rshepard | last post by:
I'm a bit embarrassed to have to ask for help on this, but I'm not finding the solution in the docs I have here. Data are assembled for writing to a database table. A representative tuple looks...
0
by: napolpie | last post by:
DISCUSSION IN USER nappie writes: Hello, I'm Peter and I'm new in python codying and I'm using parsying to extract data from one meteo Arpege file. This file is long file and it's composed by...
5
by: Troels Arvin | last post by:
Hello, Every so often, I'm asked to help people recover data from tables that were either dropped or where to much data was DELETEed. The complications related to restoring data are a problem....
1
by: EugeneBennett | last post by:
Hi I have an application where I need to manipulate data in a Dbf file(File1), search a second file (File2) for a tagname and scale the value before populating a third file (File3) with the...
45
by: Dennis | last post by:
Hi, I have a text file that contents a list of email addresses like this: "foo@yahoo.com" "tom@hotmail.com" "jerry@gmail.com" "tommy@apple.com" I like to
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
1
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...
0
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.