473,770 Members | 1,973 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

paseline(my favorite simple script): does something similar exist?

One of my all-time favorite scripts is parseline, which is printed
below

def parseline(line, format):
xlat = {'x':None,'s':s tr,'f':float,'d ':int,'i':int}
result = []
words = line.split()
for i in range(len(forma t)):
f = format[i]
trans = xlat.get(f,'Non e')
if trans: result.append(t rans(words[i]))
if len(result) == 0: return None
if len(result) == 1: return result[0]
return result

This takes a line of text, splits it, and then applies simple
formatting characters to return different python types. For example,
given the line

H 0.000 0.000 0.000

I can call parseline(line, 'sfff') and it will return the string 'H',
and three floats. If I wanted to omit the first, I could just call
parseline(line, 'xfff'). If I only wanted the first 0.000, I could call
parseline(line, 'xf'). Clearly I don't do all of my parsing this way,
but I find parseline useful in a surprising number of applications.

I'm posting this here because (1) I'm feeling smug at what a bright
little coder I am, and (2) (in a more realistic and humble frame of
mind) I realize that many many people have probably found solutions to
similar needs, and I'd imaging that many are better than the above. I
would love to hear how other people do similar things.

Rick

Oct 12 '06 #1
10 1355
"RickMuller " <rp******@gmail .comwrites:
def parseline(line, format):
xlat = {'x':None,'s':s tr,'f':float,'d ':int,'i':int}
result = []
words = line.split()
for i in range(len(forma t)):
f = format[i]
trans = xlat.get(f,'Non e')
if trans: result.append(t rans(words[i]))
if len(result) == 0: return None
if len(result) == 1: return result[0]
return result
Untested, but maybe more in current Pythonic style:

def parseline(line, format):
xlat = {'x':None,'s':s tr,'f':float,'d ':int,'i':int}
result = []
words = line.split()
for f,w in zip(format, words):
trans = xlat[f]
if trans is not None:
result.append(t rans(w))
return result

Differences:
- doesn't ignore improper format characters, raises exception instead
- always returns values in a list, including as an empty list if
there's no values
- uses iterator protocol and zip to avoid ugly index variable
and subscripts
Oct 12 '06 #2
Hi Rick,

Nice little script indeed !

You probably mean
trans = xlat.get(f,None )
instead of
trans = xlat.get(f,'Non e')
in the case where an invalid format character is supplied. The string
'None' evaluates to True, so that trans(words[i]) raises an exception

A variant, with a list comprehension instead of the for loop :

def parseline(line, format):
xlat = {'x':None,'s':s tr,'f':float,'d ':int,'i':int}
result = []
words = line.split()
result = [ xlat[f](w) for f,w in zip(format,word s)
if xlat.get(f,None ) ]
if not result: return None
if len(result) == 1: return result[0]
return result

Regards,
Pierre

Oct 12 '06 #3
RickMuller wrote:
One of my all-time favorite scripts is parseline, which is printed
here is another way to write that:

def parseline(line, format):
trans = {'x':lambda x:None,'s':str, 'f':float,'d':i nt,'i':int}
return [ trans[f](w) for f,w in zip(format, line.split() ) ]
>>parseline( 'A 1 22 3 6', 'sdxf')
['A', 1, None, 3.0]
I.

Oct 12 '06 #4
>parseline( 'A 1 22 3 6', 'sdxf')
['A', 1, None, 3.0]
Yes, but in this case the OP expects to get ['A',1,3.0]

A shorter version :

def parseline(line, format):
xlat = {'x':None,'s':s tr,'f':float,'d ':int,'i':int}
result = [ xlat[f](w) for f,w in zip(format,line .split())
if xlat.get(f,None ) ]
if len(result) == 0: return None
if len(result) == 1: return result[0]
return result

Pierre

Oct 12 '06 #5
On 2006-10-12, Pierre Quentel <qu************ @wanadoo.frwrot e:
>>parseline( 'A 1 22 3 6', 'sdxf')
['A', 1, None, 3.0]

Yes, but in this case the OP expects to get ['A',1,3.0]

A shorter version :

def parseline(line, format):
xlat = {'x':None,'s':s tr,'f':float,'d ':int,'i':int}
result = [ xlat[f](w) for f,w in zip(format,line .split())
if xlat.get(f,None ) ]
if len(result) == 0: return None
if len(result) == 1: return result[0]
return result
I don't like the name, since it actually seems to be parsing a
string.

--
Neil Cerutti
Oct 12 '06 #6

Rickdef parseline(line, format):
Rick xlat = {'x':None,'s':s tr,'f':float,'d ':int,'i':int}
Rick result = []
Rick words = line.split()
Rick for i in range(len(forma t)):
Rick f = format[i]
Rick trans = xlat.get(f,'Non e')
Rick if trans: result.append(t rans(words[i]))
Rick if len(result) == 0: return None
Rick if len(result) == 1: return result[0]
Rick return result

Note that your setting and testing of the trans variable is problematic. If
you're going to use xlat.get(), either spell None correctly or take the
default:

trans = xlat.get(f)
if trans:
result.append(t rans(words[i]))

As Paul indicated though, it would also be better to not to silently let
unrecognized format characters pass. I probably wouldn't let KeyError float
up to the caller though:

trans = xlat.get(f)
if trans:
result.append(t rans(words[i]))
else:
raise ValueError, "unrecogniz ed format character %s" % f

Finally, you might consider doing the splitting outside of this function and
pass in a list. That way you could (for example) easily pass in a row of
values read by the csv module's reader class (untested):

def format(words, fmt):
xlat = {'x':None,'s':s tr,'f':float,'d ':int,'i':int}
result = []
for i in range(len(fmt)) :
f = fmt[i]
trans = xlat.get(f)
if trans:
result.append(t rans(words[i]))
else:
raise ValueError, "unrecogniz ed format character %s" % f
return result

RickI'm posting this here because (1) I'm feeling smug at what a
Rickbright little coder I am, and (2) (in a more realistic and humble
Rickframe of mind) I realize that many many people have probably found
Ricksolutions to similar needs, and I'd imaging that many are better
Rickthan the above. I would love to hear how other people do similar
Rickthings.

It seems quite clever to me.

Skip
Oct 12 '06 #7
Wow! 6 responses in just a few minutes. Thanks for all of the great
feedback!
sk**@pobox.com wrote:
Rickdef parseline(line, format):
Rick xlat = {'x':None,'s':s tr,'f':float,'d ':int,'i':int}
Rick result = []
Rick words = line.split()
Rick for i in range(len(forma t)):
Rick f = format[i]
Rick trans = xlat.get(f,'Non e')
Rick if trans: result.append(t rans(words[i]))
Rick if len(result) == 0: return None
Rick if len(result) == 1: return result[0]
Rick return result

Note that your setting and testing of the trans variable is problematic. If
you're going to use xlat.get(), either spell None correctly or take the
default:

trans = xlat.get(f)
if trans:
result.append(t rans(words[i]))

As Paul indicated though, it would also be better to not to silently let
unrecognized format characters pass. I probably wouldn't let KeyError float
up to the caller though:

trans = xlat.get(f)
if trans:
result.append(t rans(words[i]))
else:
raise ValueError, "unrecogniz ed format character %s" % f

Finally, you might consider doing the splitting outside of this function and
pass in a list. That way you could (for example) easily pass in a row of
values read by the csv module's reader class (untested):

def format(words, fmt):
xlat = {'x':None,'s':s tr,'f':float,'d ':int,'i':int}
result = []
for i in range(len(fmt)) :
f = fmt[i]
trans = xlat.get(f)
if trans:
result.append(t rans(words[i]))
else:
raise ValueError, "unrecogniz ed format character %s" % f
return result

RickI'm posting this here because (1) I'm feeling smug at what a
Rickbright little coder I am, and (2) (in a more realistic and humble
Rickframe of mind) I realize that many many people have probably found
Ricksolutions to similar needs, and I'd imaging that many are better
Rickthan the above. I would love to hear how other people do similar
Rickthings.

It seems quite clever to me.

Skip
Oct 12 '06 #8
RickMuller wrote:
I'm posting this here because (1) I'm feeling smug at what a bright
little coder I am
if you want to show off, and use a more pythonic interface, you can do
it with a lot fewer lines. here's one example:

def parseline(line, *types):
result = [c(x) for (x, c) in zip(line.split( ), types) if c] or [None]
return len(result) != 1 and result or result[0]

text = "H 0.000 0.000 0.000"

print parseline(text, str, float, float, float)
print parseline(text, None, float, float, float)
print parseline(text, None, float)

etc. and since you know how many items you'll get back from the
function, you might as well go for the one-liner version, and do
the unpacking on the way out:

def parseline(line, *types):
return [c(x) for (x, c) in zip(line.split( ), types) if c] or [None]

text = "H 0.000 0.000 0.000"

[tag, value] = parseline(text, str, float)
[value] = parseline(text, None, float)

</F>

Oct 12 '06 #9

RickMuller wrote:
One of my all-time favorite scripts is parseline, which is printed
below

def parseline(line, format):
xlat = {'x':None,'s':s tr,'f':float,'d ':int,'i':int}
result = []
words = line.split()
for i in range(len(forma t)):
f = format[i]
trans = xlat.get(f,'Non e')
if trans: result.append(t rans(words[i]))
if len(result) == 0: return None
if len(result) == 1: return result[0]
return result

This takes a line of text, splits it, and then applies simple
formatting characters to return different python types. For example,
given the line

H 0.000 0.000 0.000

I can call parseline(line, 'sfff') and it will return the string 'H',
and three floats. If I wanted to omit the first, I could just call
parseline(line, 'xfff'). If I only wanted the first 0.000, I could call
parseline(line, 'xf').
[...]
I would love to hear how other people do similar things.

Rick
MAP = {'s':str,'f':fl oat,'d':int,'i' :int}

def parseline( line, format, separator=' '):
'''
>>parseline(' A 1 2 3 4', 'sdxf')
['A', 1, 3.0]
'''
mapping = [ (i, MAP[f]) for (i,f) in enumerate(forma t) if f != 'x'
]
parts = line.split(sepa rator)
return [f(parts[i]) for (i,f) in mapping]

def parseline2( line, format):
'''
>>parseline(' A 1 2 3 4', 'sdxf')
['A', 1, 3.0]
'''
return [f(line.split()[i]) for (i,f) in [(i, MAP[f]) for (i,f) in
enumerate(forma t) if f != 'x']]

def parselines(line s, format, separator=' '):
'''
>>lines = [ 'A 1 2 3 4', 'B 5 6 7 8', 'C 9 10 11 12']
list(parselin es(lines, 'sdxf'))
[['A', 1, 3.0], ['B', 5, 7.0], ['C', 9, 11.0]]
'''
mapping = [ (i, MAP[f]) for (i,f) in enumerate(forma t) if f != 'x'
]
for line in lines:
parts = line.split(sepa rator)
yield [f(parts[i]) for (i,f) in mapping]
import doctest
doctest.testmod (verbose=True)

Oct 12 '06 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
1462
by: George Sakkis | last post by:
It occured to me that most times I read a csv file, I'm often doing from scratch things like assigning labels to columns, mapping fields to the appropriate type, ignoring some fields, changing their order, etc. Before I go on and reinvent the wheel, is there a generic high level wrapper around csv.reader that does all this ? Thanks, George
0
9454
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10259
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
9906
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8933
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7456
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6710
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5354
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
4007
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3609
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.