paseline(my favorite simple script): does something similar exist?

RickMuller

One of my all-time favorite scripts is parseline, which is printed
below

def parseline(line, format):
xlat = {'x':None,'s':s tr,'f':float,'d ':int,'i':int}
result = []
words = line.split()
for i in range(len(forma t)):
f = format[i]
trans = xlat.get(f,'Non e')
if trans: result.append(t rans(words[i]))
if len(result) == 0: return None
if len(result) == 1: return result[0]
return result

This takes a line of text, splits it, and then applies simple
formatting characters to return different python types. For example,
given the line

H 0.000 0.000 0.000

I can call parseline(line, 'sfff') and it will return the string 'H',
and three floats. If I wanted to omit the first, I could just call
parseline(line, 'xfff'). If I only wanted the first 0.000, I could call
parseline(line, 'xf'). Clearly I don't do all of my parsing this way,
but I find parseline useful in a surprising number of applications.

I'm posting this here because (1) I'm feeling smug at what a bright
little coder I am, and (2) (in a more realistic and humble frame of
mind) I realize that many many people have probably found solutions to
similar needs, and I'd imaging that many are better than the above. I
would love to hear how other people do similar things.

Rick

Oct 12 '06 #1

Subscribe Reply

1355

Paul Rubin

"RickMuller " <rp******@gmail .comwrites:

def parseline(line, format):
xlat = {'x':None,'s':s tr,'f':float,'d ':int,'i':int}
result = []
words = line.split()
for i in range(len(forma t)):
f = format[i]
trans = xlat.get(f,'Non e')
if trans: result.append(t rans(words[i]))
if len(result) == 0: return None
if len(result) == 1: return result[0]
return result

Untested, but maybe more in current Pythonic style:

def parseline(line, format):
xlat = {'x':None,'s':s tr,'f':float,'d ':int,'i':int}
result = []
words = line.split()
for f,w in zip(format, words):
trans = xlat[f]
if trans is not None:
result.append(t rans(w))
return result

Differences:
- doesn't ignore improper format characters, raises exception instead
- always returns values in a list, including as an empty list if
there's no values
- uses iterator protocol and zip to avoid ugly index variable
and subscripts

Oct 12 '06 #2

Pierre Quentel

Hi Rick,

Nice little script indeed !

You probably mean

trans = xlat.get(f,None )

instead of

trans = xlat.get(f,'Non e')

in the case where an invalid format character is supplied. The string
'None' evaluates to True, so that trans(words[i]) raises an exception

A variant, with a list comprehension instead of the for loop :

def parseline(line, format):
xlat = {'x':None,'s':s tr,'f':float,'d ':int,'i':int}
result = []
words = line.split()
result = [ xlat[f](w) for f,w in zip(format,word s)
if xlat.get(f,None ) ]
if not result: return None
if len(result) == 1: return result[0]
return result

Regards,
Pierre

Oct 12 '06 #3

Istvan Albert

RickMuller wrote:

One of my all-time favorite scripts is parseline, which is printed

here is another way to write that:

def parseline(line, format):
trans = {'x':lambda x:None,'s':str, 'f':float,'d':i nt,'i':int}
return [ trans[f](w) for f,w in zip(format, line.split() ) ]

>>parseline( 'A 1 22 3 6', 'sdxf')

['A', 1, None, 3.0]
I.

Oct 12 '06 #4

Pierre Quentel

>parseline( 'A 1 22 3 6', 'sdxf')

['A', 1, None, 3.0]

Yes, but in this case the OP expects to get ['A',1,3.0]

A shorter version :

def parseline(line, format):
xlat = {'x':None,'s':s tr,'f':float,'d ':int,'i':int}
result = [ xlat[f](w) for f,w in zip(format,line .split())
if xlat.get(f,None ) ]
if len(result) == 0: return None
if len(result) == 1: return result[0]
return result

Pierre

Oct 12 '06 #5

Neil Cerutti

On 2006-10-12, Pierre Quentel <qu************ @wanadoo.frwrot e:

>>parseline( 'A 1 22 3 6', 'sdxf')
['A', 1, None, 3.0]

Yes, but in this case the OP expects to get ['A',1,3.0]

A shorter version :

def parseline(line, format):
xlat = {'x':None,'s':s tr,'f':float,'d ':int,'i':int}
result = [ xlat[f](w) for f,w in zip(format,line .split())
if xlat.get(f,None ) ]
if len(result) == 0: return None
if len(result) == 1: return result[0]
return result

I don't like the name, since it actually seems to be parsing a
string.

--
Neil Cerutti

Oct 12 '06 #6

skip

Rickdef parseline(line, format):
Rick xlat = {'x':None,'s':s tr,'f':float,'d ':int,'i':int}
Rick result = []
Rick words = line.split()
Rick for i in range(len(forma t)):
Rick f = format[i]
Rick trans = xlat.get(f,'Non e')
Rick if trans: result.append(t rans(words[i]))
Rick if len(result) == 0: return None
Rick if len(result) == 1: return result[0]
Rick return result

Note that your setting and testing of the trans variable is problematic. If
you're going to use xlat.get(), either spell None correctly or take the
default:

trans = xlat.get(f)
if trans:
result.append(t rans(words[i]))

As Paul indicated though, it would also be better to not to silently let
unrecognized format characters pass. I probably wouldn't let KeyError float
up to the caller though:

trans = xlat.get(f)
if trans:
result.append(t rans(words[i]))
else:
raise ValueError, "unrecogniz ed format character %s" % f

Finally, you might consider doing the splitting outside of this function and
pass in a list. That way you could (for example) easily pass in a row of
values read by the csv module's reader class (untested):

def format(words, fmt):
xlat = {'x':None,'s':s tr,'f':float,'d ':int,'i':int}
result = []
for i in range(len(fmt)) :
f = fmt[i]
trans = xlat.get(f)
if trans:
result.append(t rans(words[i]))
else:
raise ValueError, "unrecogniz ed format character %s" % f
return result

RickI'm posting this here because (1) I'm feeling smug at what a
Rickbright little coder I am, and (2) (in a more realistic and humble
Rickframe of mind) I realize that many many people have probably found
Ricksolutions to similar needs, and I'd imaging that many are better
Rickthan the above. I would love to hear how other people do similar
Rickthings.

It seems quite clever to me.

Skip

Oct 12 '06 #7

RickMuller

Wow! 6 responses in just a few minutes. Thanks for all of the great
feedback!
sk**@pobox.com wrote:

Rickdef parseline(line, format):
Rick xlat = {'x':None,'s':s tr,'f':float,'d ':int,'i':int}
Rick result = []
Rick words = line.split()
Rick for i in range(len(forma t)):
Rick f = format[i]
Rick trans = xlat.get(f,'Non e')
Rick if trans: result.append(t rans(words[i]))
Rick if len(result) == 0: return None
Rick if len(result) == 1: return result[0]
Rick return result

Note that your setting and testing of the trans variable is problematic. If
you're going to use xlat.get(), either spell None correctly or take the
default:

trans = xlat.get(f)
if trans:
result.append(t rans(words[i]))

As Paul indicated though, it would also be better to not to silently let
unrecognized format characters pass. I probably wouldn't let KeyError float
up to the caller though:

trans = xlat.get(f)
if trans:
result.append(t rans(words[i]))
else:
raise ValueError, "unrecogniz ed format character %s" % f

Finally, you might consider doing the splitting outside of this function and
pass in a list. That way you could (for example) easily pass in a row of
values read by the csv module's reader class (untested):

def format(words, fmt):
xlat = {'x':None,'s':s tr,'f':float,'d ':int,'i':int}
result = []
for i in range(len(fmt)) :
f = fmt[i]
trans = xlat.get(f)
if trans:
result.append(t rans(words[i]))
else:
raise ValueError, "unrecogniz ed format character %s" % f
return result

RickI'm posting this here because (1) I'm feeling smug at what a
Rickbright little coder I am, and (2) (in a more realistic and humble
Rickframe of mind) I realize that many many people have probably found
Ricksolutions to similar needs, and I'd imaging that many are better
Rickthan the above. I would love to hear how other people do similar
Rickthings.

It seems quite clever to me.

Skip

Oct 12 '06 #8

Fredrik Lundh

RickMuller wrote:

I'm posting this here because (1) I'm feeling smug at what a bright
little coder I am

if you want to show off, and use a more pythonic interface, you can do
it with a lot fewer lines. here's one example:

def parseline(line, *types):
result = [c(x) for (x, c) in zip(line.split( ), types) if c] or [None]
return len(result) != 1 and result or result[0]

text = "H 0.000 0.000 0.000"

print parseline(text, str, float, float, float)
print parseline(text, None, float, float, float)
print parseline(text, None, float)

etc. and since you know how many items you'll get back from the
function, you might as well go for the one-liner version, and do
the unpacking on the way out:

def parseline(line, *types):
return [c(x) for (x, c) in zip(line.split( ), types) if c] or [None]

text = "H 0.000 0.000 0.000"

[tag, value] = parseline(text, str, float)
[value] = parseline(text, None, float)

</F>

Oct 12 '06 #9

Gerard Flanagan

RickMuller wrote:

One of my all-time favorite scripts is parseline, which is printed
below

def parseline(line, format):
xlat = {'x':None,'s':s tr,'f':float,'d ':int,'i':int}
result = []
words = line.split()
for i in range(len(forma t)):
f = format[i]
trans = xlat.get(f,'Non e')
if trans: result.append(t rans(words[i]))
if len(result) == 0: return None
if len(result) == 1: return result[0]
return result

This takes a line of text, splits it, and then applies simple
formatting characters to return different python types. For example,
given the line

H 0.000 0.000 0.000

I can call parseline(line, 'sfff') and it will return the string 'H',
and three floats. If I wanted to omit the first, I could just call
parseline(line, 'xfff'). If I only wanted the first 0.000, I could call
parseline(line, 'xf').

[...]

I would love to hear how other people do similar things.

Rick

MAP = {'s':str,'f':fl oat,'d':int,'i' :int}

def parseline( line, format, separator=' '):
'''

>>parseline(' A 1 2 3 4', 'sdxf')

['A', 1, 3.0]
'''
mapping = [ (i, MAP[f]) for (i,f) in enumerate(forma t) if f != 'x'
]
parts = line.split(sepa rator)
return [f(parts[i]) for (i,f) in mapping]

def parseline2( line, format):
'''

>>parseline(' A 1 2 3 4', 'sdxf')

['A', 1, 3.0]
'''
return [f(line.split()[i]) for (i,f) in [(i, MAP[f]) for (i,f) in
enumerate(forma t) if f != 'x']]

def parselines(line s, format, separator=' '):
'''

>>lines = [ 'A 1 2 3 4', 'B 5 6 7 8', 'C 9 10 11 12']
list(parselin es(lines, 'sdxf'))

[['A', 1, 3.0], ['B', 5, 7.0], ['C', 9, 11.0]]
'''
mapping = [ (i, MAP[f]) for (i,f) in enumerate(forma t) if f != 'x'
]
for line in lines:
parts = line.split(sepa rator)
yield [f(parts[i]) for (i,f) in mapping]
import doctest
doctest.testmod (verbose=True)

Oct 12 '06 #10

Similar topics

1462

High level csv reader

by: George Sakkis | last post by:

It occured to me that most times I read a csv file, I'm often doing from scratch things like assigning labels to columns, mapping fields to the appropriate type, ignoring some fields, changing their order, etc. Before I go on and reinvent the wheel, is there a generic high level wrapper around csv.reader that does all this ? Thanks, George

Python

9454

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...

Windows Server

10259

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...

C / C++

9906

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...

General

8933

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...

Career Advice

7456

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...

Microsoft Access / VBA

6710

Couldn’t get equations in html when convert word .docx file to html file in C#.

by: conductexam | last post by:

I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...

C# / C Sharp

5354

Trying to create a lan-to-lan vpn between two differents networks

by: TSSRALBI | last post by:

Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...

Networking - Hardware / Configuration

4007

transfer the data from one system to another through ip address

by: 6302768590 | last post by:

Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

C# / C Sharp

3609

How to add payments to a PHP MySQL app.

by: muto222 | last post by:

How can i add a mobile payment intergratation into php mysql website.

PHP