One of my all-time favorite scripts is parseline, which is printed
below
def parseline(line,format):
xlat = {'x':None,'s':str,'f':float,'d':int,'i':int}
result = []
words = line.split()
for i in range(len(format)):
f = format[i]
trans = xlat.get(f,'None')
if trans: result.append(trans(words[i]))
if len(result) == 0: return None
if len(result) == 1: return result[0]
return result
This takes a line of text, splits it, and then applies simple
formatting characters to return different python types. For example,
given the line
H 0.000 0.000 0.000
I can call parseline(line,'sfff') and it will return the string 'H',
and three floats. If I wanted to omit the first, I could just call
parseline(line,'xfff'). If I only wanted the first 0.000, I could call
parseline(line,'xf'). Clearly I don't do all of my parsing this way,
but I find parseline useful in a surprising number of applications.
I'm posting this here because (1) I'm feeling smug at what a bright
little coder I am, and (2) (in a more realistic and humble frame of
mind) I realize that many many people have probably found solutions to
similar needs, and I'd imaging that many are better than the above. I
would love to hear how other people do similar things.
Rick 10 1267
"RickMuller" <rp******@gmail.comwrites:
def parseline(line,format):
xlat = {'x':None,'s':str,'f':float,'d':int,'i':int}
result = []
words = line.split()
for i in range(len(format)):
f = format[i]
trans = xlat.get(f,'None')
if trans: result.append(trans(words[i]))
if len(result) == 0: return None
if len(result) == 1: return result[0]
return result
Untested, but maybe more in current Pythonic style:
def parseline(line,format):
xlat = {'x':None,'s':str,'f':float,'d':int,'i':int}
result = []
words = line.split()
for f,w in zip(format, words):
trans = xlat[f]
if trans is not None:
result.append(trans(w))
return result
Differences:
- doesn't ignore improper format characters, raises exception instead
- always returns values in a list, including as an empty list if
there's no values
- uses iterator protocol and zip to avoid ugly index variable
and subscripts
Hi Rick,
Nice little script indeed !
You probably mean
trans = xlat.get(f,None)
instead of
trans = xlat.get(f,'None')
in the case where an invalid format character is supplied. The string
'None' evaluates to True, so that trans(words[i]) raises an exception
A variant, with a list comprehension instead of the for loop :
def parseline(line,format):
xlat = {'x':None,'s':str,'f':float,'d':int,'i':int}
result = []
words = line.split()
result = [ xlat[f](w) for f,w in zip(format,words)
if xlat.get(f,None) ]
if not result: return None
if len(result) == 1: return result[0]
return result
Regards,
Pierre
RickMuller wrote:
One of my all-time favorite scripts is parseline, which is printed
here is another way to write that:
def parseline(line, format):
trans = {'x':lambda x:None,'s':str,'f':float,'d':int,'i':int}
return [ trans[f](w) for f,w in zip(format, line.split() ) ]
>>parseline( 'A 1 22 3 6', 'sdxf')
['A', 1, None, 3.0]
I.
>parseline( 'A 1 22 3 6', 'sdxf')
['A', 1, None, 3.0]
Yes, but in this case the OP expects to get ['A',1,3.0]
A shorter version :
def parseline(line,format):
xlat = {'x':None,'s':str,'f':float,'d':int,'i':int}
result = [ xlat[f](w) for f,w in zip(format,line.split())
if xlat.get(f,None) ]
if len(result) == 0: return None
if len(result) == 1: return result[0]
return result
Pierre
On 2006-10-12, Pierre Quentel <qu************@wanadoo.frwrote:
>>parseline( 'A 1 22 3 6', 'sdxf')
['A', 1, None, 3.0]
Yes, but in this case the OP expects to get ['A',1,3.0]
A shorter version :
def parseline(line,format):
xlat = {'x':None,'s':str,'f':float,'d':int,'i':int}
result = [ xlat[f](w) for f,w in zip(format,line.split())
if xlat.get(f,None) ]
if len(result) == 0: return None
if len(result) == 1: return result[0]
return result
I don't like the name, since it actually seems to be parsing a
string.
--
Neil Cerutti
Rickdef parseline(line,format):
Rick xlat = {'x':None,'s':str,'f':float,'d':int,'i':int}
Rick result = []
Rick words = line.split()
Rick for i in range(len(format)):
Rick f = format[i]
Rick trans = xlat.get(f,'None')
Rick if trans: result.append(trans(words[i]))
Rick if len(result) == 0: return None
Rick if len(result) == 1: return result[0]
Rick return result
Note that your setting and testing of the trans variable is problematic. If
you're going to use xlat.get(), either spell None correctly or take the
default:
trans = xlat.get(f)
if trans:
result.append(trans(words[i]))
As Paul indicated though, it would also be better to not to silently let
unrecognized format characters pass. I probably wouldn't let KeyError float
up to the caller though:
trans = xlat.get(f)
if trans:
result.append(trans(words[i]))
else:
raise ValueError, "unrecognized format character %s" % f
Finally, you might consider doing the splitting outside of this function and
pass in a list. That way you could (for example) easily pass in a row of
values read by the csv module's reader class (untested):
def format(words, fmt):
xlat = {'x':None,'s':str,'f':float,'d':int,'i':int}
result = []
for i in range(len(fmt)):
f = fmt[i]
trans = xlat.get(f)
if trans:
result.append(trans(words[i]))
else:
raise ValueError, "unrecognized format character %s" % f
return result
RickI'm posting this here because (1) I'm feeling smug at what a
Rickbright little coder I am, and (2) (in a more realistic and humble
Rickframe of mind) I realize that many many people have probably found
Ricksolutions to similar needs, and I'd imaging that many are better
Rickthan the above. I would love to hear how other people do similar
Rickthings.
It seems quite clever to me.
Skip
Wow! 6 responses in just a few minutes. Thanks for all of the great
feedback! sk**@pobox.com wrote:
Rickdef parseline(line,format):
Rick xlat = {'x':None,'s':str,'f':float,'d':int,'i':int}
Rick result = []
Rick words = line.split()
Rick for i in range(len(format)):
Rick f = format[i]
Rick trans = xlat.get(f,'None')
Rick if trans: result.append(trans(words[i]))
Rick if len(result) == 0: return None
Rick if len(result) == 1: return result[0]
Rick return result
Note that your setting and testing of the trans variable is problematic. If
you're going to use xlat.get(), either spell None correctly or take the
default:
trans = xlat.get(f)
if trans:
result.append(trans(words[i]))
As Paul indicated though, it would also be better to not to silently let
unrecognized format characters pass. I probably wouldn't let KeyError float
up to the caller though:
trans = xlat.get(f)
if trans:
result.append(trans(words[i]))
else:
raise ValueError, "unrecognized format character %s" % f
Finally, you might consider doing the splitting outside of this function and
pass in a list. That way you could (for example) easily pass in a row of
values read by the csv module's reader class (untested):
def format(words, fmt):
xlat = {'x':None,'s':str,'f':float,'d':int,'i':int}
result = []
for i in range(len(fmt)):
f = fmt[i]
trans = xlat.get(f)
if trans:
result.append(trans(words[i]))
else:
raise ValueError, "unrecognized format character %s" % f
return result
RickI'm posting this here because (1) I'm feeling smug at what a
Rickbright little coder I am, and (2) (in a more realistic and humble
Rickframe of mind) I realize that many many people have probably found
Ricksolutions to similar needs, and I'd imaging that many are better
Rickthan the above. I would love to hear how other people do similar
Rickthings.
It seems quite clever to me.
Skip
RickMuller wrote:
I'm posting this here because (1) I'm feeling smug at what a bright
little coder I am
if you want to show off, and use a more pythonic interface, you can do
it with a lot fewer lines. here's one example:
def parseline(line, *types):
result = [c(x) for (x, c) in zip(line.split(), types) if c] or [None]
return len(result) != 1 and result or result[0]
text = "H 0.000 0.000 0.000"
print parseline(text, str, float, float, float)
print parseline(text, None, float, float, float)
print parseline(text, None, float)
etc. and since you know how many items you'll get back from the
function, you might as well go for the one-liner version, and do
the unpacking on the way out:
def parseline(line, *types):
return [c(x) for (x, c) in zip(line.split(), types) if c] or [None]
text = "H 0.000 0.000 0.000"
[tag, value] = parseline(text, str, float)
[value] = parseline(text, None, float)
</F>
RickMuller wrote:
One of my all-time favorite scripts is parseline, which is printed
below
def parseline(line,format):
xlat = {'x':None,'s':str,'f':float,'d':int,'i':int}
result = []
words = line.split()
for i in range(len(format)):
f = format[i]
trans = xlat.get(f,'None')
if trans: result.append(trans(words[i]))
if len(result) == 0: return None
if len(result) == 1: return result[0]
return result
This takes a line of text, splits it, and then applies simple
formatting characters to return different python types. For example,
given the line
H 0.000 0.000 0.000
I can call parseline(line,'sfff') and it will return the string 'H',
and three floats. If I wanted to omit the first, I could just call
parseline(line,'xfff'). If I only wanted the first 0.000, I could call
parseline(line,'xf').
[...]
I would love to hear how other people do similar things.
Rick
MAP = {'s':str,'f':float,'d':int,'i':int}
def parseline( line, format, separator=' '):
'''
>>parseline('A 1 2 3 4', 'sdxf')
['A', 1, 3.0]
'''
mapping = [ (i, MAP[f]) for (i,f) in enumerate(format) if f != 'x'
]
parts = line.split(separator)
return [f(parts[i]) for (i,f) in mapping]
def parseline2( line, format):
'''
>>parseline('A 1 2 3 4', 'sdxf')
['A', 1, 3.0]
'''
return [f(line.split()[i]) for (i,f) in [(i, MAP[f]) for (i,f) in
enumerate(format) if f != 'x']]
def parselines(lines, format, separator=' '):
'''
>>lines = [ 'A 1 2 3 4', 'B 5 6 7 8', 'C 9 10 11 12'] list(parselines(lines, 'sdxf'))
[['A', 1, 3.0], ['B', 5, 7.0], ['C', 9, 11.0]]
'''
mapping = [ (i, MAP[f]) for (i,f) in enumerate(format) if f != 'x'
]
for line in lines:
parts = line.split(separator)
yield [f(parts[i]) for (i,f) in mapping]
import doctest
doctest.testmod(verbose=True)
Amazing! There were lots of great suggestions to my original post, but
I this is my favorite.
Rick
Fredrik Lundh wrote:
RickMuller wrote:
I'm posting this here because (1) I'm feeling smug at what a bright
little coder I am
if you want to show off, and use a more pythonic interface, you can do
it with a lot fewer lines. here's one example:
def parseline(line, *types):
result = [c(x) for (x, c) in zip(line.split(), types) if c] or [None]
return len(result) != 1 and result or result[0]
text = "H 0.000 0.000 0.000"
print parseline(text, str, float, float, float)
print parseline(text, None, float, float, float)
print parseline(text, None, float)
etc. and since you know how many items you'll get back from the
function, you might as well go for the one-liner version, and do
the unpacking on the way out:
def parseline(line, *types):
return [c(x) for (x, c) in zip(line.split(), types) if c] or [None]
text = "H 0.000 0.000 0.000"
[tag, value] = parseline(text, str, float)
[value] = parseline(text, None, float)
</F>
This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: George Sakkis |
last post by:
It occured to me that most times I read a csv file, I'm often doing
from scratch things like assigning labels to columns, mapping fields to
the...
|
by: tammygombez |
last post by:
Hey everyone!
I've been researching gaming laptops lately, and I must say, they can get pretty expensive. However, I've come across some great...
|
by: concettolabs |
last post by:
In today's business world, businesses are increasingly turning to PowerApps to develop custom business applications. PowerApps is a powerful tool...
|
by: teenabhardwaj |
last post by:
How would one discover a valid source for learning news, comfort, and help for engineering designs? Covering through piles of books takes a lot of...
|
by: CD Tom |
last post by:
This happens in runtime 2013 and 2016. When a report is run and then closed a toolbar shows up and the only way to get it to go away is to right...
|
by: jalbright99669 |
last post by:
Am having a bit of a time with URL Rewrite. I need to incorporate http to https redirect with a reverse proxy. I have the URL Rewrite rules made...
|
by: antdb |
last post by:
Ⅰ. Advantage of AntDB: hyper-convergence + streaming processing engine
In the overall architecture, a new "hyper-convergence" concept was...
|
by: Matthew3360 |
last post by:
Hi there. I have been struggling to find out how to use a variable as my location in my header redirect function.
Here is my code.
...
|
by: Matthew3360 |
last post by:
Hi, I have a python app that i want to be able to get variables from a php page on my webserver. My python app is on my computer. How would I make it...
|
by: AndyPSV |
last post by:
HOW CAN I CREATE AN AI with an .executable file that would suck all files in the folder and on my computerHOW CAN I CREATE AN AI with an .executable...
| |