471,306 Members | 857 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,306 software developers and data experts.

split parameter line with quotes

Hello,

first i'm a newbie to python (but i searched the Internet i swear).
i'm looking for some way to split up a string into a list of pairs
'key=value'. This code should be able to handle this particular
example string :

qop="auth,auth-int,auth-conf",cipher="rc4-40,rc4-56,rc4,des,
3des",maxbuf=1024,charset=utf-8,algorithm=md5-sess

i know i can do that with some regexp (i'm currently trying to learn
that) but if there's some other way...

thanks
Jan 11 '08 #1
10 2344
On Jan 11, 1:50 pm, teddyber <teddy...@gmail.comwrote:
Hello,

first i'm a newbie to python (but i searched the Internet i swear).
i'm looking for some way to split up a string into a list of pairs
'key=value'. This code should be able to handle this particular
example string :

qop="auth,auth-int,auth-conf",cipher="rc4-40,rc4-56,rc4,des,
3des",maxbuf=1024,charset=utf-8,algorithm=md5-sess

i know i can do that with some regexp (i'm currently trying to learn
that) but if there's some other way...

thanks
This is unconventional and using eval is not SAFE too.
>>s = 'qop="auth,auth-int,auth-conf",cipher="rc4-40,rc4-56,rc4,des,3des",maxbuf=1024,charset="utf-8",algorithm="md5-sess"'
d = eval(' dict(%s)' % s)
d.items()
[('algorithm', 'md5-sess'), ('maxbuf', 1024), ('charset', 'utf-8'),
('cipher', 'rc4-40,rc4-56,rc4,des,3des'), ('qop', 'auth,auth-int,auth-
conf')]
>>for k,v in d.iteritems(): print k, '=', v
....
algorithm = md5-sess
maxbuf = 1024
charset = utf-8
cipher = rc4-40,rc4-56,rc4,des,3des
qop = auth,auth-int,auth-conf

For safe eval, take a look at http://aspn.activestate.com/ASPN/Coo.../Recipe/364469

-N
Jan 11 '08 #2
teddyber wrote:
first i'm a newbie to python (but i searched the Internet i swear).
i'm looking for some way to split up a string into a list of pairs
'key=value'. This code should be able to handle this particular
example string :

qop="auth,auth-int,auth-conf",cipher="rc4-40,rc4-56,rc4,des,
3des",maxbuf=1024,charset=utf-8,algorithm=md5-sess

i know i can do that with some regexp (i'm currently trying to learn
that) but if there's some other way...
Take a look at the shlex module. You might be able to fiddle with the shlex
object and convince it to split on the commas. But, to be honest, that
above would be a lot easier to parse if the dividing commas were spaces
instead.

j
Jan 11 '08 #3
On 11 jan, 20:28, Nanjundi <nanju...@gmail.comwrote:
On Jan 11, 1:50 pm, teddyber <teddy...@gmail.comwrote:
Hello,
first i'm a newbie to python (but i searched the Internet i swear).
i'm looking for some way to split up a string into a list of pairs
'key=value'. This code should be able to handle this particular
example string :
qop="auth,auth-int,auth-conf",cipher="rc4-40,rc4-56,rc4,des,
3des",maxbuf=1024,charset=utf-8,algorithm=md5-sess
i know i can do that with some regexp (i'm currently trying to learn
that) but if there's some other way...
thanks

This is unconventional and using eval is not SAFE too.>>s = 'qop="auth,auth-int,auth-conf",cipher="rc4-40,rc4-56,rc4,des,3des",maxbuf=1024,charset="utf-8",algorithm="md5-sess"'
>d = eval(' dict(%s)' % s)
d.items()
thanks for that. The problem is i don't have charset="utf-8" but
charset=utf-8. Sometimes " sometimes not!
>
[('algorithm', 'md5-sess'), ('maxbuf', 1024), ('charset', 'utf-8'),
('cipher', 'rc4-40,rc4-56,rc4,des,3des'), ('qop', 'auth,auth-int,auth-
conf')]>>for k,v in d.iteritems(): print k, '=', v

...
algorithm = md5-sess
maxbuf = 1024
charset = utf-8
cipher = rc4-40,rc4-56,rc4,des,3des
qop = auth,auth-int,auth-conf

For safe eval, take a look athttp://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/364469

-N
Jan 11 '08 #4
On Jan 11, 12:50*pm, teddyber <teddy...@gmail.comwrote:
Hello,

first i'm a newbie to python (but i searched the Internet i swear).
i'm looking for some way to split up a string into a list of pairs
'key=value'. This code should be able to handle this particular
example string :

qop="auth,auth-int,auth-conf",cipher="rc4-40,rc4-56,rc4,des,
3des",maxbuf=1024,charset=utf-8,algorithm=md5-sess

i know i can do that with some regexp (i'm currently trying to learn
that) but if there's some other way...

thanks
Those quoted strings sure are pesky when you try to split along
commas. Here is a solution using pyparsing - note the argument field
access methods at the bottom. Also, the parse action attached to
integer will do conversion of the string to an int at parse time.

More info on pyparsing at http://pyparsing.wikispaces.com.

-- Paul

from pyparsing import Word, nums, alphas, quotedString, \
delimitedList, Literal, CharsNotIn, Dict, Group, \
removeQuotes

arg = '''qop="auth,auth-int,auth-conf",
cipher="rc4-40,rc4-56,rc4,des,3des",
maxbuf=1024,charset=utf-8,algorithm=md5-sess'''

# format is: delimited list of key=value groups, where value is
# a quoted string, an integer, or a non-quoted string up to the
next
# ',' character
key = Word(alphas)
EQ = Literal("=").suppress()
integer = Word(nums).setParseAction(lambda t:int(t[0]))
quotedString.setParseAction(removeQuotes)
other = CharsNotIn(",")
val = quotedString | integer | other

# parse each key=val arg into its own group
argList = delimitedList( Group(key + EQ + val) )
args = argList.parseString(arg)

# print the parsed results
print args.asList()
print

# add dict-like retrieval capabilities, by wrapping in a Dict
expression
argList = Dict(delimitedList( Group(key + EQ + val) ))
args = argList.parseString(arg)

# print the modified results, using dump() (shows dict entries too)
print args.dump()

# access the values by key name
print "Keys =", args.keys()
print "cipher =", args["cipher"]

# or can access them like attributes of an object
print "maxbuf =", args.maxbuf
Prints:

[['qop', 'auth,auth-int,auth-conf'], ['cipher', 'rc4-40,rc4-56,rc4,des,
3des'], ['maxbuf', 1024], ['charset', 'utf-8'], ['algorithm', 'md5-
sess']]

[['qop', 'auth,auth-int,auth-conf'], ['cipher', 'rc4-40,rc4-56,rc4,des,
3des'], ['maxbuf', 1024], ['charset', 'utf-8'], ['algorithm', 'md5-
sess']]
- algorithm: md5-sess
- charset: utf-8
- cipher: rc4-40,rc4-56,rc4,des,3des
- maxbuf: 1024
- qop: auth,auth-int,auth-conf
Keys = ['maxbuf', 'cipher', 'charset', 'algorithm', 'qop']
maxbuf = 1024
cipher = rc4-40,rc4-56,rc4,des,3des
Jan 11 '08 #5
On Jan 11, 10:50 am, teddyber <teddy...@gmail.comwrote:
Hello,

first i'm a newbie to python (but i searched the Internet i swear).
i'm looking for some way to split up a string into a list of pairs
'key=value'. This code should be able to handle this particular
example string :

qop="auth,auth-int,auth-conf",cipher="rc4-40,rc4-56,rc4,des,
3des",maxbuf=1024,charset=utf-8,algorithm=md5-sess

i know i can do that with some regexp (i'm currently trying to learn
that) but if there's some other way...

thanks
The problem is that you are using commas for delimiters at two
different levels.

I would start by replacing the commas between quotation marks with
some other delimiter, such as spaces of semicolons. To do that, step
through each character and keep a count of quotation marks. While the
count is odd, replace each comma with the selected alternative
delimiter. While the count is even, leave the comma. [An alternative
would be to replace the commas outside the quotation marks.]

Once that is done, the problem is straightforward. Split the string on
commas (using string.split(",")). Then split each item in the list by
"=". Use the [0] element for the key, and use the [1] element for the
value (first stripping off the quotation marks if necessary). If you
need to further split each of the values, just split on whatever
delimiter you chose to replace the commas.
Jan 11 '08 #6
On Jan 11, 12:53 pm, "Russ P." <Russ.Paie...@gmail.comwrote:
On Jan 11, 10:50 am, teddyber <teddy...@gmail.comwrote:
Hello,
first i'm a newbie to python (but i searched the Internet i swear).
i'm looking for some way to split up a string into a list of pairs
'key=value'. This code should be able to handle this particular
example string :
qop="auth,auth-int,auth-conf",cipher="rc4-40,rc4-56,rc4,des,
3des",maxbuf=1024,charset=utf-8,algorithm=md5-sess
i know i can do that with some regexp (i'm currently trying to learn
that) but if there's some other way...
thanks

The problem is that you are using commas for delimiters at two
different levels.

I would start by replacing the commas between quotation marks with
some other delimiter, such as spaces of semicolons. To do that, step
through each character and keep a count of quotation marks. While the
count is odd, replace each comma with the selected alternative
delimiter. While the count is even, leave the comma. [An alternative
would be to replace the commas outside the quotation marks.]

Once that is done, the problem is straightforward. Split the string on
commas (using string.split(",")). Then split each item in the list by
"=". Use the [0] element for the key, and use the [1] element for the
value (first stripping off the quotation marks if necessary). If you
need to further split each of the values, just split on whatever
delimiter you chose to replace the commas.

One more point. Whoever chose the structure of the string you are
parsing didn't do a very good job. If you know that person, you should
tell him or her to use different delimiters at the different levels.
Use commas for one level, and spaces or semicolons for the other
level. Then you won't have to "correct" the string before you parse
it.
Jan 11 '08 #7
wow! that's perfect this shlex module! thanks for pointing this!

On 11 jan, 20:36, Joshua Kugler <jkug...@bigfoot.comwrote:
teddyber wrote:
first i'm a newbie to python (but i searched the Internet i swear).
i'm looking for some way to split up a string into a list of pairs
'key=value'. This code should be able to handle this particular
example string :
qop="auth,auth-int,auth-conf",cipher="rc4-40,rc4-56,rc4,des,
3des",maxbuf=1024,charset=utf-8,algorithm=md5-sess
i know i can do that with some regexp (i'm currently trying to learn
that) but if there's some other way...

Take a look at the shlex module. You might be able to fiddle with the shlex
object and convince it to split on the commas. But, to be honest, that
above would be a lot easier to parse if the dividing commas were spaces
instead.

j
Jan 11 '08 #8
i know this is some kind of bad design but the problem is that i
receive this string from a jabber server and i cannot do anything to
change this. i should try to verify if that's correct implementation
of jabber protocol still...

On 11 jan, 22:02, "Russ P." <Russ.Paie...@gmail.comwrote:
On Jan 11, 12:53 pm, "Russ P." <Russ.Paie...@gmail.comwrote:
On Jan 11, 10:50 am, teddyber <teddy...@gmail.comwrote:
Hello,
first i'm a newbie to python (but i searched the Internet i swear).
i'm looking for some way to split up a string into a list of pairs
'key=value'. This code should be able to handle this particular
example string :
qop="auth,auth-int,auth-conf",cipher="rc4-40,rc4-56,rc4,des,
3des",maxbuf=1024,charset=utf-8,algorithm=md5-sess
i know i can do that with some regexp (i'm currently trying to learn
that) but if there's some other way...
thanks
The problem is that you are using commas for delimiters at two
different levels.
I would start by replacing the commas between quotation marks with
some other delimiter, such as spaces of semicolons. To do that, step
through each character and keep a count of quotation marks. While the
count is odd, replace each comma with the selected alternative
delimiter. While the count is even, leave the comma. [An alternative
would be to replace the commas outside the quotation marks.]
Once that is done, the problem is straightforward. Split the string on
commas (using string.split(",")). Then split each item in the list by
"=". Use the [0] element for the key, and use the [1] element for the
value (first stripping off the quotation marks if necessary). If you
need to further split each of the values, just split on whatever
delimiter you chose to replace the commas.

One more point. Whoever chose the structure of the string you are
parsing didn't do a very good job. If you know that person, you should
tell him or her to use different delimiters at the different levels.
Use commas for one level, and spaces or semicolons for the other
level. Then you won't have to "correct" the string before you parse
it.
Jan 11 '08 #9
here's the solution i have for the moment :

t = shlex.shlex(data)
t.wordchars = t.wordchars + "/+.-"
r=''
while 1:
token = t.get_token()
if not token:
break
if not token==',': r = r+token
else: r = r + ' '
self.DEBUG(r,'ok')
for pair in r.split(' '):
key,value=pair.split('=', 1)
print(key+':'+value)

i know this is not perfect still but i'm coming a long way from very
bad php habits! :o)
and thanks for your help!

On 11 jan, 23:30, teddyber <teddy...@gmail.comwrote:
wow! that's perfect this shlex module! thanks for pointing this!

On 11 jan, 20:36, Joshua Kugler <jkug...@bigfoot.comwrote:
teddyber wrote:
first i'm a newbie to python (but i searched the Internet i swear).
i'm looking for some way to split up a string into a list of pairs
'key=value'. This code should be able to handle this particular
example string :
qop="auth,auth-int,auth-conf",cipher="rc4-40,rc4-56,rc4,des,
3des",maxbuf=1024,charset=utf-8,algorithm=md5-sess
i know i can do that with some regexp (i'm currently trying to learn
that) but if there's some other way...
Take a look at the shlex module. You might be able to fiddle with the shlex
object and convince it to split on the commas. But, to be honest, that
above would be a lot easier to parse if the dividing commas were spaces
instead.
j
Jan 11 '08 #10
teddyber wrote:
here's the solution i have for the moment :

t = shlex.shlex(data)
t.wordchars = t.wordchars + "/+.-"
r=''
while 1:
token = t.get_token()
if not token:
break
if not token==',': r = r+token
else: r = r + ' '
self.DEBUG(r,'ok')
for pair in r.split(' '):
key,value=pair.split('=', 1)
print(key+':'+value)

i know this is not perfect still but i'm coming a long way from very
bad php habits! :o)
and thanks for your help!

On 11 jan, 23:30, teddyber <teddy...@gmail.comwrote:
>wow! that's perfect this shlex module! thanks for pointing this!

On 11 jan, 20:36, Joshua Kugler <jkug...@bigfoot.comwrote:
>>teddyber wrote:
first i'm a newbie to python (but i searched the Internet i swear).
i'm looking for some way to split up a string into a list of pairs
'key=value'. This code should be able to handle this particular
example string :
qop="auth,auth-int,auth-conf",cipher="rc4-40,rc4-56,rc4,des,
3des",maxbuf=1024,charset=utf-8,algorithm=md5-sess
i know i can do that with some regexp (i'm currently trying to learn
that) but if there's some other way...
Take a look at the shlex module. You might be able to fiddle with the shlex
object and convince it to split on the commas. But, to be honest, that
above would be a lot easier to parse if the dividing commas were spaces
instead.
j
Maybe you like :
>>x = 'qop = "auth,auth-int,auth-conf",cipher="rc4-40,rc4-56,rc4,des,
3des",maxbuf=1024,charset=utf-8,algorithm=md5-sess'
>>dict(zip([k[-1].strip() for k in (j.split(',') for j in ''.join(i
for i in x if i != '"').split('='))][:-1], [k[:-1] or k for k in
(j.split(',') for j in ''.join(i for i in x if i != '"').split('='))][1:]))

{'maxbuf': ['1024'], 'cipher': ['rc4-40', 'rc4-56', 'rc4', 'des', '
3des'], 'charset': ['utf-8'], 'algorithm': ['md5-sess'], 'qop': ['
auth', 'auth-int', 'auth-conf']}

Jan 14 '08 #11

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

4 posts views Thread by William Stacey [MVP] | last post: by
6 posts views Thread by Senthil | last post: by
3 posts views Thread by John Salerno | last post: by
8 posts views Thread by Jim | last post: by
4 posts views Thread by thiago_bagua | last post: by
14 posts views Thread by tom t/LA | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.