"Boštjan Jerko" <bo***********@mf.uni-lj.si> wrote in message
news:87************@bostjan-pc.mf.uni-lj.si...
Hello !
I am trying to understand pyparsing. Here is a little test program
to check Optional subclass:
from pyparsing import Word,nums,Literal,Optional
lbrack=Literal("[").suppress()
rbrack=Literal("]").suppress()
ddot=Literal(":").suppress()
start = Word(nums+".")
step = Word(nums+".")
end = Word(nums+".")
sequence=lbrack+start+Optional(ddot+step)+ddot+end +rbrack
tokens = sequence.parseString("[0:0.1:1]")
print tokens
tokens1 = sequence.parseString("[1:2]")
print tokens1
It works on tokens, but the error message is showed on
the second string ("[1:2]"). I don't get it. I did use
Optional for ddot and step so I guess they are optional.
Any hints what I am doing wrong?
The versions are pyparsing 1.1.2 and Python 2.3.3.
Thanks,
B.
Bostjan -
Here's how pyparsing is processing your input strings:
[0:0.1:1]
[ = lbrack
0 = start
:0.1 = ddot + step (Optional match)
: = ddot
1 = end
] = rbrack
[1:2]
[ = lbrack
1 = start
:2 = ddot + step (Optional match)
] = oops! expected ddot -> failure
Dang Griffith proposed one alternative construct, here's another, perhaps
more explicit:
lbrack + ( ( ddot + step + ddot + end ) | (ddot + end) ) + rbrack
Note that the order of the inner construct is important, so as to not match
ddot+end before trying ddot+step+ddot+end; '|' is a greedy matching
operator, creating a MatchFirst object from pyparsing's class library. You
could avoid this confusion by using '^', which generates an Or object:
lbrack + ( (ddot + end) ^ ( ddot + step + ddot + end ) ) + rbrack
This will evaluate both subconstructs, and choose the longer of the two.
Or you can use another pyparsing helper, the delimited list
lbrack + delimitedlist( Word(nums+"."), delim=":") + rbrack
This implicitly suppresses delimiters, so that all you will get back are
["1","0.1","1"] in the first case and ["1","2"] in the second.
Happy pyparsing!
-- Paul