By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
444,035 Members | 1,384 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 444,035 IT Pros & Developers. It's quick & easy.

pyparsing and svg

P: n/a
Hi - I have been trying, but I need some help:
Here's my test code to parse an SVG path element, can anyone steer me right?

d1="""
M 209.12237 , 172.2415
L 286.76739 , 153.51369
L 286.76739 , 275.88534
L 209.12237 , 294.45058
L 209.12237 , 172.2415
z """

#Try it with no enters
d1="""M 209.12237,172.2415 L 286.76739,153.51369 L 286.76739,275.88534 L
209.12237,294.45058 L 209.12237,172.2415 z """

#Try it with no spaces
d1="""M209.12237,172.2415L286.76739,153.51369L286. 76739,275.88534L209.12237,294.45058L209.12237,172. 2415z"""

#For later, has more commands
d2="""
M 269.78326 , 381.27104
C 368.52151 , 424.27023
90.593578 , -18.581883
90.027729 , 129.28708
C 89.461878 , 277.15604
171.04501 , 338.27184
269.78326 , 381.27104
z """

## word :: M, L, C, Z
## number :: group of numbers
## dot :: "."
## comma :: ","
## couple :: number dot number comma number dot number
## command :: word
##
## phrase :: command couple

from pyparsing import Word, Literal, alphas, nums, Optional, OneOrMore

command = Word("MLCZ")
comma = Literal(",")
dot = Literal(".")
float = nums + dot + nums
couple = float + comma + float

phrase = OneOrMore(command + OneOrMore(couple | ( couple + couple ) ) )

print phrase

print phrase.parseString(d1.upper())

Thanks
\d

Nov 8 '07 #1
Share this Question
Share on Google+
2 Replies


P: n/a
On Nov 8, 3:14 am, Donn Ingle <donn.in...@gmail.comwrote:
float = nums + dot + nums
Should be:

float = Combine(Word(nums) + dot + Word(nums))

nums is a string that defines the set of numeric digits for composing
Word instances. nums is not an expression by itself.

For that matter, I see in your later tests that some values have a
leading minus sign, so you should really go with:

float = Combine(Optional("-") + Word(nums) + dot + Word(nums))

Some other comments:

1. Read up on the Word class, you are not using it quite right.

command = Word("MLCZ")

will work with your test set, but it is not the form I would choose.
Word(characterstring) will match any "word" made up of the characters
in the input string. So Word("MLCZ") will match
M
L
C
Z
MM
LC
MCZL
MMLCLLZCZLLM

I would suggest instead using:

command = Literal("M") | "L" | "C" | "Z"

or

command = oneOf("M L C Z")

2. Change comma to

comma = Literal(",").suppress()

The comma is important to the parsing process, but the ',' token is
not much use in the returned set of matched tokens, get rid of it (by
using suppress).

3. Group your expressions, such as

couple = Group(float + comma + float)

It will really simplify getting at the resulting parsed tokens.
4. What is the purpose of (couple + couple)? This is sufficient:

phrase = OneOrMore(command + Group(OneOrMore(couple)) )

(Note use of Group to return the coord pairs as a sublist.)
5. Results names!

phrase = OneOrMore(command("command") + Group(OneOrMore(couple))
("coords") )

will allow you to access these fields by name instead of by index.
This will make your parser code *way* more readable.
-- Paul

Nov 8 '07 #2

P: n/a
Paul McGuire wrote:
On Nov 8, 3:14 am, Donn Ingle <donn.in...@gmail.comwrote:

>float = nums + dot + nums

Should be:

float = Combine(Word(nums) + dot + Word(nums))

nums is a string that defines the set of numeric digits for composing
Word instances. nums is not an expression by itself.

For that matter, I see in your later tests that some values have a
leading minus sign, so you should really go with:

float = Combine(Optional("-") + Word(nums) + dot + Word(nums))
I have a working path data parser (in pyparsing) at
http://code.google.com/p/wxpsvg.

Parsing the numeric values initially gave me a lot of trouble - I
translated the BNF in the spec literally and there was a *ton* of
backtracking going on with every numeric value. I ended up using a more
generous grammar, and letting pythons float() reject invalid values.

I couldn't get repeating path elements (like M 100 100 200 200, which is
the same as M 100 100 M 200 200) working right in the grammar, so I
expand those with post-processing.

The parser itself can be seen at
http://wxpsvg.googlecode.com/svn/trunk/svg/pathdata.py
Some other comments:

1. Read up on the Word class, you are not using it quite right.

command = Word("MLCZ")

will work with your test set, but it is not the form I would choose.
Word(characterstring) will match any "word" made up of the characters
in the input string. So Word("MLCZ") will match
M
L
C
Z
MM
LC
MCZL
MMLCLLZCZLLM

I would suggest instead using:

command = Literal("M") | "L" | "C" | "Z"

or

command = oneOf("M L C Z")

2. Change comma to

comma = Literal(",").suppress()

The comma is important to the parsing process, but the ',' token is
not much use in the returned set of matched tokens, get rid of it (by
using suppress).

3. Group your expressions, such as

couple = Group(float + comma + float)

It will really simplify getting at the resulting parsed tokens.
4. What is the purpose of (couple + couple)? This is sufficient:

phrase = OneOrMore(command + Group(OneOrMore(couple)) )

(Note use of Group to return the coord pairs as a sublist.)
5. Results names!

phrase = OneOrMore(command("command") + Group(OneOrMore(couple))
("coords") )

will allow you to access these fields by name instead of by index.
This will make your parser code *way* more readable.
-- Paul

Nov 8 '07 #3

This discussion thread is closed

Replies have been disabled for this discussion.