471,350 Members | 1,835 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,350 software developers and data experts.

Parsing nested constructs

Hi,

I need to parse some source with nested parenthesis, like this :
>cut-------------
{
{item1}
{
{item2}
{item3}
}
}
>cut-------------

In fact I'd like to get all start indexes of items and their end (or
lenght).

I know regexps are rather limited for this type of problems.
I don't need an external module.

What would you suggest me ?

Thanks.
Sep 8 '07 #1
5 1229
On 9/8/07, tool69 <ki**@free.frwrote:
Hi,

I need to parse some source with nested parenthesis, like this :
If this is exactly how your data looks, then how about a loop which
searches for "{item" and the following "}"? You can use the "find"
string method for that.

Otherwise, if the items don't look exactly like "{item", but the
formatting is otherwise exactly the same as above, then look for lines
that have both "{" and "}" on them, then get the string between them.

Otherwise, if { and } and the item aren't always on the same line,
then go through the string character by character, keep track of when
you encounter "{" and "}". When you encounter a "}" and there was a
"{" before (ie, not a "}"), then get the string between the "{" and
the "}"
Sep 8 '07 #2
David a écrit :
On 9/8/07, tool69 <ki**@free.frwrote:
>Hi,

I need to parse some source with nested parenthesis, like this :

If this is exactly how your data looks, then how about a loop which
searches for "{item" and the following "}"? You can use the "find"
string method for that.

Otherwise, if the items don't look exactly like "{item", but the
formatting is otherwise exactly the same as above, then look for lines
that have both "{" and "}" on them, then get the string between them.

Otherwise, if { and } and the item aren't always on the same line,
then go through the string character by character, keep track of when
you encounter "{" and "}". When you encounter a "}" and there was a
"{" before (ie, not a "}"), then get the string between the "{" and
the "}"
Hi David,

thanks for answering, I will choose the last one as my strings are not
on the same line and may contain a lot of stuff inside.

cheers,

6TooL9
Sep 8 '07 #3
On Sep 8, 3:42 pm, tool69 <k...@free.frwrote:
Hi,

I need to parse some source with nested parenthesis, like this :
>cut-------------

{
{item1}
{
{item2}
{item3}
}

}
>cut-------------

In fact I'd like to get all start indexes of items and their end (or
lenght).

I know regexps are rather limited for this type of problems.
I don't need an external module.

What would you suggest me ?

Thanks.
Well, it is an external module, but pyparsing makes this pretty
straightforward:

from pyparsing import *

data = """
{
{item1}
{
{item2}
{item3}
}

}
"""

# define delimiters, but suppress them from the output
LBRACE,RBRACE = map(Suppress,"{}")

# forward define recursive items list
items = Forward()

# items is zero or more words of alphas and numbers, or an embedded
# group enclosed in braces
items << ZeroOrMore( Word(alphanums) | Group( LBRACE + items +
RBRACE ) )

# parse the input string, and print out the results
print items.parseString(data)

"""
prints:
[[['item1'], [['item2'], ['item3']]]]

or:
[
[
['item1'],
[
['item2'],
['item3']
]
]
]
"""

-- Paul

Sep 9 '07 #4
Paul McGuire wrote:
Well, it is an external module, but pyparsing makes this pretty
straightforward:

[snip delightful parsing]
Again pyparsing to the rescue :)

I have to do a parsing project in Java right now and I dearly miss
pyparsing. I explained it to the guy I'm working for, and he was pretty
impressed.

Thought that might make you smile.

/W
Sep 9 '07 #5
On Sep 8, 10:01 pm, Wildemar Wildenburger
<lasses_w...@klapptsowieso.netwrote:
>
Again pyparsing to the rescue :)

I have to do a parsing project in Java right now and I dearly miss
pyparsing. I explained it to the guy I'm working for, and he was pretty
impressed.

Thought that might make you smile.

/W
Wildemar -

Thanks for such a glowing testimonial! It really is a boost of
encouragement when I find projects that are using pyparsing, or see
postings on c.l.py or the tutor list (by people other than me!)
recommending using pyparsing in response to someone's post.

I'm glad you find pyparsing so useful in your Python endeavors.

-- Paul

Sep 9 '07 #6

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

14 posts views Thread by Viktor Rosenfeld | last post: by
reply views Thread by Dean H. Saxe | last post: by
46 posts views Thread by Neptune | last post: by
14 posts views Thread by Josh | last post: by
9 posts views Thread by Gregory Petrosyan | last post: by
12 posts views Thread by Klaus Alexander Seistrup | last post: by
13 posts views Thread by Chris Carlen | last post: by
6 posts views Thread by James Arnold | last post: by
reply views Thread by XIAOLAOHU | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.