473,413 Members | 1,989 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,413 software developers and data experts.

problem with pyparsing - suppress

5
Hi,
I want learn pyparsing and got stuck with this:
Expand|Select|Wrap|Line Numbers
  1. from pyparsing import *
  2.  
  3. body = ZeroOrMore(word)
  4. begin = Keyword('begin story').suppress()
  5. end = Keyword('end story').suppress()
  6.  
  7. sentence = begin + body + end 
  8. print sentence.parseString("begin story once upon a time end story")
I'm getting error :
ParseException: Expected "end story" (at char 36), (line:1, col:37)

I don't get it why? end story is there isn't it?
thanks
Mar 2 '10 #1

✓ answered by ptmcg

This is a very common issue with learning pyparsing. Pyparsing does not do any right-to-left backtracking like regex'es do. It is purely left-to-right. So make sure your repetition does not accidentally include the terminating sentinel value.

See embedded comments below:

Expand|Select|Wrap|Line Numbers
  1. from pyparsing import * 
  2.  
  3. # define these up front
  4. begin = Keyword('begin story').suppress() 
  5. end = Keyword('end story').suppress() 
  6. word=Word(alphas)  
  7.  
  8. # what you *really* mean by 'body' - you want
  9. # ZeroOrMore words, as long as they aren't 'end story' -
  10. # so just say that
  11. body = ZeroOrMore(~end + word) 
  12.  
  13. # the rest is just like you had it
  14. sentence = begin + body + end  
  15. print sentence.parseString("begin story once upon a time end story")
  16.  
prints:
Expand|Select|Wrap|Line Numbers
  1. ['once', 'upon', 'a', 'time']
  2.  
-- Paul

4 2497
Glenton
391 Expert 256MB
Hi

Your line 3 should be "body = ZeroOrMore(Word(alphas))", right? Or did you already define word=Word(alphas).

Anyway, this is a classic "gotcha" in regular expressions. It always takes the longest string it can that matches the characteristics.

For example if you run it with sentence defined as begin+body you get
['once','upon','a','time','end','story']. In other words the "end story" is matched by the body. Then it comes to the end of the string and goes, "but where's the "end story" that was prophesied." *

It needs something to help it differentiate. D*mmit, man, it's a string-parser, not a mind-reader!!** But if you gave it something to work with:
Expand|Select|Wrap|Line Numbers
  1. from pyparsing import *
  2.  
  3. word=Word(alphas) 
  4. body = ZeroOrMore(word)
  5. begin = Keyword('begin story').suppress()
  6. end = Keyword('$end story').suppress()
  7.  
  8. sentence = begin + body + end 
  9. print sentence.parseString("begin story once upon a time $end story")
Good luck!

*might be over anthropomorphising the string parser
**might not be an exact Star Trek quote
Mar 3 '10 #2
kc2ine
5
LOL, it's not mind reader? shoot :)

but what if I want to have 'end story' ending tag without the dollar sign... :(

thanks Glenton anyway.
Mar 4 '10 #3
Glenton
391 Expert 256MB
Well, if you know it ends with ' end story', you could just use string slicing.
Expand|Select|Wrap|Line Numbers
  1. from pyparsing import *
  2.  
  3. word=Word(alphas) 
  4. body = ZeroOrMore(word)
  5. begin = Keyword('begin story').suppress()
  6.  
  7. sentence = begin + body
  8.  
  9. myString="begin story once upon a time end story"
  10. print sentence.parseString(myString[:-10])
  11.  
  12.  
Mar 5 '10 #4
ptmcg
1
This is a very common issue with learning pyparsing. Pyparsing does not do any right-to-left backtracking like regex'es do. It is purely left-to-right. So make sure your repetition does not accidentally include the terminating sentinel value.

See embedded comments below:

Expand|Select|Wrap|Line Numbers
  1. from pyparsing import * 
  2.  
  3. # define these up front
  4. begin = Keyword('begin story').suppress() 
  5. end = Keyword('end story').suppress() 
  6. word=Word(alphas)  
  7.  
  8. # what you *really* mean by 'body' - you want
  9. # ZeroOrMore words, as long as they aren't 'end story' -
  10. # so just say that
  11. body = ZeroOrMore(~end + word) 
  12.  
  13. # the rest is just like you had it
  14. sentence = begin + body + end  
  15. print sentence.parseString("begin story once upon a time end story")
  16.  
prints:
Expand|Select|Wrap|Line Numbers
  1. ['once', 'upon', 'a', 'time']
  2.  
-- Paul
Mar 9 '10 #5

Sign in to post your reply or Sign up for a free account.

Similar topics

4
by: Bo¹tjan Jerko | last post by:
Hello ! I am trying to understand pyparsing. Here is a little test program to check Optional subclass: from pyparsing import Word,nums,Literal,Optional lbrack=Literal("").suppress()...
2
by: Peter Fein | last post by:
I'm trying to use pyparsing write a screenscraper. I've got some arbitrary HTML text I define as opener & closer. In between is the HTML data I want to extract. However, the data may contain the...
2
by: astarocean | last post by:
using pyparsing to deal with nested tables , wanna keep table's structure and propertys . but program was chunked with the </td> tag of inner table. have any ideas? here's the program ...
3
by: rh0dium | last post by:
Hi all, I have a file which I need to parse and I need to be able to break it down by sections. I know it's possible but I can't seem to figure this out. The sections are broken by <> with...
1
by: gry | last post by:
I have: def unpack_sql_array(s): # unpack a postgres "array", e.g. "{'w1','w2','w3'}" into a list(str) import pyparsing as pp withquotes = pp.dblQuotedString.setParseAction(pp.removeQuotes)...
1
by: napolpie | last post by:
----Messaggio originale---- Da: napolpie@tin.it Data: 3-mag-2007 10.02 A: <python-list@python.org> Ogg: problem with meteo datas Hello, I'm Peter and I'm new in python codying and I'm using...
3
by: Prabhu Gurumurthy | last post by:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 All, I have the following lines that I would like to parse in python using pyparsing, but have some problems forming the grammar. Line in...
3
by: John Carlyle-Clarke | last post by:
Hi. I'm new to Python and trying to use it to solve a specific problem. I have an XML file in which I need to locate a specific text node and replace the contents with some other text. The...
1
by: rh0dium | last post by:
Hi all, I almost did my first pyparsing without help but here we go again. Let's start with my code. The sample data is listed below. # This will gather the following ( "NamedPin"...
3
by: name | last post by:
Hi, I try to parse a file with pyparsing and get this output: - alias: host alias xyz - host_name: - ip_address: - use:
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.