473,473 Members | 1,960 Online
Bytes | Software Development & Data Engineering Community
Create Post

Home Posts Topics Members FAQ

Pyparsing: Grammar Suggestion

I am trying to come up with a grammar that describes the following:

record = f1,f2,...,fn END_RECORD
All the f(i) has to be in that order.
Any f(i) can be absent (e.g. f1,,f3,f4,,f6 END_RECORD)
Number of f(i)'s can vary. For example, the followings are allowed:
f1,f2 END_RECORD
f1,f2,,f4,,f6 END_RECORD

Any suggestions?

Thanks,
Khoa
May 17 '06 #1
1 1845
"Khoa Nguyen" <kh*********@gmail.com> wrote in message
news:ma***************************************@pyt hon.org...
I am trying to come up with a grammar that describes the following:

record = f1,f2,...,fn END_RECORD
All the f(i) has to be in that order.
Any f(i) can be absent (e.g. f1,,f3,f4,,f6 END_RECORD)
Number of f(i)'s can vary. For example, the followings are allowed:
f1,f2 END_RECORD
f1,f2,,f4,,f6 END_RECORD

Any suggestions?

Thanks,
Khoa

--------
pyparsing includes a built-in expression, commaSeparatedList, for just such
a case. Here is a simple pyparsing program to crack your input text:
data = """f1,f2,f3,f4,f5,f6 END_RECORD
f1,f2 END_RECORD
f1,f2,,f4,,f6 END_RECORD"""

from pyparsing import commaSeparatedList

for tokens,start,end in commaSeparatedList.scanString(data):
print tokens
This returns:
['f1', 'f2', 'f3', 'f4', 'f5', 'f6 END_RECORD']
['f1', 'f2 END_RECORD']
['f1', 'f2', '', 'f4', '', 'f6 END_RECORD']

Note that consecutive commas in the input return empty strings at the
corresponding places in the results.

Unfortunately, commaSeparatedList embeds its own definition of what is
allowed between commas, so the last field looks like it always has
END_RECORD added to the end. We could copy the definition of
commaSeparatedList and exclude this, but it is simpler just to add a parse
action to commaSeparatedList, to remove END_RECORD from the -1'th list
element:

def stripEND_RECORD(s,l,t):
last = t[-1]
if last.endswith("END_RECORD"):
# return a copy of t with last element trimmed of "END_RECORD"
return t[:-1] + [last[:-(len("END_RECORD"))].rstrip()]

commaSeparatedList.setParseAction(stripEND_RECORD)
for tokens,start,end in commaSeparatedList.scanString(data):
print tokens
This returns:

['f1', 'f2', 'f3', 'f4', 'f5', 'f6']
['f1', 'f2']
['f1', 'f2', '', 'f4', '', 'f6']

As one of my wife's 3rd graders concluded on a science report - "wah-lah!"

Python also includes a csv module if this example doesn't work for you, but
you asked for a pyparsing solution, so there it is.

-- Paul
May 17 '06 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: Lukas Holcik | last post by:
Hi everyone! How can I simply search text for regexps (lets say <a href="(.*?)">(.*?)</a>) and save all URLs(1) and link contents(2) in a dictionary { name : URL}? In a single pass if it could....
10
by: Paul McGuire | last post by:
I just published my first article on ONLamp, a beginner's walkthrough for pyparsing. Please check it out at http://www.onlamp.com/pub/a/python/2006/01/26/pyparsing.html, and be sure to post any...
2
by: Khoa Nguyen | last post by:
I run into another issue with my grammar: My input record contains a common part and an extended part. Based on the value of the common part, the extended part will be different. So, I am...
4
by: Bytter | last post by:
Hi, I'm trying to construct a parser, but I'm stuck with some basic stuff... For example, I want to match the following: letter = "A"..."Z" | "a"..."z" literal = letter+ include_bool := "+"...
3
by: Steven Bethard | last post by:
Within a larger pyparsing grammar, I have something that looks like:: wsj/00/wsj_0003.mrg When parsing this, I'd like to keep around both the full string, and the AAA_NNNN substring of it, so...
13
by: 7stud | last post by:
To the developer: 1) I went to the pyparsing wiki to download the pyparsing module and try it 2) At the wiki, there was no index entry in the table of contents for Downloads. After searching...
0
by: napolpie | last post by:
DISCUSSION IN USER nappie writes: Hello, I'm Peter and I'm new in python codying and I'm using parsying to extract data from one meteo Arpege file. This file is long file and it's composed by...
1
by: Steve | last post by:
Hi All (especially Paul McGuire!) Could you lend a hand in the grammar and paring of the output from the function win32pdhutil.ShowAllProcesses()? This is the code that I have so far (it is...
18
by: Just Another Victim of the Ambient Morality | last post by:
Is pyparsing really a recursive descent parser? I ask this because there are grammars it can't parse that my recursive descent parser would parse, should I have written one. For instance: ...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
1
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
1
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
0
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...
1
muto222
php
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.