473,396 Members | 1,777 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

Newbie code review of parsing program Please

len
I have created the following program to read a text file which happens
to be a cobol filed definition. The program then outputs to a file
what is essentially a file which is a list definition which I can
later
copy and past into a python program. I will eventually expand the
program
to also output an SQL script to create a SQL file in MySQL

The program still need a little work, it does not handle the following
items
yet;

1. It does not handle OCCURS yet.
2. It does not handle REDEFINE yet.
3. GROUP structures will need work.
4. Does not create SQL script yet.

It is my anticipation that any files created out of this program may
need
manual tweeking but I have a large number of cobol file definitions
which
I may need to work with and this seemed like a better solution than
hand
typing each list definition and SQL create file script by hand.

What I would like is if some kind soul could review my code and give
me
some suggestions on how I might improve it. I think the use of
regular
expression might cut the code down or at least simplify the parsing
but
I'm just starting to read those chapters in the book;)

*** SAMPLE INPUT FILE ***

000100 FD SALESMEN-FILE
000200 LABEL RECORDS ARE STANDARD
000300 VALUE OF FILENAME IS "SALESMEN".
000400
000500 01 SALESMEN-RECORD.
000600 05 SALESMEN-NO PIC 9(3).
000700 05 SALESMEN-NAME PIC X(30).
000800 05 SALESMEN-TERRITORY PIC X(30).
000900 05 SALESMEN-QUOTA PIC S9(7) COMP.
001000 05 SALESMEN-1ST-BONUS PIC S9(5)V99 COMP.
001100 05 SALESMEN-2ND-BONUS PIC S9(5)V99 COMP.
001200 05 SALESMEN-3RD-BONUS PIC S9(5)V99 COMP.
001300 05 SALESMEN-4TH-BONUS PIC S9(5)V99 COMP.

*** PROGRAM CODE ***

#!/usr/bin/python

import sys

f_path = '/home/lenyel/Bruske/MCBA/Internet/'
f_name = sys.argv[1]

fd = open(f_path + f_name, 'r')

def fmtline(fieldline):
size = ''
type = ''
dec = ''
codeline = []
if fieldline.count('COMP.') 0:
left = fieldline[3].find('(') + 1
right = fieldline[3].find(')')
num = fieldline[3][left:right].lstrip()
if fieldline[3].count('V'):
left = fieldline[3].find('V') + 1
dec = int(len(fieldline[3][left:]))
size = ((int(num) + int(dec)) / 2) + 1
else:
size = (int(num) / 2) + 1
dec = 0
type = 'Pdec'
elif fieldline[3][0] in ('X', '9'):
dec = 0
left = fieldline[3].find('(') + 1
right = fieldline[3].find(')')
size = int(fieldline[3][left:right].lstrip('0'))
if fieldline[3][0] == 'X':
type = 'Xstr'
else:
type = 'Xint'
else:
dec = 0
left = fieldline[3].find('(') + 1
right = fieldline[3].find(')')
size = int(fieldline[3][left:right].lstrip('0'))
if fieldline[3][0] == 'X':
type = 'Xint'
codeline.append(fieldline[1].replace('-', '_').replace('.',
'').lower())
codeline.append(size)
codeline.append(type)
codeline.append(dec)
return codeline

wrkfd = []
rec_len = 0

for line in fd:
if line[6] == '*': # drop comment lines
continue
newline = line.split()
if len(newline) == 1: # drop blank line
continue
newline = newline[1:]
if 'FILENAME' in newline:
filename = newline[-1].replace('"','').lower()
filename = filename.replace('.','')
output = open('/home/lenyel/Bruske/MCBA/Internet/'+filename
+'.fd', 'w')
code = filename + ' = [\n'
output.write(code)
elif newline[0].isdigit() and 'PIC' in newline:
wrkfd.append(fmtline(newline))
rec_len += wrkfd[-1][1]

fd.close()

fmtfd = []

for wrkline in wrkfd[:-1]:
fmtline = str(tuple(wrkline)) + ',\n'
output.write(fmtline)

fmtline = tuple(wrkfd[-1])
fmtline = str(fmtline) + '\n'
output.write(fmtline)

lastline = ']\n'
output.write(lastline)

lenrec = filename + '_len = ' + str(rec_len)
output.write(lenrec)

output.close()

*** RESULTING OUTPUT ***

salesmen = [
('salesmen_no', 3, 'Xint', 0),
('salesmen_name', 30, 'Xstr', 0),
('salesmen_territory', 30, 'Xstr', 0),
('salesmen_quota', 4, 'Pdec', 0),
('salesmen_1st_bonus', 4, 'Pdec', 2),
('salesmen_2nd_bonus', 4, 'Pdec', 2),
('salesmen_3rd_bonus', 4, 'Pdec', 2),
('salesmen_4th_bonus', 4, 'Pdec', 2)
]
salesmen_len = 83

If you find this code useful please feel free to use any or all of it
at your own risk.

Thanks
Len S
Nov 16 '08 #1
10 1971

"len" <ls******@gmail.comwrote in message
news:fc**********************************@u18g2000 pro.googlegroups.com...
>I have created the following program to read a text file which happens
to be a cobol filed definition. The program then outputs to a file
what is essentially a file which is a list definition which I can
later
copy and past into a python program. I will eventually expand the
program
to also output an SQL script to create a SQL file in MySQL

The program still need a little work, it does not handle the following
items
yet;

1. It does not handle OCCURS yet.
2. It does not handle REDEFINE yet.
3. GROUP structures will need work.
4. Does not create SQL script yet.

It is my anticipation that any files created out of this program may
need
manual tweeking but I have a large number of cobol file definitions
which
I may need to work with and this seemed like a better solution than
hand
typing each list definition and SQL create file script by hand.

What I would like is if some kind soul could review my code and give
me
some suggestions on how I might improve it. I think the use of
regular
expression might cut the code down or at least simplify the parsing
but
I'm just starting to read those chapters in the book;)

*** SAMPLE INPUT FILE ***

000100 FD SALESMEN-FILE
000200 LABEL RECORDS ARE STANDARD
000300 VALUE OF FILENAME IS "SALESMEN".
000400
000500 01 SALESMEN-RECORD.
000600 05 SALESMEN-NO PIC 9(3).
000700 05 SALESMEN-NAME PIC X(30).
000800 05 SALESMEN-TERRITORY PIC X(30).
000900 05 SALESMEN-QUOTA PIC S9(7) COMP.
001000 05 SALESMEN-1ST-BONUS PIC S9(5)V99 COMP.
001100 05 SALESMEN-2ND-BONUS PIC S9(5)V99 COMP.
001200 05 SALESMEN-3RD-BONUS PIC S9(5)V99 COMP.
001300 05 SALESMEN-4TH-BONUS PIC S9(5)V99 COMP.

*** PROGRAM CODE ***

#!/usr/bin/python

import sys

f_path = '/home/lenyel/Bruske/MCBA/Internet/'
f_name = sys.argv[1]

fd = open(f_path + f_name, 'r')

def fmtline(fieldline):
size = ''
type = ''
dec = ''
codeline = []
if fieldline.count('COMP.') 0:
left = fieldline[3].find('(') + 1
right = fieldline[3].find(')')
num = fieldline[3][left:right].lstrip()
if fieldline[3].count('V'):
left = fieldline[3].find('V') + 1
dec = int(len(fieldline[3][left:]))
size = ((int(num) + int(dec)) / 2) + 1
else:
size = (int(num) / 2) + 1
dec = 0
type = 'Pdec'
elif fieldline[3][0] in ('X', '9'):
dec = 0
left = fieldline[3].find('(') + 1
right = fieldline[3].find(')')
size = int(fieldline[3][left:right].lstrip('0'))
if fieldline[3][0] == 'X':
type = 'Xstr'
else:
type = 'Xint'
else:
dec = 0
left = fieldline[3].find('(') + 1
right = fieldline[3].find(')')
size = int(fieldline[3][left:right].lstrip('0'))
if fieldline[3][0] == 'X':
type = 'Xint'
codeline.append(fieldline[1].replace('-', '_').replace('.',
'').lower())
codeline.append(size)
codeline.append(type)
codeline.append(dec)
return codeline

wrkfd = []
rec_len = 0

for line in fd:
if line[6] == '*': # drop comment lines
continue
newline = line.split()
if len(newline) == 1: # drop blank line
continue
newline = newline[1:]
if 'FILENAME' in newline:
filename = newline[-1].replace('"','').lower()
filename = filename.replace('.','')
output = open('/home/lenyel/Bruske/MCBA/Internet/'+filename
+'.fd', 'w')
code = filename + ' = [\n'
output.write(code)
elif newline[0].isdigit() and 'PIC' in newline:
wrkfd.append(fmtline(newline))
rec_len += wrkfd[-1][1]

fd.close()

fmtfd = []

for wrkline in wrkfd[:-1]:
fmtline = str(tuple(wrkline)) + ',\n'
output.write(fmtline)

fmtline = tuple(wrkfd[-1])
fmtline = str(fmtline) + '\n'
output.write(fmtline)

lastline = ']\n'
output.write(lastline)

lenrec = filename + '_len = ' + str(rec_len)
output.write(lenrec)

output.close()

*** RESULTING OUTPUT ***

salesmen = [
('salesmen_no', 3, 'Xint', 0),
('salesmen_name', 30, 'Xstr', 0),
('salesmen_territory', 30, 'Xstr', 0),
('salesmen_quota', 4, 'Pdec', 0),
('salesmen_1st_bonus', 4, 'Pdec', 2),
('salesmen_2nd_bonus', 4, 'Pdec', 2),
('salesmen_3rd_bonus', 4, 'Pdec', 2),
('salesmen_4th_bonus', 4, 'Pdec', 2)
]
salesmen_len = 83

If you find this code useful please feel free to use any or all of it
at your own risk.

Thanks
Len S
You might want to check out the pyparsing library.

-Mark

Nov 16 '08 #2
len
On Nov 16, 12:40*pm, "Mark Tolonen" <M8R-yft...@mailinator.comwrote:
"len" <lsumn...@gmail.comwrote in message

news:fc**********************************@u18g2000 pro.googlegroups.com...


I have created the following program to read a text file which happens
to be a cobol filed definition. *The program then outputs to a file
what is essentially a file which is a list definition which I can
later
copy and past into a python program. *I will eventually expand the
program
to also output an SQL script to create a SQL file in MySQL
The program still need a little work, it does not handle the following
items
yet;
1. *It does not handle OCCURS yet.
2. *It does not handle REDEFINE yet.
3. *GROUP structures will need work.
4. *Does not create SQL script yet.
It is my anticipation that any files created out of this program may
need
manual tweeking but I have a large number of cobol file definitions
which
I may need to work with and this seemed like a better solution than
hand
typing each list definition and SQL create file script by hand.
What I would like is if some kind soul could review my code and give
me
some suggestions on how I might improve it. *I think the use of
regular
expression might cut the code down or at least simplify the parsing
but
I'm just starting to read those chapters in the book;)
*** SAMPLE INPUT FILE ***
000100 FD *SALESMEN-FILE
000200 * * LABEL RECORDS ARE STANDARD
000300 * * VALUE OF FILENAME IS "SALESMEN".
000400
000500 01 *SALESMEN-RECORD.
000600 * * 05 *SALESMEN-NO * * * * * * * *PIC 9(3).
000700 * * 05 *SALESMEN-NAME * * * * * * *PIC X(30)..
000800 * * 05 *SALESMEN-TERRITORY * * * * PIC X(30).
000900 * * 05 *SALESMEN-QUOTA * * * * * * PIC S9(7) COMP.
001000 * * 05 *SALESMEN-1ST-BONUS * * * * PIC S9(5)V99 COMP.
001100 * * 05 *SALESMEN-2ND-BONUS * * * * PIC S9(5)V99 COMP.
001200 * * 05 *SALESMEN-3RD-BONUS * * * * PIC S9(5)V99 COMP.
001300 * * 05 *SALESMEN-4TH-BONUS * * * * PIC S9(5)V99 COMP.
*** PROGRAM CODE ***
#!/usr/bin/python
import sys
f_path = '/home/lenyel/Bruske/MCBA/Internet/'
f_name = sys.argv[1]
fd = open(f_path + f_name, 'r')
def fmtline(fieldline):
* *size = ''
* *type = ''
* *dec = ''
* *codeline = []
* *if fieldline.count('COMP.') 0:
* * * *left = fieldline[3].find('(') + 1
* * * *right = fieldline[3].find(')')
* * * *num = fieldline[3][left:right].lstrip()
* * * *if fieldline[3].count('V'):
* * * * * *left = fieldline[3].find('V') + 1
* * * * * *dec = int(len(fieldline[3][left:]))
* * * * * *size = ((int(num) + int(dec)) / 2) + 1
* * * *else:
* * * * * *size = (int(num) / 2) + 1
* * * * * *dec = 0
* * * *type = 'Pdec'
* *elif fieldline[3][0] in ('X', '9'):
* * * *dec = 0
* * * *left = fieldline[3].find('(') + 1
* * * *right = fieldline[3].find(')')
* * * *size = int(fieldline[3][left:right].lstrip('0'))
* * * *if fieldline[3][0] == 'X':
* * * * * *type = 'Xstr'
* * * *else:
* * * * * *type = 'Xint'
* *else:
* * * *dec = 0
* * * *left = fieldline[3].find('(') + 1
* * * *right = fieldline[3].find(')')
* * * *size = int(fieldline[3][left:right].lstrip('0'))
* * * *if fieldline[3][0] == 'X':
* * * * * *type = 'Xint'
* *codeline.append(fieldline[1].replace('-', '_').replace('.',
'').lower())
* *codeline.append(size)
* *codeline.append(type)
* *codeline.append(dec)
* *return codeline
wrkfd = []
rec_len = 0
for line in fd:
* *if line[6] == '*': * * *# drop comment lines
* * * *continue
* *newline = line.split()
* *if len(newline) == 1: * # drop blank line
* * * *continue
* *newline = newline[1:]
* *if 'FILENAME' in newline:
* * * *filename = newline[-1].replace('"','').lower()
* * * *filename = filename.replace('.','')
* * * *output = open('/home/lenyel/Bruske/MCBA/Internet/'+filename
+'.fd', 'w')
* * * *code = filename + ' = [\n'
* * * *output.write(code)
* *elif newline[0].isdigit() and 'PIC' in newline:
* * * *wrkfd.append(fmtline(newline))
* * * *rec_len += wrkfd[-1][1]
fd.close()
fmtfd = []
for wrkline in wrkfd[:-1]:
* *fmtline = str(tuple(wrkline)) + ',\n'
* *output.write(fmtline)
fmtline = tuple(wrkfd[-1])
fmtline = str(fmtline) + '\n'
output.write(fmtline)
lastline = ']\n'
output.write(lastline)
lenrec = filename + '_len = ' + str(rec_len)
output.write(lenrec)
output.close()
*** RESULTING OUTPUT ***
salesmen = [
('salesmen_no', 3, 'Xint', 0),
('salesmen_name', 30, 'Xstr', 0),
('salesmen_territory', 30, 'Xstr', 0),
('salesmen_quota', 4, 'Pdec', 0),
('salesmen_1st_bonus', 4, 'Pdec', 2),
('salesmen_2nd_bonus', 4, 'Pdec', 2),
('salesmen_3rd_bonus', 4, 'Pdec', 2),
('salesmen_4th_bonus', 4, 'Pdec', 2)
]
salesmen_len = 83
If you find this code useful please feel free to use any or all of it
at your own risk.
Thanks
Len S

You might want to check out the pyparsing library.

-Mark
Thanks Mark I will check in out right now.

Len
Nov 16 '08 #3
Mark Tolonen wrote:
>
"len" <ls******@gmail.comwrote in message
news:fc**********************************@u18g2000 pro.googlegroups.com...
[...]
>
You might want to check out the pyparsing library.
And you might want to trim your messages to avoid quoting irrelevant
stuff. This is not directed personally at Mark, but at all readers.

Loads of us do it, and I wish we'd stop it. It's poor netiquette because
it forces people to skip past stuff that isn't relevant to the point
being made. It's also a global wste of bandwidth and storage space,
though that's less important than it used to be.

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC http://www.holdenweb.com/

Nov 16 '08 #4
len wrote:
if fieldline.count('COMP.') 0:
I take it you're only handling a particular subset of COBOL constructs: thus, "COMP" is never "COMPUTATIONAL" or "USAGE IS COMPUTATIONAL", and it always occurs just before the full-stop (can't remember enough COBOL syntax to be sure if anything else can go afterwards).
elif newline[0].isdigit() and 'PIC' in newline:
Similarly, "PIC" is never "PICTURE" or "PICTURE IS".

Aargh, I think I have to stop. I'm remembering more than I ever wanted to about COBOL. Must ... rip ... brain ... out ...
Nov 17 '08 #5

"Steve Holden" <st***@holdenweb.comwrote in message
news:ma**************************************@pyth on.org...
Mark Tolonen wrote:
>>
"len" <ls******@gmail.comwrote in message
news:fc**********************************@u18g200 0pro.googlegroups.com...
[...]
>>
You might want to check out the pyparsing library.
And you might want to trim your messages to avoid quoting irrelevant
stuff. This is not directed personally at Mark, but at all readers.

Loads of us do it, and I wish we'd stop it. It's poor netiquette because
it forces people to skip past stuff that isn't relevant to the point
being made. It's also a global wste of bandwidth and storage space,
though that's less important than it used to be.
Point taken...or I could top post ;^)

-Mark

Nov 17 '08 #6
Mark Tolonen wrote:
Point taken...or I could top post ;^)
A: A Rolls seats six.
Q: What's the saddest thing about seeing a Rolls with five top-posters in it going over a cliff?
Nov 17 '08 #7
On Nov 17, 7:11*pm, Lawrence D'Oliveiro <l...@geek-
central.gen.new_zealandwrote:
Mark Tolonen wrote:
Point taken...or I could top post ;^)

A: A Rolls seats six.
Q: What's the saddest thing about seeing a Rolls with five top-posters init going over a cliff?
+1 but you forgot the boot & the roof rack AND if it was a really old
one there'd be space for a few on the running boards (attached like
the Norwegian Blue parrot)
Nov 17 '08 #8
On Nov 16, 12:53 pm, len <lsumn...@gmail.comwrote:
On Nov 16, 12:40 pm, "Mark Tolonen" <M8R-yft...@mailinator.comwrote:

You might want to check out the pyparsing library.
-Mark

Thanks Mark I will check in out right now.

Len
Len -

Here is a rough pyparsing starter for your problem:

from pyparsing import *

COMP = Optional("USAGE IS") + oneOf("COMP COMPUTATIONAL")
PIC = oneOf("PIC PICTURE") + Optional("IS")
PERIOD,LPAREN,RPAREN = map(Suppress,".()")

ident = Word(alphanums.upper()+"_-")
integer = Word(nums).setParseAction(lambda t:int(t[0]))
lineNum = Suppress(Optional(LineEnd()) + LineStart() + Word(nums))

rep = LPAREN + integer + RPAREN
repchars = "X" + rep
repchars.setParseAction(lambda tokens: ['X']*tokens[1])
strdecl = Combine(OneOrMore(repchars | "X"))

SIGN = Optional("S")
repdigits = "9" + rep
repdigits.setParseAction(lambda tokens: ['9']*tokens[1])
intdecl = SIGN("sign") + Combine(OneOrMore(repdigits | "9"))
("intpart")
realdecl = SIGN("sign") + Combine(OneOrMore(repdigits | "9"))
("intpart") + "V" + \
Combine(OneOrMore("9" + rep | "9"))("realpart")

type = Group((strdecl | realdecl | intdecl) +
Optional(COMP("COMP")))

fieldDecl = lineNum + "05" + ident("name") + \
PIC + type("type") + PERIOD
structDecl = lineNum + "01" + ident("name") + PERIOD + \
OneOrMore(Group(fieldDecl))("fields")

It prints out:

SALESMEN-RECORD
SALESMEN-NO ['999']
SALESMEN-NAME ['XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX']
SALESMEN-TERRITORY ['XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX']
SALESMEN-QUOTA ['S', '9999999', 'COMP']
SALESMEN-1ST-BONUS ['S', '99999', 'V', '99', 'COMP']
SALESMEN-2ND-BONUS ['S', '99999', 'V', '99', 'COMP']
SALESMEN-3RD-BONUS ['S', '99999', 'V', '99', 'COMP']
SALESMEN-4TH-BONUS ['S', '99999', 'V', '99', 'COMP']

I too have some dim, dark, memories of COBOL. I seem to recall having
to infer from the number of digits in an integer or real what size the
number would be. I don't have that logic implemented, but here is an
extension to the above program, which shows you where you could put
this kind of type inference logic (insert this code before the call to
searchString):

class TypeDefn(object):
@staticmethod
def intType(tokens):
self = TypeDefn()
self.str = "int(%d)" % (len(tokens.intpart),)
self.isSigned = bool(tokens.sign)
return self
@staticmethod
def realType(tokens):
self = TypeDefn()
self.str = "real(%d.%d)" % (len(tokens.intpart),len
(tokens.realpart))
self.isSigned = bool(tokens.sign)
return self
@staticmethod
def charType(tokens):
self = TypeDefn()
self.str = "char(%d)" % len(tokens)
self.isSigned = False
self.isComp = False
return self
def __repr__(self):
return ("+-" if self.isSigned else "") + self.str
intdecl.setParseAction(TypeDefn.intType)
realdecl.setParseAction(TypeDefn.realType)
strdecl.setParseAction(TypeDefn.charType)

This prints:

SALESMEN-RECORD
SALESMEN-NO [int(3)]
SALESMEN-NAME [char(1)]
SALESMEN-TERRITORY [char(1)]
SALESMEN-QUOTA [+-int(7), 'COMP']
SALESMEN-1ST-BONUS [+-real(5.2), 'COMP']
SALESMEN-2ND-BONUS [+-real(5.2), 'COMP']
SALESMEN-3RD-BONUS [+-real(5.2), 'COMP']
SALESMEN-4TH-BONUS [+-real(5.2), 'COMP']

You can post more questions about pyparsing on the Discussion tab of
the pyparsing wiki home page.

Best of luck!
-- Paul
Nov 17 '08 #9
len
On Nov 16, 9:57*pm, Lawrence D'Oliveiro <l...@geek-
central.gen.new_zealandwrote:
len wrote:
* * if fieldline.count('COMP.') 0:

I take it you're only handling a particular subset of COBOL constructs: thus, "COMP" is never "COMPUTATIONAL" or "USAGE IS COMPUTATIONAL", and it always occurs just before the full-stop (can't remember enough COBOL syntax to *be sure if anything else can go afterwards).
* * elif newline[0].isdigit() and 'PIC' in newline:

Similarly, "PIC" is never "PICTURE" or "PICTURE IS".

Aargh, I think I have to stop. I'm remembering more than I ever wanted toabout COBOL. Must ... rip ... brain ... out ...
Most of the cobol code originally comes from packages and is
relatively consistant.

Thanks
Len
Nov 17 '08 #10
len
Thanks Paul

I will be going over your code today. I started looking at Pyparsing
last night
and it just got to late and my brain started to fog over. I would
really like
to thank you for taking the time to provide me with the code sample
I'm sure it
will really help. Again thank you very much.

Len

On Nov 17, 8:01*am, Paul McGuire <pt...@austin.rr.comwrote:
On Nov 16, 12:53 pm, len <lsumn...@gmail.comwrote:
On Nov 16, 12:40 pm, "Mark Tolonen" <M8R-yft...@mailinator.comwrote:
You might want to check out the pyparsing library.
-Mark
Thanks Mark I will check in out right now.
Len

Len -

Here is a rough pyparsing starter for your problem:

from pyparsing import *

COMP = Optional("USAGE IS") + oneOf("COMP COMPUTATIONAL")
PIC = oneOf("PIC PICTURE") + Optional("IS")
PERIOD,LPAREN,RPAREN = map(Suppress,".()")

ident = Word(alphanums.upper()+"_-")
integer = Word(nums).setParseAction(lambda t:int(t[0]))
lineNum = Suppress(Optional(LineEnd()) + LineStart() + Word(nums))

rep = LPAREN + integer + RPAREN
repchars = "X" + rep
repchars.setParseAction(lambda tokens: ['X']*tokens[1])
strdecl = Combine(OneOrMore(repchars | "X"))

SIGN = Optional("S")
repdigits = "9" + rep
repdigits.setParseAction(lambda tokens: ['9']*tokens[1])
intdecl = SIGN("sign") + Combine(OneOrMore(repdigits | "9"))
("intpart")
realdecl = SIGN("sign") + Combine(OneOrMore(repdigits | "9"))
("intpart") + "V" + \
* * * * * * * * Combine(OneOrMore("9" + rep | "9"))("realpart")

type = Group((strdecl | realdecl | intdecl) +
* * * * * * * * Optional(COMP("COMP")))

fieldDecl = lineNum + "05" + ident("name") + \
* * * * * * * * PIC + type("type") + PERIOD
structDecl = lineNum + "01" + ident("name") + PERIOD + \
* * * * * * * * OneOrMore(Group(fieldDecl))("fields")

It prints out:

SALESMEN-RECORD
* *SALESMEN-NO ['999']
* *SALESMEN-NAME ['XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX']
* *SALESMEN-TERRITORY ['XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX']
* *SALESMEN-QUOTA ['S', '9999999', 'COMP']
* *SALESMEN-1ST-BONUS ['S', '99999', 'V', '99', 'COMP']
* *SALESMEN-2ND-BONUS ['S', '99999', 'V', '99', 'COMP']
* *SALESMEN-3RD-BONUS ['S', '99999', 'V', '99', 'COMP']
* *SALESMEN-4TH-BONUS ['S', '99999', 'V', '99', 'COMP']

I too have some dim, dark, memories of COBOL. *I seem to recall having
to infer from the number of digits in an integer or real what size the
number would be. *I don't have that logic implemented, but here is an
extension to the above program, which shows you where you could put
this kind of type inference logic (insert this code before the call to
searchString):

class TypeDefn(object):
* * @staticmethod
* * def intType(tokens):
* * * * self = TypeDefn()
* * * * self.str = "int(%d)" % (len(tokens.intpart),)
* * * * self.isSigned = bool(tokens.sign)
* * * * return self
* * @staticmethod
* * def realType(tokens):
* * * * self = TypeDefn()
* * * * self.str = "real(%d.%d)" % (len(tokens.intpart),len
(tokens.realpart))
* * * * self.isSigned = bool(tokens.sign)
* * * * return self
* * @staticmethod
* * def charType(tokens):
* * * * self = TypeDefn()
* * * * self.str = "char(%d)" % len(tokens)
* * * * self.isSigned = False
* * * * self.isComp = False
* * * * return self
* * def __repr__(self):
* * * * return ("+-" if self.isSigned else "") + self.str
intdecl.setParseAction(TypeDefn.intType)
realdecl.setParseAction(TypeDefn.realType)
strdecl.setParseAction(TypeDefn.charType)

This prints:

SALESMEN-RECORD
* *SALESMEN-NO [int(3)]
* *SALESMEN-NAME [char(1)]
* *SALESMEN-TERRITORY [char(1)]
* *SALESMEN-QUOTA [+-int(7), 'COMP']
* *SALESMEN-1ST-BONUS [+-real(5.2), 'COMP']
* *SALESMEN-2ND-BONUS [+-real(5.2), 'COMP']
* *SALESMEN-3RD-BONUS [+-real(5.2), 'COMP']
* *SALESMEN-4TH-BONUS [+-real(5.2), 'COMP']

You can post more questions about pyparsing on the Discussion tab of
the pyparsing wiki home page.

Best of luck!
-- Paul
Nov 17 '08 #11

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: Arvie | last post by:
I need some advice guys.. I am proposing that we get someone to do a complete audit/review of our Java application codebase, about 1000 JSPs/Servlets and 100 EJBs. If I get firms to submit...
1
by: djinni | last post by:
It's been a while since I've written anything in C, and I was hoping to get some feedback on the following code. As far as I can tell, the program works fine, but I'm not sure I'm cleaning all the...
9
by: Adam Monsen | last post by:
I kindly request a code review. If this is not an appropriate place for my request, where might be? Specific questions are in the QUESTIONS section of the code. ...
136
by: Merrill & Michele | last post by:
A derangement is a mapping of a set onto itself leaving no element fixed. I realized that that was what I was programming when I was asked to randomly determine who buys presents for whom this...
2
by: Mark | last post by:
I got this code from the Internet. Demo on the site works fine but downloaded version is not. I am a very new to .NET, I cannot figure this out, please help. Here is the code: ASPX code: <%@...
16
by: Pedro Graca | last post by:
I have a file with different ways to write numbers ---- 8< (cut) -------- 0: zero, zilch,, nada, ,,,, empty , void, oh 1: one 7: seven 2: two, too ---- >8 -------------- ...
239
by: Eigenvector | last post by:
My question is more generic, but it involves what I consider ANSI standard C and portability. I happen to be a system admin for multiple platforms and as such a lot of the applications that my...
2
by: CC | last post by:
Hi: http://web.newsguy.com/crcarl/python/hexl.py This is my first Python program other than tutorial code snippet experimentation. I chose a hex line editor. I may do a hex screen editor...
4
maxx233
by: maxx233 | last post by:
Hello all, I'm new to OO design and have a question regarding where I should place some code. Here's a simplified situation: I'm making an app to do create, submit, and track employee reviews...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.