472,110 Members | 2,152 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,110 software developers and data experts.

parsing a dbIII file

Hello everybody, I'm new to python (...I work with cobol...)

I have to parse a file (that is a dbIII file) whose stucture look like
this:
|string|, |string|, |string that may contain commas inside|, 1, 2, 3, |
other string|

Is there anything in python that parses this stuff?
thanks a lot
korovev

Aug 7 '07 #1
6 1428
ko*******@gmail.com wrote:
Hello everybody, I'm new to python (...I work with cobol...)

I have to parse a file (that is a dbIII file) whose stucture look like
this:
|string|, |string|, |string that may contain commas inside|, 1, 2, 3, |
other string|

Is there anything in python that parses this stuff?
thanks a lot
korovev
That's not a standard dBaseIII data file though, correct? It looks more
like something that was produced *from* a dBaseIII file.

If the format is similar to Excel's CSV format then the csv module from
Python's standard library may well be what you want. Otherwise there are
parsers at all levels - one called PyParsing is quite popular, and I am
sure other readers will have their own suggestions.

I am not sure whether the pipe bars actually appear in your data file,
so it is difficult to know quite exactly what to suggest, but I would
play with the file in an interactive interpreter session first to see
whether csv can do the job.

Good luck with your escape from COBOL!

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden
--------------- Asciimercial ------------------
Get on the web: Blog, lens and tag the Internet
Many services currently offer free registration
----------- Thank You for Reading -------------

Aug 7 '07 #2
On 7 Ago, 09:21, Steve Holden <st...@holdenweb.comwrote:
That's not a standard dBaseIII data file though, correct? It looks more
like something that was produced *from* a dBaseIII file.
yeap... unfortunately it is not...
Good luck with your escape from COBOL!
i'm not escaping by now... Actually I'd like to use cobol for the rest
of my life (as a programmer) ;-)
But thanx anyway!

korovev

Aug 7 '07 #3
ko*******@gmail.com wrote:
Hello everybody, I'm new to python (...I work with cobol...)

I have to parse a file (that is a dbIII file) whose stucture look like
this:
|string|, |string|, |string that may contain commas inside|, 1, 2, 3, |
other string|
There are a number of relatively simple options that come to mind, including
regular expressions:

##### BEGIN CODE #####

import re

#
# dbIII.txt:
# |string|, |string|, |string|, |string|, |,1,2,3,4|, |other string|
#
handle = open('dbIII.txt')
for line in handle.xreadlines():
for match in re.finditer(r'\|\s*([^|]+)\s*\|,*', line):
for each in match.groups():
print each

handle.close()

##### END CODE #####
Without knowing what you need to do with the data, it's hard to suggest a better
method for parsing it. The above should work, provided that the data is always
in the format | data | with no pipe symbols in between the ones used as separators.

HTH,

-Jay
Aug 7 '07 #4
On Aug 7, 2:21 am, Steve Holden <st...@holdenweb.comwrote:
korove...@gmail.com wrote:
Hello everybody, I'm new to python (...I work with cobol...)
I have to parse a file (that is a dbIII file) whose stucture look like
this:
|string|, |string|, |string that may contain commas inside|, 1, 2, 3, |
other string|
As Steve mentioned pyparsing, here is a pyparsing version for cracking
your data:

from pyparsing import *

data = "|string|, |string|, |string that may contain commas inside|,
1, 2, 3, |other string|"

integer = Word(nums)
# change unquoteResults to True to omit '|' chars from results
string = QuotedString("|", unquoteResults=False)
itemList = delimitedList( integer | string )

# parse the data and print out the results as a simple list
print itemList.parseString(data).asList()

# add a parse action to convert integer strings to actual integers
integer.setParseAction(lambda t:int(t[0]))

# reparse the data and now get converted integers in results
print itemList.parseString(data).asList()

Prints:

['|string|', '|string|', '|string that may contain commas inside|',
'1', '2', '3', '|other string|']
['|string|', '|string|', '|string that may contain commas inside|', 1,
2, 3, '|other string|']

-- Paul

Aug 7 '07 #5
On 8/7/07, ko*******@gmail.com <ko*******@gmail.comwrote:
I have to parse a file (that is a dbIII file) whose stucture look like
this:
|string|, |string|, |string that may contain commas inside|, 1, 2, 3, |
other string|
The CSV module is probably the easiest way to go:
>>data = "|string|, |string|, |string that may contain commas
inside|, 1, 2, 3, |other string|"
>>import csv
reader = csv.reader([data], quotechar="|", skipinitialspace=True)
for row in reader:
print row

['string', 'string', 'string that may contain commas inside', '1',
'2', '3', 'other string']

--
Jerry
Aug 7 '07 #6
On 7 Ago, 17:47, "Jerry Hill" <malaclyp...@gmail.comwrote:
On 8/7/07, korove...@gmail.com <korove...@gmail.comwrote:
I have to parse a file (that is a dbIII file) whose stucture look like
this:
|string|, |string|, |string that may contain commas inside|, 1, 2, 3, |
other string|

The CSV module is probably the easiest way to go:
>data = "|string|, |string|, |string that may contain commas

inside|, 1, 2, 3, |other string|">>import csv
>reader = csv.reader([data], quotechar="|", skipinitialspace=True)
for row in reader:

print row

['string', 'string', 'string that may contain commas inside', '1',
'2', '3', 'other string']

--
Jerry

you all were right, I had to mention that I must put the datas in
mysql... So actually the best way to do it is with csv.reader: i tried
it and it works out!

thanx very much!

Aug 8 '07 #7

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

3 posts views Thread by Willem Ligtenberg | last post: by
1 post views Thread by G.Esmeijer | last post: by
1 post views Thread by Christoph Bisping | last post: by
4 posts views Thread by Rick Walsh | last post: by
13 posts views Thread by Chris Carlen | last post: by
2 posts views Thread by Felipe De Bene | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.