473,698 Members | 1,952 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Ok, I'm quite new to Python

But i'm a good c++ programmer.

What i want to do is parse a text file and store the information in relevant
fields:

//Text File:

*Version 200
*SCENE {
AMBIENT_COLOUR 0.0 0.0 0.0
}

*MATERIAL_LIST{
*MATERIAL_COUNT 0
}
*GEOMOBJECT {
*NODE_NAME "obj1"
*MESH {
*MESH_VERTEX_LI ST{
*MESH_VERTEX 0 0 0 0
*MESH_VERTEX 1 0 1 2
}
*MESH_FACE_LIST {
*MESH_FACE 1 2 3
}
}
}
/* ... More GEOMOBJECTS ...*/
but I have no idea what the best way to do this is?
Any thoughts??

Many Thanks

Mike


Jul 18 '05 #1
4 1440
Michael wrote:
But i'm a good c++ programmer.

What i want to do is parse a text file and store the information in relevant
fields:

Well, if you have (or know how to write) an EBNF grammar, SimpleParse
would likely be ideal for this. See the VRML97 sample grammar in
SimpleParse (or even the VRML97 loader in OpenGLContext for a more
real-world example).

Primary value of SimpleParse for this kind of thing is that it's fast
compared to most other Python parser generators while still being easy
to use. If you're loading largish (say 10s of MBs) models the speed can
be quite useful. (It was originally written explicitly to produce a
fast VRML97 parser (btw)).

If you're loading *huge* models (100s of MBs), you may need to go for a
C/C++ extension to directly convert from an on-disk buffer to objects,
but try it with the Python versions first. Even with 100s of MBs, you
can write SimpleParse grammars fast enough to parse them quite quickly,
it just requires a little more care with how you structure your productions.
but I have no idea what the best way to do this is?
Any thoughts??

Mostly it's just a matter of what you feel comfortable with. There's
quite a range of Python text-processing tools available. See the text
"Text Processing in Python" (available in both dead-tree and online
format) for extensive treatment of various approaches, from writing your
own recursive descent parsers through using one of the parser-generators.

Good luck,
Mike

_______________ _______________ _______________ ___
Mike C. Fletcher
Designer, VR Plumber, Coder
http://www.vrplumber.com
http://blog.vrplumber.com

Jul 18 '05 #2

"Michael" <sl***********@ hotmail.com> wrote in message
news:ck******** **@hercules.bti nternet.com...
But i'm a good c++ programmer.

What i want to do is parse a text file and store the information in
relevant
fields:


A more useful-to-the-reader and possibly more fruitful-to-you subject line
would have been something like 'Need help parsing text files'.

tjr

Jul 18 '05 #3
Well, here's a sre-based scanner and recursive-descent parser based on
my understanding of the grammar you gave.

Using a real scanner and parser may or may not be a better choice, but
it's not hard in Python to create a scanner and write a
recursive-descent parser for a simple grammar.

Jeff

------------------------------------------------------------------------
# This code is in the public domain
class PeekableIterato r:
def __init__(self, s):
self.s = iter(s)
self._peek = []

def atend(self):
try:
self.peek()
except StopIteration:
return True
return False

def peek(self):
if not self._peek: self._peek = [self.s.next()]
return self._peek[0]

def next(self):
if self._peek:
return self._peek.pop( )
return self.s.next()

def __iter__(self): return self

def tok(scanner, s):
return s
def num(scanner, s):
try:
return int(s)
except ValueError:
return float(s)

import sre
scanner = sre.Scanner([
(r"/\*(?:[^*]|[*]+[^/])*\*/", None),
(r"\*?[A-Za-z_][A-Za-z0-9_]*", tok),
(r"//.*$", None),
(r"[0-9]*\.[0-9]+|[0-9]+\.?", num),
(r"[{}]", tok),
(r'"(?:[^\\"]|\\.)*"', tok),
(r"[ \t\r\n]*", None),
], sre.MULTILINE)

class Node:
def __init__(self, name):
self.name = name
self.contents = []
def add(self, v): self.contents.a ppend(v)
def __str__(self):
sc = " ".join(map(repr , self.contents))
return "<%s: %s>" % (self.name, sc)
__repr__ = __str__

def parse_nodes(t):
n = []
while 1:
if t.peek() == "}":
t.next()
break
n.append(parse_ node(t))
return n

def parse_contents( n, t):
if t.atend(): return
if t.peek() == "{":
t.next()
for n1 in parse_nodes(t):
n.add(n1)
while 1:
if t.atend(): break
if t.peek() == "}": break
if isinstance(p, basestring) and t.peek().starts with("*"): break
n.add(t.next())

def parse_node(t):
n = Node(t.next())
parse_contents( n, t)
return n

def parse_top(t):
nodes = []
while not t.atend():
yield parse_node(t)


import sys
def main(source = sys.stdin):
tokens, rest = scanner.scan(so urce.read())
if rest:
print "Garbage at end of file:", `rest`
for n in parse_top(Peeka bleIterator(tok ens)):
print n

if __name__ == '__main__': main()
------------------------------------------------------------------------
$ python michael.py < michael.txt # and reindented for show
<*Version: 200>
<*SCENE: <AMBIENT_COLOUR : 0.0 0.0 0.0>>
<*MATERIAL_LIST : <*MATERIAL_COUN T: 0>>
<*GEOMOBJECT:
<*NODE_NAME: '"obj1"'>
<*MESH:
<*MESH_VERTEX_L IST:
<*MESH_VERTEX : 0 0 0 0>
<*MESH_VERTEX : 1 0 1 2>
<*MESH_FACE_LIS T: <*MESH_FACE: 1 2 3>>


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)

iD8DBQFBbIQVJd0 1MZaTXX0RAlmXAJ 9CjRfV1w4NQo2wS Ba4doZSWuNvDQCe Kzyd
Z0SHDzDLxFnacVG Nf6PQmtE=
=s51L
-----END PGP SIGNATURE-----

Jul 18 '05 #4
On Wed, 13 Oct 2004 00:03:41 +0000 (UTC), "Michael" <sl***********@ hotmail.com> wrote:
But i'm a good c++ programmer.

What i want to do is parse a text file and store the information in relevant ^^^^^^^^^^^^^^^ ^^[1] ^^^^^[2] ^^^^^^^^^^^[3] ^^^^^^^^-fields: -^^^^^[4]
[1] ok
[2] where?
[3] which
[4] relevant to what?
[5] ;-)
//Text File:

*Version 200
*SCENE {
AMBIENT_COLOUR 0.0 0.0 0.0
}

*MATERIAL_LIST {
*MATERIAL_COUNT 0
}
*GEOMOBJECT {
*NODE_NAME "obj1"
*MESH {
*MESH_VERTEX_LI ST{
*MESH_VERTEX 0 0 0 0
*MESH_VERTEX 1 0 1 2
}
*MESH_FACE_LIST {
*MESH_FACE 1 2 3
}
}
}
/* ... More GEOMOBJECTS ...*/
but I have no idea what the best way to do this is? ^^^^^^^[1]
[1] do what?Any thoughts??

Id probably start eith stripping out the tokens with a regular expression
and then process the list to build a tree that you can then walk? To start:
data = """\ ... *Version 200
... *SCENE {
... AMBIENT_COLOUR 0.0 0.0 0.0
... }
...
... *MATERIAL_LIST{
... *MATERIAL_COUNT 0
... }
... *GEOMOBJECT {
... *NODE_NAME "obj1"
... *MESH {
... *MESH_VERTEX_LI ST{
... *MESH_VERTEX 0 0 0 0
... *MESH_VERTEX 1 0 1 2
... }
... *MESH_FACE_LIST {
... *MESH_FACE 1 2 3
... }
... }
... }
... """
import re
rxs = re.compile(r'([{}]|"[^"]*"|[*A-Z_a-z]+|[0-9.]+)')
tokens = rxs.findall(dat a)
tokens

['*Version', '200', '*SCENE', '{', 'AMBIENT_COLOUR ', '0.0', '0.0', '0.0', '}', '*MATERIAL_LIST ',
'{', '*MATERIAL_COUN T', '0', '}', '*GEOMOBJECT', '{', '*NODE_NAME', '"obj1"', '*MESH', '{', '*M
ESH_VERTEX_LIST ', '{', '*MESH_VERTEX', '0', '0', '0', '0', '*MESH_VERTEX', '1', '0', '1', '2', '
}', '*MESH_FACE_LIS T', '{', '*MESH_FACE', '1', '2', '3', '}', '}', '}']

IWT that isolates the basic info of interest. It should not be hard to make a tree or
extract what suits your purposes, but I'm not going to guess what those are ;-)

Regards,
Bengt Richter
Jul 18 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
1544
by: J. W. McCall | last post by:
Sorry again if this is OT; I'm not sure if this is a python problem or just a CGI problem, but I couldn't find a decent CGI NG. Let me know if there's somewhere else I should be posting. I got this webcounter to be called directly (I still can't get it to be called from an HTML file with #exec or #include or anything). Now the problem is that the part of the script that updates the count file doesn't work. It's like it doesn't even...
1
2384
by: Emile van Sebille | last post by:
QOTW: "If we get 2.3.3c1 out in early December, we could release 2.3.3 final before the end of the year, and start 2004 with a 100% bug-free codebase <wink>." -- Tim Peters "cjOr proWe vbCould vbSettle prpFor noEnglish prpIn adjHungarian noNotation :-)" -- noPeter Jack Jensen reveals that python on mac continues as an osx option only starting with python2.4.
0
1888
by: Emile van Sebille | last post by:
QOTW: "Have you ever used the copy module? I am *not* a beginner, and have used it *once* (and I can't remember what for, either)." -- Michael Hudson "It will likely take a little practice before this stuff rattles off your fingertips easily." -- Paul Rubin BJ MacNevin wants to run python from a CD. http://groups.google.com/groups?threadm=BhVyb.389117$Tr4.1144562@attbi_s03
5
1988
by: geskerrett | last post by:
We are working on a project to decipher a record structure of an old accounting system that originates from the late80's mid-90's. We have come across a number format that appears to be a "float" but doesn't match any of the more standard implementations. so we are hoping this is a recognizable number storage format with an identifiable name AND pre-built conversion method similiar to the "struct" modules available in python. Here is...
4
3456
by: ajkadri | last post by:
Folks, I have written a word frequency counter program in python that works well for .txt files; but it cannot handle .DOC files. Can someone help me to resolve this issue???
0
8671
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8856
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
7709
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6515
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
4360
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4613
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
3037
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
2321
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
1997
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.