Ernesto wrote:
Xavier Morel wrote: Ernesto wrote: I'm not sure if I should use RE's or some other mechanism. Thanks
I think a line-based state machine parser could be a better idea. Much
simpler to build and debug if not faster to execute.
What is a line-based state machine ?
Parse your file line-by-line (since it seems that it's the way your data
is organized).
Keep state informations somewhere.
Change your state based on the current state and the data being fed to
your parser.
For example, here you basically have 3 states:
No Title, which is the initial state of the machine (it has not
encountered any title yet, and you do stuff based on titles)
Title loaded, when you've met a title. "Title loaded" loops on itself:
if you meet a "Title: whatever" line, you change the title currently
stored but you stay in the "Title loaded" state (you change the current
state of the machine from "title loaded" to "title loaded").
Request loaded, which can be reached only when you're in the "Title
loaded", and then encounter a line starting with "Request: ". When you
reach that stage, do your processing (you have a title loaded, which is
the latest title you encountered, and you have a request loaded, which
is the request that immediately follows the loaded title), then you go
back to the "No Title" state, since you've processed (and therefore
unloaded) the current title.
So, the state diagram could kind of look like that:
(it's supposed to be a single state diagram, but i suck at ascii
diagrams so i'll create one mini-diagram for each state)
NoTitle =0> TitleLoaded
=0>
Event: on encountering a line starting with "Title: "
Action: save the title (to whatever variable you see fit)
Change state to: TitleLoaded
TitleLoaded =1> TitleLoaded
||
2
\/
Request
=1>
Event: on encountering a line starting with "Title: "
Action: save the title (replace the current value of your title variable)
Change state to: TitleLoaded
=2>
Event: on encountering a line starting with "Request: "
Action: save the request?; immediately process the Request state
Change state to: Request
Request =3> NoTitle
||
4
\/
TitleLoaded
=3>
Event: the Request state is reached, the request is either "Play" or "Next"
Action: Do whatever you want to do; nuke the content of the title variable
Change state to: NoTitle
=4>
Event: the Request state is reached, the request is neither "Play" nor
"Next"
Action: Nuke the content of the request variable (if you saved it), do
nothing else
Change state to: TitleLoaded
As a final note, i'd recommend reading "Text Processing in Python", even
though it puts a quite big emphasis on functional programming (which you
may or may not appreciate), it's an extremely good initiation to
text-files handling, parsing and processing.