By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
445,876 Members | 1,200 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 445,876 IT Pros & Developers. It's quick & easy.

LaTeX-Like Parsing in C

P: n/a
My problem's with parsing. I have this (arbitrary, from a file)
string, lets
say:

"Directory: /file{File:/filename(/size) }"

I would like it to behave similar to LaTeX. I parse it, and then I
write it
out for diferent variables, like:

"Directory: File:.(0) File:..(0) File:a.out(12) File:foo(1) "

But I keep getting into a mess of complication. I'm using C (of
course.) How
do I parse it? strpbrk(,"/{}") (what then?) How can I get the string
to a
data-structure that I could write out? Algorithms?

-Neil

Jul 26 '07 #1
Share this Question
Share on Google+
2 Replies


P: n/a
ne****@po-box.mcgill.ca said:
My problem's with parsing. I have this (arbitrary, from a file)
string, lets
say:

"Directory: /file{File:/filename(/size) }"

I would like it to behave similar to LaTeX. I parse it, and then I
write it
out for diferent variables, like:

"Directory: File:.(0) File:..(0) File:a.out(12) File:foo(1) "

But I keep getting into a mess of complication. I'm using C (of
course.) How
do I parse it? strpbrk(,"/{}") (what then?) How can I get the string
to a
data-structure that I could write out? Algorithms?
Start with a lexing stage, where you simply break the input into lexical
tokens, doing your best to identify them as you go but not worrying too
much about odd cases. Store your lexical tokens in some kind of dynamic
data structure such as a linked list. Yes, strpbrk will work for this,
or even strtok if your input is writeable.

That will massively reduce the complexity of the parsing stage, since
you won't have to worry about tokenisation (because each token is
simply the next node on the linked list), and so you can focus purely
on the grammar that you are trying to implement.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Jul 26 '07 #2

P: n/a
Richard Heathfield wrote:
ne****@po-box.mcgill.ca said:
>My problem's with parsing. I have this (arbitrary, from a file)
string, lets
say:

"Directory: /file{File:/filename(/size) }"

I would like it to behave similar to LaTeX. I parse it, and then I
write it
out for diferent variables, like:

"Directory: File:.(0) File:..(0) File:a.out(12) File:foo(1) "

But I keep getting into a mess of complication. I'm using C (of
course.) How
do I parse it? strpbrk(,"/{}") (what then?) How can I get the string
to a
data-structure that I could write out? Algorithms?

Start with a lexing stage, where you simply break the input into lexical
tokens, doing your best to identify them as you go but not worrying too
much about odd cases. Store your lexical tokens in some kind of dynamic
data structure such as a linked list. Yes, strpbrk will work for this,
or even strtok if your input is writeable.
And if your tokenisation rules are sufficiently bizarre [1], you can
resort to tools such as [f]lex, which [typically|can] generate C
code/tables for you.
That will massively reduce the complexity of the parsing stage, since
you won't have to worry about tokenisation (because each token is
simply the next node on the linked list), and so you can focus purely
on the grammar that you are trying to implement.
And again, if you end up with a sufficiently complex grammar [1again],
there are tools that will help. But if you're in control of the grammar,
such complexity may be a grammar smell ...

(Also helpful: existing books. And writing unit tests.)

[1] What counts as "sufficiently" is variable.

--
Far-Fetched Hedgehog
"It took a very long time, much longer than the most generous estimates."
- James White, /Sector General/

Jul 27 '07 #3

This discussion thread is closed

Replies have been disabled for this discussion.