470,810 Members | 1,367 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 470,810 developers. It's quick & easy.

ASCII file parser - to read between brackets ()


Hi,
My ascii file is not exactly a comma separated file. The following is
a small but complete example of such a file. (This is the ISCAS circuit
file format that I need to read in.)

----------- Example c17.bench ----------------
INPUT(1)
INPUT(2)
INPUT(3)
INPUT(6)
INPUT(7)

OUTPUT(22)
OUTPUT(23)

10 = NAND(1, 3)
11 = NAND(3, 6)
16 = NAND(2, 11)
19 = NAND(11, 7)
22 = NAND(10, 16)
23 = NAND(16, 19)
----------------End of Example ----------------

I would like to note that the numbers can also be replaced by some
other symbols - so they are actually to be treated as strings and not
as numbers.
Eg. "1" can also be "N_1" or "Node_1" - or any other string
representation.

Now I would like to know what should be my approach to reading in this
file, i.e. the algorithm.

Off the top of my head I think I would just have to read in each line
as a string. Then I would search the string for various keywords. On
finding a keyword I would then find the location of the two brackets ()
- and then parse the values between them.

I am wondering if this approach is the right way to go.

Thanks a lot guys,
O.O.

Feb 15 '06 #1
9 3856
ol*******@yahoo.it wrote:
Hi,
My ascii file is not exactly a comma separated file. [...]

Now I would like to know what should be my approach to reading in this
file, i.e. the algorithm.

Off the top of my head I think I would just have to read in each line
as a string. Then I would search the string for various keywords. On
finding a keyword I would then find the location of the two brackets ()
- and then parse the values between them.

I am wondering if this approach is the right way to go.


Sounds fine. I don't see any C++ relation, however. Please don't just
say that you're "writing it in C++". The algorithm you've described can
just as easily be written in almost any other language. Did you mean to
post it to 'comp.programming'?

V
--
Please remove capital As from my address when replying by mail
Feb 15 '06 #2
TB
ol*******@yahoo.it sade:
Hi,
My ascii file is not exactly a comma separated file. The following is
a small but complete example of such a file. (This is the ISCAS circuit
file format that I need to read in.)

----------- Example c17.bench ----------------
INPUT(1)
INPUT(2)
INPUT(3)
INPUT(6)
INPUT(7)

OUTPUT(22)
OUTPUT(23)

10 = NAND(1, 3)
11 = NAND(3, 6)
16 = NAND(2, 11)
19 = NAND(11, 7)
22 = NAND(10, 16)
23 = NAND(16, 19)
----------------End of Example ----------------

I would like to note that the numbers can also be replaced by some
other symbols - so they are actually to be treated as strings and not
as numbers.
Eg. "1" can also be "N_1" or "Node_1" - or any other string
representation.

Now I would like to know what should be my approach to reading in this
file, i.e. the algorithm.

Off the top of my head I think I would just have to read in each line
as a string. Then I would search the string for various keywords. On
finding a keyword I would then find the location of the two brackets ()
- and then parse the values between them.


Tokenize the input before parsing.

--
TB @ SWEDEN
Feb 15 '06 #3
<ol*******@yahoo.it> wrote in message
news:11**********************@o13g2000cwo.googlegr oups.com...
: Hi,
: My ascii file is not exactly a comma separated file. The following is
: a small but complete example of such a file. (This is the ISCAS circuit
: file format that I need to read in.)
:
: ----------- Example c17.bench ----------------
: INPUT(1)
: INPUT(2)
: INPUT(3)
: INPUT(6)
: INPUT(7)
:
: OUTPUT(22)
: OUTPUT(23)
:
: 10 = NAND(1, 3)
: 11 = NAND(3, 6)
: 16 = NAND(2, 11)
: 19 = NAND(11, 7)
: 22 = NAND(10, 16)
: 23 = NAND(16, 19)
: ----------------End of Example ----------------
:
: I would like to note that the numbers can also be replaced by some
: other symbols - so they are actually to be treated as strings and not
: as numbers.
: Eg. "1" can also be "N_1" or "Node_1" - or any other string
: representation.
:
: Now I would like to know what should be my approach to reading in this
: file, i.e. the algorithm.
:
: Off the top of my head I think I would just have to read in each line
: as a string. Then I would search the string for various keywords. On
: finding a keyword I would then find the location of the two brackets ()
: - and then parse the values between them.
:
: I am wondering if this approach is the right way to go.

There are several ways in which this can be accomplished.
But because I don't know the complete 'grammar' of the file,
I am not sure which would be the most appropriate
(e.g. I assume there is not only NAND, but XOR etc.
Can a more complex expression be used? Unary NOT ? )

In any case, rather than parsing each line manually, you
could use one of the existing lexers or parser generators,
such as flex(with or without bison, a bit old-fashioned
but works - http://www.gnu.org/software/flex/),
or boost::spirit (http://www.boost.org/libs/spirit/index.html).

If the files are simple enough, a regular-expressions package
might be an alternative for extracting needed identifiers from
each line (e.g. http://www.boost.org/libs/regex/doc/index.html)
These are among a number of other options...
hth-Ivan
--
http://ivan.vecerina.com/contact/?subject=NG_POST <- email contact form
Brainbench MVP for C++ <> http://www.brainbench.com
Feb 15 '06 #4
Dear Victor,
Thanks for responding. I forgot to mention in my post that I am
dealing with C++. I know that my algorithm was general, but sometimes a
certain language may have some features to handle this situation
differently. E.g. I had heard of RegEx's in perl, and I thought I
could not use them in C++. Also I did not know of what Tokenize means
which I learnt only after TB suggested it.
That's what I was looking for.
Thanks,
O.O.

Feb 15 '06 #5
Thanks TB. I think this was what I am looking for. I think my file is
simple enough so I don't need to use RegEx's. I have found some
code in the example at http://www.codeproject.com/cpp/stringtok.asp
that I think would be useful to me.
Regards,
O.O.

Feb 15 '06 #6
ol*******@yahoo.it wrote:
Thanks for responding. I forgot to mention in my post that I am
dealing with C++.
That's what I was afraid of...
I know that my algorithm was general, but sometimes a
certain language may have some features to handle this situation
differently. E.g. I had heard of RegEx's in perl, and I thought I
could not use them in C++.
They are not part of the language yet. As soon as you see the TR1
implemented, you could try using <regex> and whatever it is going to
contain. Until then, alas, no language mechanism to help you except some
very simple ones, like 'string', 'fstream', and others of which you are
probably already aware.
Also I did not know of what Tokenize means
which I learnt only after TB suggested it.


"Tokenize" usually means "identify and split the input stream into tokens"
and it can mean _whatever_you_make_it_to_mean_ because it depends entirely
on your definition of "a token".

V
--
Please remove capital As from my address when replying by mail
Feb 15 '06 #7
Thanks Ivan. I have heard of RegEx's - but I have not used them
much. I think I would start with string tokenizer and if that becomes
too complicated I would attempt this method.
O.O.

Feb 15 '06 #8
ol*******@yahoo.it wrote:
Thanks Ivan. I have heard of RegEx's - but I have not used them
much. I think I would start with string tokenizer and if that becomes
too complicated I would attempt this method.
O.O.


most implementations of scanf will handle this for you no problem.

if you don't like sscanf you can use regex.

if you don't like regex you can use lex (probably don't need a parser,
just the scanner should suffice).

if you don't like lex/yacc you can write your own scanner, the grammar
you have there isn't too complex.

if you don't want to write your own scanner you can..
actually that's sort of the problem people have with c++. your options
when dealing with any particular problem are quite literally, endless.

perl kind of herds you into trying to approach everything with regex's
and hashes, while vb/.net will get you to buy some prebuilt item. i'm
guessing thats why you came to c++ group, to find out what approach c++
lends itself most easily to. unfortunately, c++ lends itself to just
about every solution :p
Feb 16 '06 #9
Thanks pillbug. I did not have this insight before.
O.O.

Feb 16 '06 #10

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

7 posts views Thread by Nova's Taylor | last post: by
4 posts views Thread by webdev | last post: by
5 posts views Thread by Sean Kirkpatrick | last post: by
24 posts views Thread by Marc Dubois | last post: by
399 posts views Thread by =?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?= | last post: by
6 posts views Thread by swetha | last post: by
6 posts views Thread by Jasper | last post: by
reply views Thread by mihailmihai484 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.