ASCII file parser - to read between brackets ()

olson_ord

Hi,
My ascii file is not exactly a comma separated file. The following is
a small but complete example of such a file. (This is the ISCAS circuit
file format that I need to read in.)

----------- Example c17.bench ----------------
INPUT(1)
INPUT(2)
INPUT(3)
INPUT(6)
INPUT(7)

OUTPUT(22)
OUTPUT(23)

10 = NAND(1, 3)
11 = NAND(3, 6)
16 = NAND(2, 11)
19 = NAND(11, 7)
22 = NAND(10, 16)
23 = NAND(16, 19)
----------------End of Example ----------------

I would like to note that the numbers can also be replaced by some
other symbols - so they are actually to be treated as strings and not
as numbers.
Eg. "1" can also be "N_1" or "Node_1" - or any other string
representation.

Now I would like to know what should be my approach to reading in this
file, i.e. the algorithm.

Off the top of my head I think I would just have to read in each line
as a string. Then I would search the string for various keywords. On
finding a keyword I would then find the location of the two brackets ()
- and then parse the values between them.

I am wondering if this approach is the right way to go.

Thanks a lot guys,
O.O.

Feb 15 '06 #1

Subscribe Post Reply

4025

Victor Bazarov

ol*******@yahoo.it wrote:

Hi,
My ascii file is not exactly a comma separated file. [...]

Now I would like to know what should be my approach to reading in this
file, i.e. the algorithm.

Off the top of my head I think I would just have to read in each line
as a string. Then I would search the string for various keywords. On
finding a keyword I would then find the location of the two brackets ()
- and then parse the values between them.

I am wondering if this approach is the right way to go.

Sounds fine. I don't see any C++ relation, however. Please don't just
say that you're "writing it in C++". The algorithm you've described can
just as easily be written in almost any other language. Did you mean to
post it to 'comp.programming'?

V
--
Please remove capital As from my address when replying by mail

Feb 15 '06 #2

ol*******@yahoo.it sade:

Hi,
My ascii file is not exactly a comma separated file. The following is
a small but complete example of such a file. (This is the ISCAS circuit
file format that I need to read in.)

----------- Example c17.bench ----------------
INPUT(1)
INPUT(2)
INPUT(3)
INPUT(6)
INPUT(7)

OUTPUT(22)
OUTPUT(23)

10 = NAND(1, 3)
11 = NAND(3, 6)
16 = NAND(2, 11)
19 = NAND(11, 7)
22 = NAND(10, 16)
23 = NAND(16, 19)
----------------End of Example ----------------

I would like to note that the numbers can also be replaced by some
other symbols - so they are actually to be treated as strings and not
as numbers.
Eg. "1" can also be "N_1" or "Node_1" - or any other string
representation.

Now I would like to know what should be my approach to reading in this
file, i.e. the algorithm.

Off the top of my head I think I would just have to read in each line
as a string. Then I would search the string for various keywords. On
finding a keyword I would then find the location of the two brackets ()
- and then parse the values between them.

Tokenize the input before parsing.

--
TB @ SWEDEN

Feb 15 '06 #3

Ivan Vecerina

<ol*******@yahoo.it> wrote in message
news:11**********************@o13g2000cwo.googlegr oups.com...
: Hi,
: My ascii file is not exactly a comma separated file. The following is
: a small but complete example of such a file. (This is the ISCAS circuit
: file format that I need to read in.)
:
: ----------- Example c17.bench ----------------
: INPUT(1)
: INPUT(2)
: INPUT(3)
: INPUT(6)
: INPUT(7)
:
: OUTPUT(22)
: OUTPUT(23)
:
: 10 = NAND(1, 3)
: 11 = NAND(3, 6)
: 16 = NAND(2, 11)
: 19 = NAND(11, 7)
: 22 = NAND(10, 16)
: 23 = NAND(16, 19)
: ----------------End of Example ----------------
:
: I would like to note that the numbers can also be replaced by some
: other symbols - so they are actually to be treated as strings and not
: as numbers.
: Eg. "1" can also be "N_1" or "Node_1" - or any other string
: representation.
:
: Now I would like to know what should be my approach to reading in this
: file, i.e. the algorithm.
:
: Off the top of my head I think I would just have to read in each line
: as a string. Then I would search the string for various keywords. On
: finding a keyword I would then find the location of the two brackets ()
: - and then parse the values between them.
:
: I am wondering if this approach is the right way to go.

There are several ways in which this can be accomplished.
But because I don't know the complete 'grammar' of the file,
I am not sure which would be the most appropriate
(e.g. I assume there is not only NAND, but XOR etc.
Can a more complex expression be used? Unary NOT ? )

In any case, rather than parsing each line manually, you
could use one of the existing lexers or parser generators,
such as flex(with or without bison, a bit old-fashioned
but works - http://www.gnu.org/software/flex/),
or boost::spirit (http://www.boost.org/libs/spirit/index.html).

If the files are simple enough, a regular-expressions package
might be an alternative for extracting needed identifiers from
each line (e.g. http://www.boost.org/libs/regex/doc/index.html)
These are among a number of other options...
hth-Ivan
--
http://ivan.vecerina.com/contact/?subject=NG_POST <- email contact form
Brainbench MVP for C++ <> http://www.brainbench.com

Feb 15 '06 #4

olson_ord

Dear Victor,
Thanks for responding. I forgot to mention in my post that I am
dealing with C++. I know that my algorithm was general, but sometimes a
certain language may have some features to handle this situation
differently. E.g. I had heard of RegEx's in perl, and I thought I
could not use them in C++. Also I did not know of what Tokenize means
which I learnt only after TB suggested it.
That's what I was looking for.
Thanks,
O.O.

Feb 15 '06 #5

olson_ord

Thanks TB. I think this was what I am looking for. I think my file is
simple enough so I don't need to use RegEx's. I have found some
code in the example at http://www.codeproject.com/cpp/stringtok.asp
that I think would be useful to me.
Regards,
O.O.

Feb 15 '06 #6

Victor Bazarov

ol*******@yahoo.it wrote:

Thanks for responding. I forgot to mention in my post that I am
dealing with C++.
That's what I was afraid of...
I know that my algorithm was general, but sometimes a
certain language may have some features to handle this situation
differently. E.g. I had heard of RegEx's in perl, and I thought I
could not use them in C++.
They are not part of the language yet. As soon as you see the TR1
implemented, you could try using <regex> and whatever it is going to
contain. Until then, alas, no language mechanism to help you except some
very simple ones, like 'string', 'fstream', and others of which you are
probably already aware.
Also I did not know of what Tokenize means
which I learnt only after TB suggested it.

"Tokenize" usually means "identify and split the input stream into tokens"
and it can mean _whatever_you_make_it_to_mean_ because it depends entirely
on your definition of "a token".

V
--
Please remove capital As from my address when replying by mail

Feb 15 '06 #7

olson_ord

Thanks Ivan. I have heard of RegEx's - but I have not used them
much. I think I would start with string tokenizer and if that becomes
too complicated I would attempt this method.
O.O.

Feb 15 '06 #8

pillbug

ol*******@yahoo.it wrote:

Thanks Ivan. I have heard of RegEx's - but I have not used them
much. I think I would start with string tokenizer and if that becomes
too complicated I would attempt this method.
O.O.

most implementations of scanf will handle this for you no problem.

if you don't like sscanf you can use regex.

if you don't like regex you can use lex (probably don't need a parser,
just the scanner should suffice).

if you don't like lex/yacc you can write your own scanner, the grammar
you have there isn't too complex.

if you don't want to write your own scanner you can..
actually that's sort of the problem people have with c++. your options
when dealing with any particular problem are quite literally, endless.

perl kind of herds you into trying to approach everything with regex's
and hashes, while vb/.net will get you to buy some prebuilt item. i'm
guessing thats why you came to c++ group, to find out what approach c++
lends itself most easily to. unfortunately, c++ lends itself to just
about every solution :p

Feb 16 '06 #9

olson_ord

Thanks pillbug. I did not have this insight before.
O.O.

Feb 16 '06 #10

by: Nova's Taylor | last post by:

Hi folks, I am a newbie to Python and am hoping that someone can get me started on a log parser that I am trying to write. The log is an ASCII file that contains a process identifier (PID),...

Python

minidom xml & non ascii / unicode & files

by: webdev | last post by:

lo all, some of the questions i'll ask below have most certainly been discussed already, i just hope someone's kind enough to answer them again to help me out.. so i started a python 2.3...

Python

Old-fashioned Style (ASCII/ANSI) & Console Applications in c#

by: Martín Marconcini | last post by:

Hello there, I'm writting (or trying to) a Console Application in C#. I has to be console. I remember back in the old days of Cobol (Unisys), Clipper and even Basic, I used to use a program...

C# / C Sharp

Unicode to ASCII string conversion

by: Ger | last post by:

I have not been able to find a simple, straight forward Unicode to ASCII string conversion function in VB.Net. Is that because such a function does not exists or do I overlook it? I found...

Visual Basic .NET

upper 128 ASCII chars in a text file

by: Sean Kirkpatrick | last post by:

As part of my ongoing effort to provide a set of .Net wrappers for DAO, I'm writing a simple parser in VB.Net to search collection of VB6 source files to add explicit qualifiers to existing...

Visual Basic .NET

xml file parsing in C

by: Marc Dubois | last post by:

hi, is it possible to parse an XML file in C so that i can fulfill these requirements : 1) replace all "<" and ">" signs inside the body of tag by a space, e.g. : Example 1: <fooblabla < bla...

C / C++

399

PEP 3131: Supporting Non-ASCII Identifiers

by: =?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?= | last post by:

PEP 1 specifies that PEP authors need to collect feedback from the community. As the author of PEP 3131, I'd like to encourage comments to the PEP included below, either here (comp.lang.python), or...

Python

Binary file to ASCII....

by: swetha | last post by:

HI Every1, I have a file in binary format. Using a C program i have to read that file and need to make some changes . So i wanted to convert that into ACII format.Can any1 suggest me how to do...

C / C++

Parsing a generic data file

by: Jasper | last post by:

Hi, Maybe this is off-topic, but perhaps you can help. I'm looking for ideas on how to parse a data file. I dont know XML but I know it parses data in text format. I have a structured data...

.NET Framework

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

General

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

Career Advice

ASCII file parser - to read between brackets ()

Similar topics