473,399 Members | 3,919 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,399 software developers and data experts.

Parsing - is this a sensible idea?

I have a program that needs to do a small amount of relatively simple
parsing. The routines I've written work fine, but the code using them
is a bit long-winded.

I therefore had the idea of creating a class to do parsing. It could
be used as follows:

int a, n, x, y;
Parser par;
par << string;
if (par >"From" >' ' >x >' ' >"to" >' ' >y) a = 1;
else if (par >"Number" >' ' >n) a = 2;
else a = 3;

Then if string is "From 3 to 5" this will set a=1, x=3, y=5. If the
string is "Number 2" this will set a=2 and n=2. If string is
"Other" then a=3. For convenience, I'll assume that an input of "From
4 other" is allowed to alter the value of x while returning a=3.

I think I could write a class that would do this. It would need to
keep track of whether the current parsing was succeeding and, if so,
how far through the string it had got. It would need overloaded >>
operators, obviously, some of them taking references. And it would
need a conversion operator, which I think would need to be to void *,
which would not only return whether the current parse had succeeded
but would also reset the flag and counter ready for another attempt.

So my questions are, is this a sensible thing to try to do, and are
there any potential snags that I haven't spotted?

Thanks.
Paul.
Nov 16 '08 #1
6 1899
On 2008-11-16 22:16, gw****@aol.com wrote:
I have a program that needs to do a small amount of relatively simple
parsing. The routines I've written work fine, but the code using them
is a bit long-winded.

I therefore had the idea of creating a class to do parsing. It could
be used as follows:

int a, n, x, y;
Parser par;
par << string;
if (par >"From" >' ' >x >' ' >"to" >' ' >y) a = 1;
else if (par >"Number" >' ' >n) a = 2;
else a = 3;

Then if string is "From 3 to 5" this will set a=1, x=3, y=5. If the
string is "Number 2" this will set a=2 and n=2. If string is
"Other" then a=3. For convenience, I'll assume that an input of "From
4 other" is allowed to alter the value of x while returning a=3.

I think I could write a class that would do this. It would need to
keep track of whether the current parsing was succeeding and, if so,
how far through the string it had got. It would need overloaded >>
operators, obviously, some of them taking references. And it would
need a conversion operator, which I think would need to be to void *,
which would not only return whether the current parse had succeeded
but would also reset the flag and counter ready for another attempt.

So my questions are, is this a sensible thing to try to do, and are
there any potential snags that I haven't spotted?
If you need to parse a lot you should probably try a tool like yacc or
some other parser-generator. If you only need to be able to parse a very
small grammar (and want a good exercise) you can try to write the state-
machine by hand.

You example looks like a runtime-construct (though, perhaps you can make
it compile-time with some fancy template meta-programming) which does
not sound like a good idea to me.

--
Erik Wikström
Nov 16 '08 #2
On 16 Nov, 21:42, Erik Wikström <Erik-wikst...@telia.comwrote:
On 2008-11-16 22:16, gw7...@aol.com wrote:
I have a program that needs to do a small amount of relatively simple
parsing. The routines I've written work fine, but the code using them
is a bit long-winded.
I therefore had the idea of creating a class to do parsing. It could
be used as follows:
int a, n, x, y;
Parser par;
par << string;
if (par >"From" >' ' >x >' ' >"to" >' ' >y) a = 1;
else if (par >"Number" >' ' >n) a = 2;
else a = 3;
Then if string is "From 3 to 5" this will set a=1, x=3, y=5. If the
string is "Number * * 2" this will set a=2 and n=2. If string is
"Other" then a=3. For convenience, I'll assume that an input of "From
4 other" is allowed to alter the value of x while returning a=3.
I think I could write a class that would do this. It would need to
keep track of whether the current parsing was succeeding and, if so,
how far through the string it had got. It would need overloaded >>
operators, obviously, some of them taking references. And it would
need a conversion operator, which I think would need to be to void *,
which would not only return whether the current parse had succeeded
but would also reset the flag and counter ready for another attempt.
So my questions are, is this a sensible thing to try to do, and are
there any potential snags that I haven't spotted?

If you need to parse a lot you should probably try a tool like yacc or
some other parser-generator. If you only need to be able to parse a very
small grammar (and want a good exercise) you can try to write the state-
machine by hand.
I don't think I'm going to be doing that much parsing, though I'll
bear that in mind if i do.
You example looks like a runtime-construct (though, perhaps you can make
it compile-time with some fancy template meta-programming) which does
not sound like a good idea to me.
How my example works - par >"text" will check to see whether the
next bit of the string to be parsed contains the characters "text".
par >n will check to see if the next bit of the string is a number,
and if so, set n to that number. par >' ' will skip whitespace. The
routine doesn't build up a "template" of what the string is supposed
to look like, it just checks each bit of it in turn, as I would have
thought any parser needs to.

Thanks for any further thoughts.
Paul.
Nov 16 '08 #3

Paul wrote:
>How my example works - par >"text" will check to see whether the
next bit of the string to be parsed contains the characters "text".
par >n will check to see if the next bit of the string is a number,
and if so, set n to that number. par >' ' will skip whitespace. The
routine doesn't build up a "template" of what the string is supposed
to look like, it just checks each bit of it in turn, as I would have
thought any parser needs to.
It is definately possible.

The only part that sticks out of your design as really weird is the
side effects of the conversion operator. I would prefer to have the
operator>overloads return copies of the original with the changed
member variables. If you use a reference counting smart pointer for
the string your class would no larger than 4 integers on most
platforms (one for pointer, one for its reference count, one for the
position and less than 1 for the flag). The cost of copying four
integers is not terrible. If all the lines you want to parse are
fairly short like in your examples, you won't be making too many
copies. This is likely a reasonable tradeoff for avoiding the magic
in the operator void*().

In general though the returning copies is not scalable. On the other
hand your design has limited scalablility too, as advanced parsing
requires more sophisiticated techniques. But considerering your
examples, it sounds like you don't need a powerful parser, but
want something to parse simple strings, so all this might be just fine
for you.

Nov 17 '08 #4
On Nov 16, 11:09 pm, gw7...@aol.com wrote:
On 16 Nov, 21:42, Erik Wikström <Erik-wikst...@telia.comwrote:
On 2008-11-16 22:16, gw7...@aol.com wrote:
I have a program that needs to do a small amount of
relatively simple parsing. The routines I've written work
fine, but the code using them is a bit long-winded.
I therefore had the idea of creating a class to do
parsing. It could be used as follows:
int a, n, x, y;
Parser par;
par << string;
if (par >"From" >' ' >x >' ' >"to" >' ' >y) a = 1;
else if (par >"Number" >' ' >n) a = 2;
else a = 3;
Then if string is "From 3 to 5" this will set a=1, x=3,
y=5. If the string is "Number 2" this will set a=2 and
n=2. If string is "Other" then a=3. For convenience, I'll
assume that an input of "From 4 other" is allowed to alter
the value of x while returning a=3.
I think I could write a class that would do this. It would
need to keep track of whether the current parsing was
succeeding and, if so, how far through the string it had
got. It would need overloaded >operators, obviously,
some of them taking references. And it would need a
conversion operator, which I think would need to be to
void *, which would not only return whether the current
parse had succeeded but would also reset the flag and
counter ready for another attempt.
So my questions are, is this a sensible thing to try to
do, and are there any potential snags that I haven't
spotted?
If you need to parse a lot you should probably try a tool
like yacc or some other parser-generator. If you only need
to be able to parse a very small grammar (and want a good
exercise) you can try to write the state- machine by hand.
I don't think I'm going to be doing that much parsing, though
I'll bear that in mind if i do.
You example looks like a runtime-construct (though, perhaps
you can make it compile-time with some fancy template
meta-programming) which does not sound like a good idea to
me.
How my example works - par >"text" will check to see whether
the next bit of the string to be parsed contains the
characters "text".
I think that that's what I really don't care for in it. One
expects >to read, not to check.

What's wrong with just using boost::regex?

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
Nov 17 '08 #5
In article <52baba84-3b2a-40fc-b95b-
e8**********@a17g2000prm.googlegroups.com>, gw****@aol.com says...
I have a program that needs to do a small amount of relatively simple
parsing. The routines I've written work fine, but the code using them
is a bit long-winded.

I therefore had the idea of creating a class to do parsing. It could
be used as follows:
Depending on what you're doing, I'd consider using a regular expression
library such as boost::regex, or a template-based parser generator such
as boost::Spirit 2.

--
Later,
Jerry.

The universe is a figment of its own imagination.
Nov 19 '08 #6
rwp
I wrote a class like that a few years ago and it turned out to be quite
useful

Example code:

string Part1, Part2, Key;
parse_str(Line) >Part1 >"%" >Key >"%" >Part2 >"";
if (Key.size() != 0) ...
The class was modelled after the Rexx parse command so it uses some
special strings like
"." for word
"10" to go to position 10 in the string
"+10" to go 10 positions forward in the string
"," to go to the next line in the string etc.

The construction of the class is as follows

parse_str(const string& in_s) ...
Constructor that just saves the string variable internally

//method that picks up integer variable to assign value to and returns the
object to enable
//continuing using >operators
parse_str& operator>>(int& ival)
{
wordstep();
(this->*m_try_assign)();
m_pvar = (void*)&ival;
m_try_assign = &parse_str::try_assign_int;
m_wordmatch = 1;
return *this;
}

// method that recognizes special strings and search items
parse_str& operator>>(const char* in_psz)...

// method that converts a part of the parse string to an integer
int try_assign_int() ...
// variables

void* m_pvar; // pointer to variable to set value to
const string m_str; // string passed in as argument to constructor
int (t_parse_string::* m_try_assign)(void); function pointer to method
that assigns variable

--
Message posted using http://www.talkaboutprogramming.com/...comp.lang.c++/
More information at http://www.talkaboutprogramming.com/faq.html

Nov 19 '08 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

8
by: Gerrit Holl | last post by:
Posted with permission from the author. I have some comments on this PEP, see the (coming) followup to this message. PEP: 321 Title: Date/Time Parsing and Formatting Version: $Revision: 1.3 $...
2
by: Cigdem | last post by:
Hello, I am trying to parse the XML files that the user selects(XML files are on anoher OS400 system called "wkdis3"). But i am permenantly getting that error: Directory0: \\wkdis3\ROOT\home...
16
by: Terry | last post by:
Hi, This is a newbie's question. I want to preload 4 images and only when all 4 images has been loaded into browser's cache, I want to start a slideshow() function. If images are not completed...
50
by: z. f. | last post by:
HI, i have string in format dd/mm/yyyyy hh:mm:ss and giving this as an input to DateTime.Parse gives a string was not recognized as a valid date time format string error. how do i make the parse...
4
by: Earl | last post by:
I'm curious if there are others who have a better method of accepting/parsing phone numbers. I've used a couple of different techniques that are functional but I can't really say that I'm totally...
5
by: randy | last post by:
Can some point me to a good example of parsing XML using C# 2.0? Thanks
3
by: aspineux | last post by:
My goal is to write a parser for these imaginary string from the SMTP protocol, regarding RFC 821 and 1869. I'm a little flexible with the BNF from these RFC :-) Any comment ? tests= def...
29
by: lenbell | last post by:
It's old stupid and lazy here again I have been wanting to keep using my WYSIWYG (What You See Is What You Get - for my fellow stupids) html editor. But I was told that you HAD to rename your...
5
by: Svenn Are Bjerkem | last post by:
On Jul 23, 1:03 pm, christopher.saun...@durham.ac.uk (c d saunter) wrote: As a start I want to parse VHDL which is going to be synthesised, and I am limiting myself to the entities and the...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.