473,407 Members | 2,312 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,407 software developers and data experts.

Re: String Delimitiers

On Sep 12, 7:25 am, Ruben <ru...@www2.mrbrklyn.comwrote:
I have strings that are something like
3.4meq/kg/day
which I want to parse. The numbers on the front are critical
for later calculations. But I want to pull the descriptors
apart. I can use get() to do this, but if just seems so
sloppy.
This is a code snippet that extracts the numbers. Can't I
change the delimitor for the istringstream class?
You can specify a delimitor when calling getline, but I'd tend
to consider that a bit abuse. I'd just parse the string
directly, rather than using stringstream. Possibly with
boost::regex, but something as simple as the following should
work:

std::vector< std::string >
breakIntoFields(
std::string const& source,
char separator )
{
std::vector< std::string >
result ;
std::string::const_iterator
current = source.begin() ;
while ( current != source.end() ) {
std::string::const_iterator
next
= std::find( current, source,end(), separator ) ;
result.push_back( std::string( current, next ) ) ;
if ( next != source.end() ) {
++ next ;
}
}
return result ;
}

(BTW: the generally acceptable maximum size for a signature is
four lines.)

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
Sep 12 '08 #1
3 1380
On Fri, 12 Sep 2008 01:34:22 -0700, James Kanze wrote:
On Sep 12, 7:25 am, Ruben <ru...@www2.mrbrklyn.comwrote:
>I have strings that are something like
>3.4meq/kg/day
>which I want to parse. The numbers on the front are critical for later
calculations. But I want to pull the descriptors apart. I can use
get() to do this, but if just seems so sloppy.
>This is a code snippet that extracts the numbers. Can't I change the
delimitor for the istringstream class?

You can specify a delimitor when calling getline, but I'd tend to consider
that a bit abuse. I'd just parse the string directly, rather than using
stringstream. Possibly with boost::regex, but something as simple as the
following should work:

std::vector< std::string >
breakIntoFields(
std::string const& source,
char separator )
{
std::vector< std::string >
result ;
std::string::const_iterator
current = source.begin() ;
while ( current != source.end() ) {
std::string::const_iterator
next
= std::find( current, source,end(), separator ) ;
result.push_back( std::string( current, next ) ) ; if ( next
!= source.end() ) {
++ next ;
}
}
return result ;
}
}
(BTW: the generally acceptable maximum size for a signature is four
lines.)
Thanks James:

I was considering doing something alng these lines but I'm really quite
suprised that I can't jut change the default delimiters for iostreams.

Thanks for the concise and easy to undertand mock up. It sure makes one
appreciate Perl's split function.

Ruben
--
http://www.mrbrklyn.com - Interesting Stuff
http://www.nylxs.com - Leadership Development in Free Software
So many immigrant groups have swept through our town that Brooklyn, like Atlantis, reaches mythological proportions in the mind of the world - RI Safir 1998
http://fairuse.nylxs.com DRM is THEFT - We are the STAKEHOLDERS - RI Safir 2002
© Copyright for the Digital Millennium

Sep 12 '08 #2
On Sep 12, 5:23 pm, Ruben <ru...@www2.mrbrklyn.comwrote:
On Fri, 12 Sep 2008 01:34:22 -0700, James Kanze wrote:
[...]
I was considering doing something alng these lines but I'm
really quite suprised that I can't jut change the default
delimiters for iostreams.
Well, you can, in fact, but I find it more abuse than use as
designed, as well as a lot of work. The delimiters for >are
defind by the locale, and you can define custom locales, and set
the stream to use them. There is also a form of getline which
takes an additional argument for the delimiter. My personal
feeling, however, is that locales should be used for specifying
locale specific data and functionality, not implementing
parsing, and that if getline is used to do something else, it's
sort of lying to the reader.

In the past, Dietmar Kuehl has also pointed out that you can
easily use a filtering streambuf to convert your separators into
spaces, so that >will work as expected as well. I'll admit
that I find that a bit of obfuscation as well (although I use
filtering streambuf's in a lot of other cases).
Thanks for the concise and easy to undertand mock up. It sure
makes one appreciate Perl's split function.
Long before I learned C++, I was using AWK for a lot of small,
text based processing. My first classes in C++ were String
(this was before the standard), RegularExpression and
FieldArray: the code I posted is basically from an earlier
version of CharacterSeparatedFieldArray. (Earlier versions of
FieldArray used the template method pattern to customize what
made a field---CharacterSeparatedFieldArray corresponded to a
single, arbitrary character as separator. In sum, I added the
functionality I was used to in AWK.

If you're doing any work with text at all, you should build
yourself a small toolkit with such tools, based on what is
available in a text oriented language, like AWK or Perl. (Much
of what I use is available in the Text subsystem at my site:
http://kanze.james.neuf.fr/code-en.html.)

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
Sep 12 '08 #3
On Fri, 12 Sep 2008 15:21:58 -0700, James Kanze wrote:
On Sep 12, 5:23 pm, Ruben <ru...@www2.mrbrklyn.comwrote:
>On Fri, 12 Sep 2008 01:34:22 -0700, James Kanze wrote:

[...]
>I was considering doing something alng these lines but I'm really quite
suprised that I can't jut change the default delimiters for iostreams.

Well, you can, in fact, but I find it more abuse than use as designed, as
well as a lot of work. The delimiters for >are defind by the locale,
and you can define custom locales, and set the stream to use them. There
is also a form of getline which takes an additional argument for the
delimiter.
I saw that, but this has its own problems. Evidently the cin>pverload
operator leaves the trailing line feeds in the stream. So you need to
run a cin.ignor() when using getline.

This is problematic in so many ways. What if there is more than one
linefeed? A user would never do that. That is a programing problem that
should never happen. getline should ignor the initial white space.
My personal feeling, however, is that locales should be used for
specifying locale specific data and functionality, not implementing
parsing, and that if getline is used to do something else, it's sort of
lying to the reader.

In the past, Dietmar Kuehl has also pointed out that you can easily use
a filtering streambuf to convert your separators into spaces, so that >>
will work as expected as well. I'll admit that I find that a bit of
obfuscation as well (although I use filtering streambuf's in a lot of
other cases).
eh. strtok is looking better and better, which is really distressing to
think about because the C string library, IMO, is cluge of disconjunctive
functions that don't have consistant sytax or behaviors.

>
>Thanks for the concise and easy to undertand mock up. It sure makes
one appreciate Perl's split function.

Long before I learned C++, I was using AWK for a lot of small, text
based processing. My first classes in C++ were String (this was before
the standard), RegularExpression and FieldArray: the code I posted is
basically from an earlier version of CharacterSeparatedFieldArray.
(Earlier versions of FieldArray used the template method pattern to
customize what made a field---CharacterSeparatedFieldArray corresponded
to a single, arbitrary character as separator. In sum, I added the
functionality I was used to in AWK.
I was thinking that this is gonna have to be done and its beyound me how
at this late stage that C++ string interface seems so broken from a
design perspective.

At minimum I'm going to have to build chomp and a simple form of split

If you're doing any work with text at all, you should build yourself a
small toolkit with such tools, based on what is available in a text
oriented language, like AWK or Perl. (Much of what I use is available
in the Text subsystem at my site:
http://kanze.james.neuf.fr/code-en.html.)
I'll take a look. Thanks. I have to admit that I find the string and
sstream tandam just bizare, error prone and difficult to learn and use.
Here I am learning an Object Oriented language which should facilitate the
development of rational API's and yet, C++ is leavimg me to memorize a
bunch of language specific randomized useless implementation facts and
method weirdness. I'm probably just not getting it, but it seems that the
I/O functionality is completelty in dissarray.

Ruben

--
http://www.mrbrklyn.com - Interesting Stuff
http://www.nylxs.com - Leadership Development in Free Software

So many immigrant groups have swept through our town that Brooklyn, like Atlantis, reaches mythological proportions in the mind of the world - RI Safir 1998

http://fairuse.nylxs.com DRM is THEFT - We are the STAKEHOLDERS - RI Safir 2002

"Yeah - I write Free Software...so SUE ME"

"The tremendous problem we face is that we are becoming sharecroppers to our own cultural heritage -- we need the ability to participate in our own society."

"I'm an engineer. I choose the best tool for the job, politics be damned.<
You must be a stupid engineer then, because politcs and technology have been attached at the hip since the 1st dynasty in Ancient Egypt. I guess you missed that one."

© Copyright for the Digital Millennium

Sep 14 '08 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

16
by: Krakatioison | last post by:
My sites navigation is like this: http://www.newsbackup.com/index.php?n=000000000040900000 , depending on the variable "n" (which is always a number), it will take me anywhere on the site......
5
by: Stu Cazzo | last post by:
I have the following: String myStringArray; String myString = "98 99 100"; I want to split up myString and put it into myStringArray. If I use this: myStringArray = myString.split(" "); it...
9
by: John F Dutcher | last post by:
I use code like the following to retrieve fields from a form: recd = recd.append(string.ljust(form.getfirst("lname",' '),15)) recd.append(string.ljust(form.getfirst("fname",' '),15)) etc.,...
9
by: Derek Hart | last post by:
I wish to execute code from a string. The string will have a function name, which will return a string: Dim a as string a = "MyFunctionName(param1, param2)" I have seen a ton of people...
10
by: Angus Leeming | last post by:
Hello, Could someone explain to me why the Standard conveners chose to typedef std::string rather than derive it from std::basic_string<char, ...>? The result of course is that it is...
37
by: Kevin C | last post by:
Quick Question: StringBuilder is obviously more efficient dealing with string concatenations than the old '+=' method... however, in dealing with relatively large string concatenations (ie,...
2
by: Andrew | last post by:
I have written two classes : a String Class based on the book " C++ in 21 days " and a GenericIpClass listed below : file GenericStringClass.h // Generic String class
2
by: s | last post by:
I'm getting compile errors on the following code: <code> #include <iostream> #include <fstream> #include <list> #include <string> using namespace std;
11
by: Christopher Benson-Manica | last post by:
Let's say I have a std::string, and I want to replace all the ',' characters with " or ", i.e. "A,B,C" -> "A or B or C". Is the following the best way to do it? int idx; while(...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.