468,719 Members | 1,810 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 468,719 developers. It's quick & easy.

Iterating through a string

Hi Guys,

I have a string which contains data elements separated by spaces. I also
have a function which returns the number of characters from the beginning of
the string for a given number of spaces. I am using a loop with strchr for
the number of spaces, and I was just wondering if there was a more efficient
(ie faster) way of achieving this. Any ideas please? Current function code
below:

int GetNewPosition(const std::string& DataStr, const int Spaces)
{
int i = 0; // space counter
char* b = const_cast<char*>(DataStr.c_str()); // beginning of string

do {
b = strchr(b, ' '); // find next space
++i; ++b; // increment space counter and pointer
} while(i<Spaces && b != NULL);

if(b) return b - DataStr.c_str();
else return -1;
}
Any help is appreciated,
Thanks for reading,
Steve.
Jul 22 '05 #1
16 4107
On Tue, 8 Jun 2004 22:42:36 +0100 in comp.lang.c++, "Steve"
<st*******@hotmail.com> wrote,
the string for a given number of spaces. I am using a loop with strchr for
the number of spaces, and I was just wondering if there was a more efficient
(ie faster) way of achieving this.


For efficiency, avoid going over the same ground more than once, i.e.
continue from the previous position and do not restart at the beginning
of the string.

Use string::find() functions rather than .c_str() and dropping down to
low-level old library strchr().

Do not subtract one .c_str() from another. There is no guarantee that
..c_str() will point to the same buffer every time.

See if using a string splitter function would suit you better?
http://groups.google.com/gr*********....earthlink.net

Jul 22 '05 #2
There's probably a standard library input stream class which you can use to
suck out the pieces of data you want.
If you want to roll your own, you could create a specialized iterator object
which takes your string and lets you iterate over it. It seems that you
are always starting from the beginning when you want the next position and
count through the string to find the next one. By using an object to wrap
the
string, you could maintain this state information in the object and wouldn't
need to compute it from scratch every time.
class FooBar
{
public:
FooBar( const std::string& DataStr )
:_DataStr(DataStr), _position(0)
{
// move position to point to the first data item.
}

std::string GetDataItem( ) const
{
// return a string representing the next data item.
}
bool IsValid() const ; // return true if there are more items to return,
false otherwise.
bool Next(); // advance through datastring until end of string or next
data item is found,
// update position counter.

bool First() ; // wind back position to be at the first data item, so we
can iterate again over
// the string.

private:
const std::string _DataStr
int _position;

};

.......

FooBar myfoobar( DataStr);

for ( myfoobar.First() ; myfoobar.IsValid( DataItem); myfoobar.Next() )
{
string ds = myfoobar.GetDataItem();

}

"Steve" <st*******@hotmail.com> wrote in message
news:Ohrxc.1911$ud5.1728@newsfe4-gui...
Hi Guys,

I have a string which contains data elements separated by spaces. I also
have a function which returns the number of characters from the beginning of the string for a given number of spaces. I am using a loop with strchr for
the number of spaces, and I was just wondering if there was a more efficient (ie faster) way of achieving this. Any ideas please? Current function code
below:

int GetNewPosition(const std::string& DataStr, const int Spaces)
{
int i = 0; // space counter
char* b = const_cast<char*>(DataStr.c_str()); // beginning of string

do {
b = strchr(b, ' '); // find next space
++i; ++b; // increment space counter and pointer
} while(i<Spaces && b != NULL);

if(b) return b - DataStr.c_str();
else return -1;
}
Any help is appreciated,
Thanks for reading,
Steve.

Jul 22 '05 #3
In article <Ohrxc.1911$ud5.1728@newsfe4-gui>,
"Steve" <st*******@hotmail.com> wrote:
Hi Guys,

I have a string which contains data elements separated by spaces. I also
have a function which returns the number of characters from the beginning of
the string for a given number of spaces. I am using a loop with strchr for
the number of spaces, and I was just wondering if there was a more efficient
(ie faster) way of achieving this. Any ideas please? Current function code
below:

int GetNewPosition(const std::string& DataStr, const int Spaces)
{
int i = 0; // space counter
char* b = const_cast<char*>(DataStr.c_str()); // beginning of string

do {
b = strchr(b, ' '); // find next space
++i; ++b; // increment space counter and pointer
} while(i<Spaces && b != NULL);

if(b) return b - DataStr.c_str();
else return -1;
}


I don't think your code does what you want it to do, but since you
didn't show what you wanted, I'm not sure.

I wrote this:

int main() {
string elements( "aaa bbb ccc ddd" );
for ( int x = 0; x < 10; ++x ) {
int pos = GetNewPosition( elements, x );
cout << pos << endl;
}
}

and got this output:

4
4
8
12
-654603

*crash*

I'm not entirely sure what your code is supposed to do, but this
function may be more in line with what you are looking for:

int GetNewPosition(const std::string& DataStr, int Spaces)
{
assert( Spaces >= 0 );
int result = -1;
unsigned working = DataStr.find_first_not_of( ' ' );
// strictly speaking 'working != string::npos' in the
// below isn't necessary. It makes the code more efficient.
while ( working != string::npos && Spaces-- ) {
working = DataStr.find_first_not_of( ' ',
DataStr.find( ' ', working ) );
}
assert( working == string::npos || working <= INT_MAX );
if ( working != string::npos )
result = working;
return result;
}

On most computers, I could have done without the 'working' variable,
instead using the 'result' variable. However, from what I understand -1
== string::npos is not guaranteed to be true.

Also note the two asserts in the function. They check the assumptions
made in the algorighm (a) Spaces isn't negitive (it may work with a
negitive value for Spaces, but I didn't test it, (b) that the position
of the char we are looking for is small enough to fit in an int (what
would it take to generate a string that would cause this assert to fire?)
Jul 22 '05 #4
In article <25********************@comcast.com>,
"Dave Townsend" <da********@comcast.net> wrote:
If you want to roll your own, you could create a specialized iterator object
which takes your string and lets you iterate over it.
If the OP decides to do this, I think he would be better off if he
followed the standard iterator concept.

class FooBar
{
public:
FooBar( const std::string& DataStr )
:_DataStr(DataStr), _position(0)
{
// move position to point to the first data item.
}

std::string GetDataItem( ) const
{
// return a string representing the next data item.
}
bool IsValid() const ; // return true if there are more items to return,
false otherwise.
bool Next(); // advance through datastring until end of string or next
data item is found,
// update position counter.

bool First() ; // wind back position to be at the first data item, so we
can iterate again over
// the string.

private:
const std::string _DataStr
int _position;

};

The class below is a standard input iterator:

class word_iterator:
public std::iterator<std::input_iterator_tag, std::string>
{
friend bool operator==( const word_iterator& lhs,
const word_iterator& rhs );
friend std::ostream& operator<<( std::ostream& lhs,
const word_iterator& rhs ); // for testing only

const std::string* str;
std::string::size_type pos;

public:
word_iterator( const std::string* s = 0 ):
str( s && !s->empty() ? s: 0 ),
pos(std::string::npos) {
if ( s )
pos = s->find_first_not_of( ' ' );
}

const std::string operator*() const {
assert( pos != std::string::npos );
unsigned len = str->find( ' ', pos );
if ( len < std::string::npos ) len -= pos;
return str->substr( pos, len );
}

const std::string* operator->() const {
// is there a better way to write this function?
static std::string bit;
bit = **this;
return &bit;
}

const word_iterator& operator++() {
assert( pos != std::string::npos );
if ( pos != std::string::npos )
pos = str->find_first_not_of( ' ', str->find( ' ', pos ) );
return *this;
}

const word_iterator operator++(int) {
word_iterator tmp( *this );
++(*this);
return tmp;
}
};

bool operator==( const word_iterator& lhs, const word_iterator& rhs ) {
// comparing iterators from two different containers is undefined
return lhs.pos == rhs.pos;
}

bool operator!=( const word_iterator& lhs, const word_iterator& rhs ) {
return !( lhs == rhs );
}

std::ostream& operator<<( std::ostream& lhs, const word_iterator& rhs ) {
// for testing purposes only
lhs << "Pos: " << rhs.pos << " of string: '"
<< ((rhs.str) ? *rhs.str : std::string("NULL")) << "'";
return lhs;
}
FooBar myfoobar( DataStr);

for ( myfoobar.First() ; myfoobar.IsValid( DataItem); myfoobar.Next() )
{
string ds = myfoobar.GetDataItem();

}


Mine would be used like this:

string s;
//...
for ( word_iterator x( &s ); x != word_iterator(); ++x )
cout << "word: " << *x
<< " is " << x->length() << "long.\n";
Or you could use it in the standard algorithms:

string s;
//...
copy( word_iterator(&s), word_iterator(),
ostream_iterator<string>(cout, "\n"));
Jul 22 '05 #5
"David Harmon" <so****@netcom.com.invalid> wrote in message
Use string::find() functions rather than .c_str() and dropping down to
low-level old library strchr().


The strcmp, strchr, memcmp functions may be more efficient than std::equal,
std::find because of block optimizations (ie. builtin CPU instructions to
analyze an entire block of bytes at once). In principle an implementation
may specialize std::find etc to call strchr, but I have yet to find one.
Jul 22 '05 #6
"Steve" <st*******@hotmail.com> wrote in message
news:Ohrxc.1911$ud5.1728@newsfe4-gui...
int GetNewPosition(const std::string& DataStr, const int Spaces)
{
int i = 0; // space counter
char* b = const_cast<char*>(DataStr.c_str()); // beginning of string

do {
b = strchr(b, ' '); // find next space
++i; ++b; // increment space counter and pointer
} while(i<Spaces && b != NULL);

if(b) return b - DataStr.c_str();
else return -1;
}


In addition to the other points, what happens when strchr returns NULL?
Your code does ++b, which may (and should) crash the program.

As Dave points out, you should store the result of the first DataStr.c_str
in a pointer, and avoid calling the function a second time as it is allowed
to return a different pointer.
Jul 22 '05 #7

"Steve" <st*******@hotmail.com> wrote in message
news:Ohrxc.1911$ud5.1728@newsfe4-gui...
Hi Guys,

I have a string which contains data elements separated by spaces. I also
have a function which returns the number of characters from the beginning of the string for a given number of spaces. I am using a loop with strchr for
the number of spaces, and I was just wondering if there was a more efficient (ie faster) way of achieving this. Any ideas please? Current function code
below:

int GetNewPosition(const std::string& DataStr, const int Spaces)
{
int i = 0; // space counter
char* b = const_cast<char*>(DataStr.c_str()); // beginning of string

do {
b = strchr(b, ' '); // find next space
++i; ++b; // increment space counter and pointer
} while(i<Spaces && b != NULL);

if(b) return b - DataStr.c_str();
else return -1;
}

Your current function is bugged. If there are no spaces, then b == NULL,
then you increment b, then you test it for NULL. That isn't going to work.
You also have a completely gratuitous const_cast in your code.

It's quite a good example of how newbies worry about efficiency without
other more important considerations such as writing working code, and
writing clear code.

However strchr is probably the fastest way to do it, however as with all
these questions there is no official answer as to which method is fastest.
The C++ standard makes no mention of which functions or techniques are
faster than others, the only way to tell for sure is try different methods
and time them.

Here's my version, completely untested, so apologies in advance for any
bugs.

int GetNewPosition(const std::string& DataStr, int Spaces)
{
const char* start = DataStr.c_str();
const char* p = start;
for (int i = 0; i < Spaces; ++i)
{
if (p == NULL)
return -1;
p = strchr(p + 1, ' ');
}
return p - start;
}

john
Jul 22 '05 #8
"John Harrison" <jo*************@hotmail.com> wrote in message
news:2inljvFpkd8vU1@uni-
"Steve" <st*******@hotmail.com> wrote in message
int GetNewPosition(const std::string& DataStr, const int Spaces)
{
int i = 0; // space counter
char* b = const_cast<char*>(DataStr.c_str()); // beginning of string

You also have a completely gratuitous const_cast in your code.


This last point seems fine to me. The original object was obviously
non-const, as it was created through new char[N] or allocator.allocate(N),
so it seems safe to use const_cast. I could be wrong though. Let me know.
I think some libraries provide only strchr(char *, char), but they should
also have strchr(const char *, char).

Jul 22 '05 #9

"Siemel Naran" <Si*********@REMOVE.att.net> wrote in message
news:Aw*******************@bgtnsc04-news.ops.worldnet.att.net...
"John Harrison" <jo*************@hotmail.com> wrote in message
news:2inljvFpkd8vU1@uni-
"Steve" <st*******@hotmail.com> wrote in message
int GetNewPosition(const std::string& DataStr, const int Spaces)
{
int i = 0; // space counter
char* b = const_cast<char*>(DataStr.c_str()); // beginning of

string
You also have a completely gratuitous const_cast in your code.
This last point seems fine to me. The original object was obviously
non-const, as it was created through new char[N] or allocator.allocate(N),
so it seems safe to use const_cast. I could be wrong though. Let me

know. I think some libraries provide only strchr(char *, char), but they should
also have strchr(const char *, char).


I agree it's safe, but its also unnecessary. By writing const_cast the OP
gives the impression that something tricky is going on in the code, which
isn't the case. Its a completely normal function which can be written
without any casts.

C++ should provide two overloads for strchr

const char* strchr(const char*, int);

and

char* strchr(char*, int);

Because C doesn't have function overloading it can only provide one version
which is the non-const correct

char* strchr(const char*, int);

Maybe that is what persuaded the OP that he needed to use a cast.

john
Jul 22 '05 #10

"Steve" <st*******@hotmail.com> wrote in message
news:Ohrxc.1911$ud5.1728@newsfe4-gui...
Hi Guys,

I have a string which contains data elements separated by spaces.


You could break it apart using strtok()
Jul 22 '05 #11
Thanks folks,

This exercise just tells me no matter how long I think I've been doing C++,
I'll always be a n00b! Fancy overlooking the "case of NULL", D'Oh!!

Thanks for the code advice too, it's appreciated...though I'm beginning to
think It'd be better for the world if I gave up programming ;)

Steve.
"Steve" <st*******@hotmail.com> wrote in message
news:Ohrxc.1911$ud5.1728@newsfe4-gui...
Hi Guys,

I have a string which contains data elements separated by spaces. I also
have a function which returns the number of characters from the beginning of the string for a given number of spaces. I am using a loop with strchr for
the number of spaces, and I was just wondering if there was a more efficient (ie faster) way of achieving this. Any ideas please? Current function code
below:

int GetNewPosition(const std::string& DataStr, const int Spaces)
{
int i = 0; // space counter
char* b = const_cast<char*>(DataStr.c_str()); // beginning of string

do {
b = strchr(b, ' '); // find next space
++i; ++b; // increment space counter and pointer
} while(i<Spaces && b != NULL);

if(b) return b - DataStr.c_str();
else return -1;
}
Any help is appreciated,
Thanks for reading,
Steve.

Jul 22 '05 #12

With regard to the first comment I originally posted, I thought there might
be a solution
using streams to parse the words out of a string, I looked a bit closer,
below is
my attempt. This works fine apart from one detail, perhaps somebody might
jump in, there doesn't seem to be a way to specify the separator chars of
the original stream, in the example, the last word is returned as "dog." not
"dog"
Other than that, its a wonderfully simple snippet of code, praise to STL!

I've not benchmarked this code, but I presume its going to be relatively
efficient
since the iterator will maintain the position in the stringstream as the
words are pulled out,
there will be some extra overhead I suspect from using the string class as
opposed to
plain characters, that's life.

The question is then, how to specify the separator/whitespace for the
stream.
#include <iostream>
#include <sstream>
#include <iterator>

using namespace std;
int main(int argc, char* argv[])
{
istringstream is("the quick brown fox jumped over the lazy dog.");

istream_iterator<string> is_iter(is);
istream_iterator<string> end;

for ( ; is_iter != end; ++is_iter)
{
string nextword = *is_iter;
cout << "next word is \"" << nextword << "\"\n";
}
return 0;
}

"Steve" <st*******@hotmail.com> wrote in message
news:Ohrxc.1911$ud5.1728@newsfe4-gui...
Hi Guys,

I have a string which contains data elements separated by spaces. I also
have a function which returns the number of characters from the beginning of the string for a given number of spaces. I am using a loop with strchr for
the number of spaces, and I was just wondering if there was a more efficient (ie faster) way of achieving this. Any ideas please? Current function code
below:

int GetNewPosition(const std::string& DataStr, const int Spaces)
{
int i = 0; // space counter
char* b = const_cast<char*>(DataStr.c_str()); // beginning of string

do {
b = strchr(b, ' '); // find next space
++i; ++b; // increment space counter and pointer
} while(i<Spaces && b != NULL);

if(b) return b - DataStr.c_str();
else return -1;
}
Any help is appreciated,
Thanks for reading,
Steve.

Jul 22 '05 #13
In article <2i************@uni-berlin.de>,
"John Harrison" <jo*************@hotmail.com> wrote:
"Steve" <st*******@hotmail.com> wrote in message
news:Ohrxc.1911$ud5.1728@newsfe4-gui...
Hi Guys,

I have a string which contains data elements separated by spaces. I also
have a function which returns the number of characters from the beginning

of
the string for a given number of spaces. I am using a loop with strchr for
the number of spaces, and I was just wondering if there was a more

efficient
(ie faster) way of achieving this. Any ideas please? Current function code
below:

int GetNewPosition(const std::string& DataStr, const int Spaces)
{
int i = 0; // space counter
char* b = const_cast<char*>(DataStr.c_str()); // beginning of string

do {
b = strchr(b, ' '); // find next space
++i; ++b; // increment space counter and pointer
} while(i<Spaces && b != NULL);

if(b) return b - DataStr.c_str();
else return -1;
}

Your current function is bugged. If there are no spaces, then b == NULL,
then you increment b, then you test it for NULL. That isn't going to work.
You also have a completely gratuitous const_cast in your code.

It's quite a good example of how newbies worry about efficiency without
other more important considerations such as writing working code, and
writing clear code.

However strchr is probably the fastest way to do it, however as with all
these questions there is no official answer as to which method is fastest.
The C++ standard makes no mention of which functions or techniques are
faster than others, the only way to tell for sure is try different methods
and time them.

Here's my version, completely untested, so apologies in advance for any
bugs.

int GetNewPosition(const std::string& DataStr, int Spaces)
{
const char* start = DataStr.c_str();
const char* p = start;
for (int i = 0; i < Spaces; ++i)
{
if (p == NULL)
return -1;
p = strchr(p + 1, ' ');
}
return p - start;
}


I suspect your code above doesn't do what the OP wanted...

GetNewPosition("", 0) should return -1 if I understand what he is trying
to do, yours returns 0. Care to try again? :-)
Jul 22 '05 #14
In article <S-********************@comcast.com>,
"Dave Townsend" <da********@comcast.net> wrote:

With regard to the first comment I originally posted, I thought there might
be a solution
using streams to parse the words out of a string, I looked a bit closer,
below is
my attempt. This works fine apart from one detail, perhaps somebody might
jump in, there doesn't seem to be a way to specify the separator chars of
the original stream, in the example, the last word is returned as "dog." not
"dog"
That wasn't a requirement anyway.

Other than that, its a wonderfully simple snippet of code, praise to STL!

I've not benchmarked this code, but I presume its going to be relatively
efficient
since the iterator will maintain the position in the stringstream as the
words are pulled out,
there will be some extra overhead I suspect from using the string class as
opposed to
plain characters, that's life.

The question is then, how to specify the separator/whitespace for the
stream.
Good question...

#include <iostream>
#include <sstream>
#include <iterator>

using namespace std;
int main(int argc, char* argv[])
{
istringstream is("the quick brown fox jumped over the lazy dog.");

istream_iterator<string> is_iter(is);
istream_iterator<string> end;

for ( ; is_iter != end; ++is_iter)
{
string nextword = *is_iter;
cout << "next word is \"" << nextword << "\"\n";
}
return 0;
}


Your idea skips over all whitespace, not just spaces as the OP's code
did. I don't know if he would have a problem with that or not...

Otherwise, this works just as well as my custom iterator. What I did
would only be useful if one also implemented op-- (so you could go both
forward and backward) and/or a non-const version of op* (so you could
replace words.)
Jul 22 '05 #15
"John Harrison" <jo*************@hotmail.com> wrote in message
news:2innbjFpabdoU1@uni-
C++ should provide two overloads for strchr

const char* strchr(const char*, int);

and

char* strchr(char*, int);


Good point. Because the OP is using std::string it's clear he's using C++.

Jul 22 '05 #16

"Siemel Naran" <Si*********@REMOVE.att.net> wrote in message
news:xN*********************@bgtnsc05-news.ops.worldnet.att.net...
"John Harrison" <jo*************@hotmail.com> wrote in message
news:2innbjFpabdoU1@uni-
C++ should provide two overloads for strchr

const char* strchr(const char*, int);

and

char* strchr(char*, int);
Good point. Because the OP is using std::string it's clear he's using

C++.


Posting to comp.lang.c++ should also mean C++, although that isn't always
the case. Some posters profess to be coding in a mysterious language called
C/C++.

john
Jul 22 '05 #17

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

12 posts views Thread by Matthew Wilson | last post: by
28 posts views Thread by rbt | last post: by
74 posts views Thread by Michael | last post: by
7 posts views Thread by Dave Hansen | last post: by
6 posts views Thread by Gustaf Liljegren | last post: by
4 posts views Thread by dustin.getz | last post: by
5 posts views Thread by mikehulluk | last post: by
13 posts views Thread by kj | last post: by
1 post views Thread by CARIGAR | last post: by
1 post views Thread by Oskars | last post: by
reply views Thread by zhoujie | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.