By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
457,864 Members | 1,321 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 457,864 IT Pros & Developers. It's quick & easy.

STL speed

P: n/a
Hi,
The task was indeed simple. Read 2.000.000 words (average length = 9), sort
them and write it to new file.
I've made this in STL, and it was almost 17 times slower than my previous
"clear-C-code".
I used <vector>, <algorithm>, <iostream> and <algorithm>. Is STL really so
slow?

Thx in adv.
Przemo

p.s. i know that STL uses IntrospectiveSort which seems to be good choice, i
suppose that INPUT (cin) is extremaly slow,
and <vector> as a dynamic structure also isn't to fast... any ideas?
Jul 22 '05 #1
Share this Question
Share on Google+
30 Replies


P: n/a
"Przemo Drochomirecki" <pe******@gazeta.pl> wrote in message
news:bt**********@nemesis.news.tpi.pl...
Hi,
The task was indeed simple. Read 2.000.000 words (average length = 9), sort them and write it to new file.
I've made this in STL, and it was almost 17 times slower than my previous
"clear-C-code".
I used <vector>, <algorithm>, <iostream> and <algorithm>. Is STL really so
slow?

Thx in adv.
Przemo

p.s. i know that STL uses IntrospectiveSort which seems to be good choice, i suppose that INPUT (cin) is extremaly slow,
and <vector> as a dynamic structure also isn't to fast... any ideas?


I doubt that it's inherently slow. It depends on the implementation. I
remember tracing through the stream output on an MS compiler and discovered
that it fabricated a format string and called sprintf! Fifty billion dollars
in the bank, but they chose the cheapest, nastiest implementation possible.

DW

Jul 22 '05 #2

P: n/a
"Przemo Drochomirecki" <pe******@gazeta.pl> wrote in
news:bt**********@nemesis.news.tpi.pl:
Hi,
The task was indeed simple. Read 2.000.000 words (average length = 9),
sort them and write it to new file.
I've made this in STL, and it was almost 17 times slower than my
previous "clear-C-code".
I used <vector>, <algorithm>, <iostream> and <algorithm>. Is STL
really so slow?

Thx in adv.
Przemo

p.s. i know that STL uses IntrospectiveSort which seems to be good
choice, i suppose that INPUT (cin) is extremaly slow,
and <vector> as a dynamic structure also isn't to fast... any ideas?


You didn't post your "clear-C-code", nor your Standard C++ code, thus there
is no way for us to even guess at what you've done inefficiently.

Gazing into my crystal ball says that you haven't expressed the same
program in C and C++.
Jul 22 '05 #3

P: n/a
> Is STL really so slow?

That's hard to say without the source and it depends on quite a few other
things, like the STL implementation and the Compiler.
I'd write the programm as following:

#include <algorithm>
#include <iterator>
#include <iostream>
#include <fstream>
#include <string>
#include <vector>

using namespace std;

int main()
{
ifstream file("input.txt");
vector<string> words(istream_iterator<string>(file),
(istream_iterator<string>()));
sort(words.begin(), words.end());
ofstream outfile("output.txt");
copy(words.begin(), words.end(), ostream_iterator<string>(outfile, "\n"));
}

Regards
Ignaz
Jul 22 '05 #4

P: n/a
In article <bt**********@nemesis.news.tpi.pl>, Przemo Drochomirecki wrote:
Hi,
The task was indeed simple. Read 2.000.000 words (average length = 9), sort
them and write it to new file.
I've made this in STL, and it was almost 17 times slower than my previous
"clear-C-code".
I used <vector>, <algorithm>, <iostream> and <algorithm>. Is STL really so
slow?
(1) Depends on the compiler, and how the compiler is invoked, and possibly
on the STL implementation. Abstraction penalty is almost nil for a good
optimising compiler with optimisation turned on.

(2) As others pointed out, or hinted at, there are numerous ways the programs
could actually be fundamentally different. Possible performance issues here
depend on what functions are getting called, but both the input and output
could differ (on several implementations, C++ style streams are slower,
especially if you don't pay careful attention to how you do the IO)
p.s. i know that STL uses IntrospectiveSort which seems to be good choice, i
suppose that INPUT (cin) is extremaly slow,
Well why not actually check this ?
and <vector> as a dynamic structure also isn't to fast... any ideas?


You can allocate space for it upfront. See reserve()

Cheers,
--
Donovan Rebbechi
http://pegasus.rutgers.edu/~elflord/
Jul 22 '05 #5

P: n/a
Przemo Drochomirecki wrote:
The task was indeed simple.
Read 2.000.000 words (average length = 9),
sort them and write it to new file.
I've made this in STL,
and it was almost 17 times slower than my previous "clear-C-code".
I used <vector>, <algorithm>, <iostream> and <algorithm>.
Is STL really so slow?


No. You just screwed up.
Post both your C and C++ code
so that we can see what you did wrong.

Jul 22 '05 #6

P: n/a

"E. Robert Tisdale" <E.**************@jpl.nasa.gov> wrote in message
news:3F**********@jpl.nasa.gov...
Przemo Drochomirecki wrote:
The task was indeed simple.
Read 2.000.000 words (average length = 9),
sort them and write it to new file.
I've made this in STL,
and it was almost 17 times slower than my previous "clear-C-code".
I used <vector>, <algorithm>, <iostream> and <algorithm>.
Is STL really so slow?


No. You just screwed up.
Post both your C and C++ code
so that we can see what you did wrong.


---STL CODE---

#include <string>
#include <conio.h>
#include <iostream>
#include <algorithm>
#include <vector>

using namespace std;
struct wordstruct { string word; };

void read_names(vector<wordstruct>& x)
{
wordstruct q;
while (true) {
cin >> q.word;
if (q.word == string("0"))
break;
x.push_back(q);
}
}
struct wordCompare
{
bool operator()(const wordstruct& a, const wordstruct& b) {
return a.word<b.word;
}
};

wordCompare wordc;

int main()
{
vector<wordstruct> x;
read_names(x);
sort(x.begin(), x.end(), wordc);
// vector x is sorted
return 0;
}

compiled under VC++6.0

--- C ---

simple loop reading words with fgets, each word is seperately allocated with
memalloc and
than qsort.

compiled under gcc 3.0

thx for help:) (all five STL-masters)
Jul 22 '05 #7

P: n/a
#include <algorithm>
#include <iterator>
#include <iostream>
#include <fstream>
#include <string>
#include <vector>

using namespace std;

int main()
{
ifstream file("input.txt");
vector<string> words(istream_iterator<string>(file),
14 (istream_iterator<string>()));
15 sort(words.begin(), words.end());
16 ofstream outfile("output.txt");
17 copy(words.begin(), words.end(), ostream_iterator<string>(outfile,
"\n"));
}
i used well know copy&paste technique and here is what i got.
and frankly speaking and don't understand it at all (i'm not too familiar
with STL syntax...)
(no extra compiling setting -> simply gcc.exe ignez.cpp )

---------- gcc ----------
stl.cpp: In function `int main()':
stl.cpp:14: parse error before `)' token
stl.cpp:15: request for member `begin' in `words(...)', which is of
non-aggregate type `std::vector<std::string, std::allocator<std::string>
()(...)'
stl.cpp:15: request for member `end' in `words(...)', which is of
non-aggregate
type `std::vector<std::string, std::allocator<std::string> > ()(...)'
stl.cpp:17: request for member `begin' in `words(...)', which is of
non-aggregate type `std::vector<std::string, std::allocator<std::string>

()(...)'
stl.cpp:17: request for member `end' in `words(...)', which is of
non-aggregate
type `std::vector<std::string, std::allocator<std::string> > ()(...)'

Output completed (2 sec consumed) - Normal Termination

thx for extremely brief code, maybe STL isn't as bad as thought:)
Przemo
Jul 22 '05 #8

P: n/a
In article <bt**********@nemesis.news.tpi.pl>, Przemo Drochomirecki wrote:
#include <algorithm>
#include <iterator>
#include <iostream>
#include <fstream>
#include <string>
#include <vector>

using namespace std;

int main()
{
ifstream file("input.txt");
vector<string> words(istream_iterator<string>(file),
14 (istream_iterator<string>()));
15 sort(words.begin(), words.end());
16 ofstream outfile("output.txt");
17 copy(words.begin(), words.end(), ostream_iterator<string>(outfile,
"\n"));
}
i used well know copy&paste technique and here is what i got.
and frankly speaking and don't understand it at all (i'm not too familiar
with STL syntax...)
(no extra compiling setting -> simply gcc.exe ignez.cpp )


The compiler is getting confused because it thinks that you're declaring a
function when you declare the vector words. The ugly workaround is to use
this:

istream_iterator<string> in1(file), in2;
vector<string> words (in1,in2);

The pretty way (but may not work with older compilers):

vector<string> words( (istream_iterator<string>(file)),
istream_iterator<string>());

Note that the extra parens go around the *first* argument, not the second
one.

Scott Meyers discusses this in Effective STL in some depth.

Cheers,
--
Donovan Rebbechi
http://pegasus.rutgers.edu/~elflord/
Jul 22 '05 #9

P: n/a
Przemo Drochomirecki wrote:
"E. Robert Tisdale" <E.**************@jpl.nasa.gov> wrote in message
news:3F**********@jpl.nasa.gov...
Przemo Drochomirecki wrote:

The task was indeed simple.
Read 2.000.000 words (average length = 9),
sort them and write it to new file.
I've made this in STL,
and it was almost 17 times slower than my previous "clear-C-code".
I used <vector>, <algorithm>, <iostream> and <algorithm>.
Is STL really so slow?


No. You just screwed up.
Post both your C and C++ code
so that we can see what you did wrong.

---STL CODE---

#include <string>
#include <conio.h>
#include <iostream>
#include <algorithm>
#include <vector>

using namespace std;
struct wordstruct { string word; };

void read_names(vector<wordstruct>& x)
{
wordstruct q;
while (true) {
cin >> q.word;
if (q.word == string("0"))
break;
x.push_back(q);
}
}


You'll find most of the time is spent in the routine above.

This can be quite time consuming because you will allocate and
deallocate string("0") for every iteration.

if (q.word == string("0"))

You might do this:

if (q.word.length()==1 && q.word == '0')

Still, this is probably ony a 10-15% improvement if the compiler could
not optimize it.

Also, "cin >>" is really quite expensive as well and probably where
you're spending most of the time.

Also, instead of using a vector, you might be better of using a "deque"
container to limit the re-allocation required when growing the vector.

Probably the fastest implementation is to map the file into memory
(off-topic here) and parse the file in memory and then use std::sort.

Jul 22 '05 #10

P: n/a
In article <bt**********@nemesis.news.tpi.pl>, Przemo Drochomirecki wrote:

int main()
{
vector<wordstruct> x;
read_names(x);
sort(x.begin(), x.end(), wordc);
// vector x is sorted
return 0;
}

compiled under VC++6.0

--- C ---

simple loop reading words with fgets, each word is seperately allocated with
memalloc and
than qsort.

compiled under gcc 3.0


Oh boy. Did you run them on different computers too ?

Cheers,
--
Donovan Rebbechi
http://pegasus.rutgers.edu/~elflord/
Jul 22 '05 #11

P: n/a
In article <bt**********@nemesis.news.tpi.pl>,
Przemo Drochomirecki <pe******@gazeta.pl> wrote:

"E. Robert Tisdale" <E.**************@jpl.nasa.gov> wrote in message
news:3F**********@jpl.nasa.gov...
Przemo Drochomirecki wrote:
> The task was indeed simple.
> Read 2.000.000 words (average length = 9),
> sort them and write it to new file.
> I've made this in STL,
> and it was almost 17 times slower than my previous "clear-C-code".
> I used <vector>, <algorithm>, <iostream> and <algorithm>.
> Is STL really so slow?


No. You just screwed up.
Post both your C and C++ code
so that we can see what you did wrong.


---STL CODE---

#include <string>
#include <conio.h>
#include <iostream>
#include <algorithm>
#include <vector>

using namespace std;
struct wordstruct { string word; };

void read_names(vector<wordstruct>& x)
{
wordstruct q;
while (true) {
cin >> q.word;
if (q.word == string("0"))
break;
x.push_back(q);
}
}


Is there some specific reason why you're using the word "0"
as a terminator? It would be more efficient simply to test for
end of file:

while (cin >> q.word)
{
x.push_back(q);
}

It will also speed things up if you estimate the number of words in
advance and reserve space in the vector before starting to read:

x.reserve (desired_size);

In most implementations, when a vector reaches its capacity on
push_back(), it allocates a new buffer twice as big as the old one, and
copies all the current data from the old buffer to the new one.

How do you handle the capacity issue in the C version?

--
Jon Bell <jt*******@presby.edu> Presbyterian College
Dept. of Physics and Computer Science Clinton, South Carolina USA
Jul 22 '05 #12

P: n/a

"Donovan Rebbechi" <ab***@aol.com> wrote in message
news:sl******************@panix2.panix.com...
In article <bt**********@nemesis.news.tpi.pl>, Przemo Drochomirecki wrote:

int main()
{
vector<wordstruct> x;
read_names(x);
sort(x.begin(), x.end(), wordc);
// vector x is sorted
return 0;
}

compiled under VC++6.0

--- C ---

simple loop reading words with fgets, each word is seperately allocated with memalloc and
than qsort.

compiled under gcc 3.0


Oh boy. Did you run them on different computers too ?

Cheers,


It may be simpler than that. If the compiler does not inline the hundreds of
small calls that this makes the performance WILL
be slow on any platform. On several platforms that I have used inlining is
not performed when building a debug version and on
some it is not done unless optimization is turned on.
Jul 22 '05 #13

P: n/a
Przemo Drochomirecki wrote:


compiled under VC++6.0

--- C ---

simple loop reading words with fgets, each word is seperately allocated with
memalloc and
than qsort.
How did you grow the array when words are added?

compiled under gcc 3.0

thx for help:) (all five STL-masters)


Also: Did you ever consider a std::map for doing things like that?

--
Karl Heinz Buchegger
kb******@gascad.at
Jul 22 '05 #14

P: n/a
On Fri, 9 Jan 2004 04:07:02 -0800, "Przemo Drochomirecki"
<pe******@gazeta.pl> wrote:

"E. Robert Tisdale" <E.**************@jpl.nasa.gov> wrote in message
news:3F**********@jpl.nasa.gov...
Przemo Drochomirecki wrote:
> The task was indeed simple.
> Read 2.000.000 words (average length = 9),
> sort them and write it to new file.
> I've made this in STL,
> and it was almost 17 times slower than my previous "clear-C-code".
> I used <vector>, <algorithm>, <iostream> and <algorithm>.
> Is STL really so slow?
No. You just screwed up.
Post both your C and C++ code
so that we can see what you did wrong.


---STL CODE---

#include <string>
#include <conio.h>
#include <iostream>
#include <algorithm>
#include <vector>

using namespace std;
struct wordstruct { string word; };


Why have wordstruct? What's wrong with using string directly?

void read_names(vector<wordstruct>& x)
{
wordstruct q;
while (true) {
cin >> q.word;
if (q.word == string("0"))
break;
The above would be much more efficient as:

if (q.word == "0")

since that saves an extra memory allocation per iteration.
x.push_back(q);
}
}
struct wordCompare
{
bool operator()(const wordstruct& a, const wordstruct& b) {
return a.word<b.word;
}
The above should be:

bool operator()(const wordstruct& a, const wordstruct& b) const {
return a.word<b.word;
}

(not that that will improve performance)
};

wordCompare wordc;

int main()
{
Here you should add:
std::ios::sync_with_stdio(false);
since buffering will be disabled on cin and cout on some
implementations if you don't.
vector<wordstruct> x;
Here you might want to reserve some space in the vector:
x.reserve(1000); //or more
read_names(x);
sort(x.begin(), x.end(), wordc);
// vector x is sorted
return 0;
}

compiled under VC++6.0

--- C ---

simple loop reading words with fgets, each word is seperately allocated with
memalloc and
than qsort.

compiled under gcc 3.0

thx for help:) (all five STL-masters)


Why did you compile the two using different compilers!? Also make sure
you use maximum optimization settings - C++ code relies heavily on
optimization to inline all of the small functions.

Tom

C++ FAQ: http://www.parashift.com/c++-faq-lite/
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
Jul 22 '05 #15

P: n/a

"Przemo Drochomirecki" <pe******@gazeta.pl> skrev i en meddelelse
news:bt**********@nemesis.news.tpi.pl...
Hi,
The task was indeed simple. Read 2.000.000 words (average length = 9), sort them and write it to new file.
I've made this in STL, and it was almost 17 times slower than my previous
"clear-C-code".
I used <vector>, <algorithm>, <iostream> and <algorithm>. Is STL really so
slow?

Thx in adv.
Przemo

p.s. i know that STL uses IntrospectiveSort which seems to be good choice, i suppose that INPUT (cin) is extremaly slow,
and <vector> as a dynamic structure also isn't to fast... any ideas?


Hi Przemo

Some general comments. First it seems that the Microsoft
stream-implementation is not very good. The penalty of using streams
compared to C file I/O is significant in your example - giving (if i
remember correctly) a factor of four in performance. For GCC streams should
have the same performance as C file I/O.
Secondly, when compiling C++ optimization is far more important than in C.
So if you do not optimize your program you can be quite certain it will be
slow - a factor of say 5 or 17 should not be surprising at all.
Thirdly, your implementation is sub-optimal as pointed out by others.
With VC++ 6.0 (which from an optimization point of view isnt that bad), the
performance of <vector> should be comparable to the C-way of doing things -
and your sort should be much faster.

Kind regards
Peter
Jul 22 '05 #16

P: n/a
Peter Koch Larsen wrote:
"Przemo Drochomirecki" <pe******@gazeta.pl> skrev i en meddelelse
news:bt**********@nemesis.news.tpi.pl...

Some general comments. First it seems that the Microsoft
stream-implementation is not very good. The penalty of using streams
compared to C file I/O is significant in your example - giving (if i
remember correctly) a factor of four in performance. For GCC streams should
have the same performance as C file I/O.


They are very slow too :-( At least a year ago it was so.

I did a profiling for write operations with gcc 3.2 and was
surprised that it was 8:1 but I cant remember the exact
value. But it was high!

c u

Christoph
Jul 22 '05 #17

P: n/a
"David White" <no@email.provided> wrote in message
news:wx******************@nasal.pacific.net.au...
p.s. i know that STL uses IntrospectiveSort which seems to be good
choice, i
suppose that INPUT (cin) is extremaly slow,
and <vector> as a dynamic structure also isn't to fast... any ideas?
I doubt that it's inherently slow. It depends on the implementation. I
remember tracing through the stream output on an MS compiler and

discovered that it fabricated a format string and called sprintf! Fifty billion dollars in the bank, but they chose the cheapest, nastiest implementation

possible.

Well, I didn't have access to any of that $50B when I wrote that code
in 1993, but I did write significant chunks of the C and C++ Standards
in those areas. I knew that printf gets right all sorts of subtle
corner cases that practically every iostreams implementation botched
one way or the other. I also had mineral rights to all the code I needed
to do the job other than `cheap and nasty,' and I was unable to get any
significant improvement over fabricating a format string and calling
sprintf.

FWIW, Microsoft's stash has roughly doubled since the day they chose to
adopt our cheap and nasty approach. Coincidence? (Probably.)

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com
Jul 22 '05 #18

P: n/a
"Peter Koch Larsen" <pk*@mailme.dk> wrote in message
news:3f**********************@dread14.news.tele.dk ...
Some general comments. First it seems that the Microsoft
stream-implementation is not very good. The penalty of using streams
compared to C file I/O is significant in your example - giving (if i
remember correctly) a factor of four in performance. For GCC streams should have the same performance as C file I/O.
Uh, there was a recent thread that showed neither of these factoids
to be all that true.
Secondly, when compiling C++ optimization is far more important than in C.
So if you do not optimize your program you can be quite certain it will be
slow - a factor of say 5 or 17 should not be surprising at all.
That I agree with.
Thirdly, your implementation is sub-optimal as pointed out by others.
With VC++ 6.0 (which from an optimization point of view isnt that bad), the performance of <vector> should be comparable to the C-way of doing things - and your sort should be much faster.


Also agree.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com
Jul 22 '05 #19

P: n/a
In article <ru****************@news-binary.blueyonder.co.uk>, Nick Hounsome wrote:
Oh boy. Did you run them on different computers too ?

Cheers,
It may be simpler than that.


Yes, I'm almost certain it is. What I meant was that he's manipulated several
variables, including the ones he's interested in.
If the compiler does not inline the hundreds of
small calls that this makes the performance WILL
be slow on any platform.


Yes, as I said in my other post.

Cheers,
--
Donovan Rebbechi
http://pegasus.rutgers.edu/~elflord/
Jul 22 '05 #20

P: n/a
> The task was indeed simple. Read 2.000.000 words (average length = 9),
sort
them and write it to new file.


So, STL can reserve some space for each string.
+ string object overhead + vector object and you can have 512 bytes for each
word.
That is ~ 1GB for all you data !!! + program itselt & OS.
So you OS uses virtual memory, which is SLOOOOW.
STL is nice toy, but not for this kind of things.

Tõnu.
Jul 22 '05 #21

P: n/a
I'm quite new to STL too...

but one thought was about the dynamic resizing of the vector as you
fill it.
AFAIK the vector will extend itself dynamically until it can no longer
remain a continuous allocation in memory, upon which is reallocates
itself in a new area of memory... (?)

Anyway, Can expensive allocations / deallocations / copies be avoided
by by using the reserve() member function to *try to* ensure that
there is enough contiginous space around your vector to avoid
reallocations....?

MHTWEOTSH...

"Przemo Drochomirecki" <pe******@gazeta.pl> wrote in message news:<bt**********@nemesis.news.tpi.pl>...
Hi,
The task was indeed simple. Read 2.000.000 words (average length = 9), sort
them and write it to new file.
I've made this in STL, and it was almost 17 times slower than my previous
"clear-C-code".
I used <vector>, <algorithm>, <iostream> and <algorithm>. Is STL really so
slow?

Thx in adv.
Przemo

p.s. i know that STL uses IntrospectiveSort which seems to be good choice, i
suppose that INPUT (cin) is extremaly slow,
and <vector> as a dynamic structure also isn't to fast... any ideas?

Jul 22 '05 #22

P: n/a
Please don't top-post.

EnTn wrote:
I'm quite new to STL too...

but one thought was about the dynamic resizing of the vector as you
fill it.
AFAIK the vector will extend itself dynamically until it can no longer
remain a continuous allocation in memory, upon which is reallocates
itself in a new area of memory... (?)
That's the idea.
Anyway, Can expensive allocations / deallocations / copies be avoided
by by using the reserve() member function to *try to* ensure that
there is enough contiginous space around your vector to avoid
reallocations....?
Yes, that's the whole purpose of the "reserve" method.
MHTWEOTSH...
I give up. Is that a common greeting in some language I don't speak, or
does it stand for something?

"Przemo Drochomirecki" <pe******@gazeta.pl> wrote in message news:<bt**********@nemesis.news.tpi.pl>...
Hi,
The task was indeed simple. Read 2.000.000 words (average length = 9), sort
them and write it to new file.
I've made this in STL, and it was almost 17 times slower than my previous
"clear-C-code".
I used <vector>, <algorithm>, <iostream> and <algorithm>. Is STL really so
slow?

Thx in adv.
Przemo

p.s. i know that STL uses IntrospectiveSort which seems to be good choice, i
suppose that INPUT (cin) is extremaly slow,
and <vector> as a dynamic structure also isn't to fast... any ideas?


Jul 22 '05 #23

P: n/a
tom_usenet wrote:
On Fri, 9 Jan 2004 04:07:02 -0800, "Przemo Drochomirecki"
<pe******@gazeta.pl> wrote:

"E. Robert Tisdale" <E.**************@jpl.nasa.gov> wrote in message
news:3F**********@jpl.nasa.gov...
Przemo Drochomirecki wrote:

> The task was indeed simple.
> Read 2.000.000 words (average length = 9),
> sort them and write it to new file.
> I've made this in STL,
> and it was almost 17 times slower than my previous "clear-C-code".
> I used <vector>, <algorithm>, <iostream> and <algorithm>.
> Is STL really so slow?

No. You just screwed up.
Post both your C and C++ code
so that we can see what you did wrong.

---STL CODE---

#include <string>
#include <conio.h>
#include <iostream>
#include <algorithm>
#include <vector>

using namespace std;
struct wordstruct { string word; };


Why have wordstruct? What's wrong with using string directly?

void read_names(vector<wordstruct>& x)
{
wordstruct q;
while (true) {
cin >> q.word;
if (q.word == string("0"))
break;


The above would be much more efficient as:

if (q.word == "0")

since that saves an extra memory allocation per iteration.


Are you sure here that the compiler would not just create a string("0") ?

Personally I would use a either a const static or pre constructed const
string, is this overkill?

class HowIWouldDoIt
{
private:
const static String _testString;
public:
read_names(std::vector<wordstruct>& x)
};

HowIWouldDoIt::_testString = "0";

void read_names(vector<wordstruct>& x)
{
wordstruct q;
while (true) {
cin >> q.word;
if (q.word == _testString )
break;

cont...
<snip>
Here you should add:
std::ios::sync_with_stdio(false);
since buffering will be disabled on cin and cout on some
implementations if you don't.
Interesting... I've not used this before.
vector<wordstruct> x;


Here you might want to reserve some space in the vector:
x.reserve(1000); //or more


I think this will probably be the crux of the problem, the other stuff I'd
consider "fine tuning", but worthwhile none the less.
read_names(x);
sort(x.begin(), x.end(), wordc);
// vector x is sorted
return 0;
}


Out of intestest, I wonder how this would perform using a set<> ? ie sorted
(operator<) on insert......

:-)

--
Regards

Sean Clarke
-------------------------------------------------
Linux.... for those whose IQ is greater than 98 !!
Jul 22 '05 #24

P: n/a
EnTn wrote:

Anyway, Can expensive allocations / deallocations / copies be avoided
by by using the reserve() member function to *try to* ensure that
there is enough contiginous space around your vector to avoid
reallocations....?


That's the whole idea of reserve.

Another possibility for the OP would be to change strategie. He could
try to use a std::map instead of the vector. I did this once when
toying around (counting and sorting the words of the bible) and achieved
a noticable speedup.

--
Karl Heinz Buchegger
kb******@gascad.at
Jul 22 '05 #25

P: n/a
On Fri, 09 Jan 2004 05:30:01 -0800, EnTn wrote:
Anyway, Can expensive allocations / deallocations / copies be avoided
by by using the reserve() member function to *try to* ensure that
there is enough contiginous space around your vector to avoid
reallocations....?


As long as you interpret *try to* as *throws an exception when out of
memory*, yes. A vectors contents is always contiginous.

HTH,
M4

Jul 22 '05 #26

P: n/a
On Fri, 09 Jan 2004 15:10:30 +0200, Tõnu Aas wrote:
The task was indeed simple. Read 2.000.000 words (average length = 9),

sort
them and write it to new file.


So, STL can reserve some space for each string.
+ string object overhead + vector object and you can have 512 bytes for each
word.
That is ~ 1GB for all you data !!! + program itselt & OS.
So you OS uses virtual memory, which is SLOOOOW.
STL is nice toy, but not for this kind of things.


No. This is an algorithmic issue. You can do the same in C which will be
equally slow, or you can choose another algortihm.

This case is one where the overhead of the STL (if any) completely
disapears.

HTH,
M4

Jul 22 '05 #27

P: n/a
On Fri, 09 Jan 2004 14:47:31 +0000, Sean Clarke
<se*********@no-spam.sec-consulting.co.uk> wrote:
The above would be much more efficient as:

if (q.word == "0")

since that saves an extra memory allocation per iteration.
Are you sure here that the compiler would not just create a string("0") ?


Well, its a QOI issue (operator== can make as many allocations as it
likes), but there is a
template<class charT, class traits, class Allocator>
bool operator==(const basic_string<charT,traits,Allocator>& lhs, const
charT* rhs);

An implementation would have to be amazingly stupid to just forward
the call to the 2 string version, triggering an unnecessary string
construction.
Personally I would use a either a const static or pre constructed const
string, is this overkill?
A pre-constructed one is a better solution - in theory the comparison
can be slightly faster.
class HowIWouldDoIt
{
private:
const static String _testString;
public:
read_names(std::vector<wordstruct>& x)
};

HowIWouldDoIt::_testString = "0";

void read_names(vector<wordstruct>& x)
{
wordstruct q;
while (true) {
cin >> q.word;
if (q.word == _testString )
break;
How about just:

wordstruct q;
std::string const testString("0");
while (true) {
cin >> q.word;
if (q.word == testString)
break;
vector<wordstruct> x;


Here you might want to reserve some space in the vector:
x.reserve(1000); //or more


I think this will probably be the crux of the problem, the other stuff I'd
consider "fine tuning", but worthwhile none the less.


vector::reserve is useful, but because of the exponential growth
behaviour of vector, it only tends to make a small difference. If you
don't pre-allocate, you typically only construct twice as many
objects. With a reference counted string (rare these days) the gain
will be even smaller.
read_names(x);
sort(x.begin(), x.end(), wordc);
// vector x is sorted
return 0;
}


Out of intestest, I wonder how this would perform using a set<> ? ie sorted
(operator<) on insert......


Probably quite a bit worse. The O complexity will be the same, but the
constant is likely to be bigger.

Tom

C++ FAQ: http://www.parashift.com/c++-faq-lite/
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
Jul 22 '05 #28

P: n/a
"P.J. Plauger" <pj*@dinkumware.com> wrote in message
news:0%****************@nwrddc03.gnilink.net...
"David White" <no@email.provided> wrote in message
news:wx******************@nasal.pacific.net.au...
p.s. i know that STL uses IntrospectiveSort which seems to be good
choice,
i
suppose that INPUT (cin) is extremaly slow,
and <vector> as a dynamic structure also isn't to fast... any ideas?


I doubt that it's inherently slow. It depends on the implementation. I
remember tracing through the stream output on an MS compiler and

discovered
that it fabricated a format string and called sprintf! Fifty billion

dollars
in the bank, but they chose the cheapest, nastiest implementation

possible.

Well, I didn't have access to any of that $50B when I wrote that code
in 1993, but I did write significant chunks of the C and C++ Standards
in those areas. I knew that printf gets right all sorts of subtle
corner cases that practically every iostreams implementation botched
one way or the other. I also had mineral rights to all the code I needed
to do the job other than `cheap and nasty,' and I was unable to get any
significant improvement over fabricating a format string and calling
sprintf.


That certainly sounds remarkable. In the case of string output with a given
field width there doesn't _seem_ to be a lot to do if it is done directly
(speaking from zero experience in implementing such things). To create a
format string and then have sprintf interpret it and then do the output
sounds like a lot of added overhead for a short string.

What about input? I remember reading a large text file full of numbers in
VC++ 5 or 6 and having to rewrite the code the C way because the C++
ifstream was many, many times slower. Maybe this is a clue to why sprintf
didn't make much difference to the output speed: there's already so much
overhead in C++ streams that using the C library didn't matter. If so,
programmers used to C won't exactly by encouraged to switch to streams.
FWIW, Microsoft's stash has roughly doubled since the day they chose to
adopt our cheap and nasty approach. Coincidence? (Probably.)


No, I think you deserve a cut :-)

DW

P.S. Something I couldn't remember was whether you used sprintf to fabricate
the format string as well. Did you?

Jul 22 '05 #29

P: n/a
"David White" <no@email.provided> wrote in message
news:gC******************@nasal.pacific.net.au...
"P.J. Plauger" <pj*@dinkumware.com> wrote in message
news:0%****************@nwrddc03.gnilink.net...
"David White" <no@email.provided> wrote in message
news:wx******************@nasal.pacific.net.au...
> p.s. i know that STL uses IntrospectiveSort which seems to be good choice,
i
> suppose that INPUT (cin) is extremaly slow,
> and <vector> as a dynamic structure also isn't to fast... any ideas?

I doubt that it's inherently slow. It depends on the implementation. I
remember tracing through the stream output on an MS compiler and

discovered
that it fabricated a format string and called sprintf! Fifty billion

dollars
in the bank, but they chose the cheapest, nastiest implementation

possible.

Well, I didn't have access to any of that $50B when I wrote that code
in 1993, but I did write significant chunks of the C and C++ Standards
in those areas. I knew that printf gets right all sorts of subtle
corner cases that practically every iostreams implementation botched
one way or the other. I also had mineral rights to all the code I needed
to do the job other than `cheap and nasty,' and I was unable to get any
significant improvement over fabricating a format string and calling
sprintf.


That certainly sounds remarkable. In the case of string output with a

given field width there doesn't _seem_ to be a lot to do if it is done directly
(speaking from zero experience in implementing such things). To create a
format string and then have sprintf interpret it and then do the output
sounds like a lot of added overhead for a short string.
That's the trouble with software. What *seems* inefficient can only be
verified by measurement. Ones intuition is so often wrong.
What about input? I remember reading a large text file full of numbers in
VC++ 5 or 6 and having to rewrite the code the C way because the C++
ifstream was many, many times slower. Maybe this is a clue to why sprintf
didn't make much difference to the output speed: there's already so much
overhead in C++ streams that using the C library didn't matter. If so,
programmers used to C won't exactly by encouraged to switch to streams.
Perhaps you ran afoul of the regrettable bug we had in that version.
See http://www.dinkumware.com/vc_fixes.html for the one-line fix.
If you opened a stream by filename, the bug defeated file buffering.
But absent this bug, the raw overhead of shoveling bytes through
iostreams isn't all that bad.

And to answer your leadoff question, we supply our own scanners for
integers and floating-point fields, since scanf has always been
klunkier than printf.
FWIW, Microsoft's stash has roughly doubled since the day they chose to
adopt our cheap and nasty approach. Coincidence? (Probably.)


No, I think you deserve a cut :-)


I keep trying...
P.S. Something I couldn't remember was whether you used sprintf to fabricate the format string as well. Did you?


No.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com
Jul 22 '05 #30

P: n/a
Przemo Drochomirecki wrote:
[code redacted]
compiled under VC++6.0


There's your problem. VC5 and VC6 had a known buffering issue with ifstreams.
PJ has commented on it, and there is a fix somewhere on the DinkumWare website.

red floyd

Jul 22 '05 #31

This discussion thread is closed

Replies have been disabled for this discussion.