473,785 Members | 2,863 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

STL speed

Hi,
The task was indeed simple. Read 2.000.000 words (average length = 9), sort
them and write it to new file.
I've made this in STL, and it was almost 17 times slower than my previous
"clear-C-code".
I used <vector>, <algorithm>, <iostream> and <algorithm>. Is STL really so
slow?

Thx in adv.
Przemo

p.s. i know that STL uses IntrospectiveSo rt which seems to be good choice, i
suppose that INPUT (cin) is extremaly slow,
and <vector> as a dynamic structure also isn't to fast... any ideas?
Jul 22 '05
30 2746
> The task was indeed simple. Read 2.000.000 words (average length = 9),
sort
them and write it to new file.


So, STL can reserve some space for each string.
+ string object overhead + vector object and you can have 512 bytes for each
word.
That is ~ 1GB for all you data !!! + program itselt & OS.
So you OS uses virtual memory, which is SLOOOOW.
STL is nice toy, but not for this kind of things.

Tõnu.
Jul 22 '05 #21
I'm quite new to STL too...

but one thought was about the dynamic resizing of the vector as you
fill it.
AFAIK the vector will extend itself dynamically until it can no longer
remain a continuous allocation in memory, upon which is reallocates
itself in a new area of memory... (?)

Anyway, Can expensive allocations / deallocations / copies be avoided
by by using the reserve() member function to *try to* ensure that
there is enough contiginous space around your vector to avoid
reallocations.. ..?

MHTWEOTSH...

"Przemo Drochomirecki" <pe******@gazet a.pl> wrote in message news:<bt******* ***@nemesis.new s.tpi.pl>...
Hi,
The task was indeed simple. Read 2.000.000 words (average length = 9), sort
them and write it to new file.
I've made this in STL, and it was almost 17 times slower than my previous
"clear-C-code".
I used <vector>, <algorithm>, <iostream> and <algorithm>. Is STL really so
slow?

Thx in adv.
Przemo

p.s. i know that STL uses IntrospectiveSo rt which seems to be good choice, i
suppose that INPUT (cin) is extremaly slow,
and <vector> as a dynamic structure also isn't to fast... any ideas?

Jul 22 '05 #22
Please don't top-post.

EnTn wrote:
I'm quite new to STL too...

but one thought was about the dynamic resizing of the vector as you
fill it.
AFAIK the vector will extend itself dynamically until it can no longer
remain a continuous allocation in memory, upon which is reallocates
itself in a new area of memory... (?)
That's the idea.
Anyway, Can expensive allocations / deallocations / copies be avoided
by by using the reserve() member function to *try to* ensure that
there is enough contiginous space around your vector to avoid
reallocations.. ..?
Yes, that's the whole purpose of the "reserve" method.
MHTWEOTSH...
I give up. Is that a common greeting in some language I don't speak, or
does it stand for something?

"Przemo Drochomirecki" <pe******@gazet a.pl> wrote in message news:<bt******* ***@nemesis.new s.tpi.pl>...
Hi,
The task was indeed simple. Read 2.000.000 words (average length = 9), sort
them and write it to new file.
I've made this in STL, and it was almost 17 times slower than my previous
"clear-C-code".
I used <vector>, <algorithm>, <iostream> and <algorithm>. Is STL really so
slow?

Thx in adv.
Przemo

p.s. i know that STL uses IntrospectiveSo rt which seems to be good choice, i
suppose that INPUT (cin) is extremaly slow,
and <vector> as a dynamic structure also isn't to fast... any ideas?


Jul 22 '05 #23
tom_usenet wrote:
On Fri, 9 Jan 2004 04:07:02 -0800, "Przemo Drochomirecki"
<pe******@gazet a.pl> wrote:

"E. Robert Tisdale" <E.************ **@jpl.nasa.gov > wrote in message
news:3F****** ****@jpl.nasa.g ov...
Przemo Drochomirecki wrote:

> The task was indeed simple.
> Read 2.000.000 words (average length = 9),
> sort them and write it to new file.
> I've made this in STL,
> and it was almost 17 times slower than my previous "clear-C-code".
> I used <vector>, <algorithm>, <iostream> and <algorithm>.
> Is STL really so slow?

No. You just screwed up.
Post both your C and C++ code
so that we can see what you did wrong.

---STL CODE---

#include <string>
#include <conio.h>
#include <iostream>
#include <algorithm>
#include <vector>

using namespace std;
struct wordstruct { string word; };


Why have wordstruct? What's wrong with using string directly?

void read_names(vect or<wordstruct>& x)
{
wordstruct q;
while (true) {
cin >> q.word;
if (q.word == string("0"))
break;


The above would be much more efficient as:

if (q.word == "0")

since that saves an extra memory allocation per iteration.


Are you sure here that the compiler would not just create a string("0") ?

Personally I would use a either a const static or pre constructed const
string, is this overkill?

class HowIWouldDoIt
{
private:
const static String _testString;
public:
read_names(std: :vector<wordstr uct>& x)
};

HowIWouldDoIt:: _testString = "0";

void read_names(vect or<wordstruct>& x)
{
wordstruct q;
while (true) {
cin >> q.word;
if (q.word == _testString )
break;

cont...
<snip>
Here you should add:
std::ios::sync_ with_stdio(fals e);
since buffering will be disabled on cin and cout on some
implementations if you don't.
Interesting... I've not used this before.
vector<wordstru ct> x;


Here you might want to reserve some space in the vector:
x.reserve(1000) ; //or more


I think this will probably be the crux of the problem, the other stuff I'd
consider "fine tuning", but worthwhile none the less.
read_names(x);
sort(x.begin(), x.end(), wordc);
// vector x is sorted
return 0;
}


Out of intestest, I wonder how this would perform using a set<> ? ie sorted
(operator<) on insert......

:-)

--
Regards

Sean Clarke
-------------------------------------------------
Linux.... for those whose IQ is greater than 98 !!
Jul 22 '05 #24
EnTn wrote:

Anyway, Can expensive allocations / deallocations / copies be avoided
by by using the reserve() member function to *try to* ensure that
there is enough contiginous space around your vector to avoid
reallocations.. ..?


That's the whole idea of reserve.

Another possibility for the OP would be to change strategie. He could
try to use a std::map instead of the vector. I did this once when
toying around (counting and sorting the words of the bible) and achieved
a noticable speedup.

--
Karl Heinz Buchegger
kb******@gascad .at
Jul 22 '05 #25
On Fri, 09 Jan 2004 05:30:01 -0800, EnTn wrote:
Anyway, Can expensive allocations / deallocations / copies be avoided
by by using the reserve() member function to *try to* ensure that
there is enough contiginous space around your vector to avoid
reallocations.. ..?


As long as you interpret *try to* as *throws an exception when out of
memory*, yes. A vectors contents is always contiginous.

HTH,
M4

Jul 22 '05 #26
On Fri, 09 Jan 2004 15:10:30 +0200, Tõnu Aas wrote:
The task was indeed simple. Read 2.000.000 words (average length = 9),

sort
them and write it to new file.


So, STL can reserve some space for each string.
+ string object overhead + vector object and you can have 512 bytes for each
word.
That is ~ 1GB for all you data !!! + program itselt & OS.
So you OS uses virtual memory, which is SLOOOOW.
STL is nice toy, but not for this kind of things.


No. This is an algorithmic issue. You can do the same in C which will be
equally slow, or you can choose another algortihm.

This case is one where the overhead of the STL (if any) completely
disapears.

HTH,
M4

Jul 22 '05 #27
On Fri, 09 Jan 2004 14:47:31 +0000, Sean Clarke
<se*********@ no-spam.sec-consulting.co.u k> wrote:
The above would be much more efficient as:

if (q.word == "0")

since that saves an extra memory allocation per iteration.
Are you sure here that the compiler would not just create a string("0") ?


Well, its a QOI issue (operator== can make as many allocations as it
likes), but there is a
template<class charT, class traits, class Allocator>
bool operator==(cons t basic_string<ch arT,traits,Allo cator>& lhs, const
charT* rhs);

An implementation would have to be amazingly stupid to just forward
the call to the 2 string version, triggering an unnecessary string
construction.
Personally I would use a either a const static or pre constructed const
string, is this overkill?
A pre-constructed one is a better solution - in theory the comparison
can be slightly faster.
class HowIWouldDoIt
{
private:
const static String _testString;
public:
read_names(std: :vector<wordstr uct>& x)
};

HowIWouldDoIt: :_testString = "0";

void read_names(vect or<wordstruct>& x)
{
wordstruct q;
while (true) {
cin >> q.word;
if (q.word == _testString )
break;
How about just:

wordstruct q;
std::string const testString("0") ;
while (true) {
cin >> q.word;
if (q.word == testString)
break;
vector<wordstru ct> x;


Here you might want to reserve some space in the vector:
x.reserve(1000) ; //or more


I think this will probably be the crux of the problem, the other stuff I'd
consider "fine tuning", but worthwhile none the less.


vector::reserve is useful, but because of the exponential growth
behaviour of vector, it only tends to make a small difference. If you
don't pre-allocate, you typically only construct twice as many
objects. With a reference counted string (rare these days) the gain
will be even smaller.
read_names(x);
sort(x.begin(), x.end(), wordc);
// vector x is sorted
return 0;
}


Out of intestest, I wonder how this would perform using a set<> ? ie sorted
(operator<) on insert......


Probably quite a bit worse. The O complexity will be the same, but the
constant is likely to be bigger.

Tom

C++ FAQ: http://www.parashift.com/c++-faq-lite/
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
Jul 22 '05 #28
"P.J. Plauger" <pj*@dinkumware .com> wrote in message
news:0%******** ********@nwrddc 03.gnilink.net. ..
"David White" <no@email.provi ded> wrote in message
news:wx******** **********@nasa l.pacific.net.a u...
p.s. i know that STL uses IntrospectiveSo rt which seems to be good
choice,
i
suppose that INPUT (cin) is extremaly slow,
and <vector> as a dynamic structure also isn't to fast... any ideas?


I doubt that it's inherently slow. It depends on the implementation. I
remember tracing through the stream output on an MS compiler and

discovered
that it fabricated a format string and called sprintf! Fifty billion

dollars
in the bank, but they chose the cheapest, nastiest implementation

possible.

Well, I didn't have access to any of that $50B when I wrote that code
in 1993, but I did write significant chunks of the C and C++ Standards
in those areas. I knew that printf gets right all sorts of subtle
corner cases that practically every iostreams implementation botched
one way or the other. I also had mineral rights to all the code I needed
to do the job other than `cheap and nasty,' and I was unable to get any
significant improvement over fabricating a format string and calling
sprintf.


That certainly sounds remarkable. In the case of string output with a given
field width there doesn't _seem_ to be a lot to do if it is done directly
(speaking from zero experience in implementing such things). To create a
format string and then have sprintf interpret it and then do the output
sounds like a lot of added overhead for a short string.

What about input? I remember reading a large text file full of numbers in
VC++ 5 or 6 and having to rewrite the code the C way because the C++
ifstream was many, many times slower. Maybe this is a clue to why sprintf
didn't make much difference to the output speed: there's already so much
overhead in C++ streams that using the C library didn't matter. If so,
programmers used to C won't exactly by encouraged to switch to streams.
FWIW, Microsoft's stash has roughly doubled since the day they chose to
adopt our cheap and nasty approach. Coincidence? (Probably.)


No, I think you deserve a cut :-)

DW

P.S. Something I couldn't remember was whether you used sprintf to fabricate
the format string as well. Did you?

Jul 22 '05 #29
"David White" <no@email.provi ded> wrote in message
news:gC******** **********@nasa l.pacific.net.a u...
"P.J. Plauger" <pj*@dinkumware .com> wrote in message
news:0%******** ********@nwrddc 03.gnilink.net. ..
"David White" <no@email.provi ded> wrote in message
news:wx******** **********@nasa l.pacific.net.a u...
> p.s. i know that STL uses IntrospectiveSo rt which seems to be good choice,
i
> suppose that INPUT (cin) is extremaly slow,
> and <vector> as a dynamic structure also isn't to fast... any ideas?

I doubt that it's inherently slow. It depends on the implementation. I
remember tracing through the stream output on an MS compiler and

discovered
that it fabricated a format string and called sprintf! Fifty billion

dollars
in the bank, but they chose the cheapest, nastiest implementation

possible.

Well, I didn't have access to any of that $50B when I wrote that code
in 1993, but I did write significant chunks of the C and C++ Standards
in those areas. I knew that printf gets right all sorts of subtle
corner cases that practically every iostreams implementation botched
one way or the other. I also had mineral rights to all the code I needed
to do the job other than `cheap and nasty,' and I was unable to get any
significant improvement over fabricating a format string and calling
sprintf.


That certainly sounds remarkable. In the case of string output with a

given field width there doesn't _seem_ to be a lot to do if it is done directly
(speaking from zero experience in implementing such things). To create a
format string and then have sprintf interpret it and then do the output
sounds like a lot of added overhead for a short string.
That's the trouble with software. What *seems* inefficient can only be
verified by measurement. Ones intuition is so often wrong.
What about input? I remember reading a large text file full of numbers in
VC++ 5 or 6 and having to rewrite the code the C way because the C++
ifstream was many, many times slower. Maybe this is a clue to why sprintf
didn't make much difference to the output speed: there's already so much
overhead in C++ streams that using the C library didn't matter. If so,
programmers used to C won't exactly by encouraged to switch to streams.
Perhaps you ran afoul of the regrettable bug we had in that version.
See http://www.dinkumware.com/vc_fixes.html for the one-line fix.
If you opened a stream by filename, the bug defeated file buffering.
But absent this bug, the raw overhead of shoveling bytes through
iostreams isn't all that bad.

And to answer your leadoff question, we supply our own scanners for
integers and floating-point fields, since scanf has always been
klunkier than printf.
FWIW, Microsoft's stash has roughly doubled since the day they chose to
adopt our cheap and nasty approach. Coincidence? (Probably.)


No, I think you deserve a cut :-)


I keep trying...
P.S. Something I couldn't remember was whether you used sprintf to fabricate the format string as well. Did you?


No.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com
Jul 22 '05 #30

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

13
23073
by: Yang Li Ke | last post by:
Hi guys, Is it possible to know the internet speed of the visitors with php? Thanx -- Yang
8
2992
by: Rob Ristroph | last post by:
I have tried out PHP 5 for the first time (with assistance from this group -- thanks!). The people I was working with have a site that uses lots of php objects. They are having problems with speed. They had a vague idea that PHP5 has improved handling of objects over PHP4, so it would probably be faster also. In fact it seems slower. We did a few timing loops, in which a a number of objects were created and and members were...
34
2480
by: Jacek Generowicz | last post by:
I have a program in which I make very good use of a memoizer: def memoize(callable): cache = {} def proxy(*args): try: return cache except KeyError: return cache.setdefault(args, callable(*args)) return proxy which, is functionally equivalent to
28
2605
by: Maboroshi | last post by:
Hi I am fairly new to programming but not as such that I am a total beginner From what I understand C and C++ are faster languages than Python. Is this because of Pythons ability to operate on almost any operating system? Or is there many other reasons why? I understand there is ansi/iso C and C++ and that ANSI/ISO Code will work on any system If this is the reason why, than why don't developers create specific Python Distrubutions...
52
3867
by: Neuruss | last post by:
It seems there are quite a few projects aimed to improve Python's speed and, therefore, eliminate its main limitation for mainstream acceptance. I just wonder what do you all think? Will Python (and dynamic languages in general) be someday close to compiled languages speed? What will be the future of Psyco, Pypy, Starkiller, Ironpython and all the other projects currently on development?
7
3049
by: YAZ | last post by:
Hello, I have a dll which do some number crunching. Performances (execution speed) are very important in my application. I use VC6 to compile the DLL. A friend of mine told me that in Visual studio 2003 .net optimization were enhanced and that i must gain in performance if I switch to VS 2003 or intel compiler. So I send him the project and he returned a compiled DLL with VS 2003. Result : the VS 2003 compiled Dll is slower than the VC6...
6
2032
by: Ham | last post by:
Yeah, Gotto work with my VB.Net graphic application for days, do any possible type of code optimization, check for unhandled errors and finally come up with sth that can't process 2D graphics and photos at an acceptable speed. I have heard things about the virtual machine of Mr. Net, that it can run my app at a high speed....but could never compare it with Java VM and its speed. Then, what should i do? Go and learn C++ ? Do i have time for...
6
6255
by: Jassim Rahma | last post by:
I want to detect the internet speed using C# to show the user on what speed he's connecting to internet?
11
6494
by: kyosohma | last post by:
Hi, We use a script here at work that runs whenever someone logs into their machine that logs various bits of information to a database. One of those bits is the CPU's model and speed. While this works in 95% of the time, we have some fringe cases where the only thing returned is the processor name. We use this data to help us decide which PCs need to be updated, so it would be nice to have the processor speed in all cases.
4
8622
by: nestle | last post by:
I have DSL with a download speed of 32MB/s and an upload speed of 8MB/s(according to my ISP), and I am using a router. My upload speed is always between 8MB/s and 9MB/s(which is above the max upload speed), ALWAYS. However, my download speed doesn't go over 25MB/s. And when my brother turns on the internet from his computer and takes up half the download/upload speeds (routers automatically split the speeds in two when two computers are using...
0
9480
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10329
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
9950
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8974
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7500
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6740
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5511
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4053
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3650
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.