By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
457,976 Members | 1,162 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 457,976 IT Pros & Developers. It's quick & easy.

String performance

P: n/a
I have recently done a simple measurement of std::string performance in C++,
and I'm shocked how bad it is. C++ is first and foremost meant to be a
performance language, with near-zero overhead. This holds quite well with
regards to the language, but overheads of the std library are absolutely
abysmal. And string is one of the most often used datatypes.

The test program creates 10 million strings and puts them in a container
(see source code #1, #2 below).

Results:
C++ (VS2005, -O2): 7.265s
C# (VS2005): 0.56s

C# string is 13 times faster than C++ std::string. Horrible.

There's a crucial difference between C++ and C# strings - C# strings are
immutable. Does this make a difference? To test it, I created the simplest
and fastest possible C++ string implementation (see code #3 below). It is
absolutely minimalistic (only handles construction). It does not make any
memory allocations, all the memory is obtained from a static pool by simply
incrementing a pointer. There's no deallocation scheme of any sort
implemented - this would certainly slow things down additionally. I don't
think it is possible to create a faster implementation of string in C++ than
#3. _Any_ real-life implementation will be slower.

The results when std::string was replaced with my String?
C++ (VS2005, -O2): 0.64s - still slower than 0.56s of C#

C# string is still 14% faster than the fastest theoretically imaginable C++
string. What?! How did they do it?

One reason may be that C++ string is constructed from a const char *
pointer. This is a fatal C/C++ flaw - the size of such string is unknown,
even if it's given at compile time as literal such as "foo". To find out the
size, the string must be scanned at runtime for 0 character. C# does not
have this limitation. Compiler can place the string directly where it's
needed, and also store its length in a member variable. All during compile
time.

Any comments?

Test programs for reference:

// #1: C++
std::vector<std::stringv;
int main()
{
clock_t t1 = clock();
for (int i = 0; i < 10000000; ++i)
v.push_back("poo");
clock_t t2 = clock();
cout << double(t2 - t1) / CLOCKS_PER_SEC;
}
// #2: C#
class Test
{
public static void Main()
{
DateTime t1 = System.DateTime.Now;
List<stringv = new List<string>();
for (int i = 0; i < 10000000; ++i)
v.Add("foo");
DateTime t2 = System.DateTime.Now;
Console.WriteLine(t2 - t1);
}
}

// #3: std::string replacement used for testing: a fastest theoretically
possible C++ string implementation using an "infinite" memory pool (with no
deallocation)
struct String
{
String(const char *s = "")
{
str = ptr;
do // Copy string to the pool incrementing pool pointer
{
*ptr = *s;
++ptr;
++s;
} while (*s);
}
char *str; // Pointer to this string
data in the pool
static char buffer[100000000]; // Global pool
static char *ptr; // Global free memory
pointer in the pool
};
Oct 20 '07 #1
Share this Question
Share on Google+
24 Replies


P: n/a
Marcin Kalicinski wrote:
:: I have recently done a simple measurement of std::string
:: performance in C++, and I'm shocked how bad it is. C++ is first
:: and foremost meant to be a performance language, with near-zero
:: overhead. This holds quite well with regards to the language, but
:: overheads of the std library are absolutely abysmal. And string is
:: one of the most often used datatypes.
::
:: Test programs for reference:
::
:: // #1: C++
:: std::vector<std::stringv;
:: int main()
:: {

What happens if you add

v.reserve(10000000);

here?
:: clock_t t1 = clock();
:: for (int i = 0; i < 10000000; ++i)
:: v.push_back("poo");
:: clock_t t2 = clock();
:: cout << double(t2 - t1) / CLOCKS_PER_SEC;
:: }
::
::
:: // #2: C#
:: class Test
:: {
:: public static void Main()
:: {
:: DateTime t1 = System.DateTime.Now;
:: List<stringv = new List<string>();
:: for (int i = 0; i < 10000000; ++i)
:: v.Add("foo");
:: DateTime t2 = System.DateTime.Now;
:: Console.WriteLine(t2 - t1);
:: }
:: }
::
Are you sure you are testing std::string and not std::vector?

Bo Persson
Oct 20 '07 #2

P: n/a
What happens if you add
>
v.reserve(10000000);

Are you sure you are testing std::string and not std::vector?
Adding reserve of course makes things way faster:

C++ (VS2005, -O2): 7.265s

C++ (VS2005, -O2 with vector::reserve): 1.42s
C# (VS2005): 0.56s

But still 2.5 times slower than .NET without any "preallocation hints". To
make things more equal, I also altered C# code to preallocate collection
storage as well (see program below).

Result: C# (VS2005 using array): 0.14s

And it's back to over 10x faster than C++...

// C# code with preallocated storage for collection (i.e. roughly an
equivalent of using reserve on C++ vector)
public static void Main()
{
DateTime t1 = System.DateTime.Now;
string[] v = new string[10000000];
for (int i = 0; i < 10000000; ++i)
v[i] = "poo";
DateTime t2 = System.DateTime.Now;
Console.WriteLine(t2 - t1);
}


Oct 20 '07 #3

P: n/a
* Marcin Kalicinski:
>What happens if you add

v.reserve(10000000);

Are you sure you are testing std::string and not std::vector?

Adding reserve of course makes things way faster:

C++ (VS2005, -O2): 7.265s

C++ (VS2005, -O2 with vector::reserve): 1.42s
C# (VS2005): 0.56s

But still 2.5 times slower than .NET without any "preallocation hints". To
make things more equal, I also altered C# code to preallocate collection
storage as well (see program below).

Result: C# (VS2005 using array): 0.14s

And it's back to over 10x faster than C++...

// C# code with preallocated storage for collection (i.e. roughly an
equivalent of using reserve on C++ vector)
public static void Main()
{
DateTime t1 = System.DateTime.Now;
string[] v = new string[10000000];
for (int i = 0; i < 10000000; ++i)
v[i] = "poo";
DateTime t2 = System.DateTime.Now;
Console.WriteLine(t2 - t1);
}
You didn't answer Bo's question. First you were testing vector against
string, now you've thrown in a string array.
--
Derek
Oct 20 '07 #4

P: n/a
Marcin Kalicinski wrote:
....
Any comments?

Test programs for reference:

// #1: C++
std::vector<std::stringv;
int main()
{
clock_t t1 = clock();
for (int i = 0; i < 10000000; ++i)
v.push_back("poo");
clock_t t2 = clock();
cout << double(t2 - t1) / CLOCKS_PER_SEC;
}
A slightly more comparable test:

std::vector<std::stringv;
int main()
{
v.reserve( 10000000 );
std::string poo = "poo";
clock_t t1 = clock();
for (int i = 0; i < 10000000; ++i)
{
v.push_back( poo );
}
std::clock_t t2 = std::clock();
std::cout << double(t2 - t1) / CLOCKS_PER_SEC;
}

If I recall correctly, C# is taking a reference to the objects while C++
you are making copies.

This below would be more akin to your C# example.

std::vector<std::string *v;
int main()
{
v.reserve( 10000000 );
std::string poo = "poo";
clock_t t1 = clock();
for (int i = 0; i < 10000000; ++i)
{
v.push_back( & poo );
}
std::clock_t t2 = std::clock();
std::cout << double(t2 - t1) / CLOCKS_PER_SEC;
}
>
// #2: C#
class Test
{
public static void Main()
{
DateTime t1 = System.DateTime.Now;
List<stringv = new List<string>();
for (int i = 0; i < 10000000; ++i)
v.Add("foo");
DateTime t2 = System.DateTime.Now;
Console.WriteLine(t2 - t1);
}
}
....
Oct 20 '07 #5

P: n/a
You didn't answer Bo's question. First you were testing vector against
string, now you've thrown in a string array.
Ok, you're right. I should also have used C++ array instead of std::vector
with reserve. So there we go:

std::string v[10000000];
int main()
{
clock_t t1 = clock();
for (int i = 0; i < 10000000; ++i)
v[i] = "foo";
clock_t t2 = clock();
cout << double(t2 - t1) / CLOCKS_PER_SEC;
}

Result: 0.42s

Faster than vector+reserve, but still 3x slower than C# 0.14s...

At that point I must say that I've started using .NET only quite recently,
but I've been programming C++ for some 10 years now. So initially I
approached this C# thing with a lot of prejudicament on how slow and bloated
it's gonna be compared to C++. But it turned out the opposite. It's blazing
fast. Granted you can eventually write a program in C++ that is faster than
C# version. But you must try really hard and go very low. Forget any nice
abstractions like std::string. To be faster you have to use strlen, strcmp,
allocate your strings from pools, manually manage pointers, basically go
through hell. And even then your string code may be just slightly faster.

Summarizing, after my experiences with .NET I think C++ could use a new std
library that could compete with latest from Microsoft (and Sun). Containers,
algorithms of the current std are allright, but string and especially
streams are really bad. More focus should be put on memory management.
Allocating everything with operator new by default is a performance killer.
Garbage collected heaps of Java/C# pretty much blow it to pieces. And std
allocators are rather hard to use, plus the entire design of them is flawed,
so it's hardly a real alternative.

cheers,
Marcin
Oct 20 '07 #6

P: n/a
On 2007-10-20 23:26, Marcin Kalicinski wrote:
I have recently done a simple measurement of std::string performance in C++,
and I'm shocked how bad it is. C++ is first and foremost meant to be a
performance language, with near-zero overhead. This holds quite well with
regards to the language, but overheads of the std library are absolutely
abysmal. And string is one of the most often used datatypes.

The test program creates 10 million strings and puts them in a container
(see source code #1, #2 below).
// #2: C#
class Test
{
public static void Main()
{
DateTime t1 = System.DateTime.Now;
List<stringv = new List<string>();
for (int i = 0; i < 10000000; ++i)
v.Add("foo");
DateTime t2 = System.DateTime.Now;
Console.WriteLine(t2 - t1);
}
}
I am not a C# expert, but I thing you are comparing apples and oranges
here. In C# a string is a reference type, and as you pointed out strings
are immutable. What your C# code above does is to create 1 (one) string
and then add 10000000 references to that in a container. The equivalent
C++ code would look something like this.

#include <time.h>
#include <iostream>
#include <vector>

int main()
{
std::vector<char*v;
clock_t t1 = clock();
for (int i = 0; i < 10000000; ++i)
v.push_back("poo");
clock_t t2 = clock();
std::cout << double(t2 - t1) / CLOCKS_PER_SEC;
}

Runs in 0.2 sec in release mode, but my hardware is probably different
from yours.

As you can see there are two problems with this code, first of it does
not involve strings and second, it only measures the performance of your
vector implementation.

--
Erik Wikström
Oct 20 '07 #7

P: n/a
Marcin Kalicinski wrote:
>You didn't answer Bo's question. First you were testing vector against
string, now you've thrown in a string array.

Ok, you're right. I should also have used C++ array instead of std::vector
with reserve. So there we go:

std::string v[10000000];
int main()
{
clock_t t1 = clock();
for (int i = 0; i < 10000000; ++i)
v[i] = "foo";
clock_t t2 = clock();
cout << double(t2 - t1) / CLOCKS_PER_SEC;
}

Result: 0.42s

Faster than vector+reserve, but still 3x slower than C# 0.14s...

At that point I must say that I've started using .NET only quite recently,
but I've been programming C++ for some 10 years now. So initially I
approached this C# thing with a lot of prejudicament on how slow and bloated
it's gonna be compared to C++. But it turned out the opposite. It's blazing
fast. Granted you can eventually write a program in C++ that is faster than
C# version. But you must try really hard and go very low. Forget any nice
abstractions like std::string. To be faster you have to use strlen, strcmp,
allocate your strings from pools, manually manage pointers, basically go
through hell. And even then your string code may be just slightly faster.
Correct me if I am wrong but your C# example not allocating any new
strings while the C++ version does !

The languages are different and you're not measuring apples to apples.
In many cases, it would be that you would do things less efficiently is
C++ and in many cases not. Performance really only matters when it can
make a difference and C++ has many tools available to you when you need
it. I don't know about C# enough to comment.

>
Summarizing, after my experiences with .NET I think C++ could use a new std
library that could compete with latest from Microsoft (and Sun). Containers,
algorithms of the current std are allright, but string and especially
streams are really bad.
You're not measuring string performance in your C# example so the
sentence above makes no sense.

More focus should be put on memory management.
Allocating everything with operator new by default is a performance killer.
Garbage collected heaps of Java/C# pretty much blow it to pieces.
Really ? Well, you can have a C++ garbage collector if you really want
to but there are many good reasons to remain explicit.

And std
allocators are rather hard to use, plus the entire design of them is flawed,
so it's hardly a real alternative.
What is the flaw ?
Oct 20 '07 #8

P: n/a
If I recall correctly, C# is taking a reference to the objects while C++
you are making copies.
Yes, but not in this case. In the test program every string in the C# list
is a separate instance of a string. Of course internally they may share the
same buffer (because the contents are the same), but that's invisible to the
user. But because string in C# are immutable, this is hard to show. A better
test (where all the strings are different) would be:

// C++
char str_template[] = "string template #\0\0\0\0\0\0\0\0\0\0";
std::string v[1000000];
int main()
{
clock_t t1 = clock();
for (int i = 0; i < 1000000; ++i)
{
itoa(i, str_template + 17, 10);
v[i] = str_template;
}
clock_t t2 = clock();
cout << double(t2 - t1) / CLOCKS_PER_SEC;
}
// C#
class Test
{
public static void Main()
{
DateTime t1 = System.DateTime.Now;
string[] v = new string[1000000];
for (int i = 0; i < 1000000; ++i)
v[i] = "string template #" + i.ToString();
DateTime t2 = System.DateTime.Now;
Console.WriteLine(t2 - t1);
}
}

And the results:
C++: 0.829
C#: 1.61

So indeed, when all the strings are different C++ is 2x as fast. My faith in
C++ was restored :-)

cheers,
Marcin
Oct 20 '07 #9

P: n/a
>Summarizing, after my experiences with .NET I think C++ could use a new
>std library that could compete with latest from Microsoft (and Sun).
Containers, algorithms of the current std are allright, but string and
especially streams are really bad.

You're not measuring string performance in your C# example so the sentence
above makes no sense.
Yes, I've made a mistake because all the strings were the same, so C#
compiler was just storing a reference to the same string. When I made them
all different (see the other post) C++ is 2x faster than C#.
More focus should be put on memory management.
>Allocating everything with operator new by default is a performance
killer. Garbage collected heaps of Java/C# pretty much blow it to pieces.
Really ? Well, you can have a C++ garbage collector if you really want to
but there are many good reasons to remain explicit.
Garbage collection for C++ is possible, but as long as we are working with
raw pointers it cannot be as good, because the collector cannot move objects
around. Fast collectors must move objects. Of course one can create a
"garbage collected pointer" class (I don't know if anyone has done this
before) that will support moving: gc_ptr<T>?
And std
>allocators are rather hard to use, plus the entire design of them is
flawed, so it's hardly a real alternative.

What is the flaw ?
The flaw is that using a different allocator changes the type of container.
So vector with allocator A1 has different type than vector with allocator
A2. You cannot write non-templated function that works with vector
regardless of its allocator.

cheers,
Marcin

Oct 20 '07 #10

P: n/a
As you can see there are two problems with this code, first of it does
not involve strings and second, it only measures the performance of your
vector implementation.
You're right. See the other post. When I made all strings different C++ is
2x faster than C#.

cheers,
Marcin
Oct 20 '07 #11

P: n/a
On 2007-10-21 00:56, Marcin Kalicinski wrote:
Allocating everything with operator new by default is a performance killer.
Garbage collected heaps of Java/C# pretty much blow it to pieces.
No, allocating everything with new is idiotic. In C++ you have the
choice of where to place objects, use it. Also, garbage collection might
seem faster since you do not have to spend any time when the object is
"deleted" (goes out of scope or whatever). However you still have to do
the actual garbage collection sometime and that takes time, and in
current C# implementations you never know when that will happen.

I am not saying that C# and such languages are no good, only that all
languages have their strengths and weaknesses, and if you do not use the
language properly the results will never be good. C# have many uses, but
for performance C++ is still the thing. Ever wondered what language the
JIT compiler that .Net apps depends on is written in and why?

--
Erik Wikström
Oct 20 '07 #12

P: n/a
Marcin Kalicinski wrote:
....
So indeed, when all the strings are different C++ is 2x as fast. My faith in
C++ was restored :-)
Um - not really. I think you're measuring your processor's main memory
bandwidth which has very little to do with the generated code.
Oct 21 '07 #13

P: n/a
>So indeed, when all the strings are different C++ is 2x as fast. My faith
>in C++ was restored :-)

Um - not really. I think you're measuring your processor's main memory
bandwidth which has very little to do with the generated code.
Probably. But that means that C++ requires 1/2 of the memory bandwidth to do
the same string operation as C#.
Oct 21 '07 #14

P: n/a
On 2007-10-21 01:38, Marcin Kalicinski wrote:
>>Summarizing, after my experiences with .NET I think C++ could use a new
std library that could compete with latest from Microsoft (and Sun).
Containers, algorithms of the current std are allright, but string and
especially streams are really bad.

You're not measuring string performance in your C# example so the sentence
above makes no sense.

Yes, I've made a mistake because all the strings were the same, so C#
compiler was just storing a reference to the same string. When I made them
all different (see the other post) C++ is 2x faster than C#.
> More focus should be put on memory management.
>>Allocating everything with operator new by default is a performance
killer. Garbage collected heaps of Java/C# pretty much blow it to pieces.
Really ? Well, you can have a C++ garbage collector if you really want to
but there are many good reasons to remain explicit.

Garbage collection for C++ is possible, but as long as we are working with
raw pointers it cannot be as good, because the collector cannot move objects
around. Fast collectors must move objects. Of course one can create a
"garbage collected pointer" class (I don't know if anyone has done this
before) that will support moving: gc_ptr<T>?
Not quite correct, there is nothing in the standard that says that a
pointer must contain the actual address of the object, it just has to
point to it in some way. This actually allows an implementation where
objects are moved. Of course, the problem with allowing objects to move
around is that all references to them usually involves an extra layer of
indirection.

--
Erik Wikström
Oct 21 '07 #15

P: n/a
On 2007-10-21 01:27, Marcin Kalicinski wrote:
>If I recall correctly, C# is taking a reference to the objects while C++
you are making copies.

Yes, but not in this case. In the test program every string in the C# list
is a separate instance of a string. Of course internally they may share the
same buffer (because the contents are the same), but that's invisible to the
user. But because string in C# are immutable, this is hard to show. A better
test (where all the strings are different) would be:

// C++
char str_template[] = "string template #\0\0\0\0\0\0\0\0\0\0";
std::string v[1000000];
int main()
{
clock_t t1 = clock();
for (int i = 0; i < 1000000; ++i)
{
itoa(i, str_template + 17, 10);
v[i] = str_template;
}
clock_t t2 = clock();
cout << double(t2 - t1) / CLOCKS_PER_SEC;
}
// C#
class Test
{
public static void Main()
{
DateTime t1 = System.DateTime.Now;
string[] v = new string[1000000];
for (int i = 0; i < 1000000; ++i)
v[i] = "string template #" + i.ToString();
DateTime t2 = System.DateTime.Now;
Console.WriteLine(t2 - t1);
}
}

And the results:
C++: 0.829
C#: 1.61

So indeed, when all the strings are different C++ is 2x as fast. My faith in
C++ was restored :-)
Actually I am afraid that you are now comparing the difference between
itoa() and ToString(). ToString() is probably not as fast as itoa()
since it performs a culture-aware conversion.

--
Erik Wikström
Oct 21 '07 #16

P: n/a
Gianni Mariani wrote:
:: Marcin Kalicinski wrote:
:::: You didn't answer Bo's question. First you were testing vector
:::: against string, now you've thrown in a string array.
:::
::: Ok, you're right. I should also have used C++ array instead of
::: std::vector with reserve. So there we go:
:::
::: std::string v[10000000];
::: int main()
::: {
::: clock_t t1 = clock();
::: for (int i = 0; i < 10000000; ++i)
::: v[i] = "foo";
::: clock_t t2 = clock();
::: cout << double(t2 - t1) / CLOCKS_PER_SEC;
::: }
:::
::: Result: 0.42s
:::
::: Faster than vector+reserve, but still 3x slower than C# 0.14s...
:::
::: At that point I must say that I've started using .NET only quite
::: recently, but I've been programming C++ for some 10 years now. So
::: initially I approached this C# thing with a lot of prejudicament
::: on how slow and bloated it's gonna be compared to C++. But it
::: turned out the opposite. It's blazing fast. Granted you can
::: eventually write a program in C++ that is faster than C# version.
::: But you must try really hard and go very low. Forget any nice
::: abstractions like std::string. To be faster you have to use
::: strlen, strcmp, allocate your strings from pools, manually manage
::: pointers, basically go through hell. And even then your string
::: code may be just slightly faster.
::
:: Correct me if I am wrong but your C# example not allocating any new
:: strings while the C++ version does !
::
:: The languages are different and you're not measuring apples to
:: apples. In many cases, it would be that you would do things less
:: efficiently is C++ and in many cases not. Performance really only
:: matters when it can make a difference and C++ has many tools
:: available to you when you need it. I don't know about C# enough
:: to comment.
::

Right. If all I really wanted was a vector of 10 million identical
strings, I would do it like this:

// #1c: C++
int main()
{
std::clock_t t1 = std::clock();

std::vector<std::stringv(10000000, "foo");

std::clock_t t2 = std::clock();
std::cout << double(t2 - t1) / CLOCKS_PER_SEC;
}

And it runs even faster.
Bo Persson

Oct 21 '07 #17

P: n/a
On Oct 20, 5:26 pm, "Marcin Kalicinski" <kal...@poczta.onet.plwrote:
I have recently done a simple measurement of std::string performance in C++,
and I'm shocked how bad it is. C++ is first and foremost meant to be a
performance language, with near-zero overhead. This holds quite well with
regards to the language, but overheads of the std library are absolutely
abysmal. And string is one of the most often used datatypes.

The test program creates 10 million strings and puts them in a container
(see source code #1, #2 below).

Results:
C++ (VS2005, -O2): 7.265s
C# (VS2005): 0.56s

C# string is 13 times faster than C++ std::string. Horrible.

There's a crucial difference between C++ and C# strings - C# strings are
immutable. Does this make a difference? To test it, I created the simplest
and fastest possible C++ string implementation (see code #3 below). It is
absolutely minimalistic (only handles construction). It does not make any
memory allocations, all the memory is obtained from a static pool by simply
incrementing a pointer. There's no deallocation scheme of any sort
implemented - this would certainly slow things down additionally. I don't
think it is possible to create a faster implementation of string in C++ than
#3. _Any_ real-life implementation will be slower.

The results when std::string was replaced with my String?
C++ (VS2005, -O2): 0.64s - still slower than 0.56s of C#

C# string is still 14% faster than the fastest theoretically imaginable C++
string. What?! How did they do it?

One reason may be that C++ string is constructed from a const char *
pointer. This is a fatal C/C++ flaw - the size of such string is unknown,
even if it's given at compile time as literal such as "foo". To find out the
size, the string must be scanned at runtime for 0 character. C# does not
have this limitation. Compiler can place the string directly where it's
needed, and also store its length in a member variable. All during compile
time.

Any comments?

Test programs for reference:

// #1: C++
std::vector<std::stringv;
int main()
{
clock_t t1 = clock();
for (int i = 0; i < 10000000; ++i)
v.push_back("poo");
clock_t t2 = clock();
cout << double(t2 - t1) / CLOCKS_PER_SEC;

}
On my system VC2005/same compile options (computer with low RAM)-- 210
secs.
Altering to write how you would see this code in real life--
clock_t t1 = clock();
v.resize(10000000,"poo");
clock_t t2 = clock();

This ran in under five seconds. I'm sure on your computer (that is
obvious more powerful than my laptop) you would see better results.

Lance

Oct 21 '07 #18

P: n/a
On Oct 21, 1:38 am, "Marcin Kalicinski" <kal...@poczta.onet.plwrote:
Allocating everything with operator new by default is a
performance killer. Garbage collected heaps of Java/C#
pretty much blow it to pieces.
Really ? Well, you can have a C++ garbage collector if you
really want to but there are many good reasons to remain
explicit.
Agreed, but this isn't one of them. For that matter, it's hard
to imagine a case where there's any good reason to explicitly
manage memory (just don't expect garbage collection to manage
anything else).
Garbage collection for C++ is possible, but as long as we are
working with raw pointers it cannot be as good, because the
collector cannot move objects around. Fast collectors must
move objects.
Not necessarily. There are many different garbage collection
algorithms, and it's not been proved that moving collectors are
systematically faster than non-moving ones. There has also been
work on "mostly moving" collectors for C++; I'm not sure of the
details, but apparently, the compiler is able to provide the
garbage collector with enough information that it can move most
of the allocated memory.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique oriente objet/
Beratung in objektorientierter Datenverarbeitung
9 place Smard, 78210 St.-Cyr-l'cole, France, +33 (0)1 30 23 00 34

Oct 21 '07 #19

P: n/a
On Oct 21, 11:58 am, Erik Wikstrm <Erik-wikst...@telia.comwrote:

[...]
Actually I am afraid that you are now comparing the difference
between itoa() and ToString(). ToString() is probably not as
fast as itoa() since it performs a culture-aware conversion.
What does "culture-aware" mean? In C, itoa depends on the
locale.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique oriente objet/
Beratung in objektorientierter Datenverarbeitung
9 place Smard, 78210 St.-Cyr-l'cole, France, +33 (0)1 30 23 00 34

Oct 21 '07 #20

P: n/a
On 2007-10-21 16:28, James Kanze wrote:
On Oct 21, 11:58 am, Erik Wikström <Erik-wikst...@telia.comwrote:

[...]
>Actually I am afraid that you are now comparing the difference
between itoa() and ToString(). ToString() is probably not as
fast as itoa() since it performs a culture-aware conversion.

What does "culture-aware" mean? In C, itoa depends on the
locale.
Not sure really, the .Net framework is not something I have studied in
detail. I am not sure what the difference between locale and culture is,
it might be two names for the same thing, but calling ToString() with no
arguments involves getting the culture object of the current thread,
from there you get a format provider object which will be used to format
the text representation of the object the function was called on.

While itoa() might also produce different output depending on the
current locale I do not think that it is quite as complicated as in
..Net, and to make things more fun ToString() is a virtual function
inherited from Object.

Nitpick: Of course, in C (as in standard C) itoa() is not defined.

--
Erik Wikström
Oct 21 '07 #21

P: n/a
On Oct 20, 7:38 pm, "Marcin Kalicinski" <kal...@poczta.onet.plwrote:
What is the flaw ?

The flaw is that using a different allocator changes the type of container.
So vector with allocator A1 has different type than vector with allocator
A2. You cannot write non-templated function that works with vector
regardless of its allocator.

cheers,
Marcin
It is possible to write STL containers (including strings) that use
"polymorphic allocators" such that changing the allocator does not
change the type. See http://www.lancediduck.com/papers/Cpp/StatefulSTL.pdf
I worked for a company recently that has an obsession with fixing this
"flaw" in standard container allocators--so much so that they actually
wrote their own version of the Standard libraries that "fixed" this.
And indeed, everyone could use std containers where you could pass in
a pointer to whatever allocator you dreamt up.
The resulting mess was beyond description, but suffice it to say that
those of us that actually had to build working scalable robust systems
quickly learned how to put the "change my type" allocators back in.
I hear over and over that the std allocator scheme is "broken"
"flawed" and such. And indeed it is hard to use. There are several
proposals for "improving" the allocators, two of which look promising
(N2035 and N1953). But often the fix is far worse than the problem.
In C++, by default, when you change the allocator (container or
otherwise) you change the type. This has been a feature of C++ since
Cfront 1.0. Using allocators that do not change the type is an option,
and this technique was use even in early C++. However, allocators that
know about the type of object they create allows many optimizations.

Java/C# use a form of the allocator that has O(1) heap allocations. C/C
++ typically use a system that has O(lg n) profile. This is why C/C++
will be at a disadvantage over Java/C# when making variable sized
objects such as vectors and strings, esp when using the default
allocator. But it is fairly easy (and seems to be a lost art) to make
a O(1) allocators in C++, as long as the allocator know about the type
of object being created. And you get to keep the destructors.

Lance


Oct 22 '07 #22

P: n/a
On Oct 21, 11:56 am, "Marcin Kalicinski" <kal...@poczta.onet.pl>
wrote:
One reason may be that C++ string is constructed from a const char *
pointer. This is a fatal C/C++ flaw - the size of such string is unknown,
even if it's given at compile time as literal such as "foo". To find out the
size, the string must be scanned at runtime for 0 character.

Granted you can eventually write a program in C++ that is faster than
C# version. But you must try really hard and go very low. Forget any nice
abstractions like std::string. To be faster you have to use strlen, strcmp,
You don't have a clue what you are talking about.
First you say that it is a flaw that you scan
the string for a 0 character at runtime. And then you
say that to make the program faster, you should call
functions that scan for 0 instead of using the stored length.

Oct 22 '07 #23

P: n/a
Marcin Kalicinski wrote:
Forget any nice
abstractions like std::string. To be faster you have to use strlen, strcmp,
How exactly will strlen() be faster than string::length(), given that
the latter just reads the value of a variable while the former always
scans the string?
Oct 24 '07 #24

P: n/a
On 20 Oct, 22:26, "Marcin Kalicinski" <kal...@poczta.onet.plwrote:
I have recently done a simple measurement of std::string performance in C++,
and I'm shocked how bad it is. <snip>
This probably ought to be in the FAQ. Very very often, when someone
says "Look, the performance of X is crap; my test code proves it!",
either their code is not measuring what they think it is (as in your
case), or their code is responsible for the crap performance, or
both. Don't guess - get a profiler.

Oct 24 '07 #25

This discussion thread is closed

Replies have been disabled for this discussion.