By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
424,644 Members | 1,866 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 424,644 IT Pros & Developers. It's quick & easy.

StringBuilder much much faster and better than String forconcatenation !!!

P: n/a
StringBuilder better and faster than string for adding many strings.

Look at the below. It's amazing how much faster StringBuilder is than
string.

The last loop below is telling: for adding 200000 strings of 8 char
each, string took over 25 minutes while StringBuilder took 40
milliseconds!

Can anybody explain such a radical difference?

The hardware running this program was a Pentium IV with 2 GB RAM.

RL

// stringbuilder much faster than string in concatenation

//////////////
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace console1
{
class Program
{
static void Main(string[] args)
{
Console.WriteLine("hi \n");
UpdateTime myUpdateTime = new UpdateTime(1000);
myUpdateTime.UpdateTimeMethod();
Console.WriteLine("times str,sb are: {0}, {1}",
myUpdateTime.txtConcatTime, myUpdateTime.txtStringBTime);
}
}
}

/*
* OUTPUT
* results:
* for 1000 iterations: string = 10.01ms; stringbuilder = 0
* for 5000 iterations: string = 410.6ms; stringbuilder = 0
* for 50k iterations: sring = 79013 ms; stringbuilder = 0;
* for 10k iterations : string = 1772.5 ms; stringbuilder = 0;
* for 75k iterations : string = 186237.8ms; stringbuilder = 20.03
ms
* for 100k iterations : string = 334.4k ms (5.6 min); stringbuilder =
20.03 ms;
* for 200k iterations: string = 1515.6k ms (25.3 min); stringbuilder
= 40.06 ms;
*
*
* */
//////////////////////////////
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace console1
{
class UpdateTime
{
int txtInterations;
public string txtConcatTime;
public string txtStringBTime;
public UpdateTime(int i)
{
txtInterations = i;
txtConcatTime = "";
txtStringBTime = "";
}

public void UpdateTimeMethod()
{

int iterations = txtInterations;

string theString = "MyString";

DateTime strCall = DateTime.Now;

string targetString = null;

for (int x = 0; x < iterations; x++)
{
targetString += theString;
}

TimeSpan time = (DateTime.Now - strCall);

txtConcatTime = time.TotalMilliseconds.ToString();

//StringBuilder

DateTime inCall = DateTime.Now;

string theString2 = "MyStrig2";
StringBuilder sb = new StringBuilder(theString2);

for (int x = 0; x < iterations; x++)
{
sb.Append(theString2);
}

time = (DateTime.Now - inCall);

txtStringBTime = time.TotalMilliseconds.ToString();

}

}
}
/////////////////////
Sep 22 '08 #1
Share this Question
Share on Google+
34 Replies


P: n/a
ive noticed this, and have had to use it in verious places,
usualy after ive run profiler, to see where its needed.

basically theres a fair bit of work when you add two strings together,
as your making a new object each time, stringbuilder is a single object,
and so doesnt have any where near the same overhead to apend.

Colin.

"raylopez99" <ra********@yahoo.comwrote in message
news:46**********************************@m45g2000 hsb.googlegroups.com...
StringBuilder better and faster than string for adding many strings.

Look at the below. It's amazing how much faster StringBuilder is than
string.

The last loop below is telling: for adding 200000 strings of 8 char
each, string took over 25 minutes while StringBuilder took 40
milliseconds!

Can anybody explain such a radical difference?

The hardware running this program was a Pentium IV with 2 GB RAM.

RL

Sep 22 '08 #2

P: n/a
On Sep 22, 8:36*am, raylopez99 <raylope...@yahoo.comwrote:
StringBuilder better and faster than string for adding many strings.

Look at the below. *It's amazing how much faster StringBuilder is than
string.

The last loop below is telling: *for adding 200000 strings of 8 char
each, string took over 25 minutes while StringBuilder took 40
milliseconds!

Can anybody explain such a radical difference?

The hardware running this program was a Pentium IV with 2 GB RAM.

RL
See the following article.

http://www.yoda.arachsys.com/csharp/stringbuilder.html

Sep 22 '08 #3

P: n/a
raylopez99 <ra********@yahoo.comwrote:
StringBuilder better and faster than string for adding many strings.

Look at the below. It's amazing how much faster StringBuilder is than
string.

The last loop below is telling: for adding 200000 strings of 8 char
each, string took over 25 minutes while StringBuilder took 40
milliseconds!

Can anybody explain such a radical difference?
Very easily. It's all to do with creating copies. This difference is
the whole point of StringBuilder existing in the first place.

See http://pobox.com/~skeet/csharp/stringbuilder.html

--
Jon Skeet - <sk***@pobox.com>
Web site: http://www.pobox.com/~skeet
Blog: http://www.msmvps.com/jon.skeet
C# in Depth: http://csharpindepth.com
Sep 22 '08 #4

P: n/a
The "string table" you mention is the interner; by default, strings in
compiled code get interned, but not strings that you build at runtime
(for example, via concatenation); as such, the string "abc" never gets
collected, because it is interned. In fact, the compiler is too clever
by half, and actually does the "abcd" concatenation itself, so the
string "abcd" is interned too ;-p

StringBuilder *operates* like a list/array of characters, but is
actually implemented as a regular .NET string, which it abuses and
tortures to mutate at runtime.

Marc
Sep 22 '08 #5

P: n/a
On Sep 22, 7:15*am, Jon Skeet [C# MVP] <sk...@pobox.comwrote:
raylopez99 <raylope...@yahoo.comwrote:
StringBuilder better and faster than string for adding many strings.
Look at the below. *It's amazing how much faster StringBuilder is than
string.
The last loop below is telling: *for adding 200000 strings of 8 char
each, string took over 25 minutes while StringBuilder took 40
milliseconds!
Can anybody explain such a radical difference?

Very easily. It's all to do with creating copies. This difference is
the whole point of StringBuilder existing in the first place.

Seehttp://pobox.com/~skeet/csharp/stringbuilder.html

--
Jon Skeet - <sk...@pobox.com>
Web site:http://www.pobox.com/~skeet*
Blog:http://www.msmvps.com/jon.skeet
C# in Depth:http://csharpindepth.com
See http://pobox.com/~skeet/csharp/stringbuilder.html that was a good
article...

Thanks Jon.

-Cnu

Sep 22 '08 #6

P: n/a
raylopez99 wrote:
The last loop below is telling: for adding 200000 strings of 8 char
each, string took over 25 minutes while StringBuilder took 40
milliseconds!

Can anybody explain such a radical difference?
For each time you add eight characters to the string, the entire string
is copied into a new string along with the new characters.

In the first iteration you copy 16 bytes (8 character, each two bytes).
In the second iteration you copy 32 bytes.
In the third iteration you copy 48 bytes.
And son on...

When you reach the 200000th iteration, you will have copied:

16*(1+2+3+4+5+...+200000) = 16*(100000*200001) = 320001600000 bytes

That is 320 GB. That 160 times more than you have in your computer. To
create a string that is 1600000 characters, you have copied 100000 times
that much data.

The StringBuilder has to grow it's internal string several times during
the loop, but each time it's size is doubled, so in the end the
StringBuilder will have copied about two times the size of the string.

So in this case the StringBuilder should be about 50000 times faster
than concatenating the strings, which corresponds to your result.

(If you specify the final size when creating the StringBuilder, the
interal string never has to be reallocated, so it will be twice as fast.)
int txtInterations;
If you want to use hungarinan notation to specify the data type, you
should not use a prefix that contradicts the data type.

However, in a type safe language there isn't really any need to use
hungarian notation to keep track of the data types.

--
Göran Andersson
_____
http://www.guffa.com
Sep 22 '08 #7

P: n/a
Göran Andersson wrote:
When you reach the 200000th iteration, you will have copied:

16*(1+2+3+4+5+...+200000) = 16*(100000*200001) = 320001600000 bytes

That is 320 GB. That 160 times more than you have in your computer. To
create a string that is 1600000 characters, you have copied 100000 times
Well that's interesting Goran. But my PC did not crash, and I don't
have 320 GB of HD, so somehow it must be doing some fancy stuff in the
background to truncate.

The StringBuilder has to grow it's internal string several times during
the loop, but each time it's size is doubled, so in the end the
StringBuilder will have copied about two times the size of the string.

So in this case the StringBuilder should be about 50000 times faster
than concatenating the strings, which corresponds to your result.

(If you specify the final size when creating the StringBuilder, the
interal string never has to be reallocated, so it will be twice as fast.)
That's counterintuitive, if you're saying specifiying the final size
will make StringBuilder *slower*. Very strange if true. Anyway I
never specify anything so I'm OK.
>
int txtInterations;

If you want to use hungarinan notation to specify the data type, you
should not use a prefix that contradicts the data type.

However, in a type safe language there isn't really any need to use
hungarian notation to keep track of the data types.
And why is that? Anyhow, I just discovered this cool property for
runtime type checking:

// using public static object ChangeType (object value, Type
conversionType);
// example:

Type myTargetType = typeof (int);
object theSourceStr = “42”;
object theResult = Convert.ChangeType(theSourceStr, myTargetType);
Console.WriteLine(theResult); //42
Console.WriteLine(theResult.GetType()); //System.Int32

//pretty cool, eh? I bet it only works though for 'primitive' data
types like int, etc.
// UPDATE: I see C# has no easy way of casting any object...or so it
seems. I'll post in a separate thread on this...

RL
Sep 22 '08 #8

P: n/a
assuming all the calcs here are correct,
your computer shouldn't crash, you have COPIED 320GB of data, however the
largest string actually created is only 1.6GB.
after a string has been copied it is available for garbage collection and
part of the vast time difference you are seeing is GC doing its job.

re specifiying the final size being 'twice as fast' , that would be twice
as fast as not specifying a final size -or- 100000 times faster that
concatenating strings
"raylopez99" <ra********@yahoo.comwrote in message
news:ed**********************************@c58g2000 hsc.googlegroups.com...
Göran Andersson wrote:
When you reach the 200000th iteration, you will have copied:

16*(1+2+3+4+5+...+200000) = 16*(100000*200001) = 320 001 600 000 bytes

That is 320 GB. That 160 times more than you have in your computer. To
create a string that is 1 600 000 characters, you have copied 100000 times
Well that's interesting Goran. But my PC did not crash, and I don't
have 320 GB of HD, so somehow it must be doing some fancy stuff in the
background to truncate.

The StringBuilder has to grow it's internal string several times during
the loop, but each time it's size is doubled, so in the end the
StringBuilder will have copied about two times the size of the string.

So in this case the StringBuilder should be about 50000 times faster
than concatenating the strings, which corresponds to your result.

(If you specify the final size when creating the StringBuilder, the
interal string never has to be reallocated, so it will be twice as fast.)
That's counterintuitive, if you're saying specifiying the final size
will make StringBuilder *slower*. Very strange if true. Anyway I
never specify anything so I'm OK.
>
int txtInterations;

If you want to use hungarinan notation to specify the data type, you
should not use a prefix that contradicts the data type.

However, in a type safe language there isn't really any need to use
hungarian notation to keep track of the data types.
And why is that? Anyhow, I just discovered this cool property for
runtime type checking:

// using public static object ChangeType (object value, Type
conversionType);
// example:

Type myTargetType = typeof (int);
object theSourceStr = “42”;
object theResult = Convert.ChangeType(theSourceStr, myTargetType);
Console.WriteLine(theResult); //42
Console.WriteLine(theResult.GetType()); //System.Int32

//pretty cool, eh? I bet it only works though for 'primitive' data
types like int, etc.
// UPDATE: I see C# has no easy way of casting any object...or so it
seems. I'll post in a separate thread on this...

RL
Sep 22 '08 #9

P: n/a
On Sep 22, 2:43*pm, raylopez99 <raylope...@yahoo.comwrote:
Göran Andersson wrote:
When you reach the 200000th iteration, you will have copied:
16*(1+2+3+4+5+...+200000) = 16*(100000*200001) = 320001600000 bytes
That is 320 GB. That 160 times more than you have in your computer. To
create a string that is 1600000 characters, you have copied 100000 times

Well that's interesting Goran. *But my PC did not crash, and I don't
have 320 GB of HD, so somehow it must be doing some fancy stuff in the
background to truncate.
All of that memory isn't in play simultaneously.
>
The StringBuilder has to grow it's internal string several times during
the loop, but each time it's size is doubled, so in the end the
StringBuilder will have copied about two times the size of the string.
So in this case the StringBuilder should be about 50000 times faster
than concatenating the strings, which corresponds to your result.
(If you specify the final size when creating the StringBuilder, the
interal string never has to be reallocated, so it will be twice as fast..)

That's counterintuitive, if you're saying specifiying the final size
will make StringBuilder *slower*. *Very strange if true. *Anyway I
never specify anything so I'm OK.
He said it would be twice as *fast*.
>
* * * * int txtInterations;
If you want to use hungarinan notation to specify the data type, you
should not use a prefix that contradicts the data type.
However, in a type safe language there isn't really any need to use
hungarian notation to keep track of the data types.

And why is that? *
I'll pass on that.
Anyhow, I just discovered this cool property for
runtime type checking:

// using public static object ChangeType (object value, Type
conversionType);
// example:

Type myTargetType = typeof (int);
* object theSourceStr = “42”;
* object theResult = Convert.ChangeType(theSourceStr, myTargetType);
* Console.WriteLine(theResult); //42
* Console.WriteLine(theResult.GetType()); //System.Int32

//pretty cool, eh? *I bet it only works though for 'primitive' data
types like int, etc.
// UPDATE: *I see C# has no easy way of casting any object...or so it
seems. *I'll post in a separate thread on this...
It does, but like you said it's better to post that question in
another thread.
Sep 22 '08 #10

P: n/a
Brian Gideon wrote:

OK, got it. And I understood now Hungarian Notation not needed since
the compiler will catch your error.

RL
Sep 22 '08 #11

P: n/a
On Sep 22, 3:37*pm, raylopez99 <raylope...@yahoo.comwrote:
Brian Gideon wrote:

OK, got it. *And I understood now Hungarian Notation not needed since
the compiler will catch your error.

RL
Basically. And for what it's worth I've adopted the "m_" prefix for
instance members and "s_" for static members.
Sep 22 '08 #12

P: n/a
On Mon, 22 Sep 2008 09:58:48 -0700, Göran Andersson <gu***@guffa.com>
wrote:
[...]
> int txtInterations;

If you want to use hungarinan notation to specify the data type, you
should not use a prefix that contradicts the data type.
I hope that's a statement that anyone can perceive as trivially true. :)
However, in a type safe language there isn't really any need to use
hungarian notation to keep track of the data types.
I will insert my standard disclaimer here:

Even in a type unsafe language, Hungarian's primary purpose isn't to keep
track of the data type. If and when the type is identical to the semantic
of the variable, then of course it will. But otherwise, the type tag (not
prefix) in Hungarian reflects the _semantic_ usage of the data, not it's
literal type.

In fact, for variables that are typed as built-in types such as "int",
"char", etc. the tag will most often _not_ reflect the actual type of the
variable. For example, "x", "dx", and "cx" are common variable names when
dealing with the X coordinate in a Cartesian coordinate space, but they
can all refer to a variety of integral types: "int", "short", "ushort",
"long", etc. In the Hungarian philosophy, the naming is there to ensure
semantic correctness, not compiler correctness.

This is in fact why Hungarian is still valuable even when using a
strongly-typed language.

This is something that Microsoft's "Systems" version of Hungarian gets
very, very wrong. Unfortunately, that's the Hungarian most people are
exposed to.

Pete
Sep 22 '08 #13

P: n/a
Peter Duniho wrote:
Even in a type unsafe language, Hungarian's primary purpose isn't to
keep track of the data type. If and when the type is identical to the
semantic of the variable, then of course it will. But otherwise, the
type tag (not prefix) in Hungarian reflects the _semantic_ usage of the
data, not it's literal type.

In fact, for variables that are typed as built-in types such as "int",
"char", etc. the tag will most often _not_ reflect the actual type of
the variable. For example, "x", "dx", and "cx" are common variable
names when dealing with the X coordinate in a Cartesian coordinate
space, but they can all refer to a variety of integral types: "int",
"short", "ushort", "long", etc. In the Hungarian philosophy, the naming
is there to ensure semantic correctness, not compiler correctness.

This is in fact why Hungarian is still valuable even when using a
strongly-typed language.

This is something that Microsoft's "Systems" version of Hungarian gets
very, very wrong. Unfortunately, that's the Hungarian most people are
exposed to.

Pete
I've done quite a bit ASP/VBScript, and although it wasn't exactly the
original intention of the hungarian notation, using it to keep track of
the data type is very useful in that environment. Otherwise you could
easily get surprised by the results, like:

Dim page
page = Request.QueryString("page")
If page = 42 Then
' we don't get here even if we put 42 in the query string
' as the variable page contains "42", not 42.
End If

--
Göran Andersson
_____
http://www.guffa.com
Sep 22 '08 #14

P: n/a
raylopez99 wrote:
Göran Andersson wrote:
>When you reach the 200000th iteration, you will have copied:

16*(1+2+3+4+5+...+200000) = 16*(100000*200001) = 320001600000 bytes

That is 320 GB. That 160 times more than you have in your computer. To
create a string that is 1600000 characters, you have copied 100000 times

Well that's interesting Goran. But my PC did not crash, and I don't
have 320 GB of HD, so somehow it must be doing some fancy stuff in the
background to truncate.
For each string that you create, the previous string is up for garbage
collection.

As you go through a lot more memory than there are in the computer, it
means that it has done more than 160 garbage collections during the
loop, and probably something closer to a 1000.
>The StringBuilder has to grow it's internal string several times during
the loop, but each time it's size is doubled, so in the end the
StringBuilder will have copied about two times the size of the string.

So in this case the StringBuilder should be about 50000 times faster
than concatenating the strings, which corresponds to your result.

(If you specify the final size when creating the StringBuilder, the
interal string never has to be reallocated, so it will be twice as fast.)

That's counterintuitive, if you're saying specifiying the final size
will make StringBuilder *slower*. Very strange if true. Anyway I
never specify anything so I'm OK.
If you find it counter intuitive, then perhaps you should read it again
to see if you got it right. In this case you got it backwards.
Anyhow, I just discovered this cool property for
runtime type checking:

// using public static object ChangeType (object value, Type
conversionType);
// example:

Type myTargetType = typeof (int);
object theSourceStr = “42”;
object theResult = Convert.ChangeType(theSourceStr, myTargetType);
Console.WriteLine(theResult); //42
Console.WriteLine(theResult.GetType()); //System.Int32

//pretty cool, eh? I bet it only works though for 'primitive' data
types like int, etc.
The type has to implement the IConvertible interface, which the
primitive types do.

The primitive types already have methods in the Convert class, so
instead of doing Convert.ChangeType(var, typeof(int)) you can use
Convert.ToInt32(var).

Casting to a type that you specify dynamically isn't very useful in a
strongly typed language. You can't do much with the data anyway without
casting the reference to the actual type.
// UPDATE: I see C# has no easy way of casting any object...or so it
seems. I'll post in a separate thread on this...
Converting a string to an int is parsing. There are several method for
doing that, like int.Parse(string), int.TryParse(string, out int),
Convert.ToInt32(string)...

For casting, C# uses the same syntax as C/C++:

int value = 42;
long bigValue = (long)value;

--
Göran Andersson
_____
http://www.guffa.com
Sep 22 '08 #15

P: n/a

"raylopez99" <ra********@yahoo.comwrote in message
news:46**********************************@m45g2000 hsb.googlegroups.com...
StringBuilder better and faster than string for adding many strings.

Look at the below. It's amazing how much faster StringBuilder is than
string.

The last loop below is telling: for adding 200000 strings of 8 char
each, string took over 25 minutes while StringBuilder took 40
milliseconds!

Can anybody explain such a radical difference?
You got lots of answers as to why String performed so badly in this
situation compared to StringBuilder.
Many people misinterpret this to say that you should always prefer
StringBuilder over String when doing concatenation.
But this is not a true assumption.

For straight concatenation like this

string foo = "A" + "B" +C"....

String is optimized to handle this case and it is very efficient

Even for strings constructed by looping

String foo = "";
foreach(string str in Bar)
{
foo += str;
}

If the number of loops is under 10-20 this code can actually outperform
similar code using StringBuilder

Obviously you saw what can happen if the number gets large, so it is
good to understand how it works under the covers and use the right tool
for the job.

Bill


Sep 23 '08 #16

P: n/a
Göran Andersson wrote:
Casting to a type that you specify dynamically isn't very useful in a
strongly typed language. You can't do much with the data anyway without
casting the reference to the actual type.
Very good. It explains why C# doesn't have, like C++/CLI, this
function "safe_cast":
Object ^ obj = safe_cast <Object^(anEnumHere->Current);

Or maybe not.

Anyway today, via FTM's and Jon Skeet's help, I learned this query,
for iterating through a list having mixed strings and ints and picking
out the first two letters of the strings:
List<objectwords = new List<object{ "green", "blue", 3,
"violet", 5 };

IEnumerable<stringquery =
words.AsQueryable().OfType<string>().Cast<string>( ).Select(str =>
str.Substring(0,Math.Min(str.Length, 2))); //prevents out-of-range
exception and also string cast problem

Pretty cool eh? On top of doing my day job, which is heating up
now... multitasking big time...I gotta run play a game of online chess
then crash...

RL
Sep 23 '08 #17

P: n/a
On Mon, 22 Sep 2008 17:57:34 -0700, Bill Butler <qw****@asdf.comwrote:
[...]
Even for strings constructed by looping

String foo = "";
foreach(string str in Bar)
{
foo += str;
}

If the number of loops is under 10-20 this code can actually outperform
similar code using StringBuilder
For what it's worth, curious I did a quick test. I found that the two
techniques reach near-parity at just 5 concatenations, and StringBuilder
is definitively faster at 10 concatenations. At 20, there's no contest.

Even with just two concatenations, the difference is "only" a factor of
2. And of course, in that situation the performance of the actual
concatenation is unlikely to be significant in the overall algorithm.

To me, that means that it's "safe" to use plain string concatenation as a
_simplification_ of the code, when you're sure that the number of
concatenations will be small. I would even accept the performance
overhead up to 20 concatenations or so, as long as it wasn't a critical
difference, because I do feel the code is easier to read. But only when I
could be assured the number of concatenations wouldn't ever be much
greater than that.

It's hard to imagine a situation in which choosing string concatenation
over StringBuilder would be a legitimate real-world performance
optimization. Practically any situation in which you have few enough
concatenations for string concatenation to win, there would be too few
concatenations for the concatenation to matter much at all.

Again, perfectly valid as a code maintenance/readability choice in certain
"safe" situations, but probably not something someone's going to do as a
performance enhancement. For example, I wouldn't create code that has a
special code-path for small numbers of concatenations just to take
advantage of that performance difference. The decrease in maintainability
isn't worth the marginal improvement in performance, even in the best case.

Pete
Sep 23 '08 #18

P: n/a

"Peter Duniho" <Np*********@nnowslpianmk.comwrote in message
news:op***************@petes-computer.local...
On Mon, 22 Sep 2008 17:57:34 -0700, Bill Butler <qw****@asdf.com>
wrote:
>[...]
Even for strings constructed by looping

String foo = "";
foreach(string str in Bar)
{
foo += str;
}

If the number of loops is under 10-20 this code can actually
outperform
similar code using StringBuilder

For what it's worth, curious I did a quick test. I found that the two
techniques reach near-parity at just 5 concatenations, and
StringBuilder is definitively faster at 10 concatenations. At 20,
there's no contest.
Correct, That's what I get for trying to pull the number from memory.
>
Even with just two concatenations, the difference is "only" a factor
of 2. And of course, in that situation the performance of the actual
concatenation is unlikely to be significant in the overall algorithm.

To me, that means that it's "safe" to use plain string concatenation
as a _simplification_ of the code, when you're sure that the number
of concatenations will be small. I would even accept the performance
overhead up to 20 concatenations or so, as long as it wasn't a
critical difference, because I do feel the code is easier to read.
But only when I could be assured the number of concatenations
wouldn't ever be much greater than that.

It's hard to imagine a situation in which choosing string
concatenation over StringBuilder would be a legitimate real-world
performance optimization. Practically any situation in which you
have few enough concatenations for string concatenation to win, there
would be too few concatenations for the concatenation to matter much
at all.

Again, perfectly valid as a code maintenance/readability choice in
certain "safe" situations, but probably not something someone's going
to do as a performance enhancement. For example, I wouldn't create
code that has a special code-path for small numbers of concatenations
just to take advantage of that performance difference. The decrease
in maintainability isn't worth the marginal improvement in
performance, even in the best case.

I agree completely.
But I have seen folks that avoid string concatenation like the plague,
since they heard that it is slower.
In most cases readability triumphs.

Thanks for the correction
Bill


Sep 23 '08 #19

P: n/a
On Mon, 22 Sep 2008 21:39:26 -0700, Bill Butler <qw****@asdf.comwrote:
I agree completely.
But I have seen folks that avoid string concatenation like the plague,
since they heard that it is slower.
I agree that avoiding anything "like the plague" can often lead to poor
design and implementation choices. It's important to know the trade-offs.
In most cases readability triumphs.
I don't know about "most". There are lots of examples of theoretically
unbounded scenarios. Often what happens is that there's no guarantee of a
certain bounding, but an immediately expected data set that empirically
has a low bound. Then once the code has been "in the wild" for awhile,
the data set changes and the inefficient approach becomes painfully
apparent.

I would say that only in the situation where it is _guaranteed_ up front
that the number of concatenations is bounded to some relatively small
number (dozens, or if the concatenation is in a known "not bottleneck"
section of code perhaps hundreds) should the readability of straight
string concatenation be preferred. I suspect that these scenarios don't
in fact make up "most cases" (I know that in my own experience they don't,
but lacking a real survey of real-world code I can't state for sure what's
unversally true :) ).

Barring those conditions, it's just too dangerous to not be using
StringBuilder, especially given that readability doesn't suffer _that_
much for using it. I dislike premature optimizations as much as the next
guy, but in this case the potential for catastrophic performance problems
outweighs the relatively minor improvement in readability.
Thanks for the correction
No problem...thank you for inspiring the test! :)

Pete
Sep 23 '08 #20

P: n/a
raylopez99 wrote:
Göran Andersson wrote:
>Casting to a type that you specify dynamically isn't very useful in a
strongly typed language. You can't do much with the data anyway without
casting the reference to the actual type.

Very good. It explains why C# doesn't have, like C++/CLI, this
function "safe_cast":
Object ^ obj = safe_cast <Object^(anEnumHere->Current);

Or maybe not.
Actually it doesn't. The reason that there is no "safe_cast" in C# is
that every cast is safe. If you want an unsafe cast you have to do it in
an unsafe code block, and even then you might have to cast it to a void
pointer before casting to a different type to prevent the compiler from
telling you that you are doing something wrong.
Anyway today, via FTM's and Jon Skeet's help, I learned this query,
for iterating through a list having mixed strings and ints and picking
out the first two letters of the strings:
List<objectwords = new List<object{ "green", "blue", 3,
"violet", 5 };

IEnumerable<stringquery =
words.AsQueryable().OfType<string>().Cast<string>( ).Select(str =>
str.Substring(0,Math.Min(str.Length, 2))); //prevents out-of-range
exception and also string cast problem
The OfType extension already returns a typed enumerator, so you don't
need the Cast extension. Also, the OfType is an extension of
IEnumerable, so you don't need the extension AsQueryable to use it on
the List:

IEnumerable<stringquery = words.OfType<string>().Select(str =>
str.Substring(0,Math.Min(str.Length, 2)));

--
Göran Andersson
_____
http://www.guffa.com
Sep 23 '08 #21

P: n/a
On Sep 22, 6:21*pm, "Peter Duniho" <NpOeStPe...@nnowslpianmk.com>
wrote:
>
For what it's worth, curious I did a quick test. *I found that the two *
techniques reach near-parity at just 5 concatenations, and StringBuilder *
is definitively faster at 10 concatenations. *At 20, there's no contest..
I'm curious how you did such a quick test, especially since the
DateTime structure is only accurate to at best 10 ms or greater.

You lying *again*, Peter Duniho?

Hahahaha.

RL
Sep 23 '08 #22

P: n/a
>"raylopez99" <ra********@yahoo.comwrote in message
>news:98**********************************@x35g200 0hsb.googlegroups.com...
On Sep 22, 6:21 pm, "Peter Duniho" <NpOeStPe...@nnowslpianmk.com>
wrote:
>>
For what it's worth, curious I did a quick test. I found that the two
techniques reach near-parity at just 5 concatenations, and
StringBuilder
is definitively faster at 10 concatenations. At 20, there's no
contest.

I'm curious how you did such a quick test, especially since the
DateTime structure is only accurate to at best 10 ms or greater.
You simply repeat each test about 100,000 times in a loop.
Viola!
Sep 23 '08 #23

P: n/a


Bill Butler wrote:
I'm curious how you did such a quick test, especially since the
DateTime structure is only accurate to at best 10 ms or greater.

You simply repeat each test about 100,000 times in a loop.
Viola!
But Peter said: For what it's worth, curious I did a quick test. I
found that the two
techniques reach near-parity at just 5 concatenations, and StringBuilder
is definitively faster at 10 concatenations. At 20, there's no contest.
Which implies he only tested five concatenations, and up to 20, no
more.

RL
Sep 24 '08 #24

P: n/a
On Sep 24, 10:36*am, raylopez99 <raylope...@yahoo.comwrote:
But Peter said:
For what it's worth, curious I did a quick test. *I found that the two
techniques reach near-parity at just 5 concatenations, and StringBuilder
is definitively faster at 10 concatenations. *At 20, there's no contest.

Which implies he only tested five concatenations, and up to 20, no
more.
You can repeatedly do 20 concatenations though. Doing 20
concatenations 100,000 times is not the same thing as doing 20 *
100,000 concatenations.

Jon

Sep 24 '08 #25

P: n/a


Jon Skeet [C# MVP] wrote:
On Sep 24, 10:36�am, raylopez99 <raylope...@yahoo.comwrote:
But Peter said:
For what it's worth, curious I did a quick test. �I found that the two
techniques reach near-parity at just 5 concatenations, and StringBuilder
is definitively faster at 10 concatenations. �At 20, there's no contest.
Which implies he only tested five concatenations, and up to 20, no
more.

You can repeatedly do 20 concatenations though. Doing 20
concatenations 100,000 times is not the same thing as doing 20 *
100,000 concatenations.

OK, I see. Do 20 loops, store the time difference, and repeat 100k
times. Makes sense.

RL
Sep 24 '08 #26

P: n/a
On Sep 24, 12:28*pm, raylopez99 <raylope...@yahoo.comwrote:
You can repeatedly do 20 concatenations though. Doing 20
concatenations 100,000 times is not the same thing as doing 20 *
100,000 concatenations.

OK, I see. *Do 20 loops, store the time difference, and repeat 100k
times. *Makes sense.
Not quite. The point is to only compare the times at the very start
and end:

1) Start Stopwatch (System.Diagnostics.Stopwatch)
2) Concatenate string 20 times
3) Repeat step 2 100,000 times
4) Stop Stopwatch
5) Report results

Jon
Sep 24 '08 #27

P: n/a
I'm curious how you did such a quick test, especially since the
DateTime structure is only accurate to at best 10 ms or greater.
Yeah, DateTime subtraction is a terrible way to time stuff. Use the
Stopwatch.Elapsed.TotalSeconds instead. (And be warned that if you are
running on a motherboard with multiple sockets Stopwatch won't work
right without Vista or manually setting the CPU affinity on your
thread.)
Sep 24 '08 #28

P: n/a
I've been looking for a decent stopwatch, and have checked out various
MSDN articles that say it really can't be done (the one that says "if
you're looking for a metronome, you've come to the wrong place").

If you have code or pseudo code on this "Stopwatch" method, please
post here so I can add it to my bag of tricks aka library.

Thank you,

RL

not_a_commie wrote:
I'm curious how you did such a quick test, especially since the
DateTime structure is only accurate to at best 10 ms or greater.

Yeah, DateTime subtraction is a terrible way to time stuff. Use the
Stopwatch.Elapsed.TotalSeconds instead. (And be warned that if you are
running on a motherboard with multiple sockets Stopwatch won't work
right without Vista or manually setting the CPU affinity on your
thread.)
Sep 24 '08 #29

P: n/a
raylopez99 <ra********@yahoo.comwrote:
I've been looking for a decent stopwatch, and have checked out various
MSDN articles that say it really can't be done (the one that says "if
you're looking for a metronome, you've come to the wrong place").

If you have code or pseudo code on this "Stopwatch" method, please
post here so I can add it to my bag of tricks aka library.
As I said before, see System.Diagnostics.Stopwatch. (It's not a method,
it's a type.)

Simple example:

using System;
using System.Diagnostics;

class Test
{
static void Main()
{
Stopwatch sw = Stopwatch.StartNew();
// Do something expensive
string x = "";
for (int i=0; i < 100000; i++)
{
x += " "; // Eek!
}

sw.Stop();
Console.WriteLine("Elapsed time: {0}ms",
sw.ElapsedMilliseconds);
}
}

--
Jon Skeet - <sk***@pobox.com>
Web site: http://www.pobox.com/~skeet
Blog: http://www.msmvps.com/jon.skeet
C# in Depth: http://csharpindepth.com
Sep 24 '08 #30

P: n/a
>sw.ElapsedMilliseconds

On a decent motherboard you should be able to get near microsecond
accuracy with that, which means you probably want to use
sw.Elapsed.TotalMilliseconds instead of sw.Milliseconds. I've also
seen a few weird glitches with the latter where occasionally the
machine will hiccup and you'll be off by six orders of magnitude.
Sep 25 '08 #31

P: n/a
On Sep 25, 3:44*pm, not_a_commie <notacom...@gmail.comwrote:
sw.ElapsedMilliseconds

On a decent motherboard you should be able to get near microsecond
accuracy with that, which means you probably want to use
sw.Elapsed.TotalMilliseconds instead of sw.Milliseconds.
I disagree. If your test is so sensitive that +/- 1ms makes any
difference, it's running for way too short a time to be meaningful,
IMO. I'd be very suspicious of any test running in less than a second
- and moderately suspicious of a test running in less than 10 seconds,
unless it's to show that (say) an algorithm taking 5 seconds is much
slower than one taking half a second. The variation between runs is
very, very rarely going to be less than a millisecond, so where's the
benefit in giving more precision? The benefit of giving *less*
precision is that Integers are, IMO, easier to immediately recognise
in terms of magnitude than reals. It's easier to compare at a glance,
say, 12532 and 3250 than 12532.23401 and 3250.195323.
I've also
seen a few weird glitches with the latter where occasionally the
machine will hiccup and you'll be off by six orders of magnitude.
That's very odd - never seen anything like that.

Jon
Sep 25 '08 #32

P: n/a
On Sep 25, 7:44*am, not_a_commie <notacom...@gmail.comwrote:
sw.ElapsedMilliseconds

On a decent motherboard you should be able to get near microsecond
accuracy with that, which means you probably want to use
sw.Elapsed.TotalMilliseconds instead of sw.Milliseconds. I've also
seen a few weird glitches with the latter where occasionally the
machine will hiccup and you'll be off by six orders of magnitude.
Thanks !commie. On a related note, I've seen strange stuff with
garbage collected timers, they tend to "bunch up" and/or 'hiccup'
where they don't fire on time. The DateTime structure also has
a .Ticks "total milliseconds" (actually I think it's close to
nanoseconds, or some multiple of ns) from the year 0 AD that's also
good but the granularity of DateTime is >= 10ms, so that defeats the
purpose.

Thanks to you and Jon for the hint--I've but this into my bag of
tricks.

RL
Sep 25 '08 #33

P: n/a
On Sep 25, 4:30*pm, raylopez99 <raylope...@yahoo.comwrote:
Thanks !commie. *On a related note, I've seen strange stuff with
garbage collected timers, they tend to "bunch up" and/or 'hiccup'
where they don't fire on time. *The DateTime structure also has
a .Ticks "total milliseconds" (actually I think it's close to
nanoseconds, or some multiple of ns) from the year 0 AD that's also
good but the granularity of DateTime is >= 10ms, so that defeats the
purpose.
The granularity of DateTime itself is fine - it's the granularity of
the timer that DateTime.Now uses that makes it less suitable than
Stopwatch for performance timing.

Jon
Sep 25 '08 #34

P: n/a
On Sep 25, 4:30*pm, raylopez99 <raylope...@yahoo.comwrote:
Thanks !commie. *On a related note, I've seen strange stuff with
garbage collected timers, they tend to "bunch up" and/or 'hiccup'
where they don't fire on time. *The DateTime structure also has
a .Ticks "total milliseconds" (actually I think it's close to
nanoseconds, or some multiple of ns) from the year 0 AD that's also
good but the granularity of DateTime is >= 10ms, so that defeats the
purpose.
The granularity of DateTime itself is fine - it's the granularity of
the timer that DateTime.Now uses that makes it less suitable than
Stopwatch for performance timing.

Jon
Sep 25 '08 #35

This discussion thread is closed

Replies have been disabled for this discussion.