473,883 Members | 1,660 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

StringBuilder much much faster and better than String forconcatenatio n !!!

StringBuilder better and faster than string for adding many strings.

Look at the below. It's amazing how much faster StringBuilder is than
string.

The last loop below is telling: for adding 200000 strings of 8 char
each, string took over 25 minutes while StringBuilder took 40
milliseconds!

Can anybody explain such a radical difference?

The hardware running this program was a Pentium IV with 2 GB RAM.

RL

// stringbuilder much faster than string in concatenation

//////////////
using System;
using System.Collecti ons.Generic;
using System.Linq;
using System.Text;

namespace console1
{
class Program
{
static void Main(string[] args)
{
Console.WriteLi ne("hi \n");
UpdateTime myUpdateTime = new UpdateTime(1000 );
myUpdateTime.Up dateTimeMethod( );
Console.WriteLi ne("times str,sb are: {0}, {1}",
myUpdateTime.tx tConcatTime, myUpdateTime.tx tStringBTime);
}
}
}

/*
* OUTPUT
* results:
* for 1000 iterations: string = 10.01ms; stringbuilder = 0
* for 5000 iterations: string = 410.6ms; stringbuilder = 0
* for 50k iterations: sring = 79013 ms; stringbuilder = 0;
* for 10k iterations : string = 1772.5 ms; stringbuilder = 0;
* for 75k iterations : string = 186237.8ms; stringbuilder = 20.03
ms
* for 100k iterations : string = 334.4k ms (5.6 min); stringbuilder =
20.03 ms;
* for 200k iterations: string = 1515.6k ms (25.3 min); stringbuilder
= 40.06 ms;
*
*
* */
//////////////////////////////
using System;
using System.Collecti ons.Generic;
using System.Linq;
using System.Text;

namespace console1
{
class UpdateTime
{
int txtInterations;
public string txtConcatTime;
public string txtStringBTime;
public UpdateTime(int i)
{
txtInterations = i;
txtConcatTime = "";
txtStringBTime = "";
}

public void UpdateTimeMetho d()
{

int iterations = txtInterations;

string theString = "MyString";

DateTime strCall = DateTime.Now;

string targetString = null;

for (int x = 0; x < iterations; x++)
{
targetString += theString;
}

TimeSpan time = (DateTime.Now - strCall);

txtConcatTime = time.TotalMilli seconds.ToStrin g();

//StringBuilder

DateTime inCall = DateTime.Now;

string theString2 = "MyStrig2";
StringBuilder sb = new StringBuilder(t heString2);

for (int x = 0; x < iterations; x++)
{
sb.Append(theSt ring2);
}

time = (DateTime.Now - inCall);

txtStringBTime = time.TotalMilli seconds.ToStrin g();

}

}
}
/////////////////////
Sep 22 '08
34 3586
Brian Gideon wrote:

OK, got it. And I understood now Hungarian Notation not needed since
the compiler will catch your error.

RL
Sep 22 '08 #11
On Sep 22, 3:37*pm, raylopez99 <raylope...@yah oo.comwrote:
Brian Gideon wrote:

OK, got it. *And I understood now Hungarian Notation not needed since
the compiler will catch your error.

RL
Basically. And for what it's worth I've adopted the "m_" prefix for
instance members and "s_" for static members.
Sep 22 '08 #12
On Mon, 22 Sep 2008 09:58:48 -0700, Göran Andersson <gu***@guffa.co m>
wrote:
[...]
> int txtInterations;

If you want to use hungarinan notation to specify the data type, you
should not use a prefix that contradicts the data type.
I hope that's a statement that anyone can perceive as trivially true. :)
However, in a type safe language there isn't really any need to use
hungarian notation to keep track of the data types.
I will insert my standard disclaimer here:

Even in a type unsafe language, Hungarian's primary purpose isn't to keep
track of the data type. If and when the type is identical to the semantic
of the variable, then of course it will. But otherwise, the type tag (not
prefix) in Hungarian reflects the _semantic_ usage of the data, not it's
literal type.

In fact, for variables that are typed as built-in types such as "int",
"char", etc. the tag will most often _not_ reflect the actual type of the
variable. For example, "x", "dx", and "cx" are common variable names when
dealing with the X coordinate in a Cartesian coordinate space, but they
can all refer to a variety of integral types: "int", "short", "ushort",
"long", etc. In the Hungarian philosophy, the naming is there to ensure
semantic correctness, not compiler correctness.

This is in fact why Hungarian is still valuable even when using a
strongly-typed language.

This is something that Microsoft's "Systems" version of Hungarian gets
very, very wrong. Unfortunately, that's the Hungarian most people are
exposed to.

Pete
Sep 22 '08 #13
Peter Duniho wrote:
Even in a type unsafe language, Hungarian's primary purpose isn't to
keep track of the data type. If and when the type is identical to the
semantic of the variable, then of course it will. But otherwise, the
type tag (not prefix) in Hungarian reflects the _semantic_ usage of the
data, not it's literal type.

In fact, for variables that are typed as built-in types such as "int",
"char", etc. the tag will most often _not_ reflect the actual type of
the variable. For example, "x", "dx", and "cx" are common variable
names when dealing with the X coordinate in a Cartesian coordinate
space, but they can all refer to a variety of integral types: "int",
"short", "ushort", "long", etc. In the Hungarian philosophy, the naming
is there to ensure semantic correctness, not compiler correctness.

This is in fact why Hungarian is still valuable even when using a
strongly-typed language.

This is something that Microsoft's "Systems" version of Hungarian gets
very, very wrong. Unfortunately, that's the Hungarian most people are
exposed to.

Pete
I've done quite a bit ASP/VBScript, and although it wasn't exactly the
original intention of the hungarian notation, using it to keep track of
the data type is very useful in that environment. Otherwise you could
easily get surprised by the results, like:

Dim page
page = Request.QuerySt ring("page")
If page = 42 Then
' we don't get here even if we put 42 in the query string
' as the variable page contains "42", not 42.
End If

--
Göran Andersson
_____
http://www.guffa.com
Sep 22 '08 #14
raylopez99 wrote:
Gran Andersson wrote:
>When you reach the 200000th iteration, you will have copied:

16*(1+2+3+4+5+ ...+200000) = 16*(100000*2000 01) = 320001600000 bytes

That is 320 GB. That 160 times more than you have in your computer. To
create a string that is 1600000 characters, you have copied 100000 times

Well that's interesting Goran. But my PC did not crash, and I don't
have 320 GB of HD, so somehow it must be doing some fancy stuff in the
background to truncate.
For each string that you create, the previous string is up for garbage
collection.

As you go through a lot more memory than there are in the computer, it
means that it has done more than 160 garbage collections during the
loop, and probably something closer to a 1000.
>The StringBuilder has to grow it's internal string several times during
the loop, but each time it's size is doubled, so in the end the
StringBuilde r will have copied about two times the size of the string.

So in this case the StringBuilder should be about 50000 times faster
than concatenating the strings, which corresponds to your result.

(If you specify the final size when creating the StringBuilder, the
interal string never has to be reallocated, so it will be twice as fast.)

That's counterintuitiv e, if you're saying specifiying the final size
will make StringBuilder *slower*. Very strange if true. Anyway I
never specify anything so I'm OK.
If you find it counter intuitive, then perhaps you should read it again
to see if you got it right. In this case you got it backwards.
Anyhow, I just discovered this cool property for
runtime type checking:

// using public static object ChangeType (object value, Type
conversionType) ;
// example:

Type myTargetType = typeof (int);
object theSourceStr = 42;
object theResult = Convert.ChangeT ype(theSourceSt r, myTargetType);
Console.WriteLi ne(theResult); //42
Console.WriteLi ne(theResult.Ge tType()); //System.Int32

//pretty cool, eh? I bet it only works though for 'primitive' data
types like int, etc.
The type has to implement the IConvertible interface, which the
primitive types do.

The primitive types already have methods in the Convert class, so
instead of doing Convert.ChangeT ype(var, typeof(int)) you can use
Convert.ToInt32 (var).

Casting to a type that you specify dynamically isn't very useful in a
strongly typed language. You can't do much with the data anyway without
casting the reference to the actual type.
// UPDATE: I see C# has no easy way of casting any object...or so it
seems. I'll post in a separate thread on this...
Converting a string to an int is parsing. There are several method for
doing that, like int.Parse(strin g), int.TryParse(st ring, out int),
Convert.ToInt32 (string)...

For casting, C# uses the same syntax as C/C++:

int value = 42;
long bigValue = (long)value;

--
Gran Andersson
_____
http://www.guffa.com
Sep 22 '08 #15

"raylopez99 " <ra********@yah oo.comwrote in message
news:46******** *************** ***********@m45 g2000hsb.google groups.com...
StringBuilder better and faster than string for adding many strings.

Look at the below. It's amazing how much faster StringBuilder is than
string.

The last loop below is telling: for adding 200000 strings of 8 char
each, string took over 25 minutes while StringBuilder took 40
milliseconds!

Can anybody explain such a radical difference?
You got lots of answers as to why String performed so badly in this
situation compared to StringBuilder.
Many people misinterpret this to say that you should always prefer
StringBuilder over String when doing concatenation.
But this is not a true assumption.

For straight concatenation like this

string foo = "A" + "B" +C"....

String is optimized to handle this case and it is very efficient

Even for strings constructed by looping

String foo = "";
foreach(string str in Bar)
{
foo += str;
}

If the number of loops is under 10-20 this code can actually outperform
similar code using StringBuilder

Obviously you saw what can happen if the number gets large, so it is
good to understand how it works under the covers and use the right tool
for the job.

Bill


Sep 23 '08 #16
Gran Andersson wrote:
Casting to a type that you specify dynamically isn't very useful in a
strongly typed language. You can't do much with the data anyway without
casting the reference to the actual type.
Very good. It explains why C# doesn't have, like C++/CLI, this
function "safe_cast" :
Object ^ obj = safe_cast <Object^(anEnum Here->Current);

Or maybe not.

Anyway today, via FTM's and Jon Skeet's help, I learned this query,
for iterating through a list having mixed strings and ints and picking
out the first two letters of the strings:
List<objectword s = new List<object{ "green", "blue", 3,
"violet", 5 };

IEnumerable<str ingquery =
words.AsQueryab le().OfType<str ing>().Cast<str ing>().Select(s tr =>
str.Substring(0 ,Math.Min(str.L ength, 2))); //prevents out-of-range
exception and also string cast problem

Pretty cool eh? On top of doing my day job, which is heating up
now... multitasking big time...I gotta run play a game of online chess
then crash...

RL
Sep 23 '08 #17
On Mon, 22 Sep 2008 17:57:34 -0700, Bill Butler <qw****@asdf.co mwrote:
[...]
Even for strings constructed by looping

String foo = "";
foreach(string str in Bar)
{
foo += str;
}

If the number of loops is under 10-20 this code can actually outperform
similar code using StringBuilder
For what it's worth, curious I did a quick test. I found that the two
techniques reach near-parity at just 5 concatenations, and StringBuilder
is definitively faster at 10 concatenations. At 20, there's no contest.

Even with just two concatenations, the difference is "only" a factor of
2. And of course, in that situation the performance of the actual
concatenation is unlikely to be significant in the overall algorithm.

To me, that means that it's "safe" to use plain string concatenation as a
_simplification _ of the code, when you're sure that the number of
concatenations will be small. I would even accept the performance
overhead up to 20 concatenations or so, as long as it wasn't a critical
difference, because I do feel the code is easier to read. But only when I
could be assured the number of concatenations wouldn't ever be much
greater than that.

It's hard to imagine a situation in which choosing string concatenation
over StringBuilder would be a legitimate real-world performance
optimization. Practically any situation in which you have few enough
concatenations for string concatenation to win, there would be too few
concatenations for the concatenation to matter much at all.

Again, perfectly valid as a code maintenance/readability choice in certain
"safe" situations, but probably not something someone's going to do as a
performance enhancement. For example, I wouldn't create code that has a
special code-path for small numbers of concatenations just to take
advantage of that performance difference. The decrease in maintainability
isn't worth the marginal improvement in performance, even in the best case.

Pete
Sep 23 '08 #18

"Peter Duniho" <Np*********@nn owslpianmk.comw rote in message
news:op******** *******@petes-computer.local. ..
On Mon, 22 Sep 2008 17:57:34 -0700, Bill Butler <qw****@asdf.co m>
wrote:
>[...]
Even for strings constructed by looping

String foo = "";
foreach(stri ng str in Bar)
{
foo += str;
}

If the number of loops is under 10-20 this code can actually
outperform
similar code using StringBuilder

For what it's worth, curious I did a quick test. I found that the two
techniques reach near-parity at just 5 concatenations, and
StringBuilder is definitively faster at 10 concatenations. At 20,
there's no contest.
Correct, That's what I get for trying to pull the number from memory.
>
Even with just two concatenations, the difference is "only" a factor
of 2. And of course, in that situation the performance of the actual
concatenation is unlikely to be significant in the overall algorithm.

To me, that means that it's "safe" to use plain string concatenation
as a _simplification _ of the code, when you're sure that the number
of concatenations will be small. I would even accept the performance
overhead up to 20 concatenations or so, as long as it wasn't a
critical difference, because I do feel the code is easier to read.
But only when I could be assured the number of concatenations
wouldn't ever be much greater than that.

It's hard to imagine a situation in which choosing string
concatenation over StringBuilder would be a legitimate real-world
performance optimization. Practically any situation in which you
have few enough concatenations for string concatenation to win, there
would be too few concatenations for the concatenation to matter much
at all.

Again, perfectly valid as a code maintenance/readability choice in
certain "safe" situations, but probably not something someone's going
to do as a performance enhancement. For example, I wouldn't create
code that has a special code-path for small numbers of concatenations
just to take advantage of that performance difference. The decrease
in maintainability isn't worth the marginal improvement in
performance, even in the best case.

I agree completely.
But I have seen folks that avoid string concatenation like the plague,
since they heard that it is slower.
In most cases readability triumphs.

Thanks for the correction
Bill


Sep 23 '08 #19
On Mon, 22 Sep 2008 21:39:26 -0700, Bill Butler <qw****@asdf.co mwrote:
I agree completely.
But I have seen folks that avoid string concatenation like the plague,
since they heard that it is slower.
I agree that avoiding anything "like the plague" can often lead to poor
design and implementation choices. It's important to know the trade-offs.
In most cases readability triumphs.
I don't know about "most". There are lots of examples of theoretically
unbounded scenarios. Often what happens is that there's no guarantee of a
certain bounding, but an immediately expected data set that empirically
has a low bound. Then once the code has been "in the wild" for awhile,
the data set changes and the inefficient approach becomes painfully
apparent.

I would say that only in the situation where it is _guaranteed_ up front
that the number of concatenations is bounded to some relatively small
number (dozens, or if the concatenation is in a known "not bottleneck"
section of code perhaps hundreds) should the readability of straight
string concatenation be preferred. I suspect that these scenarios don't
in fact make up "most cases" (I know that in my own experience they don't,
but lacking a real survey of real-world code I can't state for sure what's
unversally true :) ).

Barring those conditions, it's just too dangerous to not be using
StringBuilder, especially given that readability doesn't suffer _that_
much for using it. I dislike premature optimizations as much as the next
guy, but in this case the potential for catastrophic performance problems
outweighs the relatively minor improvement in readability.
Thanks for the correction
No problem...thank you for inspiring the test! :)

Pete
Sep 23 '08 #20

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

37
4733
by: Kevin C | last post by:
Quick Question: StringBuilder is obviously more efficient dealing with string concatenations than the old '+=' method... however, in dealing with relatively large string concatenations (ie, 20-30k), what are the performance differences (if any with something as trivial as this) between initializing a new instance of StringBuilder with a specified capacity vs. initializing a new instance without... (the final length is not fixed) ie,
16
410
by: Alvin Bruney | last post by:
Is string builder intelligent enough to handle concats without behaving like string? Consider myStringBuilder.Append("one" + "two") what does the '+' do here? Because this syntax is also legal but what is the cost compared to myStringBuilder.Append("one"); myStringBuilder.Append("two");
14
15057
by: Bob | last post by:
I have a function that takes in a list of IDs (hundreds) as input parameter and needs to pass the data to another step as a comma delimited string. The source can easily create this list of IDs in a comma-delimited string or string array. I don't want it to be a string because I want to overload this function, and it's sister already uses a string input parameter. Now if I define the function to take in a string array, it solves my...
33
4705
by: genc_ymeri | last post by:
Hi over there, Propably this subject is discussed over and over several times. I did google it too but I was a little bit surprised what I read on internet when it comes 'when to use what'. Most of articles I read from different experts and programmers tell me that their "gut feelings" for using stringBuilder instead of string concatenation is when the number of string concatunation is more then N ( N varies between 3 to max 15 from...
26
3224
by: Hardy Wang | last post by:
Hi all, I know it is better to handle large string with a StringBuilder, but how does StringBuilder class improve the performance in the background? Thanks! -- WWW: http://hardywang.1accesshost.com ICQ: 3359839 yours Hardy
10
4427
by: =?Utf-8?B?RGFyYSBQ?= | last post by:
Can some one suggest me as why StringBuilder class is better than Strings? -- Never say, Give up
0
9933
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, well explore What is ONU, What Is Router, ONU & Routers main usage, and What is the difference between ONU and Router. Lets take a closer look ! Part I. Meaning of...
0
11123
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
10836
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
10407
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
7960
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupr who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
7114
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
1
4606
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
4211
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
3230
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.