473,400 Members | 2,163 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,400 software developers and data experts.

String.Join vs. StringBuilder, which is faster?

Bob
I have a function that takes in a list of IDs (hundreds) as input parameter
and needs to pass the data to another step as a comma delimited string. The
source can easily create this list of IDs in a comma-delimited string or
string array. I don't want it to be a string because I want to overload
this function, and it's sister already uses a string input parameter. Now
if I define the function to take in a string array, it solves my overload
issue, but then I have to convert the array inside the function to a comma
delimited string using string.Join(). Alternatively, I can define the input
parameter as a StringBuilder (which just contains the comma delimited
string), and then do a sb.ToString() to get the string. Which would be a
better solution between using string array and then join vs. StringBuilder
and ToString?

I know if I don't overload at all, it would make the best sense from a
performance perspective, but code readability and maintenance become harder
as the two functions really do the very similar things.

Thanks
Bob
Nov 15 '05 #1
14 14995
Since strings are immutable (a new memory address is allocated each time
your alter the string), the StringBuilder is the way to go.
" Bob" <bo*******@yahoo.com> wrote in message
news:OH**************@TK2MSFTNGP12.phx.gbl...
I have a function that takes in a list of IDs (hundreds) as input parameter and needs to pass the data to another step as a comma delimited string. The source can easily create this list of IDs in a comma-delimited string or
string array. I don't want it to be a string because I want to overload
this function, and it's sister already uses a string input parameter. Now
if I define the function to take in a string array, it solves my overload
issue, but then I have to convert the array inside the function to a comma
delimited string using string.Join(). Alternatively, I can define the input parameter as a StringBuilder (which just contains the comma delimited
string), and then do a sb.ToString() to get the string. Which would be a
better solution between using string array and then join vs. StringBuilder
and ToString?

I know if I don't overload at all, it would make the best sense from a
performance perspective, but code readability and maintenance become harder as the two functions really do the very similar things.

Thanks
Bob

Nov 15 '05 #2
What does the fact that strings are immutable have to do with Bob's
question? String.Join is a static method of that class and creates a new
string object when called. Personally I would go with an array and Join, it
will be easier to understand (you're passing an array or a list of values)
and you won't have to loop to put all your values into a string builder.
That will never be faster than calling Join. Unless of course your input is
not an array of strings (you're saying a list of IDs, not sure about the
type there).

Jerry

"Scott M." <s-***@BADSPAMsnet.net> wrote in message
news:Ob**************@TK2MSFTNGP12.phx.gbl...
Since strings are immutable (a new memory address is allocated each time
your alter the string), the StringBuilder is the way to go.
" Bob" <bo*******@yahoo.com> wrote in message
news:OH**************@TK2MSFTNGP12.phx.gbl...
I have a function that takes in a list of IDs (hundreds) as input

parameter
and needs to pass the data to another step as a comma delimited string.

The
source can easily create this list of IDs in a comma-delimited string or
string array. I don't want it to be a string because I want to overload
this function, and it's sister already uses a string input parameter. Now if I define the function to take in a string array, it solves my overload issue, but then I have to convert the array inside the function to a comma delimited string using string.Join(). Alternatively, I can define the

input
parameter as a StringBuilder (which just contains the comma delimited
string), and then do a sb.ToString() to get the string. Which would be a better solution between using string array and then join vs. StringBuilder and ToString?

I know if I don't overload at all, it would make the best sense from a
performance perspective, but code readability and maintenance become

harder
as the two functions really do the very similar things.

Thanks
Bob


Nov 15 '05 #3
String.Join is implemented internally by the run-time. It is serviced by the
ConcatenateJoinHelperArray method which turns out to be extremely fast at
creating the final string as opposed to the slightly slower
StringBuilder.Append. If you have an array of strings, I would recommend using
that with String.Join over StringBuilder.Append since String.Join is going to be
quite a bit faster.

using System;
using System.Text;

public class JoinVsBuilder {
private static string[] strings = new string[0];

private static void Main(string[] args) {
int count = int.Parse(args[0]);

strings = new string[count];
for(int i = 0; i < count; i++) {
strings[i] = i.ToString();
}

DateTime start, end;

start = DateTime.Now;
string newStr = string.Join("foo", strings);
end = DateTime.Now;
Console.WriteLine("String::Join timing is {0}", end-start);

StringBuilder sb = new StringBuilder();
start = DateTime.Now;

// Faster than worrying about when to append the connector
sb.Append(strings[0]);
for(int i = 1; i < strings.Length; i++) {
sb.Append("foo");
sb.Append(strings[i]);
}
string newStr2 = sb.ToString();
end = DateTime.Now;
Console.WriteLine("StringBuilder::Append timing is {0}", end-start);
}
}

C:\Projects\CSharp\Samples\JoinVsBuilder>JoinVsBui lder.exe 1000000
String::Join timing is 00:00:00.4606624
StringBuilder::Append timing is 00:00:02.9141904
--
Justin Rogers
DigiTec Web Consultants, LLC.
Blog: http://weblogs.asp.net/justin_rogers

"Jerry III" <je******@hotmail.com> wrote in message
news:uT****************@TK2MSFTNGP09.phx.gbl...
What does the fact that strings are immutable have to do with Bob's
question? String.Join is a static method of that class and creates a new
string object when called. Personally I would go with an array and Join, it
will be easier to understand (you're passing an array or a list of values)
and you won't have to loop to put all your values into a string builder.
That will never be faster than calling Join. Unless of course your input is
not an array of strings (you're saying a list of IDs, not sure about the
type there).

Jerry

"Scott M." <s-***@BADSPAMsnet.net> wrote in message
news:Ob**************@TK2MSFTNGP12.phx.gbl...
Since strings are immutable (a new memory address is allocated each time
your alter the string), the StringBuilder is the way to go.
" Bob" <bo*******@yahoo.com> wrote in message
news:OH**************@TK2MSFTNGP12.phx.gbl...
I have a function that takes in a list of IDs (hundreds) as input

parameter
and needs to pass the data to another step as a comma delimited string.

The
source can easily create this list of IDs in a comma-delimited string or
string array. I don't want it to be a string because I want to overload
this function, and it's sister already uses a string input parameter. Now if I define the function to take in a string array, it solves my overload issue, but then I have to convert the array inside the function to a comma delimited string using string.Join(). Alternatively, I can define the

input
parameter as a StringBuilder (which just contains the comma delimited
string), and then do a sb.ToString() to get the string. Which would be a better solution between using string array and then join vs. StringBuilder and ToString?

I know if I don't overload at all, it would make the best sense from a
performance perspective, but code readability and maintenance become

harder
as the two functions really do the very similar things.

Thanks
Bob



Nov 15 '05 #4

"Jerry III" <je******@hotmail.com> wrote in message
news:uT****************@TK2MSFTNGP09.phx.gbl...
What does the fact that strings are immutable have to do with Bob's
question? String.Join is a static method of that class and creates a new
string object when called. Personally I would go with an array and Join, it will be easier to understand (you're passing an array or a list of values)
and you won't have to loop to put all your values into a string builder.
That will never be faster than calling Join. Unless of course your input is not an array of strings (you're saying a list of IDs, not sure about the
type there).

Because, as you said "String.Join is a static method of that class and
creates a new string object when called.". Each time its called, it will
create a new string, and thus a new spot on the heap. A StringBuilder will
only need one memory address and for this reason, usually performs better.

Jerry

"Scott M." <s-***@BADSPAMsnet.net> wrote in message
news:Ob**************@TK2MSFTNGP12.phx.gbl...
Since strings are immutable (a new memory address is allocated each time
your alter the string), the StringBuilder is the way to go.
" Bob" <bo*******@yahoo.com> wrote in message
news:OH**************@TK2MSFTNGP12.phx.gbl...
I have a function that takes in a list of IDs (hundreds) as input parameter
and needs to pass the data to another step as a comma delimited string.
The
source can easily create this list of IDs in a comma-delimited string
or string array. I don't want it to be a string because I want to overload this function, and it's sister already uses a string input parameter.

Now if I define the function to take in a string array, it solves my overload issue, but then I have to convert the array inside the function to a comma delimited string using string.Join(). Alternatively, I can define the

input
parameter as a StringBuilder (which just contains the comma delimited
string), and then do a sb.ToString() to get the string. Which would
be a better solution between using string array and then join vs. StringBuilder and ToString?

I know if I don't overload at all, it would make the best sense from a
performance perspective, but code readability and maintenance become

harder
as the two functions really do the very similar things.

Thanks
Bob



Nov 15 '05 #5
Scott M. <s-***@BADSPAMsnet.net> wrote:
Because, as you said "String.Join is a static method of that class and
creates a new string object when called.". Each time its called, it will
create a new string, and thus a new spot on the heap. A StringBuilder will
only need one memory address and for this reason, usually performs better.


No, because here only one call to String.Join is needed. A
StringBuilder is better than manually creating lots of new strings, but
that doesn't happen in String.Join.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 15 '05 #6
Yes, this is a perfect example where StringBuilder at the very best provides
no advantage, and most likely is at a strong disadvantage. Just because
Microsoft and others have preached that it should always be used whenever
there's any kind of concatenation, don't believe it.

"Jon Skeet [C# MVP]" <sk***@pobox.com> wrote in message
news:MP************************@msnews.microsoft.c om...
Scott M. <s-***@BADSPAMsnet.net> wrote:
Because, as you said "String.Join is a static method of that class and
creates a new string object when called.". Each time its called, it will create a new string, and thus a new spot on the heap. A StringBuilder will only need one memory address and for this reason, usually performs
better.
No, because here only one call to String.Join is needed. A
StringBuilder is better than manually creating lots of new strings, but
that doesn't happen in String.Join.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too

Nov 15 '05 #7
I recreated your test, tweaking it so that there is no connector (thus only
one Append in the loop). I consistently get Join being 2 to 2-1/2 times
faster than a series of SB Appends.
Nov 15 '05 #8
Daniel Billingsley <db**********@NO.durcon.SPAAMM.com> wrote:
Yes, this is a perfect example where StringBuilder at the very best provides
no advantage, and most likely is at a strong disadvantage. Just because
Microsoft and others have preached that it should always be used whenever
there's any kind of concatenation, don't believe it.


I don't think they *have* actually preached that - people who've
understood *part* of why it's worth using StringBuilder but not all of
it have preached it. I don't think MS put particularly blanket
recommendations out. Let me know where they are and I'll complain about
them, if they do exist :)

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 15 '05 #9
These issues with stringbuilder performance have a strange ring of
familiartiy eh skeet?

--
Regards,
Alvin Bruney [ASP.NET MVP]
Got tidbits? Get it here...
http://tinyurl.com/3he3b
"Jon Skeet [C# MVP]" <sk***@pobox.com> wrote in message
news:MP************************@msnews.microsoft.c om...
Daniel Billingsley <db**********@NO.durcon.SPAAMM.com> wrote:
Yes, this is a perfect example where StringBuilder at the very best provides no advantage, and most likely is at a strong disadvantage. Just because
Microsoft and others have preached that it should always be used whenever there's any kind of concatenation, don't believe it.


I don't think they *have* actually preached that - people who've
understood *part* of why it's worth using StringBuilder but not all of
it have preached it. I don't think MS put particularly blanket
recommendations out. Let me know where they are and I'll complain about
them, if they do exist :)

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too

Nov 15 '05 #10
Figured I'd post up my latest testing. The StringBuilder really isn't as bad as
the original test showed.
What is missing is that behind the scenes String::Join is computing an optimally
sized memory array,
while the StringBuilder is constantly expanding it's own. If you use capacity
planning then the
StringBuilder is only about 25% slower than the String::Join. In addition there
are some string interning
problems that tend to affect performance quite a bit that I've been able to
compute out of my analysis.

Original performance testing post:
http://weblogs.asp.net/justin_rogers.../04/84306.aspx

Revised performance testing post:
http://weblogs.asp.net/justin_rogers.../05/84986.aspx

I definitely understand why Microsoft recommends the use of StringBuilder. Most
developers tend to
subscribe to a form of lazy concatenation rather than preparing their data for a
more specialized method.
Many algorithms also favor lazy concatenation rather than preparing data up
front (ASP .NET is probably
the number one subscribe since their entire system is based on lazy
concatenation). If you do some capacity
planning or limit the number of times the builder will be resized you can get
much better performance from
the StringBuilder than I'm sure most people are getting because they don't take
performance very seriously
or don't know what it takes to squeeze performance out of the StringBuilder.

--
Justin Rogers
DigiTec Web Consultants, LLC.
Blog: http://weblogs.asp.net/justin_rogers

"Alvin Bruney [MVP]" <vapor at steaming post office> wrote in message
news:%2******************@TK2MSFTNGP12.phx.gbl...
These issues with stringbuilder performance have a strange ring of
familiartiy eh skeet?

--
Regards,
Alvin Bruney [ASP.NET MVP]
Got tidbits? Get it here...
http://tinyurl.com/3he3b
"Jon Skeet [C# MVP]" <sk***@pobox.com> wrote in message
news:MP************************@msnews.microsoft.c om...
Daniel Billingsley <db**********@NO.durcon.SPAAMM.com> wrote:
Yes, this is a perfect example where StringBuilder at the very best provides no advantage, and most likely is at a strong disadvantage. Just because
Microsoft and others have preached that it should always be used whenever there's any kind of concatenation, don't believe it.


I don't think they *have* actually preached that - people who've
understood *part* of why it's worth using StringBuilder but not all of
it have preached it. I don't think MS put particularly blanket
recommendations out. Let me know where they are and I'll complain about
them, if they do exist :)

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too


Nov 15 '05 #11
Justin Rogers <Ju****@games4dotnet.com> wrote:
In addition there
are some string interning problems that tend to affect performance quite a bit
that I've been able to compute out of my analysis.


I don't understand this point. In the code that I've seen, there isn't
any interning going on apart from the interning "foo" which only
happens once. Nothing should be interning each of the bits which ends
up being joined.

Far more likely, IMO, is that you've got garbage collection occurring -
but that's an entirely different thing from string interning.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 15 '05 #12
All strings are present in a global string table. This is how you ensure
immutability.
To compact the string table and prevent instances where the same string is
present
in memory more than once they do a scan to make sure the string doesn't already
exist (using a string hash I believe). If it does then your string reference
simply points
to the already allocated string table slot. Since we are building the same
exact string
many times, the second and any subsequent time we build the same string we incur
a performance hit as the string table is searched. Since the strings in the
example
are very large it takes a while to do the comparison and find the string (aka
computing
the hash).

There is no garbage collection in the example if you run it using the parameters
I
pointed out in the article. At least not with a sufficient amount of memory.
The string
allocations themselves only take up say:

3 * 1million + 7 * 1million, or approximately 10 million characters.

If the sample is changed to ensure that the strings are referenced throughout
the operation
of the program and then used at the end of the program, it makes no changes to
the
performance characteristics. No GC is happening.
--
Justin Rogers
DigiTec Web Consultants, LLC.
Blog: http://weblogs.asp.net/justin_rogers

"Jon Skeet [C# MVP]" <sk***@pobox.com> wrote in message
news:MP************************@msnews.microsoft.c om...
Justin Rogers <Ju****@games4dotnet.com> wrote:
In addition there
are some string interning problems that tend to affect performance quite a bit that I've been able to compute out of my analysis.


I don't understand this point. In the code that I've seen, there isn't
any interning going on apart from the interning "foo" which only
happens once. Nothing should be interning each of the bits which ends
up being joined.

Far more likely, IMO, is that you've got garbage collection occurring -
but that's an entirely different thing from string interning.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too

Nov 15 '05 #13
Justin,
You just described Interning, however as Jon stated Interning is not
involved here!

Try the following:

char c1 = 'a', c2 = 'b';

string s1 = "ab", s2 = string.Concat(c1, c2);

System.Diagnostics.Debug.WriteLine(string.Equals(s 1, s2), "s1 == s2");
System.Diagnostics.Debug.WriteLine(string.Referenc eEquals(s1, s2), "s1
is s2");

string.Equals is true as they contain the same characters, however
string.ReferenceEquals is false as they were not interned!

Hope this helps
Jay

"Justin Rogers" <Ju****@games4dotnet.com> wrote in message
news:Oi*************@TK2MSFTNGP11.phx.gbl...
All strings are present in a global string table. This is how you ensure
immutability.
To compact the string table and prevent instances where the same string is
present
in memory more than once they do a scan to make sure the string doesn't already exist (using a string hash I believe). If it does then your string reference simply points
to the already allocated string table slot. Since we are building the same exact string
many times, the second and any subsequent time we build the same string we incur a performance hit as the string table is searched. Since the strings in the example
are very large it takes a while to do the comparison and find the string (aka computing
the hash).

There is no garbage collection in the example if you run it using the parameters I
pointed out in the article. At least not with a sufficient amount of memory. The string
allocations themselves only take up say:

3 * 1million + 7 * 1million, or approximately 10 million characters.

If the sample is changed to ensure that the strings are referenced throughout the operation
of the program and then used at the end of the program, it makes no changes to the
performance characteristics. No GC is happening.
--
Justin Rogers
DigiTec Web Consultants, LLC.
Blog: http://weblogs.asp.net/justin_rogers

"Jon Skeet [C# MVP]" <sk***@pobox.com> wrote in message
news:MP************************@msnews.microsoft.c om...
Justin Rogers <Ju****@games4dotnet.com> wrote:
In addition there
are some string interning problems that tend to affect performance
quite a
bit that I've been able to compute out of my analysis.


I don't understand this point. In the code that I've seen, there isn't
any interning going on apart from the interning "foo" which only
happens once. Nothing should be interning each of the bits which ends
up being joined.

Far more likely, IMO, is that you've got garbage collection occurring -
but that's an entirely different thing from string interning.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too


Nov 15 '05 #14
Justin Rogers <Ju****@games4dotnet.com> wrote:
All strings are present in a global string table. This is how you ensure
immutability.
No they're not. Only strings which are interned do this. It happens
automatically for strings literals, and you can ask other strings to be
interned. It doesn't happen automatically for *all* strings.

<snip>
There is no garbage collection in the example if you run it using the parameters
I pointed out in the article.
Yes there is.
At least not with a sufficient amount of memory.
The string allocations themselves only take up say:

3 * 1million + 7 * 1million, or approximately 10 million characters.
And that's far more than the size of generation 0 in the heap - so
garbage collection *will* take place. You can see this with the
performance monitor if you want.
If the sample is changed to ensure that the strings are referenced throughout
the operation of the program and then used at the end of the program, it makes no
changes to the performance characteristics. No GC is happening.


Well there certainly isn't any interning happening...

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 15 '05 #15

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

37
by: Kevin C | last post by:
Quick Question: StringBuilder is obviously more efficient dealing with string concatenations than the old '+=' method... however, in dealing with relatively large string concatenations (ie,...
12
by: Tee | last post by:
String Builder & String, what's the difference. and when to use which ? Thanks.
12
by: DumberThanSnot | last post by:
is there a faster way to copy an ArrayList of strings to a string other than a tight loop. In it's most simple terms, I'm currently using something like this... ---------------------------- ...
3
by: Usarian Skiff | last post by:
I'm making a csv file from a datatable. Here's my code 'Copy Headers Dim obj(8) As Object Dim txt As String obj = dt.DefaultView.Item(0).Row.ItemArray txt = Join(obj, ",") filedata = txt ...
33
by: genc_ymeri | last post by:
Hi over there, Propably this subject is discussed over and over several times. I did google it too but I was a little bit surprised what I read on internet when it comes 'when to use what'. Most...
12
by: Richard Lewis Haggard | last post by:
I thought that the whole point of StringBuilder was that it was supposed to be a faster way of building strings than string. However, I just put together a simple little application to do a...
26
by: Hardy Wang | last post by:
Hi all, I know it is better to handle large string with a StringBuilder, but how does StringBuilder class improve the performance in the background? Thanks! -- WWW:...
34
by: raylopez99 | last post by:
StringBuilder better and faster than string for adding many strings. Look at the below. It's amazing how much faster StringBuilder is than string. The last loop below is telling: for adding...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.