473,569 Members | 2,761 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

String concatenation

Is it true that joining the string elements of a list is faster than
concatenating them via the '+' operator?

"".join(['a', 'b', 'c'])

vs

'a'+'b'+'c'

If so, can anyone explain why?

\\ jonas galvez
// jonasgalvez.com

Jul 18 '05 #1
5 3627
Jonas Galvez wrote:
Is it true that joining the string elements of a list is faster than
concatenating them via the '+' operator?

"".join(['a', 'b', 'c'])

vs

'a'+'b'+'c'

If so, can anyone explain why?


It's because the latter one has to build a temporary
string consisting of 'ab' first, then the final string
with 'c' added, while the join can (and probably does) add up
all the lengths of the strings to be joined and build the final
string all in one go.

Note that there's also '%s%s%s' % ('a', 'b', 'c'), which is
probably on par with the join technique for both performance
and lack of readability.

Note much more importantly, however, that you should probably
not pick the join approach over the concatenation approach
based on performance. Concatenation is more readable in the
above case (ignoring the fact that it's a contrived example),
as you're being more explicit about your intentions.

The reason joining lists is popular is because of the
terribly bad performance of += when one is gradually building
up a string in pieces, rather than appending to a list and
then doing join at the end.

So

l = []
l.append('a')
l.append('b')
l.append('c')
s = ''.join(l)

is _much_ faster (therefore better) in real-world cases than

s = ''
s += 'a'
s += 'b'
s += 'c'

With the latter, if you picture longer and many more strings,
and realize that each += causes a new string to be created
consisting of the contents of the two old strings joined together,
steadily growing longer and requiring lots of wasted copying,
you can see why it's very bad on memory and performance.

The list approach doesn't copy the strings at all, but just
holds references to them in a list (which does grow in a
similar but much more efficient manner). The join figures
out the sizes of all of the strings and allocates enough
space to do only a single copy from each.

Again though, other than the += versus .append() case, you should
probably not pick ''.join() over + since readability will
suffer more than your performance will improve.

-Peter
Jul 18 '05 #2
Peter Hansen <pe***@engcorp. com> wrote in news:xvydnWNN7t 2X50vdRVn-
gw@powergate.ca:
Jonas Galvez wrote:
Is it true that joining the string elements of a list is faster than
concatenating them via the '+' operator?

Note that there's also '%s%s%s' % ('a', 'b', 'c'), which is
probably on par with the join technique for both performance
and lack of readability.


A few more points.

Yes, the format string in this example isn't the clearest, but if you have
a case where some of the strings are fixed and others vary, then the format
string can be the clearest.

e.g.

'<a href="%s" alt="%s">%s</a>' % (uri, alt, text)

rather than:

'<a href="'+uri+'" alt="'+alt+'">' +text+'</a>'

In many situations I find I use a combination of all three techniques.
Build a list of strings to be concatenated to produce the final output, but
each of these strings might be built from a format string or simple
addition as above.

On the readability of ''.join(), I would suggest never writing it more than
once. That means I tend to do something like:

concatenate = ''.join
...
concatenate(myL ist)

Or

def concatenate(*ar gs):
return ''.join(args)
...
concatenate('a' , 'b', 'c')

depending on how it is to be used.

It's also worth saying that a lot of the time you find you don't want the
empty separator at all, (e.g. maybe newline is more appropriate), and in
this case the join really does become easier than simple addition, but
again it is worth wrapping it so that your intention at the point of call
is clear.

Finally, a method call on a bare string (''.join, or '\n'.join) looks
sufficiently bad that if, for some reason, you don't want to give it a name
as above, I would suggest using the alternative form for calling it:

str.join('\n', aList)

rather than:

'\n'.join(aList )
Jul 18 '05 #3
Peter Hansen wrote:
Jonas Galvez wrote:
Is it true that joining the string elements of a list is faster than
concatenating them via the '+' operator?

"".join(['a', 'b', 'c'])

vs

'a'+'b'+'c'

If so, can anyone explain why?

It's because the latter one has to build a temporary
string consisting of 'ab' first, then the final string
with 'c' added, while the join can (and probably does) add up
all the lengths of the strings to be joined and build the final
string all in one go.


Idea sprang to mind: Often (particularly in generating web pages) one
wants to do lots of += without thinking about "".join.
So what about creating a class that will do this quickly?
The following class does this and is much faster when adding together
lots of strings. Only seem to see performance gains above about 6000
strings...

David

class faststr(str):
def __init__(self, *args, **kwargs):
self.appended = []
str.__init__(se lf, *args, **kwargs)
def __add__(self, otherstr):
self.appended.a ppend(otherstr)
return self
def getstr(self):
return str(self) + "".join(self.ap pended)

def testadd(start, n):
for i in range(n):
start += str(i)
if hasattr(start, "getstr"):
return start.getstr()
else:
return start

if __name__ == "__main__":
import sys
if len(sys.argv) >= 3 and sys.argv[2] == "fast":
start = faststr("test")
else:
start = "test"
s = testadd(start, int(sys.argv[1]))
Jul 18 '05 #4

Let's try this :

def test_concat():
s = ''
for i in xrange( test_len ):
s += str( i )
return s

def test_join():
s = []
for i in xrange( test_len ):
s.append( str( i ))
return ''.join(s)

def test_join2():
return ''.join( map( str, range( test_len ) ))

Results, with and without psyco :
test_len = 1000
String concatenation (normal) 4.85290050507 ms.
[] append + join (normal) 4.27646517754 ms.
map + join (normal) 2.37970948219 ms.

String concatenation (psyco) 2.0838675499 ms.
[] append + join (psyco) 2.29129695892 ms.
map + join (psyco) 2.21130692959 ms.

test_len = 5000
String concatenation (normal) 40.3251230717 ms.
[] append + join (normal) 23.3911275864 ms.
map + join (normal) 13.844203949 ms.

String concatenation (psyco) 9.65108215809 ms.
[] append + join (psyco) 13.0564379692 ms.
map + join (psyco) 13.342962265 ms.

test_len = 10000
String concatenation (normal) 163.02690506 ms.
[] append + join (normal) 47.6168513298 ms.
map + join (normal) 28.5276055336 ms.

String concatenation (psyco) 19.6494650841 ms.
[] append + join (psyco) 26.637775898 ms.
map + join (psyco) 26.7823898792 ms.

test_len = 20000
String concatenation (normal) 4556.57429695 ms.
[] append + join (normal) 92.0199871063 ms.
map + join (normal) 56.7145824432 ms.

String concatenation (psyco) 42.247030735 ms.
[] append + join (psyco) 58.3201909065 ms.
map + join (psyco) 53.8239884377 ms.
Conclusion :

- join is faster but worth the annoyance only if you join 1000s of strings
- map is useful
- psyco makes join useless if you can use it (depends on which web
framework you use)
- python is really pretty fast even without psyco (it runs about one mips
!)

Note :

Did I mention psyco has a special optimization for string concatenation ?


Jul 18 '05 #5
Duncan Booth wrote:

[...]
Finally, a method call on a bare string (''.join, or '\n'.join) looks
sufficiently bad that if, for some reason, you don't want to give it a name
as above, I would suggest using the alternative form for calling it:

str.join('\n', aList)

rather than:

'\n'.join(aList )


This is, of course, pure prejudice. Not that there's anything wrong with
that ...

regards
Steve
Jul 18 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

20
11293
by: hagai26 | last post by:
I am looking for the best and efficient way to replace the first word in a str, like this: "aa to become" -> "/aa/ to become" I know I can use spilt and than join them but I can also use regular expressions and I sure there is a lot ways, but I need realy efficient one
3
11151
by: John Ford | last post by:
For simple string concatenation, is there a difference between... Dim s As String s += "add this to string" ....and... Dim s As String s = String.Concat(s, "add this to string")
9
2040
by: Justin M. Keyes | last post by:
Hi, Please read carefully before assuming that this is the same old question about string concatenation in C#! It is well-known that the following concatenation produces multiple immutable String objects for each statement: String a = "a"; a += "b";
16
2130
by: Mark A. Sam | last post by:
Hello, I am having a problem with imputting into a string variable: Dim strSQL As String = "INSERT INTO tblContactForm1 (txtName, txtCompany, txtPhone, txtEmail, txtComment, chkGrower, chkProduceDealer, txtOtherCustType, chkStandardBags, chkCustomBags,txtOtherBags) " + _ "VALUES ('" + txtName.Text + "','" + txtCompany.Text + "','" +...
33
4648
by: genc_ymeri | last post by:
Hi over there, Propably this subject is discussed over and over several times. I did google it too but I was a little bit surprised what I read on internet when it comes 'when to use what'. Most of articles I read from different experts and programmers tell me that their "gut feelings" for using stringBuilder instead of string concatenation...
12
2696
by: Richard Lewis Haggard | last post by:
I thought that the whole point of StringBuilder was that it was supposed to be a faster way of building strings than string. However, I just put together a simple little application to do a comparative analysis between the two and, surprisingly, string seems to out perform StringBuilder by a significant amount. A string concatenation takes not...
34
2626
by: Larry Hastings | last post by:
This is such a long posting that I've broken it out into sections. Note that while developing this patch I discovered a Subtle Bug in CPython, which I have discussed in its own section below. ____________ THE OVERVIEW I don't remember where I picked it up, but I remember reading years ago that the simple, obvious Python approach for...
10
13619
by: =?Utf-8?B?RWxlbmE=?= | last post by:
I am surprised to discover that c# automatically converts an integer to a string when concatenating with the "+" operator. I thought c# was supposed to be very strict about types. Doesn't it seem like c# is breaking its own rules? This works: int b = 32; string a = "ABC " + b; Result: a = "ABC 32"
34
3509
by: raylopez99 | last post by:
StringBuilder better and faster than string for adding many strings. Look at the below. It's amazing how much faster StringBuilder is than string. The last loop below is telling: for adding 200000 strings of 8 char each, string took over 25 minutes while StringBuilder took 40 milliseconds! Can anybody explain such a radical...
0
7924
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. ...
0
8120
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
1
7672
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For...
0
6283
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
0
3653
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in...
0
3640
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
2113
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
1
1212
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
937
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.