473,473 Members | 1,748 Online
Bytes | Software Development & Data Engineering Community
Create Post

Home Posts Topics Members FAQ

Efficient string concatenation methods

As a realtive python newb, but an old hack in general, I've been
interested in the impact of having string objects (and other
primitives) be immutable. It seems to me that string concatenation is
a rather common operation, and in Python having immutable strings
results in a performance gotcha for anyone not aware of the impact of
doing lots of concatenation in the obvious way.

I found several sources with advice for how to do concatenation in a
pythonic way (e.g. ref#1), but I hadn't seen any measurements or
comparisons. So, I put together a little test case and ran it through
for six different methods. Here's the results, and my conclusions:

http://www.skymind.com/~ocrow/python_string/

I'd be happy to hear if anyone else has done similar tests and if
there are any other good candidate methods that I missed.

ref #1: http://manatee.mojam.com/~skip/pytho...html#stringcat

Oliver
Jul 18 '05 #1
7 1918
Oliver Crow wrote:
http://www.skymind.com/~ocrow/python_string/

I'd be happy to hear if anyone else has done similar tests and if
there are any other good candidate methods that I missed.


You left out the StringIO module (having done only the cStringIO
version of that).

Note also that, for any of the ones which do method calls,
you can speed up the call by saving a reference to the
bound method in a local variable. For example, in method 4
you can do "app_list = str_list.append" and then use
"app_list(`num`)" instead of str_list.append(`num`). This
saves an attribute lookup on each loop iteration. It's
not "idiomatic" to do so except in (a) cases of optimization
obsession, or (b) benchmarks. ;-)

Interesting and useful results. Thanks! :-)

-Peter
Jul 18 '05 #2
..................................
I was curious about this like an hour ago and googled for it, and hit
your page. Thanks! It was quite helpful.
Jul 18 '05 #3
Oliver Crow <oc***@skymind.com> wrote:
As a realtive python newb, but an old hack in general, I've been
interested in the impact of having string objects (and other
primitives) be immutable. It seems to me that string concatenation is
a rather common operation, and in Python having immutable strings
results in a performance gotcha for anyone not aware of the impact of
doing lots of concatenation in the obvious way.

I found several sources with advice for how to do concatenation in a
pythonic way (e.g. ref#1), but I hadn't seen any measurements or
comparisons. So, I put together a little test case and ran it through
for six different methods. Here's the results, and my conclusions:

http://www.skymind.com/~ocrow/python_string/

I'd be happy to hear if anyone else has done similar tests and if
there are any other good candidate methods that I missed.

ref #1: http://manatee.mojam.com/~skip/pytho...html#stringcat

Oliver


Try printing the integers to a file, then read it back. Should be
similar to Method 5.

--
William Park, Open Geometry Consulting, <op**********@yahoo.ca>
Linux solution/training/migration, Thin-client
Jul 18 '05 #4
Oliver Crow wrote:
I'd be happy to hear if anyone else has done similar tests and if
there are any other good candidate methods that I missed.


I'd like to try out another variant, but I'm unable to run your
script... Where does that timing module come from? I can't find it anywhere.

The method I propose is simply

def method7():
return ''.join(map(str, xrange(loop_count)))

I ran my own little test and it seems to be faster than method6, but I'd
like to run it in your script for more reliable results. Also, I haven't
done any memory measurement, only a timing.

--
"Codito ergo sum"
Roel Schroeven
Jul 18 '05 #5
Oliver Crow <oc***@skymind.com> wrote:
I found several sources with advice for how to do concatenation in a
pythonic way (e.g. ref#1), but I hadn't seen any measurements or
comparisons. So, I put together a little test case and ran it through
for six different methods. Here's the results, and my conclusions:

http://www.skymind.com/~ocrow/python_string/

I'd be happy to hear if anyone else has done similar tests and if
there are any other good candidate methods that I missed.


This

def method4():
str_list = []
for num in xrange(loop_count):
str_list.append(`num`)
return ''.join(str_list)

will run slightly faster modified to this:

def method4():
str_list = []
append = str_list.append
for num in xrange(loop_count):
append(`num`)
return ''.join(str_list)

by factoring the method lookup out of the loop. Ditto for 3 and 5.

Terry J. Reedy

PS. Changing IE's View/TextSize changes size of header fonts but not body
text, which your CSS apparently fixes at a size a bit small for me on
current system.


Jul 18 '05 #6
Roel Schroeven <rs****************@fastmail.fm> wrote in message news:<Mc*********************@phobos.telenet-ops.be>...

I'd like to try out another variant, but I'm unable to run your
script... Where does that timing module come from? I can't find it anywhere.
It's George Neville-Neil's timing module. It looks like it used to be
part of the python library, but has been removed in recent versions.
I'm not sure why. Perhaps timeit is the new preferred module.

The method I propose is simply

def method7():
return ''.join(map(str, xrange(loop_count)))

I ran my own little test and it seems to be faster than method6, but I'd
like to run it in your script for more reliable results. Also, I haven't
done any memory measurement, only a timing.


I'll add that test and rerun the results.

Thanks for the suggestion!

Oliver
Jul 18 '05 #7
Peter Hansen <pe***@engcorp.com> wrote in message news:<s8********************@powergate.ca>...
You left out the StringIO module (having done only the cStringIO
version of that).
I should probably add that one just for reference. I left it out
originally because my instinct was that it would perform less well
than the string += operator. I think it uses ordinary immutable
python strings for internal storage.
Note also that, for any of the ones which do method calls,
you can speed up the call by saving a reference to the
bound method in a local variable. For example, in method 4
you can do "app_list = str_list.append" and then use
"app_list(`num`)" instead of str_list.append(`num`). This
saves an attribute lookup on each loop iteration. It's
not "idiomatic" to do so except in (a) cases of optimization
obsession, or (b) benchmarks. ;-)
I hadn't thought of this, although it makes sense. It looks like I
could do this in methods 3, 4 and 5. But I also feel that it makes
the code a little less readable.

I think the unstated goal I had was to find a method that could be
learned by python programmers and used in real programs without having
to think *too* hard about the various performance trade-offs. So, in
that spirit I should definitely measure the difference, but perhaps
not go so far as to recommend it as part of the best overall approach.
Interesting and useful results. Thanks! :-)


Thanks!
Oliver
Jul 18 '05 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: Jonas Galvez | last post by:
Is it true that joining the string elements of a list is faster than concatenating them via the '+' operator? "".join() vs 'a'+'b'+'c' If so, can anyone explain why?
13
by: fran | last post by:
Hello, Is this code efficient? public static string HTML_FASE1_OTROS_GENERAL = " <table id='tFase1' cellspacing='0' cellpadding='0' width='800' >" + " <tr>" + " <td width='15'></td>" +...
16
by: Mark A. Sam | last post by:
Hello, I am having a problem with imputting into a string variable: Dim strSQL As String = "INSERT INTO tblContactForm1 (txtName, txtCompany, txtPhone, txtEmail, txtComment, chkGrower,...
33
by: genc_ymeri | last post by:
Hi over there, Propably this subject is discussed over and over several times. I did google it too but I was a little bit surprised what I read on internet when it comes 'when to use what'. Most...
12
by: Richard Lewis Haggard | last post by:
I thought that the whole point of StringBuilder was that it was supposed to be a faster way of building strings than string. However, I just put together a simple little application to do a...
9
by: Terry Olsen | last post by:
I send out a daily email to technicians regarding nightly backup logs. I use the following code to generate the body of the email: tmpBody += vbCrLf tmpBody += "----- " & SupTechs.Item(j) & " ("...
34
by: Larry Hastings | last post by:
This is such a long posting that I've broken it out into sections. Note that while developing this patch I discovered a Subtle Bug in CPython, which I have discussed in its own section below. ...
5
by: Diego Martins | last post by:
Since C++ (and STL) have many ways to do string concatenation, I want to hear (read) from you how you do to concatenate strings with other strings and other types. The approaches I know are: --...
34
by: raylopez99 | last post by:
StringBuilder better and faster than string for adding many strings. Look at the below. It's amazing how much faster StringBuilder is than string. The last loop below is telling: for adding...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
1
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.