473,757 Members | 10,708 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

The Running Time of += on Char Strings ?

This Is A Late Cross Post from comp.lang.pytho n. It seems the mistery
is deeper then i expected.

What is the running time of conactination on character strings.

i.e.
joe="123"
joe+="999999999 99999999"

is it Amortized Constant time? I don't think it would be O((number of
chars)^2) but i really don't know.

Teach me how to fish, where would i find out more about the

internal
representations of data types in python (and guarenteed run times, im
think of something like sgi.com 's info on the STL) . I have looked
through the docs but i don't seem to see these types of specifications.

thanks * 100
- Haz

P.S.
- Should Note that i am famliure with timeit, but understanding the
underly data structures and representations is an important thing to
know.

P.P.S

This a bit of what i think relevent discourse i have been having via a
email responder of my usenet posting.
Haz> i should have mentioned that i am familure with the timeit
Haz> function, but shurly there must be a specification in the language
Haz> of the running time (number of flops).

Nope. I can't think of an instance where it would be appropriate to specify
runtime properties of various algorithms in the language. For example, if
you were to specify that sorting of lists was O(n log n) that would
potentially preclude the choice of quicksort as an algorithm because its
worst case behavior is O(n * n) even though it is generally faster than most
other sorting algorithms.
The answere here is to use omega(n log n) or specify average and worst
cases. I truely do think that there can be a complexity specification
for the language. I mean all algorithms have a complexity and surely
data structures are choosen with the size/speed tradeoffs in mind.

For instance in the STL has a sorting algorithm and it specifies a
running time the latter way (why they can do assure this is by using
coding with concepts, but i think in the base language it could be
simpler because the data structures are known.) i.e. Its know what
data types += works on and thus it should be know what algoritms are
to be used (that is the highly optimal ones)
From SGI STL Page: sort
<snip>
Complexity
O(N log(N)) comparisons (both average and worst-case), where N is last
- first. [2]
<snip>

source : http://www.sgi.com/tech/stl/sort.html

Haz> Without knowing these specifications its hard to optimize.

No, you still need to see where your program runs slow and figure out ways
to make it run faster.


Well basically my point is that it is hard to know why a code section
is running slow unless you understand the underlying data
represenations and algorithms

For instance

matlab code:

A=[]
for i=1:N
A=[A;'a']
end

is O(N^2) operation

C code:

vector<char> A;
for(i=0; i<N; i++){
A.push_back('a' );
}

is Amortized O(N) [note that this isn't the correct wording exactly,
but in general is O(N)]

Python Code:

A=""
for i in range(N):
A+='a'

running time : ???

So if you are looking through your code in matlab or C and see this
concatination loop you know that it is a problem in the former and not
in the latter. But in python ???
Jul 18 '05 #1
3 2312
Edg Bamyasi wrote:
What is the running time of conactination on character strings.

i.e.

joe="123"
joe+="999999 99999999999"

is it Amortized Constant time? I don't think it would be O((number of
chars)^2) but i really don't know.


Strings are immutable, so
joe+="999999999 99999999"
is executed as
joe = joe + "99999999999999 999"

This means that there is
- one allocation
- two memcpy
- one deallocation (old value of joe)

My guess is that the allocations play a rather large part in the actual
timing.

Creating a large string by multiple concatenations is slow, instead you
should:
- use module cStringIO
- or add all the strings to a list and do "".join (listvariable)

How are you supposed to know? It's mostly Python folklore, some of which
has been written down in the Python Cookbook
(http://aspn.activestate.com/ASPN/Python/Cookbook/)

Daniel
Jul 18 '05 #2
Edg Bamyasi schrieb:
What is the running time of conactination on character strings.
i.e. ..>>>joe="123"
..>>>joe+="9999 9999999999999" is it Amortized Constant time? I don't think it would be O((number of
chars)^2) but i really don't know.


First of all, this idiom is generally avoided in loops (where it actually
matters). Use ''.join() which the documentation describes as optimized for
your case and as preferable over creating a larger number of immutable
strings. Note that it does not specify the run-time complexity there.

That said, Python is a dynamic, high-level language. CPython is an
implementation. IronPython and Jython are different implementations . There are
others. Many of the internal complexities are implementation specific. Some
have good reasons for this. Also, IronPython and Jython heavily rely on the
performance of the underlying run-time environment for their own performance.
The differences between the various implementations and their run-time
environments can be big enough to render performance specifications useless in
many (though possibly not all) cases.

If you need to know the exact complexity of algorithms, read them. Download
the source distribution of the Python version you want to investigate and read
the source. But remember that there is no actual specification. Do not expect
your code to run at the same speed in all Python versions and implementations .

There are examples for basic algorithms that were exchanged during the long
evolution of the CPython implementation. One is the sort algorithm. Recent 2.4
changes in the handling of lists made some common operations considerably faster.

It is a pragmatically sane approach to accept the high programming level of
Python and to not rely on the specific performance of a specific
implementation. Just use the tool that is made for your task. The information
for choosing the right tool can already be found in the documentation.

Stefan
Jul 18 '05 #3
Thanks Guys It Was Great Help and I have began to mark my code for the
''.join() string conatination optimization. Upon regoogling (when you
know the right thing to google it can make a big diffrence, having not
know how to google +=, hehe). I found this commentary and set of tests.
I find it a good conclustion to this question.

http://www.skymind.com/~ocrow/python_string/
''.join(['Thank ','you])

- Haz

Jul 18 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
4361
by: Gizmo | last post by:
hello all have been trying to write a Mid() function a bit like the one in vb. i come to compile it and there are no errors however when i run it an error accours and it says the program has to close. The odd thing is the error accours after the mid function has done its job. but if i take out the mid function the error does not accour. i have found that if i do cout <<mid(newString.GetString(),iPosition,3);
1
18006
by: Matt Garman | last post by:
What is the "best" way to copy a vector of strings to an array of character strings? By "best", I mean most elegantly/tersely written, but without any sacrifice in performance. I'm writing an application using C++ and the STL for handling my data. Unfortunately, I must interact with a (vanilla) C API. I use vectors of strings (for simplicity and less memory hassle), but the function calls for this API require arrays of character...
3
2211
by: sieg1974 | last post by:
Hi, I have made this simple program to understand char ** pointers, but I still having many questions. int main() { char ** testPointerPointerChar = 0; char * A = "string01";
68
3712
by: Roman Ziak | last post by:
Hello, I just downloaded MS Visual Studio 2005 Express Beta. When I tried to compile existing valid project, I get a lot of warnings like 'sprintf' has been deprecated, 'strcpy' has been deprecated etc. I opened STDIO.H and figured that one has to define a macro _CRT_SECURE_NO_DEPRECATE to stop these warnings. I started to search internet and found few links, and the following proposal
10
4786
by: Bart Goeman | last post by:
Hi, I have a question about how to put redundant information in data structures, initialized at compile time. This is often necessary for performance reasons and can't be done at run time (data structures are read only) Ideally one should be able to put the redundant information there automatically so no mistakes are possible, but in a lot of case I see no way how to do it.
4
10722
by: John Devereux | last post by:
Hi, I would like some advice on whether I should be using plain "chars" for strings. I have instead been using "unsigned char" in my code (for embedded systems). In general the strings contain ASCII characters in the 0-127 range, although I had thought that I might want to use the 128-255 range for special symbols or foreign character codes. This all worked OK for a long time, but a recent update to the compiler on my system has...
5
2704
by: Peter Steele | last post by:
We have an application that when it runs in the IDE in debug mode an unhandled exception is occurring in a system header file associated with STL stirngs. The actual statement that crashes is return ::memcmp(_First1, _First2, _Count); On inspecting these variables, the strings are in fact equal when the exception occurs and _Count is the right size. As a test I replaced this code in the system include file with a for loop to do the...
4
3577
by: nass | last post by:
hello everyone, i have a bit of problem reading char * strings from a buffer (a shared memory, pointed to by 'file_memory'). basically i have a structure in memory 'ShMem' that can be accessed by 2 applications (or will be at least when it is done). the structure is declared in the procedure under the pointer infoVar. members tXXL are integer lengths of the strings that as all saved (concatenated) at address of oAndS_str, which is of type...
27
3294
by: aravind | last post by:
Hi, I need to develop a menu system for a project of mine. The menu will be displayed on a character LCD display driven by ARM7 Microcontroller. For this purpose i wish to construct a structure in C which will contain a the following - struct menu { int n (no. of elements in menu); char menu_items; (This will contains the strings to be displayed on the LCD, 20 characters and n lines funcptr fptr; (Pointer to the corresponding menu...
0
9298
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
9906
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
9885
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
1
7286
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6562
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5172
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5329
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
3
3399
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2698
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.