473,396 Members | 2,026 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

The Running Time of += on Char Strings ?

This Is A Late Cross Post from comp.lang.python. It seems the mistery
is deeper then i expected.

What is the running time of conactination on character strings.

i.e.
joe="123"
joe+="99999999999999999"

is it Amortized Constant time? I don't think it would be O((number of
chars)^2) but i really don't know.

Teach me how to fish, where would i find out more about the

internal
representations of data types in python (and guarenteed run times, im
think of something like sgi.com 's info on the STL) . I have looked
through the docs but i don't seem to see these types of specifications.

thanks * 100
- Haz

P.S.
- Should Note that i am famliure with timeit, but understanding the
underly data structures and representations is an important thing to
know.

P.P.S

This a bit of what i think relevent discourse i have been having via a
email responder of my usenet posting.
Haz> i should have mentioned that i am familure with the timeit
Haz> function, but shurly there must be a specification in the language
Haz> of the running time (number of flops).

Nope. I can't think of an instance where it would be appropriate to specify
runtime properties of various algorithms in the language. For example, if
you were to specify that sorting of lists was O(n log n) that would
potentially preclude the choice of quicksort as an algorithm because its
worst case behavior is O(n * n) even though it is generally faster than most
other sorting algorithms.
The answere here is to use omega(n log n) or specify average and worst
cases. I truely do think that there can be a complexity specification
for the language. I mean all algorithms have a complexity and surely
data structures are choosen with the size/speed tradeoffs in mind.

For instance in the STL has a sorting algorithm and it specifies a
running time the latter way (why they can do assure this is by using
coding with concepts, but i think in the base language it could be
simpler because the data structures are known.) i.e. Its know what
data types += works on and thus it should be know what algoritms are
to be used (that is the highly optimal ones)
From SGI STL Page: sort
<snip>
Complexity
O(N log(N)) comparisons (both average and worst-case), where N is last
- first. [2]
<snip>

source : http://www.sgi.com/tech/stl/sort.html

Haz> Without knowing these specifications its hard to optimize.

No, you still need to see where your program runs slow and figure out ways
to make it run faster.


Well basically my point is that it is hard to know why a code section
is running slow unless you understand the underlying data
represenations and algorithms

For instance

matlab code:

A=[]
for i=1:N
A=[A;'a']
end

is O(N^2) operation

C code:

vector<char> A;
for(i=0; i<N; i++){
A.push_back('a');
}

is Amortized O(N) [note that this isn't the correct wording exactly,
but in general is O(N)]

Python Code:

A=""
for i in range(N):
A+='a'

running time : ???

So if you are looking through your code in matlab or C and see this
concatination loop you know that it is a problem in the former and not
in the latter. But in python ???
Jul 18 '05 #1
3 2288
Edg Bamyasi wrote:
What is the running time of conactination on character strings.

i.e.

joe="123"
joe+="99999999999999999"

is it Amortized Constant time? I don't think it would be O((number of
chars)^2) but i really don't know.


Strings are immutable, so
joe+="99999999999999999"
is executed as
joe = joe + "99999999999999999"

This means that there is
- one allocation
- two memcpy
- one deallocation (old value of joe)

My guess is that the allocations play a rather large part in the actual
timing.

Creating a large string by multiple concatenations is slow, instead you
should:
- use module cStringIO
- or add all the strings to a list and do "".join (listvariable)

How are you supposed to know? It's mostly Python folklore, some of which
has been written down in the Python Cookbook
(http://aspn.activestate.com/ASPN/Python/Cookbook/)

Daniel
Jul 18 '05 #2
Edg Bamyasi schrieb:
What is the running time of conactination on character strings.
i.e. ..>>>joe="123"
..>>>joe+="99999999999999999" is it Amortized Constant time? I don't think it would be O((number of
chars)^2) but i really don't know.


First of all, this idiom is generally avoided in loops (where it actually
matters). Use ''.join() which the documentation describes as optimized for
your case and as preferable over creating a larger number of immutable
strings. Note that it does not specify the run-time complexity there.

That said, Python is a dynamic, high-level language. CPython is an
implementation. IronPython and Jython are different implementations. There are
others. Many of the internal complexities are implementation specific. Some
have good reasons for this. Also, IronPython and Jython heavily rely on the
performance of the underlying run-time environment for their own performance.
The differences between the various implementations and their run-time
environments can be big enough to render performance specifications useless in
many (though possibly not all) cases.

If you need to know the exact complexity of algorithms, read them. Download
the source distribution of the Python version you want to investigate and read
the source. But remember that there is no actual specification. Do not expect
your code to run at the same speed in all Python versions and implementations.

There are examples for basic algorithms that were exchanged during the long
evolution of the CPython implementation. One is the sort algorithm. Recent 2.4
changes in the handling of lists made some common operations considerably faster.

It is a pragmatically sane approach to accept the high programming level of
Python and to not rely on the specific performance of a specific
implementation. Just use the tool that is made for your task. The information
for choosing the right tool can already be found in the documentation.

Stefan
Jul 18 '05 #3
Thanks Guys It Was Great Help and I have began to mark my code for the
''.join() string conatination optimization. Upon regoogling (when you
know the right thing to google it can make a big diffrence, having not
know how to google +=, hehe). I found this commentary and set of tests.
I find it a good conclustion to this question.

http://www.skymind.com/~ocrow/python_string/
''.join(['Thank ','you])

- Haz

Jul 18 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: Gizmo | last post by:
hello all have been trying to write a Mid() function a bit like the one in vb. i come to compile it and there are no errors however when i run it an error accours and it says the program has to...
1
by: Matt Garman | last post by:
What is the "best" way to copy a vector of strings to an array of character strings? By "best", I mean most elegantly/tersely written, but without any sacrifice in performance. I'm writing an...
3
by: sieg1974 | last post by:
Hi, I have made this simple program to understand char ** pointers, but I still having many questions. int main() { char ** testPointerPointerChar = 0; char * A = "string01";
68
by: Roman Ziak | last post by:
Hello, I just downloaded MS Visual Studio 2005 Express Beta. When I tried to compile existing valid project, I get a lot of warnings like 'sprintf' has been deprecated, 'strcpy' has been...
10
by: Bart Goeman | last post by:
Hi, I have a question about how to put redundant information in data structures, initialized at compile time. This is often necessary for performance reasons and can't be done at run time (data...
4
by: John Devereux | last post by:
Hi, I would like some advice on whether I should be using plain "chars" for strings. I have instead been using "unsigned char" in my code (for embedded systems). In general the strings contain...
5
by: Peter Steele | last post by:
We have an application that when it runs in the IDE in debug mode an unhandled exception is occurring in a system header file associated with STL stirngs. The actual statement that crashes is ...
4
by: nass | last post by:
hello everyone, i have a bit of problem reading char * strings from a buffer (a shared memory, pointed to by 'file_memory'). basically i have a structure in memory 'ShMem' that can be accessed by...
27
by: aravind | last post by:
Hi, I need to develop a menu system for a project of mine. The menu will be displayed on a character LCD display driven by ARM7 Microcontroller. For this purpose i wish to construct a structure in...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.