When I pass an empty string to a function is a new string object created
or does python use some global pre-created object? I know python does
this with integer objects under a certain value. For instance, in the
following code is a new string object created for each function call?
func(0,'')
func(1,'')
func(2,'')
func(3,'')
I tried the following commands in the interactive shell: x = '' y = '' x is y
True x = 'hello' y = 'hello' x is y
True
This leads me to believe that python does reuse existing strings, but
once the variables are removed, does the item still exist in the cache?
-Farshid 10 1492
Farshid Lashkari wrote: When I pass an empty string to a function is a new string object created or does python use some global pre-created object? I know python does this with integer objects under a certain value. For instance, in the following code is a new string object created for each function call?
func(0,'') func(1,'') func(2,'') func(3,'')
In this case, the language implementation may either create new
strings or re-use existing ones:
for immutable types, operations that compute new values
may actually return a reference to any existing object with
the same type and value, while for mutable objects this is
not allowed.
[http://docs.python.org/ref/objects.html]
[...] This leads me to believe that python does reuse existing strings, but once the variables are removed, does the item still exist in the cache?
Either; see the same reference page.
--
--Bryan
Farshid Lashkari wrote: When I pass an empty string to a function is a new string object created or does python use some global pre-created object? I know python does this with integer objects under a certain value. For instance, in the following code is a new string object created for each function call?
func(0,'') func(1,'') func(2,'') func(3,'')
I tried the following commands in the interactive shell:
>> x = '' >> y = '' >> x is y True >> x = 'hello' >> y = 'hello' >> x is y True
This leads me to believe that python does reuse existing strings, but once the variables are removed, does the item still exist in the cache?
It takes far too little evidence to induce belief: a = "hello" b = "h"+"ello" a is b
False c = "hello" b is a
False
regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC www.holdenweb.com
PyCon TX 2006 www.python.org/pycon/
> It takes far too little evidence to induce belief: >>> a = "hello" >>> b = "h"+"ello" >>> a is b False >>> c = "hello" >>> b is a False >>>
I don't understand the point of your last expression. Were you intending
this instead:
c is a
True
However, the following commands add to my confusion:
a = 'wtf?' b = 'wtf?' a is b
False
So how are string literals cached? Is there an explanation somewhere? Is
it some freaky voodoo, and I should just assume that a string literal
will always generate a new object?
Thanks,
Farshid
Farshid Lashkari wrote: It takes far too little evidence to induce belief:
>>> a = "hello" >>> b = "h"+"ello" >>> a is bFalse >>> c = "hello" >>> b is a False >>>
I don't understand the point of your last expression. Were you intending this instead: >>> c is a True
Yes.
However, the following commands add to my confusion: >> a = 'wtf?' >> b = 'wtf?' >> a is b False
So how are string literals cached? Is there an explanation somewhere? Is it some freaky voodoo, and I should just assume that a string literal will always generate a new object?
I really don't understand why it's so important: it's not a part of the
language definition at all, and therefore whatever behavior you see is
simply an artifact of the implementation you observe.
regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC www.holdenweb.com
PyCon TX 2006 www.python.org/pycon/
> I really don't understand why it's so important: it's not a part of the language definition at all, and therefore whatever behavior you see is simply an artifact of the implementation you observe.
I guess I should rephrase my question in the form of an example. Should
I assume that a new string object is created in each iteration of the
following loop?
for x in xrange(1000000):
func(x,'some string')
Or would it be better to do the following?
stringVal = 'some string'
for x in xrange(1000000):
func(x,stringVal)
Or, like you stated, is it not important at all?
Thanks,
Farshid
Farshid Lashkari wrote: I really don't understand why it's so important: it's not a part of the language definition at all, and therefore whatever behavior you see is simply an artifact of the implementation you observe.
I guess I should rephrase my question in the form of an example. Should I assume that a new string object is created in each iteration of the following loop?
for x in xrange(1000000): func(x,'some string')
Or would it be better to do the following?
stringVal = 'some string' for x in xrange(1000000): func(x,stringVal)
Or, like you stated, is it not important at all?
It doesn't make a lot of difference: import dis print dis.dis.__doc__
Disassemble classes, methods, functions, or code.
With no argument, disassemble the last traceback.
help(dis.dis)
dis.dis(compile("""\
... for x in xrange(1000000):
... func(x,'some string')""", "", 'exec'))
1 0 SETUP_LOOP 33 (to 36)
3 LOAD_NAME 0 (xrange)
6 LOAD_CONST 0 (1000000)
9 CALL_FUNCTION 1
12 GET_ITER 13 FOR_ITER 19 (to 35)
16 STORE_NAME 1 (x)
2 19 LOAD_NAME 2 (func)
22 LOAD_NAME 1 (x)
25 LOAD_CONST 1 ('some string')
28 CALL_FUNCTION 2
31 POP_TOP
32 JUMP_ABSOLUTE 13 35 POP_BLOCK 36 LOAD_CONST 2 (None)
39 RETURN_VALUE dis.dis(compile("""\
... stringVal = 'some string'
... for x in xrange(1000000):
... func(x,stringVal)""", "", 'exec'))
1 0 LOAD_CONST 0 ('some string')
3 STORE_NAME 0 (stringVal)
2 6 SETUP_LOOP 33 (to 42)
9 LOAD_NAME 1 (xrange)
12 LOAD_CONST 1 (1000000)
15 CALL_FUNCTION 1
18 GET_ITER 19 FOR_ITER 19 (to 41)
22 STORE_NAME 2 (x)
3 25 LOAD_NAME 3 (func)
28 LOAD_NAME 2 (x)
31 LOAD_NAME 0 (stringVal)
34 CALL_FUNCTION 2
37 POP_TOP
38 JUMP_ABSOLUTE 19 41 POP_BLOCK 42 LOAD_CONST 2 (None)
45 RETURN_VALUE
It just boils down to either a LOAD_CONST vs. a LOAD_NAME - either way
the string isn't duplicated.
regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC www.holdenweb.com
PyCon TX 2006 www.python.org/pycon/
> It just boils down to either a LOAD_CONST vs. a LOAD_NAME - either way the string isn't duplicated.
Great, that's exactly what I wanted to know. Thanks Steve!
-Farshid
Farshid Lashkari wrote: However, the following commands add to my confusion:
>> a = 'wtf?' >> b = 'wtf?' >> a is b
False
So how are string literals cached? Is there an explanation somewhere? Is it some freaky voodoo, and I should just assume that a string literal will always generate a new object?
A few comments (which I hope are correct, but which I hope you will read
then mostly ignore since you probably shouldn't be designing based on
this stuff anyway):
1. What you see at the interactive prompt is not necessarily what will
happen in a compiled source file. Try the above test with and without
the question mark both at the interactive prompt and in source and see.
2. The reason for the difference with ? is probably due to an
optimization relating to looking up attribute names and such in
dictionaries. wtf? is not a valid attribute, so it probably isn't
optimized the same way.
2.5. I think I had a third comment like the above but after too little
sleep last night it seems it's evaporated since I began writing...
3. As Steve or someone said, these things are implementation details
unless spelled out explicitly in the language reference, so don't rely
on them.
4. If you haven't already written your code and profiled and found it
lacking in performance and proving that the cause is related to whether
or not you hoisted the string literal out of the loop, you're wasting
your time and this is a good opportunity to begin reprogramming your
brain not to optimize prematurely. IMHO. FWIW. :-)
-Peter
> A few comments (which I hope are correct, but which I hope you will read then mostly ignore since you probably shouldn't be designing based on this stuff anyway):
Thanks for the info Peter. My original question wasn't due to any
observed performance problems. I was just being curious :)
-Farshid
Farshid Lashkari wrote: I really don't understand why it's so important: it's not a part of the language definition at all, and therefore whatever behavior you see is simply an artifact of the implementation you observe.
I guess I should rephrase my question in the form of an example. Should I assume that a new string object is created in each iteration of the following loop?
for x in xrange(1000000): func(x,'some string')
Or would it be better to do the following?
stringVal = 'some string' for x in xrange(1000000): func(x,stringVal)
Or, like you stated, is it not important at all?
In this particular case, it's no big deal, since you use
a literal, which is something Python knows won't change.
In general, it's semantically very different to create an
object at one point and then use a reference to that over
and over in a loop, or to create a new object over and
over again in a loop. E.g.
for x in xrange(1000000):
func(x, str(5))
v.s.
stringVal = str(5)
for x in xrange(1000000):
func(x,stringVal)
This isn't just a matter of extra function call overhead. In
the latter case, you are telling Python that all calls to
"func(x,stringVal)" use the same objects as arguments (assuming
that there aren't any assignments to x and stringVal somewhere
else in the loop). In the former case, no such guarantee can
be made from studying the loop.
As for interning strings, it's my understanding that current
CPython interns strings that look like identifiers, i.e.
starts with an ASCII letter or an underscore and is followed
by zero or more ASCII letter, underscore or digit. On the other
hand, it seems id(str(5)) is persistent as well, so the current
implementation seems slightly simplified compared to the
perceived need. Anyway, this is just an implementation choice
made to improve performance, nothing to rely on. This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: Pierre Fortin |
last post by:
Hi!
"Python Essential Reference" - 2nd Ed, on P. 47 states that a string
format can include "*" for a field width (no restrictions noted); yet...
>>> "%*d" % (6,2) # works as expected
' ...
|
by: Jay Levitt |
last post by:
I'm just starting to play around with CSS and MovableType. My home page
(http://www.jay.fm) now validates on both the CSS and the XHTML.
However, the Google cached version shows the wrong font in...
|
by: Andreas Palm |
last post by:
I have a dataset that has DBNull in certain columns, now when I write
out this one to XML, I only get the columns as elements that do have
data in it. However I do need also the empty colums as...
|
by: Dan Bass |
last post by:
which one do you use and why?
MyString == null || MyString == ""
vs
MyString == null || MyString.Length == 0
|
by: cj |
last post by:
What is string.empty used for?
I can't say: if string.empty then
I have to use: if string = "" then
which is ok, I just want to know what .empty is for.
|
by: krbyxtrm |
last post by:
hello i have this profile for iterating empty vectors:
+0.3us (microsecond) on intel pentium 2.4Ghz
can this added delay to my application be reduced?
i mean near zero delay, its very important....
|
by: anonieko |
last post by:
In the past I always used "" everywhere for empty string in my code
without a problem.
Now, do you think I should use String.Empty instead of "" (at all
times) ?
Let me know your thoughts.
|
by: JB |
last post by:
Hi All,
I'm using the Application's Resources to store Strings (right click on
the Application, Properties, Resources).
There are quite a lot of Strings in there. Today I've just added
another...
|
by: Faheem Mitha |
last post by:
Hi everybody,
I was wondering if anyone can explain this. My understanding is that 'is'
checks if the object is the same. However, in that case, why this
inconsistency for short strings? I would...
|
by: ryjfgjl |
last post by:
ExcelToDatabase: batch import excel into database automatically...
|
by: isladogs |
last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM).
In this month's session, we are pleased to welcome back...
|
by: isladogs |
last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM).
In this month's session, we are pleased to welcome back...
|
by: Vimpel783 |
last post by:
Hello!
Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
|
by: ArrayDB |
last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
|
by: PapaRatzi |
last post by:
Hello,
I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
|
by: Defcon1945 |
last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
|
by: Shællîpôpï 09 |
last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
|
by: Faith0G |
last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
| |