An oddity in list comparison and element assignment

michael.f.ellis

The following script puzzles me. It creates two nested lists that
compare identically. After identical element assignments, the lists
are different. In one case, a single element is replaced. In the
other, an entire column is replaced.

---------------------------------------------------------------------------------------

'''
An oddity in the behavior of lists of lists. Occurs under
Python 2.4.3 (#69, Mar 29 2006, 17:35:34) [MSC v.1310 32 bit (Intel)]
on win32.
Not tested on other platforms or builds.
'''
a =[[[1,2],[1,2]],[[1,2],[1,2]]]
b = [[range(1,3)]*2]*2
assert(a==b)
print "Initially, python reports that the lists are equal"
a[1][1]=[5]
b[1][1]=[5]
try:
assert(a==b)
except AssertionError:
print "After identical element assignments, the lists are not equal"
print "a is now ", a
print "b is now ", b
-------------------------------------------------------------------------------------

Here's the output on my system.

------------------------------------------------------------------------------------
Initially, python reports that the lists are equal
After identical element assignments, the lists are not equal
a is now [[[1, 2], [1, 2]], [[1, 2], [5]]]
b is now [[[1, 2], [5]], [[1, 2], [5]]]
------------------------------------------------------------------------------------

This seems contrary to one of my fundamental expectations, namely that
objects which compare equally must remain equal after identical
operations. I think what must be going on is that the 'b' list
contains replicated references instead of copies of [range(1,3)]*2 .
IMO, python's == operator should detect this difference in list
structure since it leads to different behavior under element
assignments.

Mike Ellis

Jun 1 '06 #1

Subscribe Post Reply

2674

Alex Martelli

<mi*************@gmail.com> wrote:
...

operations. I think what must be going on is that the 'b' list
contains replicated references instead of copies of [range(1,3)]*2 .
Right.
IMO, python's == operator should detect this difference in list
structure since it leads to different behavior under element
assignments.

Wrong; equality does not imply any check on identity. You can consider
the definition of "list A equals list B" as:

-- len(A) == len(B), AND,
-- for each valid index i, A[i] == B[i]

This is an extremely natural definition of "equality" for containers:
"they have EQUAL items" [[in the same order, for containers for which
order is relevant]]. Nowhere in this extremely natural definition does
the IDENTITY of the items come into play.

Therefore, your expectations about the effects of item alterations (for
alterable items) are ill-founded.

Try concisely expressing your "should" -- constructively, as pseudocode
that one could use to check for your "strengthened equality", not in
abstract terms of constraints -- and if (as I strongly suspect) you
cannot find a definition that is as simple, concise and natural as the
two-liner above, this might help convince you that your desired
definition would NOT be the most obvious, natural and fundamental, and
therefore would not be appropriate to pick as part of the language's
core. Indeed, it's an interesting problem to code up, if one wants any
generality (for example, identity of immutable items _whose items or
attributes are in turn immutable_ probably should not matter even for
your "strengthened" equality... but that's pretty hard to express!-).
Alex

Jun 1 '06 #2

michael.f.ellis

Hi Alex,
With all due respect to your well-deserved standing in the Python
community, I'm not convinced that equality shouldn't imply invariance
under identical operations.

Perhaps the most fundamental notion is mathematics is that the left and
right sides of an equation remain identical after any operation applied
to both sides. Our experience of the physical world is similar. If I
make identical modifications to the engines of two identical
automobiles, I expect the difference in performance to be identical.
If my expectation is met, I would assert that either the two vehicles
were not identical to begin with or that my modifications were not
performed identically.

As to containers, would you say that envelope containing five $100
bills is the same as an envelope containing a single $100 bill and 4
xerox copies of it? If so, I'd like to engage in some envelope
exchanges with you :-)

As I see it, reference copying is a very useful performance and memory
optimization. But I don't think it should undermine the validity of
assert(a==b) as a predictor of invariance under identical operations.

Cheers,
Mike Ellis
Alex Martelli wrote:

<mi*************@gmail.com> wrote:
Wrong; equality does not imply any check on identity. You can consider
the definition of "list A equals list B" as:

-- len(A) == len(B), AND,
-- for each valid index i, A[i] == B[i]

This is an extremely natural definition of "equality" for containers:
"they have EQUAL items" [[in the same order, for containers for which
order is relevant]]. Nowhere in this extremely natural definition does
the IDENTITY of the items come into play.

Jun 1 '06 #3

michael.f.ellis

oops! last sentence of 2nd paragraph in previous message should read
"If my expectation is NOT met ..."

Jun 1 '06 #4

Tim Chase

> As to containers, would you say that envelope containing five $100

bills is the same as an envelope containing a single $100 bill and 4
xerox copies of it? If so, I'd like to engage in some envelope
exchanges with you :-)

if len(set([bill.serialnumber for bill in envelope])) !=
len(envelope): refuseMichaelsExchange()

Though the way references work, you would have an envelope
containing only 5 slips of paper that all say "I have a $100
bill"...

:)

-tkc

Jun 1 '06 #5

Tim Peters

[mi*************@gmail.com]

...
As I see it, reference copying is a very useful performance and memory
optimization. But I don't think it should undermine the validity of
assert(a==b) as a predictor of invariance under identical operations.

So, as Alex said last time,

Try concisely expressing your "should" -- constructively, as
pseudocode that one could use to check for your "strengthened
equality", not in abstract terms of constraints -- and if (as I
strongly suspect) you cannot find a definition that is as simple,
concise and natural as the two-liner above, this might help
convince you that your desired definition would NOT be the most
obvious, natural and fundamental, and therefore would not be
appropriate to pick as part of the language's core. Indeed,
it's an interesting problem to code up, if one wants any generality
(for example, identity of immutable items _whose items or
attributes are in turn immutable_ probably should not matter even
for your "strengthened" equality... but that's pretty hard to express!-).

So try that. In reality, you can either learn to change your
expectations, or avoid virtually all object-oriented programming
languages. Object identity is generally fundamental to the intended
semantics of such languages, not just an optimization.

Think about a simpler case:

a = [1]
b = a
assert(a == b)
a.remove(1)
b.remove(1)

Oops. The last line dies with an exception, despite that a==b at the
third statement and that ".remove(1)" is applied to both a and b. If
you think a should not equal b at the third statement "because" of
this, you're going to lead a life of increasing but needless despair
;-)

Jun 1 '06 #6

Fredrik Lundh

mi*************@gmail.com wrote:

As to containers, would you say that envelope containing five $100
bills is the same as an envelope containing a single $100 bill and 4
xerox copies of it? If so, I'd like to engage in some envelope
exchanges with you :-)

if you spent as much time *learning* stuff as you spend making up irrelevant examples,
you might end up learning how assignments, repeat operators, and references work in
Python.

it's only hard to understand if you don't want to understand it.

</F>

Jun 1 '06 #7

Scott David Daniels

mi*************@gmail.com wrote:

As to containers, would you say that envelope containing five $100
bills is the same as an envelope containing a single $100 bill and 4
xerox copies of it? If so, I'd like to engage in some envelope
exchanges with you :-)

Would you say that envelope containing five $100 bills is equal to
an envelope containing five $100 bills with different serial numbers?

--Scott David Daniels
sc***********@acm.org

Jun 1 '06 #8

Jim Segrave

In article <11*********************@c74g2000cwc.googlegroups. com>,
<mi*************@gmail.com> wrote:

Hi Alex,
With all due respect to your well-deserved standing in the Python
community, I'm not convinced that equality shouldn't imply invariance
under identical operations.

Perhaps the most fundamental notion is mathematics is that the left and
right sides of an equation remain identical after any operation applied
to both sides. Our experience of the physical world is similar. If I
make identical modifications to the engines of two identical
automobiles, I expect the difference in performance to be identical.
If my expectation is met, I would assert that either the two vehicles
were not identical to begin with or that my modifications were not
performed identically.

As to containers, would you say that envelope containing five $100
bills is the same as an envelope containing a single $100 bill and 4
xerox copies of it? If so, I'd like to engage in some envelope
exchanges with you :-)

As I see it, reference copying is a very useful performance and memory
optimization. But I don't think it should undermine the validity of
assert(a==b) as a predictor of invariance under identical operations.

I think you end up with a totally unwieldy definition of '==' for
containers, since you have to check for _identical_ aliasing to
whatever depth it might occur, and, if you insist on equality
modeling the physical world, two lists can only be equal if:

for each corresponding element in the two lists, either the element is
a reference to the same underlying object or the corresponding
elements are references to objects which do not have and never will
have other references bound to them.

For example:

ra = ['a', 'a']
rb = ['b', 'b']

l1 = [ra, rb]
l2 = [ra, rb]

This will be equal by your definition and will preserve equality over
identical operations on l1 and l2

l3 = [ ['a', 'b'], ['a', 'b']]

This list will be identical, under your definition, so long as we
don't have anyone doing anything to the references ra or rb. Your
equality test has to claim that l1 and l3 are not equal, since ra
could be changed and that's not an operation on l1 or l3

This also leaves out the complexity of analysiing nested structures -
if you have a tuple, containing tuples which contain lists, then are
those top level tuples 'equal' if there are aliases in the lists? How
many levels deep should an equality test go?

Does the more common test, to see if the elements of a sequence are
identical at the time of comparision need a new operator or hand
coding, since most of the time programmers aren't interested in future
equality or not of the structures.

--
Jim Segrave (je*@jes-2.demon.nl)

Jun 1 '06 #9

michael.f.ellis

Hi Tim,
In your example, a & b are references to the same object. I agree they
should compare equally. But please note that a==b is True at every
point in your example, even after the ValueError raised by b.remove(1).
That's good consistent behavior.

My original example is a little different. a and b never referred to
the same object. They were containers created by different expressions
with no object in common. The problem arises because the overloaded *
operator makes row level copies of the lists in b. There's nothing
wrong with that, but the fact remains that a and b are different in a
very significant way.

I agree with Alex that checking for this type of inequality is not a
trivial programming exercise. It requires (at least) a parallel
recursion that counts references with the containers being compared .
At the same time, I know that much harder programming problems have
been solved. Where I disagree with Alex is in the labeling of the
existing behavior as 'natural'.

Alternatively, it might make sense to disallow == for containers by
raising a TypeError although that would eliminate a largely useful
feature.

Realistically, I know that Python is unlikely to adopt either
alternative. It would probably break a lot of existing code. My
point in the original post was to raise what I felt was a useful topic
for discussion and to help others avoid a pitfall that cost me a couple
of hours of head-scratching.

By the way, I've been programming professionally for over 25 years and
have used at least 30 different languages. During the past few years,
Python has become my language of choice for almost everything because
it helps me deliver more productivity and value to my clients.

Cheers,
Mike
Tim Peters wrote:

Think about a simpler case:

a = [1]
b = a
assert(a == b)
a.remove(1)
b.remove(1)

Oops. The last line dies with an exception, despite that a==b at the
third statement and that ".remove(1)" is applied to both a and b.

Jun 1 '06 #10

michael.f.ellis

Yes. You stated it quite precisely. I believe l1==l2 should always
return True and l1==l3 should always be False. (unless l3 is reassigned
as l3=l1). Your idea of a separate operator for 'all elements have
numerically equal values at the moment of comparision' is a good one.
For want of a better name, it could be called DeepCopyEquality(a,b) and
would be equivalent to a byte-by-byte comparison of two distinct
regions in memory created by a deep copies of a and b.

Cheers,
Mike

Jim Segrave wrote:

In article <11*********************@c74g2000cwc.googlegroups. com>,
<mi*************@gmail.com> wrote:
Hi Alex,
With all due respect to your well-deserved standing in the Python
community, I'm not convinced that equality shouldn't imply invariance
under identical operations.

Perhaps the most fundamental notion is mathematics is that the left and
right sides of an equation remain identical after any operation applied
to both sides. Our experience of the physical world is similar. If I
make identical modifications to the engines of two identical
automobiles, I expect the difference in performance to be identical.
If my expectation is met, I would assert that either the two vehicles
were not identical to begin with or that my modifications were not
performed identically.

As to containers, would you say that envelope containing five $100
bills is the same as an envelope containing a single $100 bill and 4
xerox copies of it? If so, I'd like to engage in some envelope
exchanges with you :-)

As I see it, reference copying is a very useful performance and memory
optimization. But I don't think it should undermine the validity of
assert(a==b) as a predictor of invariance under identical operations.

I think you end up with a totally unwieldy definition of '==' for
containers, since you have to check for _identical_ aliasing to
whatever depth it might occur, and, if you insist on equality
modeling the physical world, two lists can only be equal if:

for each corresponding element in the two lists, either the element is
a reference to the same underlying object or the corresponding
elements are references to objects which do not have and never will
have other references bound to them.

For example:

ra = ['a', 'a']
rb = ['b', 'b']

l1 = [ra, rb]
l2 = [ra, rb]

This will be equal by your definition and will preserve equality over
identical operations on l1 and l2

l3 = [ ['a', 'b'], ['a', 'b']]

This list will be identical, under your definition, so long as we
don't have anyone doing anything to the references ra or rb. Your
equality test has to claim that l1 and l3 are not equal, since ra
could be changed and that's not an operation on l1 or l3

This also leaves out the complexity of analysiing nested structures -
if you have a tuple, containing tuples which contain lists, then are
those top level tuples 'equal' if there are aliases in the lists? How
many levels deep should an equality test go?

Does the more common test, to see if the elements of a sequence are
identical at the time of comparision need a new operator or hand
coding, since most of the time programmers aren't interested in future
equality or not of the structures.

--
Jim Segrave (je*@jes-2.demon.nl)

Jun 1 '06 #11

Kent Johnson

mi*************@gmail.com wrote:

Hi Alex,
With all due respect to your well-deserved standing in the Python
community, I'm not convinced that equality shouldn't imply invariance
under identical operations.

Perhaps the most fundamental notion is mathematics is that the left and
right sides of an equation remain identical after any operation applied
to both sides. Our experience of the physical world is similar. If I
make identical modifications to the engines of two identical
automobiles, I expect the difference in performance to be identical.
If my expectation is met, I would assert that either the two vehicles
were not identical to begin with or that my modifications were not
performed identically.

But programming is not mathematics and assignment is not an equation.
How about this:

In [1]: a=3.0

In [2]: b=3

In [3]: a==b
Out[3]: True

In [4]: a/2 == b/2
Out[4]: False

Kent

Jun 1 '06 #12

Jim Segrave

In article <11*********************@i39g2000cwa.googlegroups. com>,
<mi*************@gmail.com> wrote:

Yes. You stated it quite precisely. I believe l1==l2 should always
return True and l1==l3 should always be False. (unless l3 is reassigned
as l3=l1). Your idea of a separate operator for 'all elements have
numerically equal values at the moment of comparision' is a good one.
For want of a better name, it could be called DeepCopyEquality(a,b) and
would be equivalent to a byte-by-byte comparison of two distinct
regions in memory created by a deep copies of a and b.

The operator which works at the moment of comaprision is already there
- that's what == does.
If you really think there's a need for a comparision which includes
dealing with aliasing, then it seems to me a python module with a
set of functions for comparisions would make more sense.

--
Jim Segrave (je*@jes-2.demon.nl)

Jun 1 '06 #13

michael.f.ellis

Considering the number of new programmers who get bit by automatic
coercion, I wish Dennis Ritchie had made some different choices when he
designed C. But then I doubt he ever dreamed it would become so wildly
successful.

Being a curmudgeon purist I'd actually prefer it if Python raised a
TypeError on float vs integer comparisons.

Cheers,
Mike

Kent Johnson wrote:

mi*************@gmail.com wrote:
Hi Alex,
With all due respect to your well-deserved standing in the Python
community, I'm not convinced that equality shouldn't imply invariance
under identical operations.

Perhaps the most fundamental notion is mathematics is that the left and
right sides of an equation remain identical after any operation applied
to both sides. Our experience of the physical world is similar. If I
make identical modifications to the engines of two identical
automobiles, I expect the difference in performance to be identical.
If my expectation is met, I would assert that either the two vehicles
were not identical to begin with or that my modifications were not
performed identically.

But programming is not mathematics and assignment is not an equation.
How about this:

In [1]: a=3.0

In [2]: b=3

In [3]: a==b
Out[3]: True

In [4]: a/2 == b/2
Out[4]: False

Kent

Jun 1 '06 #14

michael.f.ellis

Yes (unless I was testing the assertion that the second envelope did
not contain counterfeits of the first)

Scott David Daniels wrote:

Would you say that envelope containing five $100 bills is equal to
an envelope containing five $100 bills with different serial numbers?

Jun 1 '06 #15

Erik Max Francis

mi*************@gmail.com wrote:

With all due respect to your well-deserved standing in the Python
community, I'm not convinced that equality shouldn't imply invariance
under identical operations.

Doo you really want

2 == 2.0

to be False?

--
Erik Max Francis && ma*@alcyone.com && http://www.alcyone.com/max/
San Jose, CA, USA && 37 20 N 121 53 W && AIM erikmaxfrancis
Without victory there is no survival.
-- Winston Churchill

Jun 1 '06 #16

Maric Michaud

Le Jeudi 01 Juin 2006 18:00, mi*************@gmail.com a écrit*:

Perhaps the most fundamental notion is mathematics is that the left and
right sides of an equation remain identical after any operation applied
to both sides.

IMHO, you are not aware that the '=' symbol of mathematics exists in python,
it's the 'is' assertion.

a is b
and then, do what you want with a (or b), a is b remains True.

THIS is the meaning of expr1 = expr2, but in computer science, this is not as
important as it is in pure logic (most languages do not even provide the 'is'
assertion).

--
_____________

Maric Michaud
_____________

Aristote - www.aristote.info
3 place des tapis
69004 Lyon
Tel: +33 426 880 097

Jun 1 '06 #17

michael.f.ellis

Truthfully, I wouldn't mind it at all. In Python, I frequently write
things like
i == int(f)
or vice versa just to avoid subtle bugs that sometimes creep in when
later modifications to code change the original assumptions.

When working in C, I always set the compiler for maximum warnings and
do my damndest to make them all go away. In the long run, time spent on
rigorous coding always repays itself with interest in time saved
debugging.

Mike

Erik Max Francis wrote:

mi*************@gmail.com wrote:
With all due respect to your well-deserved standing in the Python
community, I'm not convinced that equality shouldn't imply invariance
under identical operations.

Doo you really want

2 == 2.0

to be False?

--
Erik Max Francis && ma*@alcyone.com && http://www.alcyone.com/max/
San Jose, CA, USA && 37 20 N 121 53 W && AIM erikmaxfrancis
Without victory there is no survival.
-- Winston Churchill

Jun 1 '06 #18

Slawomir Nowaczyk

On Thu, 01 Jun 2006 13:40:34 -0700
mi*************@gmail.com wrote:

#> Scott David Daniels wrote:
#> > Would you say that envelope containing five $100 bills is equal to
#> > an envelope containing five $100 bills with different serial numbers?

#> Yes (unless I was testing the assertion that the second envelope did
#> not contain counterfeits of the first)

So, what if Bank of America later decided that bills with serial
numbers containing "7" are no longer valid?

In other word, *if* you assume equality must be preserved by future
modifications, than no two different (modifiable) objects can ever be
really equal.

--
Best wishes,
Slawomir Nowaczyk
( Sl***************@cs.lth.se )

I believe that math illiteracy affects 7 out of every 5 people.

Jun 1 '06 #19

michael.f.ellis

I believe that 'is' tests equality of reference, such that

a = range(1,3)
b = range(1,3)
a is b
False

The 'is' operator tells you whether a and b refer to the same object.
What I've been discussing is whether == should test for "structural"
equality so that a and b remain equivalent under parallel mutations
(and also under single mutations to common references)

Cheers,
Mike

Maric Michaud wrote: Le Jeudi 01 Juin 2006 18:00, mi*************@gmail.com a écrit :
Perhaps the most fundamental notion is mathematics is that the left and
right sides of an equation remain identical after any operation applied
to both sides.

IMHO, you are not aware that the '=' symbol of mathematics exists in python,
it's the 'is' assertion.

a is b
and then, do what you want with a (or b), a is b remains True.

THIS is the meaning of expr1 = expr2, but in computer science, this is not as
important as it is in pure logic (most languages do not even provide the 'is'
assertion).

--
_____________

Maric Michaud
_____________

Aristote - www.aristote.info
3 place des tapis
69004 Lyon
Tel: +33 426 880 097

Jun 1 '06 #20

Slawomir Nowaczyk

On Thu, 01 Jun 2006 15:12:23 -0700
mi*************@gmail.com wrote:

#> I believe that 'is' tests equality of reference, such that
#>
#> >>> a = range(1,3)
#> >>> b = range(1,3)
#> >>> a is b
#> False
#>
#> The 'is' operator tells you whether a and b refer to the same object.
#> What I've been discussing is whether == should test for "structural"
#> equality so that a and b remain equivalent under parallel mutations
#> (and also under single mutations to common references)

What does "parallel mutations" mean? In particular, what should be the
results of each of the following three comparisons:

x, y, z = [1],[1],[1]
a, b = [x,y], [y,z]
c, d = [[1],[1]], [[1],[1]]
a == b
c == d
a[0].remove(1)
b[0].remove(1)
a == b

So, do I understand correctly that you would like first comparison
(a==b) to return "False" and second comparison (c==d) to return
"True"?

--
Best wishes,
Slawomir Nowaczyk
( Sl***************@cs.lth.se )

Living on Earth may be expensive, but it includes
an annual free trip around the Sun.

Jun 1 '06 #21

Alex Martelli

<mi*************@gmail.com> wrote:
...

I agree with Alex that checking for this type of inequality is not a
trivial programming exercise. It requires (at least) a parallel
I'm not asking for ANY programming: I'm asking for a *straightforward
operational definition*. If the concept which you hanker after is NOT
subject to a straightforward operational definition, then I would rule
out that said concept could POSSIBLY be "natural".
Alternatively, it might make sense to disallow == for containers by
raising a TypeError although that would eliminate a largely useful
feature.

This may be the best example I've ever seen of Emerson's well-known
quote about foolish consistency -- except that I don't think this
behavior would be "consistent" (either wisely or foolishly) with
anything except a vague handwaving set of constraints whose
"naturalness" (assuming it's unfeasible to provide a good and
straightforward operational definition) is out of the question.
Alex

Jun 2 '06 #22

Alex Martelli

<mi*************@gmail.com> wrote:

Hi Alex,
With all due respect to your well-deserved standing in the Python
community, I'm not convinced that equality shouldn't imply invariance
under identical operations.
So, why aren't you satisfying my request? Provide a simple concrete
definition of what your idea of equality WOULD behave like. I notice
that your lack of response stands out like a sore thumb -- all you're
providing is a set of constraints you desire and a collection of
illfounded analogies and handwaving. Traditional mathematics does not
support the concept of "change", nor the distinction between equality
and identity; the "real world" has no way to define what modifications
are "identical" except by their effects (if the results differ, either
the original equality was ill-posited or the modifications were not
"identical"). But the real world DOES have the concept of "performing
exactly the same sequence of operational steps", and, by THAT definition
of "equal modifications", then your assertion:
make identical modifications to the engines of two identical
automobiles, I expect the difference in performance to be identical.

is ill-founded -- or, rather, your *expectation* may be ill-founded.

Take two systems of any significant complexity that are similar enough
to be called "identical" by ALL observers (because trying to ascertain
the differences, if any, would inevitably perturb the systems
irretrievably by Heisenberg's effect -- i.e., there are no OBSERVABLE
differences, which by Occam's Razor requires you to posit the systems
are equal, because you cannot prove otherwise -- and entities must not
be multiplied beyond necessity, so supposing that "observably equal"
systems are indeed equal is Occam-compliant).

Now, perform "identical" (ditto) modifications: in the real world, due
to quantum effects, there WILL be sub-observable differences in what
you're doing to the first one and to the second one. If the systems are
unstable to start with, they may well amplify those differences to
observable proportions -- and there you are: the effect of the "equal"
change on "equal" system may easily become observably unequal.
Philosophically, you may classify this as an "observation" of both
systems, which reasoning backwards lead you to posit that either the
systems were NOT equal to start with or the modifications weren't...
that is, IF you also posit determinism, which, as well we know, is an
unwarrantedly strong hypothesis for systems in which the differences at
quantum level matter. Feel free to follow Einstein (and diverse
light-years away from the last few decades of physics) in positing that
there MUST exist "hidden variables" (unobservable except maybe in
destructive, irreversible ways) explaining the difference -- I'll stick
with the mainstream of physics and claim your expectation was badly
founded to start with.

I can debate epistemology with the best, but this is not really the
proper forum for this -- starting with the crucial distinction, what it
means, in mathematics OR in the real world, to state that two systems
are "equal but NOT identical"? In the end, such debates tend to prove
rather futile and unproductive, however.

In the world of programming languages, we cut through the chase by
requesting *operational* (Brouwer-ian, mathematically speaking)
definitions. Provide the *operational* definition of how you WANT
equality checking to work, contrast it with my simple two-lines one, and
THEN we can have a meaningful debate of which one is the correct one to
use in the core of a programming language that has the (blessing and
curse of) mutable data objects...
Alex

Jun 2 '06 #23

Alex Martelli

Slawomir Nowaczyk <sl*******************@student.lu.se> wrote:

On Thu, 01 Jun 2006 15:12:23 -0700
mi*************@gmail.com wrote:

#> I believe that 'is' tests equality of reference, such that
#>
#> >>> a = range(1,3)
#> >>> b = range(1,3)
#> >>> a is b
#> False
#>
#> The 'is' operator tells you whether a and b refer to the same object.
#> What I've been discussing is whether == should test for "structural"
#> equality so that a and b remain equivalent under parallel mutations
#> (and also under single mutations to common references)

What does "parallel mutations" mean? In particular, what should be the
results of each of the following three comparisons:

x, y, z = [1],[1],[1]
a, b = [x,y], [y,z]
c, d = [[1],[1]], [[1],[1]]
a == b
c == d
a[0].remove(1)
b[0].remove(1)
a == b

So, do I understand correctly that you would like first comparison
(a==b) to return "False" and second comparison (c==d) to return
"True"?

I sure hope not, since, e.g.:

ridiculous = c[0]

is not a "mutation" (so equality should still hold, right?), and then it
becomes weird to claim that

ridiculous.append('bah, humbug!')

is a "nonparallel mutation to" c and/or d.

In fact, I'm starting to wonder if by Michaels' requirement ANY
non-*IDENTICAL* containers (with non-identical mutable items) could EVER
be deemed "equal". If he's arguing that "==" should mean exactly the
same as "is", that's even crazier than I had gauged so far.

But of course, since Michaels still refuses to provide simple,
straightforward operational definitions of what it IS that he wants, all
of this remains vague and ill-defined. See *WHY* it's so important to
provide precision rather than just the handwaving he's given so far?
Alex

Jun 2 '06 #24

Alex Martelli

Slawomir Nowaczyk <sl*******************@student.lu.se> wrote:

On Thu, 01 Jun 2006 13:40:34 -0700
mi*************@gmail.com wrote:

#> Scott David Daniels wrote:
#> > Would you say that envelope containing five $100 bills is equal to
#> > an envelope containing five $100 bills with different serial numbers?

#> Yes (unless I was testing the assertion that the second envelope did
#> not contain counterfeits of the first)

So, what if Bank of America later decided that bills with serial
numbers containing "7" are no longer valid?
Then Wachowia would no doubt be happy to take my business away from
BoA;-).

I suspect you believe BoA is some kind of "official" body -- it isn't,
just like Deutschebank is not one in Germany (rather, Bundesbank is).

Just to share some tidbits (about which, as an Italian now living
between San Francisco and San Jose, I'm sort of proud of...!-)...:

Bank of America is a private bank, founded in San Francisco more than
100 years ago by an Italian-American guy (Amadeo Giannini, born in San
Jose, CA, but to Italian-born parents) as "Bank of Italy", then renamed
in 1930 in part because the Italian State bank "Banca d'Italia"
objected. It rose to prominence right after the SF earthquake of 100
years ago, by opening and staffing a temporary branch to ensure
depositors could access their money when they most needed it, while most
other banks were staying closed.

In other word, *if* you assume equality must be preserved by future
modifications, than no two different (modifiable) objects can ever be
really equal.

Yes, apart from the (slight and understandable!) mistake about BoA's
role, this is an excellent example. Here, a global change (to the rule
about what banknotes are "equal" to each other, by making some of them
invalid and thus unequal to others) perturbs Michaels' desired "strong
equality definition" -- to preserve it, equality must degenerate to
identity. A Python example would be a change to the default encoding
(not officially supported but achievable through a reload(sys), hint;-)
which could easily make a bytestring equal, or not, to a Unicode string!
Alex

Jun 2 '06 #25

Maric Michaud

Le Vendredi 02 Juin 2006 00:12, mi*************@gmail.com a écrit*:

I believe that 'is' tests equality of reference, such that
a = range(1,3)
b = range(1,3)
a is b

False

The 'is' operator tells you whether a and b refer to the same object.

Yeah ! That's it. And you proposed a definition of identity :
for all operator op, op(a) = op(b) => a = b
This is of poor use in real life where two thing are never identical, just
comparable.
What I've been discussing is whether == should test for "structural"
equality so that a and b remain equivalent under parallel mutations
(and also under single mutations to common references)

So you wanted a comparison opertor of twto sequence dafined like this :

seq1 == seq2 => for all e in seq1, seq2[seq1.index(e) *is* e

!!! this would not be very useful nor consistent I guess and prefer the one
used in python :

seq1 == seq2 => for all e in seq1, seq2[seq1.index(e) == e

--
_____________

Maric Michaud
_____________

Aristote - www.aristote.info
3 place des tapis
69004 Lyon
Tel: +33 426 880 097

Jun 2 '06 #26

Slawomir Nowaczyk

On Thu, 01 Jun 2006 19:16:16 -0700
al***@mac.com (Alex Martelli) wrote:

#> > What does "parallel mutations" mean? In particular, what should be the
#> > results of each of the following three comparisons:
#> >
#> > x, y, z = [1],[1],[1]
#> > a, b = [x,y], [y,z]
#> > c, d = [[1],[1]], [[1],[1]]
#> > a == b
#> > c == d
#> > a[0].remove(1)
#> > b[0].remove(1)
#> > a == b
#> >
#> > So, do I understand correctly that you would like first comparison
#> > (a==b) to return "False" and second comparison (c==d) to return
#> > "True"?
#>
#> I sure hope not,

So do I, but that's how I understood Michaels' words...

#> In fact, I'm starting to wonder if by Michaels' requirement ANY
#> non-*IDENTICAL* containers (with non-identical mutable items) could
#> EVER be deemed "equal". If he's arguing that "==" should mean
#> exactly the same as "is", that's even crazier than I had gauged so
#> far.

I think he explicitly said that "is" doesn't fulfill his requirements
either... but then, I am not sure as I do not understand what his
requirements actually are (they seem to make some sense for immutable
objects, but how should they generalise to mutable stuff I have no
idea).

PS. Thanks for explanation about Bank of America: I had no clue how it
works in realty, it just had a good name ;)

--
Best wishes,
Slawomir Nowaczyk
( Sl***************@cs.lth.se )

Java is clearly an example of a MOP (money-oriented programming)
-- Alexander Stepanov

Jun 2 '06 #27

Aahz

In article <1h**************************@mac.com>,
Alex Martelli <al***@mac.com> wrote:

Just to share some tidbits (about which, as an Italian now living
between San Francisco and San Jose, I'm sort of proud of...!-)...:

Bank of America is a private bank, founded in San Francisco more than
100 years ago by an Italian-American guy (Amadeo Giannini, born in San
Jose, CA, but to Italian-born parents) as "Bank of Italy", then renamed
in 1930 in part because the Italian State bank "Banca d'Italia"
objected. It rose to prominence right after the SF earthquake of 100
years ago, by opening and staffing a temporary branch to ensure
depositors could access their money when they most needed it, while most
other banks were staying closed.

Except, of course, that BofA doesn't exist anymore. Oh, the *name*
does, but what's now called BofA is simply the current name of the bank
that acquired BofA.
--
Aahz (aa**@pythoncraft.com) <*> http://www.pythoncraft.com/

"I saw `cout' being shifted "Hello world" times to the left and stopped
right there." --Steve Gonedes

Jun 2 '06 #28

Terry Reedy

"Aahz" <aa**@pythoncraft.com> wrote in message
news:e5**********@panix3.panix.com...

Except, of course, that BofA doesn't exist anymore. Oh, the *name*
does, but what's now called BofA is simply the current name of the bank
that acquired BofA.

In Pythonese, they performed

SomeBank.extend(BofA)
BofA = SomeBank
del SomeBank

so that id(BofA) is now what id(SomeBank) was, not what was id(the BofA I
grew up with). The name was definitely part of of the acquisition value.

;-)

OT, but not completely irrelevant to a discussion of names, ids, and
values.

Terry Jan Reedy

Jun 2 '06 #29

Terry Hancock

Alex Martelli wrote:

Slawomir Nowaczyk <sl*******************@student.lu.se> wrote:
On Thu, 01 Jun 2006 13:40:34 -0700 mi*************@gmail.com wrote:
#> Scott David Daniels wrote: #> > Would you say that envelope
containing five $100 bills is equal to #> > an envelope containing
five $100 bills with different serial numbers?

#> Yes (unless I was testing the assertion that the second envelope
did #> not contain counterfeits of the first)

So, what if Bank of America later decided that bills with serial
numbers containing "7" are no longer valid?

Then Wachowia would no doubt be happy to take my business away from
BoA;-).

I suspect you believe BoA is some kind of "official" body -- it
isn't, just like Deutschebank is not one in Germany (rather,
Bundesbank is).

Yeah, it's a funny mistake, but what he meant, is what if the
US Treasury Department declared bills with serial numbers
containing "7" invalid. That would indeed complete the analogy.

And it's a sharp example -- because money is conceived of as
fungible, one $100 is as good as another, so two $100 bills
compare as equal, whether they are equal or not.

Of course, the counter argument is that it's not unlike counting
a reflection of a $100 bill as another $100 and concluding that
you have $200 (you need two mirrors to double your money,
technically ;-)).

I don't think there's any way to make it "more logical" -- it's
going to break somewhere no matter what assumption you
make, so you just have to learn what's really going on in order
to avoid confusion.

Cheers,
Terry

--
Terry Hancock (ha*****@AnansiSpaceworks.com)
Anansi Spaceworks http://www.AnansiSpaceworks.com

Jun 2 '06 #30

Terry Hancock

Alex Martelli wrote:

to be called "identical" by ALL observers (because trying to
ascertain the differences, if any, would inevitably perturb the
systems irretrievably by Heisenberg's effect

Not to detract from your point, but the "Heisenberg effect", if
you mean the "Heisenberg uncertainty principle" is much more
fundamental (and quantumly "spooky") than this.

You are merely talking about the observer disturbing the system
by the process of observation, which is a common problem, but
has nothing to do with Heisenberg, and AFAIK, doesn't really
have a name. It's a normal application of classical physics.

I'm sorry to nitpick, it's just that it's one of those misconceptions
that never wants to die, like thinking that gravity is caused by
magnetism or the Earth's rotation, or that you can "get too close
and be 'sucked in' by a strong gravity field", or that things are
"weightless" in orbit, because they're "too far from the Earth's
gravity".

Cheers,
Terry

--
Terry Hancock (ha*****@AnansiSpaceworks.com)
Anansi Spaceworks http://www.AnansiSpaceworks.com

Jun 2 '06 #31

michael.f.ellis

Perhaps a little background to my original post will defuse some of the
controversy. While working during an airline flight, I ran into an
unexpected outcome from using the * replication operator to initialize
an array of lists. When I modified a single element of the array an
entire column changed. Having no reference books or internet access
available, I tried to understand what was going on by creating some
small arrays on the command line to see if there was a difference
between explicit initialization and initialization with range() and the
* operator.

The arrays looked identical when printed and a == b returned True. Yet
the arrays were clearly not equivalent because mutating the
corresponding elements produced different outcomes. I put the problem
aside until the next day when I looked at it some more and and created
the example script I posted. Just as I was about to hit the Send
button, I realized that the * operator must have been creating
references instead of copies. And then I appended the now much debated
opinion that == should have detected the difference.

(As an aside, may I point out that Python In A Nutshell states on page
46 "The result of S*n or n*S is the concatenation of n copies of S". It
might be well to put a warning in a future edition that this is not
strictly the case.)

My viewpoint is that of a working professional software consultant.
I'm essentially a pragmatist with no strong 'religious' views about
languages and methodologies. As I noted in an earlier reply, I don't
realistically expect Python to change the behavior of the == operator.
I do think that a problem arose when it was adopted from C and extended
to allow comparison of containers. In C, you can use it to compare
integers, floats, and pointers and everyone understands that p==q does
not imply *p == *q. Moreover, compilers issue warnings about
comparisons between different types.

Basically, I'm looking for simple diagnostic tools that make it easy to
understand what's really going on when code produces an unexpected
result. A 'strengthened equivalence' operator, to use your terminology
would have been useful to me.

As to constructing pseudocode for such an operator, I've appended a
working script below. The counterexamples and questions from Slawomir,
Maric, and Jim were really useful in sharpening my thinking about the
matter. I'm sure there are many ways to break it. For example, tuples
have no index method, so one would have to be written. Still, I hope it
will serve to move the discussion beyond terms like 'crazy' and
'handwaving' and 'ill-founded'. I haven't used such perjoratives in
any of my posts and would appreciate the same courtesy.

Cheers,
Mike

'''
StrongEquality -- a first cut at the definition proposed by M. Ellis.
Author: Michael F. Ellis, Ellis & Grant, Inc.
'''

def indices(item,seq):
'''Utility function that returns a list of indices where item occurs
in seq'''
result=[]
for i in xrange(len(seq)):
try:
result.append(i+seq[i:].index(item))
except ValueError:
return result

def StrongEquality(a,b):
'''True if a and b are numerically and "structurally" equal'''
if a is b: return True
if a != b: return False
## At this point we know a and b have the same length and
## evaluate numerically equivalent. We now need to figure out
## whether there are any references to identical objects in
non-corresponding
## positions of a & b (per Slawomir's example). We also need to
inspect
## a and b for non-matching patterns of identical references (per
my example)
ida=[] ; idb=[]
for i in xrange(len(a)):
if a[i] is b[i]:
continue
if isinstance(a[i], (int, float, str)) and isinstance(b[i],
(int, float, str)):
continue ## we already know they're numerically
equal

ida.append(id(a[i]))
idb.append(id(b[i]))
## We know that ida[n] is not idb[n] for all n because we
omitted all
## cases where a is b. Therefore Slawomir's example is
detected if
## any id appears in both lists.
for n in ida:
if n in idb: return False
## Next we test for my example. I'm sure this can be coded
more
## more elegantly ...
for j in xrange(len(ida)):
if indices(ida[j],ida) != indices(idb[j],idb): return
False
## Lastly, recurse ...
if not StrongEquality(a[i],b[i]): return False

return True

if __name__=='__main__':
## Rudimentary test cases
assert StrongEquality(1,1)
assert not StrongEquality(0,1)

## Slawomir's example
x, y, z = [1],[1],[1]
a, b = [x,y], [y,z]
c, d = [[1],[1]], [[1],[1]]
assert StrongEquality(c,d)
assert a == b
assert not StrongEquality(a,b)

## My example
a =[[[1,2],[1,2]],[[1,2],[1,2]]]
b = [[range(1,3)]*2]*2
assert a==b
assert not StrongEquality(a,b)

print "All tests ok."

Alex Martelli wrote:

<mi*************@gmail.com> wrote:
Hi Alex,
With all due respect to your well-deserved standing in the Python
community, I'm not convinced that equality shouldn't imply invariance
under identical operations.

So, why aren't you satisfying my request? Provide a simple concrete
definition of what your idea of equality WOULD behave like. I notice
that your lack of response stands out like a sore thumb -- all you're
providing is a set of constraints you desire and a collection of
illfounded analogies and handwaving. Traditional mathematics does not
support the concept of "change", nor the distinction between equality
and identity; the "real world" has no way to define what modifications
are "identical" except by their effects (if the results differ, either
the original equality was ill-posited or the modifications were not
"identical"). But the real world DOES have the concept of "performing
exactly the same sequence of operational steps", and, by THAT definition
of "equal modifications", then your assertion:
make identical modifications to the engines of two identical
automobiles, I expect the difference in performance to be identical.

is ill-founded -- or, rather, your *expectation* may be ill-founded.

Take two systems of any significant complexity that are similar enough
to be called "identical" by ALL observers (because trying to ascertain
the differences, if any, would inevitably perturb the systems
irretrievably by Heisenberg's effect -- i.e., there are no OBSERVABLE
differences, which by Occam's Razor requires you to posit the systems
are equal, because you cannot prove otherwise -- and entities must not
be multiplied beyond necessity, so supposing that "observably equal"
systems are indeed equal is Occam-compliant).

Now, perform "identical" (ditto) modifications: in the real world, due
to quantum effects, there WILL be sub-observable differences in what
you're doing to the first one and to the second one. If the systems are
unstable to start with, they may well amplify those differences to
observable proportions -- and there you are: the effect of the "equal"
change on "equal" system may easily become observably unequal.
Philosophically, you may classify this as an "observation" of both
systems, which reasoning backwards lead you to posit that either the
systems were NOT equal to start with or the modifications weren't...
that is, IF you also posit determinism, which, as well we know, is an
unwarrantedly strong hypothesis for systems in which the differences at
quantum level matter. Feel free to follow Einstein (and diverse
light-years away from the last few decades of physics) in positing that
there MUST exist "hidden variables" (unobservable except maybe in
destructive, irreversible ways) explaining the difference -- I'll stick
with the mainstream of physics and claim your expectation was badly
founded to start with.

I can debate epistemology with the best, but this is not really the
proper forum for this -- starting with the crucial distinction, what it
means, in mathematics OR in the real world, to state that two systems
are "equal but NOT identical"? In the end, such debates tend to prove
rather futile and unproductive, however.

In the world of programming languages, we cut through the chase by
requesting *operational* (Brouwer-ian, mathematically speaking)
definitions. Provide the *operational* definition of how you WANT
equality checking to work, contrast it with my simple two-lines one, and
THEN we can have a meaningful debate of which one is the correct one to
use in the core of a programming language that has the (blessing and
curse of) mutable data objects...
Alex

Jun 2 '06 #32

Terry Reedy

<mi*************@gmail.com> wrote in message
news:11**********************@g10g2000cwb.googlegr oups.com...

(As an aside, may I point out that Python In A Nutshell states on page
46 "The result of S*n or n*S is the concatenation of n copies of S".
It would be more exact to say that S*n is [] extended with S n times,
which makes it clear that 0*S == S*0 == [] and which avoids the apparently
misleading word 'copy'. I presume the C implementation is the equivalent
of

def __mul__(inlist, times):
result = []
for i in range(times):
result.extend(inlist)
return result

Or one could say that the result *is the same as* (not *is*) the
concatenation of n *shallow* copies of S. 'Shallow' means that each copy
of S would have the same *content* (at the id level) as S, so that the
result would contain the content of S n times, which is to say, when len(S)
== 1, n slots with each slot bound to the *identical* content of S.

When the content is immutable, the identicaliy does not matter. When it
*is* mutable, it does.
It might be well to put a warning in a future edition that this is not
strictly the case.)

Did you mean anything else by 'not strictly the case'?

Terry Jan Reedy

Jun 3 '06 #33

Alex Martelli

<mi*************@gmail.com> wrote:
...

(As an aside, may I point out that Python In A Nutshell states on page
46 "The result of S*n or n*S is the concatenation of n copies of S". It
might be well to put a warning in a future edition that this is not
strictly the case.)
Can you give me an example where, say, for a sequence S,

x = S * 3

is not structurally the same as

x = copy.copy(S) + copy.copy(S) + copy.copy(S)

....? That is, where the "* 3" on a sequence is NOT the concatenation of
three copies (ordinary copies, of course!) of that sequence? I don't
think you can... and I can't repeatedly explain or point to the
distinction between normal, ordinary, shallow copies on one side, and
"deep copies" on the other, every single time in which that distinction
MIGHT be relevant (because some reader might not be aware of it); such
endless repetition would bloat the Nutshell totally away from its role
as a CONCISE desktop reference, and seriously hamper its usefulness
(particularly by DESTROYING any trace of usefulness for anybody who's
finally *GOT* this crucial bit, but not just in that way).

languages and methodologies. As I noted in an earlier reply, I don't
realistically expect Python to change the behavior of the == operator.
Then you might have avoided trying to convince anybody, or even trying
to IMPLY, that in an ideal version of Python == *SHOULD* behave your way
-- Python's semantics *ARE* entirely up for rediscussion at the moment,
with an eye on the future "Python 3000" release, so this is one of the
very rare periods of the history of the language where backwards
incompatibility of a potential change is _NOT_ a blocking point.

By asserting that your version of == would be "more natural", and trying
to defend that assertion by vague handwaving references to maths and
"real world", you managed to entirely shift MY mindstate (and possibly
that of several other discussants) into one of total and absolute
opposition to the proposal -- having by now spent considerable time and
energy pondering and debating the issue, I am now entirely convinced
that a language with such an == operator instead of Python's current one
would be such a total, unadulterated disaster that I would refuse to use
that language, no matter what other "good" features it might present to
me. I've walked away from great jobs, just because they would have
required me to use some technology I just could not stand, more than
once already in my life: and this IS how strongly (and negatively) I
feel about your claim that, for built-in ==, your semantics would be in
any way preferable to Python's.

By managing to thus focus my mindset (and make me spend my energy and
time) in opposition to your claims of "more natural", you have at least
managed to ensure that I will not now lend any scrap of help or support
to your debugging needs. If you were as pragmatic as you claim to be,
this kind of consideration WOULD have weighed highly in your choices.

I.ie., if you had WANTED to attract any such support and help, a
completely different attitude than that "most natural" claim would have
been ENORMOUSLY more productive -- and your continuing attempts to
debate that issue aren't helping AT ALL either:
I do think that a problem arose when it was adopted from C and extended
to allow comparison of containers. In C, you can use it to compare
integers, floats, and pointers and everyone understands that p==q does
not imply *p == *q.
If that is so, then everyone is utterly, totally, entirely, horribly
*WRONG*, because, in C, p==q ***DOES*** imply *p == *q (whenever p --
and by consequence q, given equality -- may legitimately be
dereferenced: p == q == 0 does not imply anything about *p and/or *q,
which may produce random results, crash the process, or whatever -- of
course).

You no doubt meant to say something entirely different from what you
ACTUALLY said, but I respectfully suggest you spare your breath rather
than keep trying to defend an indefensible position.

I do NOT agree, and I cannot imagine any state of the world that would
get me to agree, with your claim that "a problem arose" by allowing
equality comparison of containers in Python (many other languages allow
such comparisons, BTW; I would consider it a horrible wart if a language
claiming to be "higher level" did NOT). That you're "supporting" (HA!)
your absurd claim with an equally absurd (and obviously, wholly false,
unfounded, and misplaced) claim about C pointers doesn't "help", of
course, but even perfectly accurate claims about C (or machine code, or
Cobol, or RPG...) would be pretty much uninteresting and irrelevant.
Moreover, compilers issue warnings about
comparisons between different types.
Some do, some don't, depending on the case -- e.g., I do not believe
that even with -Wall (or the equivalent setting) any C compiler whines
about EQUALITY comparisons of signed and unsigned integers (as well they
shouldn't, of course)!

And, just as of course, this umpteenth side-path you're introducing has
really nothing to do with the case, since when you're comparing two
lists-containing-lists which are equal by Python's rules but according
to your original claims about what's "more natural" ``should not'' be,
you ARE anyway comparing object of equal types.
Basically, I'm looking for simple diagnostic tools that make it easy to
understand what's really going on when code produces an unexpected
result. A 'strengthened equivalence' operator, to use your terminology
would have been useful to me.
A *FUNCTION* performing such checks in a debugging and diagnostics
package would have been -- and if you hadn't pushed me to spend so much
time and energy defending Python's design choices against your claims
that other choices would be "more natural", you might have gotten help
and support in developing it. But you chose to make ill-founded claims
of "more natural", and therefore you got a flamewar instead: your
choice.
will serve to move the discussion beyond terms like 'crazy' and
'handwaving' and 'ill-founded'. I haven't used such perjoratives in
any of my posts and would appreciate the same courtesy.

You claimed that Python's semantics are "contrary to one of my
fundamental expectations", "an oddity", resulting only (you said in your
second post) from a "performance and memory optimization", and tried to
justify this severe criticism of Python by vague (i..e, "handwaving")
appeals to analogies with "maths" and "the real world".

I do not believe that such scathing criticism had any sound foundations,
nor that calling it "ill-founded" is anything but a fittingly accurate
description: and a language where the == operator DID satisfy the
constraints you claimed to desire for *THE EQUALITY OPERATOR ITSELF* (as
opposed to, for some helper/checker function in a separate module for
checking and debugging) would be a crazy one indeed. If my opinions
that, I believe, accurately reflect the facts of the case, sound
"pejorative" to you, well, that's not a matter of choice of words on my
part, as much as of your choice of what to express in the first place.

Take your recent claim that in C "everyone understands that p==q does
not imply *p == *q"; which of the many ("pejorative") adjectives I
tagged your assertion with do you think are inaccurate or inappropriate?
I called it wrong, absurd, false, unfounded, and misplaced, and seasoned
the mix with a choice of adverbs including "utterly" and "horribly". It
appears to me that each of these adjectives and adverbs is appropriate
and accurate (though it's repetitious on my part to use them all, that
repetition does convey the intensity of my opinions in the matter).

This is a factual issue where it's easy to defend strongly held opinions
(which may be checked against "facts"); in matters of "should" (what
semantics "should" a certain language construct HAVE, in order to be
most natural, simplest, and most useful -- quite apart from any issues
of optimization) such ease is, alas, not given... but the fact that
veryfying clashing opinons about what "should" be the case is way harder
than opinions easily checkable against "facts", does not mean that the
"should"'s (which are even more important, potentially, in their effect
on future language design!) are any less important -- on the contrary.
I do not believe I am going to follow this thread any more; I wish you
best of luck in your future endeavors -- and, if you can get back to
being the pragmatist that you claim to be, perhaps in the future you may
chose to express your debugging needs and desiderata in ways that
dispose the experts on some give technology to help and support you,
rather than to fight against the damage which, they opine, it would
cause to the future of that technology if certain ("crazy") suggestions
were to be part of it.
Alex

Jun 3 '06 #34

Alex Martelli

Terry Reedy <tj*****@udel.edu> wrote:

<mi*************@gmail.com> wrote in message
news:11**********************@g10g2000cwb.googlegr oups.com...
(As an aside, may I point out that Python In A Nutshell states on page
46 "The result of S*n or n*S is the concatenation of n copies of S".
It would be more exact to say that S*n is [] extended with S n times,
which makes it clear that 0*S == S*0 == [] and which avoids the apparently
misleading word 'copy'. I presume the C implementation is the equivalent

Considering that the very next (and final) sentence in that same
paragraph is "If n is zero or less than zero, the result is an empty
sequence of the same type as S", I don't think there's anything
misleading in the quoted sentence. Moreover, since the paragraph is
about sequences, not just lists, it *WOULD* be horribly wrong to use the
phrasing you suggest: "bah!"*3 is NOT a list, it's EXACTLY the
concatenation of three copies of that string -- no more, no less.
Or one could say that the result *is the same as* (not *is*) the
I find this distinction, in this context, to be empty padding, with zero
added value on ANY plane -- including the plane of "pedantry";-).
concatenation of n *shallow* copies of S. 'Shallow' means that each copy

I do not think it would be good to introduce the concept of "shallow" at
a point in the text which is talking about ALL sequences -- including
ones, such as strings, for which it just does not apply.

But, thanks for the suggestions, anyway!
Alex

Jun 3 '06 #35

anton.vredegoor

Alex Martelli wrote:

[snip]

Can somebody please shut down this bot? I think it's running out of
control. It seems to be unable to understand that "don't be evil" might
be good when you're small (at least it's not very bad) but that it
becomes distinctly evil when you're big.

What is good when you're big? I really don't know and I think there's
even not many other people who do. But simply forbidding things that are
not precisely definable -the way mathematicians have been doing before
physicists shook them out of it- seems to do more harm than good.

In my opinion it looks like there is a path from rigid rule adherence to
slowly accepting more doubt and inconsistencies -because we're all
adults here- and this has something to do with letting go of things like
childish adherence to static typing and confusion between equality and
identity.

Let me qualify that last paragraph before anyone concludes I have become
disfunctional too and will lead everyone to their destruction.

There seem to always be certain unclear parts in a programming language
and people are constantly trying out new structures in order to map some
new territory. I remember sets, generators and metaclasses. Only after
people noticing problems (don't modify the thing you're iterating over)
ways are found to solve them (you can if you put everything back just at
the right moment) and finally these ways are condensed into officially
endorsed coding practices. Now we're struggling with immutability and
sequences. They're not a problem if you know what you're doing, but what
exactly is it that those who know what they're doing do? It indicates
that maybe it's the birth of a new language construct.

But why should it stop there? I expect a certain openness and
willingness to discuss controversial matters from a community even if it
were only to educate newcomers. But it might be the case that such
willingness to accept doubt, without it turning into actively seeking it
-that seems to be foolish, but who am I to judge even that- is what
makes it possible to develop higher language and cognitive structures.

Anton

'even if it means turning into lisp before moving on'

Jun 3 '06 #36

Alex Martelli

<an*************@gmail.com> wrote:

Alex Martelli wrote:

[snip]

Can somebody please shut down this bot? I think it's running out of

Much as you might love for somebody to "shut me down", that
(unfortunately, no doubt, from your viewpoint) is quite unlikely to
happen. Although "making predictions is always difficult, especially
about the future", the most likely course of events is that I shall
continue for a while to survive, probably in tolerable health.

BTW, and for your information: in modern Western society, publically
wishing for some adversary's death is often considered somewhat uncouth
and rude, except perhaps in times of war or similar extremities. Being
aware of such social niceties and conventions may help: even were I to
wish that somebody crushed you underfoot like the worm you are, I would
avoid expressing such a wish in public, and likely in private too.
Alex

Jun 3 '06 #37

anton.vredegoor

Alex Martelli wrote:

<an*************@gmail.com> wrote:

Can somebody please shut down this bot? I think it's running out of

Much as you might love for somebody to "shut me down", that
(unfortunately, no doubt, from your viewpoint) is quite unlikely to
happen. Although "making predictions is always difficult, especially
about the future", the most likely course of events is that I shall
continue for a while to survive, probably in tolerable health.

You've got that completely wrong. I was not trying to kill you but I was
trying to revive you. A process that is not evolving is dead. Stopping
it frees up valuable resources that enable it to become alive again.

Anton

'being alive is being mutable'

Jun 3 '06 #38

michael.f.ellis

Hey Alex, lighten up! Python is a programming language -- not your
family, religion, or civil rights.
Cheers,
Mike

Alex Martelli wrote:

<mi*************@gmail.com> wrote:
...
(As an aside, may I point out that Python In A Nutshell states on page
46 "The result of S*n or n*S is the concatenation of n copies of S". It
might be well to put a warning in a future edition that this is not
strictly the case.)

Can you give me an example where, say, for a sequence S,

x = S * 3

is not structurally the same as

x = copy.copy(S) + copy.copy(S) + copy.copy(S)

...? That is, where the "* 3" on a sequence is NOT the concatenation of
three copies (ordinary copies, of course!) of that sequence? I don't
think you can... and I can't repeatedly explain or point to the
distinction between normal, ordinary, shallow copies on one side, and
"deep copies" on the other, every single time in which that distinction
MIGHT be relevant (because some reader might not be aware of it); such
endless repetition would bloat the Nutshell totally away from its role
as a CONCISE desktop reference, and seriously hamper its usefulness
(particularly by DESTROYING any trace of usefulness for anybody who's
finally *GOT* this crucial bit, but not just in that way).

languages and methodologies. As I noted in an earlier reply, I don't
realistically expect Python to change the behavior of the == operator.

Then you might have avoided trying to convince anybody, or even trying
to IMPLY, that in an ideal version of Python == *SHOULD* behave your way
-- Python's semantics *ARE* entirely up for rediscussion at the moment,
with an eye on the future "Python 3000" release, so this is one of the
very rare periods of the history of the language where backwards
incompatibility of a potential change is _NOT_ a blocking point.

By asserting that your version of == would be "more natural", and trying
to defend that assertion by vague handwaving references to maths and
"real world", you managed to entirely shift MY mindstate (and possibly
that of several other discussants) into one of total and absolute
opposition to the proposal -- having by now spent considerable time and
energy pondering and debating the issue, I am now entirely convinced
that a language with such an == operator instead of Python's current one
would be such a total, unadulterated disaster that I would refuse to use
that language, no matter what other "good" features it might present to
me. I've walked away from great jobs, just because they would have
required me to use some technology I just could not stand, more than
once already in my life: and this IS how strongly (and negatively) I
feel about your claim that, for built-in ==, your semantics would be in
any way preferable to Python's.

By managing to thus focus my mindset (and make me spend my energy and
time) in opposition to your claims of "more natural", you have at least
managed to ensure that I will not now lend any scrap of help or support
to your debugging needs. If you were as pragmatic as you claim to be,
this kind of consideration WOULD have weighed highly in your choices.

I.ie., if you had WANTED to attract any such support and help, a
completely different attitude than that "most natural" claim would have
been ENORMOUSLY more productive -- and your continuing attempts to
debate that issue aren't helping AT ALL either:
I do think that a problem arose when it was adopted from C and extended
to allow comparison of containers. In C, you can use it to compare
integers, floats, and pointers and everyone understands that p==q does
not imply *p == *q.

If that is so, then everyone is utterly, totally, entirely, horribly
*WRONG*, because, in C, p==q ***DOES*** imply *p == *q (whenever p --
and by consequence q, given equality -- may legitimately be
dereferenced: p == q == 0 does not imply anything about *p and/or *q,
which may produce random results, crash the process, or whatever -- of
course).

You no doubt meant to say something entirely different from what you
ACTUALLY said, but I respectfully suggest you spare your breath rather
than keep trying to defend an indefensible position.

I do NOT agree, and I cannot imagine any state of the world that would
get me to agree, with your claim that "a problem arose" by allowing
equality comparison of containers in Python (many other languages allow
such comparisons, BTW; I would consider it a horrible wart if a language
claiming to be "higher level" did NOT). That you're "supporting" (HA!)
your absurd claim with an equally absurd (and obviously, wholly false,
unfounded, and misplaced) claim about C pointers doesn't "help", of
course, but even perfectly accurate claims about C (or machine code, or
Cobol, or RPG...) would be pretty much uninteresting and irrelevant.
Moreover, compilers issue warnings about
comparisons between different types.

Some do, some don't, depending on the case -- e.g., I do not believe
that even with -Wall (or the equivalent setting) any C compiler whines
about EQUALITY comparisons of signed and unsigned integers (as well they
shouldn't, of course)!

And, just as of course, this umpteenth side-path you're introducing has
really nothing to do with the case, since when you're comparing two
lists-containing-lists which are equal by Python's rules but according
to your original claims about what's "more natural" ``should not'' be,
you ARE anyway comparing object of equal types.
Basically, I'm looking for simple diagnostic tools that make it easy to
understand what's really going on when code produces an unexpected
result. A 'strengthened equivalence' operator, to use your terminology
would have been useful to me.

A *FUNCTION* performing such checks in a debugging and diagnostics
package would have been -- and if you hadn't pushed me to spend so much
time and energy defending Python's design choices against your claims
that other choices would be "more natural", you might have gotten help
and support in developing it. But you chose to make ill-founded claims
of "more natural", and therefore you got a flamewar instead: your
choice.
will serve to move the discussion beyond terms like 'crazy' and
'handwaving' and 'ill-founded'. I haven't used such perjoratives in
any of my posts and would appreciate the same courtesy.

You claimed that Python's semantics are "contrary to one of my
fundamental expectations", "an oddity", resulting only (you said in your
second post) from a "performance and memory optimization", and tried to
justify this severe criticism of Python by vague (i..e, "handwaving")
appeals to analogies with "maths" and "the real world".

I do not believe that such scathing criticism had any sound foundations,
nor that calling it "ill-founded" is anything but a fittingly accurate
description: and a language where the == operator DID satisfy the
constraints you claimed to desire for *THE EQUALITY OPERATOR ITSELF* (as
opposed to, for some helper/checker function in a separate module for
checking and debugging) would be a crazy one indeed. If my opinions
that, I believe, accurately reflect the facts of the case, sound
"pejorative" to you, well, that's not a matter of choice of words on my
part, as much as of your choice of what to express in the first place.

Take your recent claim that in C "everyone understands that p==q does
not imply *p == *q"; which of the many ("pejorative") adjectives I
tagged your assertion with do you think are inaccurate or inappropriate?
I called it wrong, absurd, false, unfounded, and misplaced, and seasoned
the mix with a choice of adverbs including "utterly" and "horribly". It
appears to me that each of these adjectives and adverbs is appropriate
and accurate (though it's repetitious on my part to use them all, that
repetition does convey the intensity of my opinions in the matter).

This is a factual issue where it's easy to defend strongly held opinions
(which may be checked against "facts"); in matters of "should" (what
semantics "should" a certain language construct HAVE, in order to be
most natural, simplest, and most useful -- quite apart from any issues
of optimization) such ease is, alas, not given... but the fact that
veryfying clashing opinons about what "should" be the case is way harder
than opinions easily checkable against "facts", does not mean that the
"should"'s (which are even more important, potentially, in their effect
on future language design!) are any less important -- on the contrary.
I do not believe I am going to follow this thread any more; I wish you
best of luck in your future endeavors -- and, if you can get back to
being the pragmatist that you claim to be, perhaps in the future you may
chose to express your debugging needs and desiderata in ways that
dispose the experts on some give technology to help and support you,
rather than to fight against the damage which, they opine, it would
cause to the future of that technology if certain ("crazy") suggestions
were to be part of it.
Alex

Jun 3 '06 #39

Terry Reedy

"Alex Martelli" <al***@mac.com> wrote in message
news:1hgbjx1.1gn7haipx7x5N%al***@mac.com...

Terry Reedy <tj*****@udel.edu> wrote:
<mi*************@gmail.com> wrote in message
news:11**********************@g10g2000cwb.googlegr oups.com...
> (As an aside, may I point out that Python In A Nutshell states on page
> 46 "The result of S*n or n*S is the concatenation of n copies of S".

Alex, in responding to this sentence lifted out of context by Michael, I am
responding more to him and his claim of 'not strictly the case' than to
you. What I wrote is also more a draft of a paragraph for a book of mine
(of no competition to yours) than a suggestion for a future edition of your
book.
It would be more exact to say that S*n is [] extended with S n times,
A**b is often explained as 'A multiplied by itself b times' or some such.
It is more exact to say it is '1 multiplied by A b times'
which makes it clear that 0*S == S*0 == []
which makes it clear that A**0, including 0**0, is 1.
and which avoids the apparently misleading word 'copy'.
Given the number of people, including Michael, who have posted puzzlement
based on their confusion as to what does and does not get copied, I don't
think it unfair to call 'copy' 'apparently misleading. My intention in
this phrase is to suggest that one who misunderstands 'copy' in this
context will be mislead while one who understands just what is copied and
what is not copied will not.

The point of my code snippet was to explain/illustrate what is copied.
While it will (obviously) only executive as Python code for lists, I
believe the algorithm is generic if the initial assignment and list.extend
method are suitably interpreted.

Of course, I presume that in the CPython code, the actual initialization is
more like (length one blank value) * (n*len(S)) followed by n slice
assignments, but the Python code for this would still be list specific and
would also be more complex, without much aiding comprehension of the
result.

Strings are a somewhat special case since the characters in the string are
not wrapped as Python objects unless and until extracted from the string.
So one cannot as easily talk about the object(s) contained in the sequence.

Considering that the very next (and final) sentence in that same
paragraph is "If n is zero or less than zero, the result is an empty
sequence of the same type as S", I don't think there's anything
misleading in the quoted sentence.
In its original context, with respect to that issue, no. But as I said, I
was responding to Michael's removed-from-context quotation and his claim
about the need for a warning. I wonder if you read down to the end, where
I asked him whether I had missed anything he might find fault with, before
you responded.
Moreover, since the paragraph is about sequences, not just lists,
This *thread* is about repetition of lists, and in particular, a list of
one (or more) lists, and the consequences of the mutability of the inner
lists, and that is the context in which I wrote.
it *WOULD* be horribly wrong to use the phrasing you suggest:
In a generic context, [] would obviously have to be replaced by 'null
sequence of the type of S', which in many cases is type(S)(). And 'extend'
would have to be interpreted generically, as something the interpreter
would do behind the scenes in type(S). __new__, although as I said before,
I don't think that is exactly what it does do.
"bah!"*3 is NOT a list,
Duh.,
it's EXACTLY the
concatenation of three copies of that string -- no more, no less.
Depends what one means by 'copy'. See below for your alternate wording.
Or one could say that the result *is the same as* (not *is*) the

I find this distinction, in this context, to be empty padding, with zero
added value on ANY plane -- including the plane of "pedantry";-).

Perhaps you should tone down the overwrought emotionalism and take a look
in a mirror. In *your* response to Michael you made the *same*
distinction:

Can you give me an example where, say, for a sequence S,
x = S * 3
is not structurally the same as
x = copy.copy(S) + copy.copy(S) + copy.copy(S)
I agree that adding 'structurally' makes the 'padding' even better.
concatenation of n *shallow* copies of S. 'Shallow' means that each
copy

I do not think it would be good to introduce the concept of "shallow" at
a point in the text which is talking about ALL sequences -- including
ones, such as strings, for which it just does not apply.
Again, I was not suggesting that you do so. Adding 'shallow' was
intentially pedantic for Michael's 'benefit'. To be clear, I was *NOT*
supporting his warning suggestion.
But, thanks for the suggestions, anyway!

We both have the goal of expaining Python better so beginners hit fewer
bumps on the road.

Terry Jan Reedy

Jun 3 '06 #40

Alex Martelli

Terry Reedy <tj*****@udel.edu> wrote:

it's EXACTLY the
concatenation of three copies of that string -- no more, no less.

Depends what one means by 'copy'. See below for your alternate wording.

Please give me a reasonable definition of the unadorned word "copy"
which would make this statement false. (And, just to forestall one
possible attempt: no, I cannot agree that a ``deepcopy'' is a reasonable
definition of the _unadorned_ word "copy").

Or one could say that the result *is the same as* (not *is*) the

I find this distinction, in this context, to be empty padding, with zero
added value on ANY plane -- including the plane of "pedantry";-).

Perhaps you should tone down the overwrought emotionalism and take a look
in a mirror. In *your* response to Michael you made the *same*
distinction:

I did not *DRAW* any distinction (as you did with your parenthetical
note "(not *is*)", emphasis and all) -- rather, I used one of many
reasonably interchangeable ways to word a concept. ((In the Nutshell, I
always deliberately try to pick the shortest and most concise way; in my
more spontaneous writing, I strongly tend to wilder exhuberance).

So, having deeply delved into the mirror, I still fail to find any
validity in your criticism: the phrases "S*n is the same as the
concatenation of" and "S*n the concatenation of", taken as definitions
of what S*n means, are such that your emphatic distinction has no added
value whatsoever -- I stand by this assertion and fail to see in it any
emotionalism, overwrought or otherwise. Care to _defend_ your
criticism, with some _objective_ explanation of why that parenthetical
was warranted (particularly the emphasis within it)? Or would you
rather continue the personal attacks against me and the unproven
accusations of "overwrought emotionalism" in particular?
Alex

Jun 4 '06 #41

Slawomir Nowaczyk

On Sat, 03 Jun 2006 17:03:00 -0700
al***@mac.com (Alex Martelli) wrote:

#> Terry Reedy <tj*****@udel.edu> wrote:
#>
#> > Depends what one means by 'copy'. See below for your alternate wording.
#>
#> Please give me a reasonable definition of the unadorned word "copy"
#> which would make this statement false. (And, just to forestall one
#> possible attempt: no, I cannot agree that a ``deepcopy'' is a reasonable
#> definition of the _unadorned_ word "copy").

Actually, when *I* think about the word "copy", I have in mind what
happens with files... and I to me semantics of []*3 is more like
symbolic linking, not copying. While I, personally, understand the
sentence in question "The result of S*n or n*S is the concatenation of
n copies of S" correctly, I *do* see how it might be misunderstood by
others.

Not that I know how to express it better :-(

--
Best wishes,
Slawomir Nowaczyk
( Sl***************@cs.lth.se )

Don't wake me for the end of the world unless it has very good
special effects -- Roger Zelazny

Jun 5 '06 #42

Steve Holden

mi*************@gmail.com wrote:

Yes. You stated it quite precisely. I believe l1==l2 should always
return True and l1==l3 should always be False. (unless l3 is reassigned
as l3=l1). Your idea of a separate operator for 'all elements have
numerically equal values at the moment of comparision' is a good one.
For want of a better name, it could be called DeepCopyEquality(a,b) and
would be equivalent to a byte-by-byte comparison of two distinct
regions in memory created by a deep copies of a and b.
I suspect the word you are grasping for is "isomorphic", since your
complaint appears to be that two non-isomorphic lists can compare as equal.

He then later said: Considering the number of new programmers who get bit by automatic
coercion, I wish Dennis Ritchie had made some different choices when he
designed C. But then I doubt he ever dreamed it would become so wildly
successful.
So he designed it badly because he didn't anticipate its ubiquity? Give
me a break. Every language designer regrets some of their decisions:
it's almost a given for design of any kind, since one makes compromises
without realising that they are compromises until usage reveals them.
Being a curmudgeon purist I'd actually prefer it if Python raised a
TypeError on float vs integer comparisons.

That's taking purity just a little too far for my taste.

Looking at how this thread developed (if such an unedifying process can
be described as "development") I hope you'll phrase future posts a
little more carefully.

regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Love me, love my blog http://holdenweb.blogspot.com
Recent Ramblings http://del.icio.us/steve.holden

Jun 5 '06 #43

Alex Martelli

Slawomir Nowaczyk <sl*******************@student.lu.se> wrote:

On Sat, 03 Jun 2006 17:03:00 -0700
al***@mac.com (Alex Martelli) wrote:

#> Terry Reedy <tj*****@udel.edu> wrote:
#>
#> > Depends what one means by 'copy'. See below for your alternate wording.
#>
#> Please give me a reasonable definition of the unadorned word "copy"
#> which would make this statement false. (And, just to forestall one
#> possible attempt: no, I cannot agree that a ``deepcopy'' is a reasonable
#> definition of the _unadorned_ word "copy").

Actually, when *I* think about the word "copy", I have in mind what
happens with files...
Sure! In particular, to reproduce the concept of an object containing
references to other objects, imagine that the file is a .tar, .dmg (on
MacOSX), or other kind of "container"/"archive" kind of file, and one of
the items in its contents is a symbolic link.

When you copy the archive file, both the original and the copy now
contain symbolic links to the SAME target.
and I to me semantics of []*3 is more like
symbolic linking, not copying.
??? an _assignment_ in Python can be said to be "like symbolic linking,
not copying" -- and that's a crucial part of Python's semantics, of
course. But the Sequence*N operation has nothing to do with creating
symbolic links; it may (of course) _copy_ such links, if they're present
in the sequence, just like copying a container file copies symbolic
links it may contain -- in each case one can end up with symbolic links
to the same target. The analogy between files and Python objects is of
course not exact (partly because filesystems normally distinguish
between directories, which only contain references to files, and
"ordinary" files, that don't -- the Composite Design Pattern proceeds,
often fruitfully, by abstracting away this distinction), but in as far
as it holds, it points roughly in the right direction. (GNU's cp offers
a -R switch to explicitly perform a "recursive copy", with other
switches such as -H, -L, -P to affect what happens in that case to
symlinks -- this -R is more akin to "deep copying", except that Python's
deepcopy is simpler, and always "recurses to the hilt").
While I, personally, understand the
sentence in question "The result of S*n or n*S is the concatenation of
n copies of S" correctly, I *do* see how it might be misunderstood by
others.

Not that I know how to express it better :-(

I do find it interesting that the concept of "copy" causes such trouble,
even when the analogies used (filecopying and symlinks) would tend to
point in the right direction. The "real-life" analogy of copying a list
also points in the right direction: if I have a list of the writings I
hold on my library's top shelf, when I copy the list, both the original
and the copy point to _exactly the same_ writings -- not to separate
copies of each. If I asked an employee to "please copy this list" I
would be astonished if he or she by default also copied each other
writing that is an _item_ on the list -- surely I would expect such huge
amounts of work to happen only when explicitly requested, and the
default meaning of "copy" to be therefore ``shallow'' (wherever
applicable, that is, when copying something that references other
things). It would be interesting to study the root of the confusion in
more detail (although it's unlikely, as you indicate, that such study
would yield a different definition, concise and simple enough to be used
in a concise reference work, it would still help authors of tutorials).
Alex

Jun 5 '06 #44

An oddity in list comparison and element assignment

Similar topics