congratulations for (ostensibly) discovering the Barber's paradox (if

the village barber shaves all and only those who don't shave

tehmselves, who shaves the barber?

http://en.wikipedia.org/wiki/Barber_paradox) in python ! :-D

as far as i see it, you complaint is not just that any string X

contains itself but that string X can contain another string Y (i.e.

object of class string to contain another of class string) - where you

understand "contain" as per the operator "in" to be set-theory

operator, when in fact the meaning put for strings is instead "has a

substring".

therefore your grudge is not just with

'a' in 'a'

but also with

'a' in 'abcd'

here is excerpt from the reference manual:

----------------------------------------

The operators in and not in test for set membership. x in s evaluates

to true if x is a member of the set s, and false otherwise. x not in s

returns the negation of x in s. The set membership test has

traditionally been bound to sequences; an object is a member of a set

if the set is a sequence and contains an element equal to that object.

However, it is possible for an object to support membership tests

without being a sequence. In particular, dictionaries support

membership testing as a nicer way of spelling key in dict; other

mapping types may follow suit.

For the list and tuple types, x in y is true if and only if there

exists an index i such that x == y[i] is true.

For the Unicode and string types, x in y is true if and only if x is a

substring of y. An equivalent test is y.find(x) != -1. Note, x and y

need not be the same type; consequently, u'ab' in 'abc' will return

True. Empty strings are always considered to be a substring of any

other string, so "" in "abc" will return True.

----------------------------------------

it is apparent "in" was overriden for strings for convenience's sake,

not to get freaky on the therory of sets.

what can you do about it? well, you can check for string type

specifically but there are no guarantees in life: someone else can

define new type with "in" that behaves like that: say "interval(x,y)",

where "interval(x,y) in interval(a,b)" checks if [x,y] is a

sub-interval of [a,b] - very intuitive - but there you have the problem

again!

or you can specifically check if the objects are from a "semanthically

supported group" of classes - but that will hamper authomatic extension

by introducing new types.

- Nas

WENDUM Denis 47.76.11 (agent) wrote:

While testing recursive algoritms dealing with generic lists I stumbled

on infinite loops which were triggered by the fact that (at least for my

version of Pyton) characters contain themselves.See session:

>>> 'a' is 'a' True >>> 'a' in 'a' True >>> 'a' in ['a'] True >>> ....

Leading to paradoxes and loops objects which contain themselves (and

other kinds of monsters) are killed in set theory with the Axiom of

Foundation:=)

But let's go back to more earthly matters. I couldn't find any clue in a

python FAQ after having googled with the following "Python strings FAQ"

about why this design choice and how to avoid falling in this trap

without having to litter my code everywhere with tests for stringiness

each time I process a generic list of items.

Any hints would be appreciated.