473,327 Members | 1,896 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,327 software developers and data experts.

Characters contain themselves?


While testing recursive algoritms dealing with generic lists I stumbled
on infinite loops which were triggered by the fact that (at least for my
version of Pyton) characters contain themselves.See session:

system prompt% python
Python 2.3.5 (#2, Feb 9 2005, 00:38:15)
[GCC 3.3.5 (Debian 1:3.3.5-8)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
....
'a' is 'a' True 'a' in 'a' True 'a' in ['a'] True ....


Leading to paradoxes and loops objects which contain themselves (and
other kinds of monsters) are killed in set theory with the Axiom of
Foundation:=)

But let's go back to more earthly matters. I couldn't find any clue in a
python FAQ after having googled with the following "Python strings FAQ"
about why this design choice and how to avoid falling in this trap
without having to litter my code everywhere with tests for stringiness
each time I process a generic list of items.

Any hints would be appreciated.

Apr 7 '06 #1
14 1424
WENDUM Denis 47.76.11 (agent):
While testing recursive algoritms dealing with generic lists I stumbled
on infinite loops which were triggered by the fact that (at least for my
version of Pyton) characters contain themselves.


No, strings contain characters. And 'a' is a string consisting of one
character.

"The items of a string are characters. There is no separate character
type; a character is represented by a string of one item."
http://docs.python.org/ref/types.html

(One item of what type, one might ask)

--
René Pijlman
Apr 7 '06 #2
gry
In fact, not just characters, but strings contain themselves:
'abc' in 'abc' True

This is a very nice(i.e. clear and concise) shortcut for:
'the rain in spain stays mainly'.find('rain') != -1

True

Which I always found contorted and awkward.

Could you be a bit more concrete about your complaint?

-- George
[Thanks, I did enjoy looking up the Axiom of Foundation!]

Apr 7 '06 #3
Rene Pijlman <re********************@my.address.is.invalid> writes:
WENDUM Denis 47.76.11 (agent):
While testing recursive algoritms dealing with generic lists I stumbled
on infinite loops which were triggered by the fact that (at least for my
version of Pyton) characters contain themselves.


No, strings contain characters. And 'a' is a string consisting of one
character.

"The items of a string are characters. There is no separate character
type; a character is represented by a string of one item."
http://docs.python.org/ref/types.html

(One item of what type, one might ask)


Good point. ". . .represented by a string of length one" would be
better.
--
Mark Jackson - http://www.alumni.caltech.edu/~mjackson
An information system based on theory isolated from reality
is bound to fail. - Mitch Kabay
Apr 7 '06 #4
WENDUM Denis 47.76.11 (agent) a écrit :

While testing recursive algoritms dealing with generic lists I stumbled
on infinite loops which were triggered by the fact that (at least for my
version of Pyton) characters contain themselves.


There is *no* character type in Python. 'a' is a string, and of course
the string 'a' is a substring of string 'a'.
Apr 7 '06 #5
On Fri, 07 Apr 2006 15:50:53 +0200, WENDUM Denis 47.76.11 (agent) wrote:

While testing recursive algoritms dealing with generic lists I stumbled
on infinite loops which were triggered by the fact that (at least for my
version of Pyton) characters contain themselves.
A "character" is just a string of length 1.
[snip] Leading to paradoxes and loops objects which contain themselves (and
other kinds of monsters) are killed in set theory with the Axiom of
Foundation:=)
Let me see if I understand you... you would like behaviour like this?
'a' in 'bbabb' True 'a' in 'bbab' True 'a' in 'bab' True 'a' in 'ab' True 'a' in 'a' False # not the real result

Ain't going to happen.

But let's go back to more earthly matters. I couldn't find any clue in a
python FAQ after having googled with the following "Python strings FAQ"
about why this design choice and how to avoid falling in this trap
without having to litter my code everywhere with tests for stringiness
each time I process a generic list of items.

Any hints would be appreciated.


I think that if you want to do work with sets, use sets, not strings.

In Python 2.3, you need:
import sets


In Python 2.4, Sets (note the initial capital) are a built-in.

--
Steven.

Apr 8 '06 #6
On Fri, 07 Apr 2006 15:50:53 +0200, WENDUM Denis 47.76.11 (agent) wrote:
But let's go back to more earthly matters. I couldn't find any clue in a
python FAQ after having googled with the following "Python strings FAQ"
about why this design choice and how to avoid falling in this trap
without having to litter my code everywhere with tests for stringiness
each time I process a generic list of items.

Any hints would be appreciated.


Here is another suggestion.

If you absolutely must use strings rather than sets or lists, create a
custom string class:
class Mystr(str): .... def __contains__(self, other):
.... if self == other: return False
.... return super(Mystr, self).__contains__(other)
.... s = Mystr("hello world")
"x" in s False "w" in s True "hello" in s True "hello world" in s

False
Hope this helps.
--
Steven.

Apr 8 '06 #7

congratulations for (ostensibly) discovering the Barber's paradox (if
the village barber shaves all and only those who don't shave
tehmselves, who shaves the barber?
http://en.wikipedia.org/wiki/Barber_paradox) in python ! :-D

as far as i see it, you complaint is not just that any string X
contains itself but that string X can contain another string Y (i.e.
object of class string to contain another of class string) - where you
understand "contain" as per the operator "in" to be set-theory
operator, when in fact the meaning put for strings is instead "has a
substring".

therefore your grudge is not just with
'a' in 'a'
but also with
'a' in 'abcd'

here is excerpt from the reference manual:
----------------------------------------
The operators in and not in test for set membership. x in s evaluates
to true if x is a member of the set s, and false otherwise. x not in s
returns the negation of x in s. The set membership test has
traditionally been bound to sequences; an object is a member of a set
if the set is a sequence and contains an element equal to that object.
However, it is possible for an object to support membership tests
without being a sequence. In particular, dictionaries support
membership testing as a nicer way of spelling key in dict; other
mapping types may follow suit.

For the list and tuple types, x in y is true if and only if there
exists an index i such that x == y[i] is true.

For the Unicode and string types, x in y is true if and only if x is a
substring of y. An equivalent test is y.find(x) != -1. Note, x and y
need not be the same type; consequently, u'ab' in 'abc' will return
True. Empty strings are always considered to be a substring of any
other string, so "" in "abc" will return True.
----------------------------------------

it is apparent "in" was overriden for strings for convenience's sake,
not to get freaky on the therory of sets.

what can you do about it? well, you can check for string type
specifically but there are no guarantees in life: someone else can
define new type with "in" that behaves like that: say "interval(x,y)",
where "interval(x,y) in interval(a,b)" checks if [x,y] is a
sub-interval of [a,b] - very intuitive - but there you have the problem
again!

or you can specifically check if the objects are from a "semanthically
supported group" of classes - but that will hamper authomatic extension
by introducing new types.

- Nas

WENDUM Denis 47.76.11 (agent) wrote:
While testing recursive algoritms dealing with generic lists I stumbled
on infinite loops which were triggered by the fact that (at least for my
version of Pyton) characters contain themselves.See session:
>>> 'a' is 'a' True >>> 'a' in 'a' True >>> 'a' in ['a'] True >>> ....


Leading to paradoxes and loops objects which contain themselves (and
other kinds of monsters) are killed in set theory with the Axiom of
Foundation:=)

But let's go back to more earthly matters. I couldn't find any clue in a
python FAQ after having googled with the following "Python strings FAQ"
about why this design choice and how to avoid falling in this trap
without having to litter my code everywhere with tests for stringiness
each time I process a generic list of items.

Any hints would be appreciated.


Apr 8 '06 #8
WENDUM Denis 47.76.11 (agent) wrote:
While testing recursive algoritms dealing with generic lists I stumbled
on infinite loops which were triggered by the fact that (at least for my
version of Pyton) characters contain themselves.See session:


Strings are sequences and this is a problem for recursive functions that
process nested lists. You do need to test for stringiness somehow. IIRC
the usual solutions are

- explicit test for string: isinstance(xx, basestring)

- explicit test for list: isinstance(xx, list) - this fails for
user-defined sequences that don't subclass list

- test for __iter__ attribute. IIUC this relies on an implementation
detail that strings don't define __iter__:

In [6]: hasattr('aa', '__iter__')
Out[6]: False

In [7]: hasattr([], '__iter__')
Out[7]: True

Kent
Apr 8 '06 #9
WENDUM Denis 47.76.11 (agent) wrote:

While testing recursive algoritms dealing with generic lists I stumbled
on infinite loops which were triggered by the fact that (at least for my
version of Pyton) characters contain themselves.


Note that the empty string is contained in all strings, including itself.
bool('') False '' in ''

True
Apr 10 '06 #10


gr*@ll.mit.edu wrote:
In fact, not just characters, but strings contain themselves:

'abc' in 'abc'
True

This is a very nice(i.e. clear and concise) shortcut for:

'the rain in spain stays mainly'.find('rain') != -1
True

Which I always found contorted and awkward.

Could you be a bit more concrete about your complaint?

-- George
[Thanks, I did enjoy looking up the Axiom of Foundation!] Here is the minimal context triggering an infinite descent in recursion.
From the answers I've got it seems I'll have to check if I'm iterating
on a string or on another kind of "list". python

Python 2.3.5 (#2, Feb 9 2005, 00:38:15)
[GCC 3.3.5 (Debian 1:3.3.5-8)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
import sys
sys.setrecursionlimit(10)
def f(List): .... try: # assuming List supports iteration
.... new=[f(item) for item in List]
.... except:# if no iteration is possible
.... new=1
.... return new
.... f([1, [2,3]]) # f substitutes each non iterable item by 1 (silly butshows the problem) [1, [1, 1]] f([1, [2.3, (5,6,7)]]) [1, [1, [1, 1, 1]]] f('bac') [[[[[[[[1]]]]]]], [[[[[[[1]]]]]]], [[[[[[[1]]]]]]]] sys.setrecursionlimit(5)
f('bac') [[[1]], [[1]], [[1]]] # each item in 'bac' is a character,

ie. a string of length one on wich one can iterate
and so triggers an infinite descent of recursion.
...

Apr 10 '06 #11
WENDUM Denis 47.76.11 (agent) wrote:
... From the answers I've got it seems I'll have to check if I'm
iterating on a string or on another kind of "list"....
import sys
sys.setrecursionlimit(10)
def f(List): ... try: # assuming List supports iteration
... new=[f(item) for item in List]
... except:# if no iteration is possible
... new=1
... return new f([1, [2,3]]) # f substitutes each non iterable item by 1 (silly
butshows the problem) [1, [1, 1]] f([1, [2.3, (5,6,7)]]) [1, [1, [1, 1, 1]]] f('bac') [[[[[[[[1]]]]]]], [[[[[[[1]]]]]]], [[[[[[[1]]]]]]]] sys.setrecursionlimit(5)
f('bac') [[[1]], [[1]], [[1]]] # each item in 'bac' is a character,

ie. a string of length one on wich one can iterate
and so triggers an infinite descent of recursion.


I'd rewrite your 'f' as follows:
def copier(List):
try: # assuming List supports iteration
new = [f(item) for item in List]
except TypeError: # Non-iterables get replaced
return '?'
except RuntimeError: # Recursion failures get shown
return 'Recursion'
return new

Note, even if you got your wish and somehow you couldn't have an issue
with characters in strings, you would have to deal with this:

simple_list = [3.1415]
container = [simple_list]
simple_list.append(container)

Your function (and mine) assumes the argument is a DAG (Directed Acyclic
Graph), but there is no such guarantee about data structures in python.

--Scott David Daniels
sc***********@acm.org
Apr 10 '06 #12
WENDUM Denis 47.76.11 (agent) wrote:
While testing recursive algoritms dealing with generic lists I stumbled
on infinite loops which were triggered by the fact that (at least for my
version of Pyton) characters contain themselves. [snip] Leading to paradoxes and loops objects which contain themselves (and
other kinds of monsters) are killed in set theory with the Axiom of
Foundation:=)


You could always use an "is-proper-subset-of" function, which is closer
to the intent of your algorithm. Using Jamitzky's very clever infix
recipe [1], you can even write it as an infix operator:

#Jamitzky's infix-operator class, abbreviated
class Infix:
def __init__(self, function):
self.function = function
def __ror__(self, other):
return Infix(lambda x, self=self, other=other:
self.function(other, x))
def __or__(self, other):
return self.function(other)
def __call__(self, value1, value2):
return self.function(value1, value2)

# define our is-proper-subset operator...
ips = Infix(lambda a, b: (a is not b) and (a in b))
print 'a' |ips| 'a' False print 'a' |ips| 'abc' True print '' |ips| ''

False

Of course, |ips| looks like Lips-L, which is odd; but you could pick a
better name. ;-)

Graham

[1] http://aspn.activestate.com/ASPN/Coo.../Recipe/384122

Apr 10 '06 #13
Graham Fawcett <gr************@gmail.com> wrote:
You could always use an "is-proper-subset-of" function, which is closer
to the intent of your algorithm. Using Jamitzky's very clever infix
recipe [1], you can even write it as an infix operator:

#Jamitzky's infix-operator class, abbreviated
class Infix:
[ ... ]

# define our is-proper-subset operator...
ips = Infix(lambda a, b: (a is not b) and (a in b))
print 'a' |ips| 'a'False print 'a' |ips| 'abc'True
Unfortunately:
print 'abc' |ips| 'abc' False print 'a'+'bc' |ips| 'abc' True

Which might not be what you want. On the other hand, it's a simple
fix:
ips = Infix(lambda a, b: (a != b) and (a in b))
print 'a' |ips| 'abc' True
print 'a'+'bc' |ips| 'abc'

False
[1] http://aspn.activestate.com/ASPN/Coo.../Recipe/384122


--
\S -- si***@chiark.greenend.org.uk -- http://www.chaos.org.uk/~sion/
___ | "Frankly I have no feelings towards penguins one way or the other"
\X/ | -- Arthur C. Clarke
her nu becomeþ se bera eadward ofdun hlæddre heafdes bæce bump bump bump
Apr 11 '06 #14
Sion Arrowsmith wrote:
Unfortunately:
print 'a'+'bc' |ips| 'abc' True

Which might not be what you want. On the other hand, it's a simple
fix: ips = Infix(lambda a, b: (a != b) and (a in b))
print 'a'+'bc' |ips| 'abc'


Ah, good point.

Graham

Apr 11 '06 #15

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

13
by: Xah Lee | last post by:
the Journey of Foreign Characters thru Internet Xah Lee, 20051101 There's a bunch of confusions about the display of non-ascii characters such as the bullet "•". These confusions are...
35
by: David Mathog | last post by:
Every so often one of my fgets() based programs encounters an input file containing embedded nulls. fgets is happy to read these but the embedded nulls subsequently cause problems elsewhere in...
3
by: john | last post by:
I need to produce a RTF-document which is filled with data from a database. I've created a RTF-document in WordPad (a template, so to speak) which contains 'placeholders', for example '<dd01>',...
8
by: david.lindsay.green | last post by:
Hello all, I am quite new a web scripting and making web pages in general and I have stumbled across a problem I have as yet been unable to solve. I am trying to take the contents of a textarea box...
3
by: Thomas | last post by:
Hi All, i am getting this error once i submit my page to a payment gateway and when it returns the reponse back. Redirect URI cannot contain newline characters. Description: An unhandled...
10
by: Lawrence Krubner | last post by:
Once upon a time, there were no garbage characters on this page: http://www.teamlalala.com/blog/category/css/ Now there are. For instance: The 2nd paragraph from page 114 of “The Zen Of...
4
by: Jonathan | last post by:
Hi, Can you please tell me how can I printf 'n' characters of from a char* to a file. I notice there is a snprinf, but I don't see one corresponding to fprintf? I am thinking if i can do...
0
by: AAaron123 | last post by:
Been playing with asp:changepassword and have it looking OK except that I can't elininate or change the title at the top that says "Change Your Password". It's a repeat of my pages title. ...
13
by: Liang Chen | last post by:
Hope you all had a nice weekend. I have a question that I hope someone can help me out. I want to run a Python program that uses Tkinter for the user interface (GUI). The program allows me to type...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.