Some thougts on cartesian products

In Python, it is possible to multiply a string with a number:

"hello"*3 'hellohellohello'

However, you can't multiply a string with another string:
'hello'*'world' Traceback (most recent call last):
File "<interactive input>", line 1, in ?
TypeError: can't multiply sequence by non-int

Sometimes I was missing such a feature.
What I expect as the result is the "cartesian product" of the strings.

Here is a simple implementation of new list and string objects that
explains what I mean. It also implements powers of lists and strings:

class plist(list):
"""list with cartesian product list"""

def __mul__(self, other):
if isinstance(other, pstr):
return plist([s+o for s in self for o in other])
if hasattr(other, '__getitem__'):
return plist([[s, o] for s in self for o in other])
else:
return list(self)*other

def __pow__(self, other):
if isinstance(other, int) and other > 0:
if other == 1:
return self
return self * self**(other-1)

class pstr(str):
"""str with cartesian product list"""

def __mul__(self, other):
if hasattr(other, '__getitem__'):
return plist([s+o for s in self for o in other])
else:
return str(self)*other

def __pow__(self, other):
if isinstance(other, int) and other > 0:
if other == 1:
return self
return self * self**(other-1)

With these new strings you can do the following:
pstr("ab")*pstr("cd")*pstr("ef") ['ace', 'acf', 'ade', 'adf', 'bce', 'bcf', 'bde', 'bdf']
print pstr("abcdefgh")*pstr("12345678") ['a1', 'a2', ..., 'a8', 'b1', 'b2', ..., 'b8',
...., ..., ..., 'h1', 'h2', ..., 'h8']
print pstr("ACGU")**3

['AAA', 'AAC', 'AAG', 'AAU', 'ACA', 'ACC', 'ACG', ...,
...., 'UGC', 'UGG', 'UGU', 'UUA', 'UUC', 'UUG', 'UUU']

I think this can be quite handy at times and save some typing.

If Python strings had this ability, you could even outdo the
117 byte solution in the recent shortest Python coding contest
(http://www.pycontest.net), as follows:

j=''.join;seven_seg=lambda x:j(j(\
(' |'*'_ '*' |')[ord('|B¬@z”(ÀD°'[int(d)])%e]\
for d in x)+'\n'for e in(4,9,7))

This has only 110 bytes.

Or you could write a simple password cracker like that:

def crack(crypted, alphabet):
for passwd in alphabet**4:
if crypt(passwd, crypted[:2]) == crypted:
return passwd

And call it with alphabet = string.lowercase, for example.

Cartesian products may be generally interesting for iterables:

def imul(iterable1, iterable2):
"""cartesian product of two iterables"""
for object1 in iterable1:
for object2 in iterable2:
if isinstance(object1, basestring) and \
isinstance(object2, basestring):
yield object1 + object2
else:
yield (object1, object2)

def ipow(iterable, number):
"""cartesian product power of an iterable"""
if number == 1:
for object in iterable:
yield object
elif number > 1:
for object1 in iterable:
for object2 in ipow(iterable, number-1):
yield object1 + object2

class istr(str):
"""str with iterable cartesian product"""

def __mul__(self, other):
if isinstance(other, str):
return imul(self, other)
else:
return str(self)*other

def __pow__(self, other):
return ipow(self, other)

I was wondering if similar functionality could be added in some way to
Python. I noticed that Python has a lot of aggregate functions that can
"reduce" given collection objects (like reduce, filter, min, max, sum,
hash) and functions that keep the same size (like map, sort or zip), but
few functions that can "inflate" the given objects (like range and
enumerate). I know, such functions are dangerous because they also
inflate time and memory consumed by the program. Still, sometimes they
can make sense, whenever you for some reason simply *have* to walk
through all the combinations.

Jan 22 '06 #1

Subscribe Post Reply

4120

Giovanni Bajo

Christoph Zwerschke wrote:

Sometimes I was missing such a feature.
What I expect as the result is the "cartesian product" of the strings.

I've been thinking of it as well. I'd like it for lists too:

range(3)**2

[(0,0), (0,1), (0,2), (1,0), (1,1), (1,2), (2,0), (2,1), (2,2)]

--
Giovanni Bajo

Jan 22 '06 #2

Kay Schluehr

Giovanni Bajo wrote:

Christoph Zwerschke wrote:
Sometimes I was missing such a feature.
What I expect as the result is the "cartesian product" of the strings.

I've been thinking of it as well. I'd like it for lists too:
range(3)**2

[(0,0), (0,1), (0,2), (1,0), (1,1), (1,2), (2,0), (2,1), (2,2)]

--
Giovanni Bajo

But why isn't this interpreted as [0, 1, 4] like it is in Mathematica?

I would prefer a polymorphic distribute(*args) function ( or generator
) that acts on tuples of listlike objects of equal size. It could be
extended to distribute(*args[,default]) by a single default argument
that is inserted in the distribution table when two listlike objects
have not the same size.

Kay

Jan 22 '06 #3

Alex Martelli

Kay Schluehr <ka**********@gmx.net> wrote:
...

> range(3)**2 [(0,0), (0,1), (0,2), (1,0), (1,1), (1,2), (2,0), (2,1), (2,2)]

... But why isn't this interpreted as [0, 1, 4] like it is in Mathematica?

Since range(3)*2 is [0, 1, 2, 0, 1, 2], it would be horribly, painfully
inconsistent if **2 was interpreted as "square each item".
Alex

Jan 22 '06 #4

Some thougts on cartesian products

Similar topics