By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
459,199 Members | 1,725 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 459,199 IT Pros & Developers. It's quick & easy.

Some thougts on cartesian products

P: n/a
In Python, it is possible to multiply a string with a number:
"hello"*3 'hellohellohello'

However, you can't multiply a string with another string:
'hello'*'world' Traceback (most recent call last):
File "<interactive input>", line 1, in ?
TypeError: can't multiply sequence by non-int

Sometimes I was missing such a feature.
What I expect as the result is the "cartesian product" of the strings.

Here is a simple implementation of new list and string objects that
explains what I mean. It also implements powers of lists and strings:

class plist(list):
"""list with cartesian product list"""

def __mul__(self, other):
if isinstance(other, pstr):
return plist([s+o for s in self for o in other])
if hasattr(other, '__getitem__'):
return plist([[s, o] for s in self for o in other])
else:
return list(self)*other

def __pow__(self, other):
if isinstance(other, int) and other > 0:
if other == 1:
return self
return self * self**(other-1)

class pstr(str):
"""str with cartesian product list"""

def __mul__(self, other):
if hasattr(other, '__getitem__'):
return plist([s+o for s in self for o in other])
else:
return str(self)*other

def __pow__(self, other):
if isinstance(other, int) and other > 0:
if other == 1:
return self
return self * self**(other-1)

With these new strings you can do the following:
pstr("ab")*pstr("cd")*pstr("ef") ['ace', 'acf', 'ade', 'adf', 'bce', 'bcf', 'bde', 'bdf']
print pstr("abcdefgh")*pstr("12345678") ['a1', 'a2', ..., 'a8', 'b1', 'b2', ..., 'b8',
...., ..., ..., 'h1', 'h2', ..., 'h8']
print pstr("ACGU")**3

['AAA', 'AAC', 'AAG', 'AAU', 'ACA', 'ACC', 'ACG', ...,
...., 'UGC', 'UGG', 'UGU', 'UUA', 'UUC', 'UUG', 'UUU']

I think this can be quite handy at times and save some typing.

If Python strings had this ability, you could even outdo the
117 byte solution in the recent shortest Python coding contest
(http://www.pycontest.net), as follows:

j=''.join;seven_seg=lambda x:j(j(\
(' |'*'_ '*' |')[ord('|B@z(D'[int(d)])%e]\
for d in x)+'\n'for e in(4,9,7))

This has only 110 bytes.

Or you could write a simple password cracker like that:

def crack(crypted, alphabet):
for passwd in alphabet**4:
if crypt(passwd, crypted[:2]) == crypted:
return passwd

And call it with alphabet = string.lowercase, for example.

Cartesian products may be generally interesting for iterables:

def imul(iterable1, iterable2):
"""cartesian product of two iterables"""
for object1 in iterable1:
for object2 in iterable2:
if isinstance(object1, basestring) and \
isinstance(object2, basestring):
yield object1 + object2
else:
yield (object1, object2)

def ipow(iterable, number):
"""cartesian product power of an iterable"""
if number == 1:
for object in iterable:
yield object
elif number > 1:
for object1 in iterable:
for object2 in ipow(iterable, number-1):
yield object1 + object2

class istr(str):
"""str with iterable cartesian product"""

def __mul__(self, other):
if isinstance(other, str):
return imul(self, other)
else:
return str(self)*other

def __pow__(self, other):
return ipow(self, other)

I was wondering if similar functionality could be added in some way to
Python. I noticed that Python has a lot of aggregate functions that can
"reduce" given collection objects (like reduce, filter, min, max, sum,
hash) and functions that keep the same size (like map, sort or zip), but
few functions that can "inflate" the given objects (like range and
enumerate). I know, such functions are dangerous because they also
inflate time and memory consumed by the program. Still, sometimes they
can make sense, whenever you for some reason simply *have* to walk
through all the combinations.
Jan 22 '06 #1
Share this Question
Share on Google+
44 Replies


P: n/a
Christoph Zwerschke wrote:
Sometimes I was missing such a feature.
What I expect as the result is the "cartesian product" of the strings.


I've been thinking of it as well. I'd like it for lists too:
range(3)**2

[(0,0), (0,1), (0,2), (1,0), (1,1), (1,2), (2,0), (2,1), (2,2)]

--
Giovanni Bajo
Jan 22 '06 #2

P: n/a

Giovanni Bajo wrote:
Christoph Zwerschke wrote:
Sometimes I was missing such a feature.
What I expect as the result is the "cartesian product" of the strings.


I've been thinking of it as well. I'd like it for lists too:
range(3)**2

[(0,0), (0,1), (0,2), (1,0), (1,1), (1,2), (2,0), (2,1), (2,2)]

--
Giovanni Bajo


But why isn't this interpreted as [0, 1, 4] like it is in Mathematica?

I would prefer a polymorphic distribute(*args) function ( or generator
) that acts on tuples of listlike objects of equal size. It could be
extended to distribute(*args[,default]) by a single default argument
that is inserted in the distribution table when two listlike objects
have not the same size.

Kay

Jan 22 '06 #3

P: n/a
Kay Schluehr <ka**********@gmx.net> wrote:
...
> range(3)**2 [(0,0), (0,1), (0,2), (1,0), (1,1), (1,2), (2,0), (2,1), (2,2)]

... But why isn't this interpreted as [0, 1, 4] like it is in Mathematica?


Since range(3)*2 is [0, 1, 2, 0, 1, 2], it would be horribly, painfully
inconsistent if **2 was interpreted as "square each item".
Alex
Jan 22 '06 #4

P: n/a
Kay Schluehr schrieb:
But why isn't this interpreted as [0, 1, 4] like it is in Mathematica?


Because we are thinking of a cartesian product. If you have lists of
numbers, then there are mane more ways to define a product: tensor
product, vector product, scalar product, componentwise product...

The cartesian product is a much more generic concept, you can have it
already for sets:

class sset(set):

def __mul__(self, other):
return sset((a,b) for a in self for b in other)

def __pow__(self, other):
if isinstance(other, int) and other > 0:
if other == 1:
return self
elif other == 2:
return self*self
else:
return sset(a + (b,) \
for a in self**(other-1) for b in self)

Example:

for x in sorted(sset("ACGU")**3):
print ''.join(x)

AAA
AAC
AAG
AAU
ACA
ACC
..
..
..
UGU
UUA
UUC
UUG
UUU

Now as I'm thinking about it, wouldn't it be nice to have the cartesian
products on Python sets? Maybe also a method that returns the power set
of a set (the set of all subsets), or the set of all subsets with a
given length. You could get a 6/49 lotto tip with something like:

choice(set(range(49)).powerset(6))

-- Christoph
Jan 22 '06 #5

P: n/a
Alex Martelli wrote:
Kay Schluehr <ka**********@gmx.net> wrote:
range(3)**2
But why isn't this interpreted as [0, 1, 4] like it is in Mathematica?


Since range(3)*2 is [0, 1, 2, 0, 1, 2], it would be horribly, painfully
inconsistent if **2 was interpreted as "square each item".


Yes. Python does not interpreate the product of a list with a number as
a scalar product. Otherwise range(3)*2 should be [0, 1, 4] as well.

For doing such things I would use a vector subtype of list.

-- Christoph
Jan 22 '06 #6

P: n/a
Christoph Zwerschke <ci**@online.de> wrote:
...
given length. You could get a 6/49 lotto tip with something like:

choice(set(range(49)).powerset(6))


And that would be better than the current random.sample(range(49),6) in
WHAT ways, exactly...?
Alex
Jan 22 '06 #7

P: n/a
al***@mail.comcast.net (Alex Martelli) writes:
given length. You could get a 6/49 lotto tip with something like:
choice(set(range(49)).powerset(6))


And that would be better than the current random.sample(range(49),6) in
WHAT ways, exactly...?


I think the first one would be incorrect since it samples with
replacement. At least, it looks like it does.
Jan 22 '06 #8

P: n/a
On Sun, 22 Jan 2006 18:29:45 +0100, Christoph Zwerschke wrote:
Alex Martelli wrote:
Kay Schluehr <ka**********@gmx.net> wrote:
range(3)**2
But why isn't this interpreted as [0, 1, 4] like it is in Mathematica?


Since range(3)*2 is [0, 1, 2, 0, 1, 2], it would be horribly, painfully
inconsistent if **2 was interpreted as "square each item".


Yes. Python does not interpreate the product of a list with a number as
a scalar product. Otherwise range(3)*2 should be [0, 1, 4] as well.

For doing such things I would use a vector subtype of list.


Not everything needs to be a separate class! Why create a magic class for
every piece of functionality you want? Just create functions that operate
on existing classes!

Instead of a class that supports cartesian products, make a function that
takes two sequences and returns the cartesian product of them. (This will
likely be best implemented as a generator.) If you write it properly,
which is to say if you don't go out of your way to break it, this function
will *automatically* work on any sequence type, lists, tuples, strings,
and things you and I haven't even thought of.

What advantage is there to creating a "list with cartesian product"
subclass of list?

--
Steven.

Jan 22 '06 #9

P: n/a
Alex Martelli schrieb:
Christoph Zwerschke <ci**@online.de> wrote:
...
given length. You could get a 6/49 lotto tip with something like:

choice(set(range(49)).powerset(6))
And that would be better than the current random.sample(range(49),6) in
WHAT ways, exactly...?


You're right, random.sample(range(49),6) does the same and much faster.
I just didn't think of it (it is new since Python 2.3).

What if you need 12 different tips for your lotto ticket?

s = set(range(49)).powerset(6)
for x in range(10):
print s.pop()

But the real disadvantage of this idea is the memory consumed and the
time to set up that memory: set(range(49)).powerset(6) has a cardinality
of about 13 million entries! You PC would start to swap just for getting
a lotto tip...

-- Christoph
Jan 22 '06 #10

P: n/a
Steven D'Aprano wrote:
On Sun, 22 Jan 2006 18:29:45 +0100, Christoph Zwerschke wrote:
For doing such things I would use a vector subtype of list.


Not everything needs to be a separate class! Why create a magic class for
every piece of functionality you want? Just create functions that operate
on existing classes!

What advantage is there to creating a "list with cartesian product"
subclass of list?


Principally, you're right (see also my example with iterators).

But I can still see two reasons for classes:

1) That function would have to make a lot of case distinctions (check
the types of operands). If you have a class, you already know the type
of the operands (at least one).

2) It allows you to write a*b instead of mul(a,b) which looks nicer.

-- Christoph
Jan 22 '06 #11

P: n/a
Paul Rubin schrieb:
al***@mail.comcast.net (Alex Martelli) writes:
given length. You could get a 6/49 lotto tip with something like:
choice(set(range(49)).powerset(6))

And that would be better than the current random.sample(range(49),6) in
WHAT ways, exactly...?


I think the first one would be incorrect since it samples with
replacement. At least, it looks like it does.


No, the elements of the powerset would be sets with 6 elements each, not
tuples. So technically, it would be correct. Just horribly inefficient.

-- Christoph
Jan 22 '06 #12

P: n/a
Christoph Zwerschke <ci**@online.de> writes:
No, the elements of the powerset would be sets with 6 elements each,
not tuples. So technically, it would be correct. Just horribly
inefficient.


Oh I see, not the Cartesian product. Yeah, it would be silly in
practice.
Jan 22 '06 #13

P: n/a
Steven D'Aprano <st***@REMOVETHIScyber.com.au> wrote:
...
What advantage is there to creating a "list with cartesian product"
subclass of list?


Essentially, syntax sugar -- for some people, being able to code a*b
rather than product(a,b) takes on a huge significance; Python chooses to
support this syntax variation by special methods in classes, and thus
encourages people to create classes if they're keen on the syntax.
Alex
Jan 22 '06 #14

P: n/a
Christoph Zwerschke <ci**@online.de> wrote:
Alex Martelli schrieb:
Christoph Zwerschke <ci**@online.de> wrote:
...
given length. You could get a 6/49 lotto tip with something like:

choice(set(range(49)).powerset(6))
And that would be better than the current random.sample(range(49),6) in
WHAT ways, exactly...?


You're right, random.sample(range(49),6) does the same and much faster.
I just didn't think of it (it is new since Python 2.3).


Yep, it's one of 2.3's many little gems. Still, builtin set is new in
2.4, so it's hardly a "more classic" approach;-).

What if you need 12 different tips for your lotto ticket?
You probably want random ones, then...
s = set(range(49)).powerset(6)
for x in range(10):
print s.pop()
This is very systematic, not random;-). Still, using random.sample on s
would indeed produce 12 random different tips!-)
But the real disadvantage of this idea is the memory consumed and the
time to set up that memory: set(range(49)).powerset(6) has a cardinality
of about 13 million entries! You PC would start to swap just for getting
a lotto tip...


Well then, you need a new PC, 64-bit and with as many GB of RAM as
required, no?-)

Until you get such a PC, something like:

tips = set()
while len(tips) < 10:
tip = frozenzet(random.sample(range(49), 6))
tips.add(tip)

will have to suffice, I guess;-).
Alex
Jan 22 '06 #15

P: n/a
Generally, if you could multiply strings in the above fashion, you could
spare one more more sub loops, as in this example:

for f in ('index', 'default')*('.html', '.htm', '.shtml'):
if exists(f):
break

In this case, it would be not really be better than that:

for f in 'index', 'default':
for e in '.html', '.htm', '.shtml':
if exists(f+e):
break

The password cracking was already a better example.

But this would be only efficient if the product of two strings is
returned as a generator, not a list. And I don't know if you really
would expect that, since multiplying a string or a list with a number
does not change its type.

-- Christoph
Jan 22 '06 #16

P: n/a
On Sun, 22 Jan 2006 10:41:39 -0800, Alex Martelli wrote:
Steven D'Aprano <st***@REMOVETHIScyber.com.au> wrote:
...
What advantage is there to creating a "list with cartesian product"
subclass of list?


Essentially, syntax sugar -- for some people, being able to code a*b
rather than product(a,b) takes on a huge significance; Python chooses to
support this syntax variation by special methods in classes, and thus
encourages people to create classes if they're keen on the syntax.


I beg to differ: Python *allows* people to create classes if they're keen
on the syntax (and I can sympathise with that like), but Python
*encourages* by example generic tools that operate on as many different
types as makes sense.
--
Steven.

Jan 22 '06 #17

P: n/a
Alex Martelli schrieb:
s = set(range(49)).powerset(6)
for x in range(10):
print s.pop()
This is very systematic, not random;-). Still, using random.sample on s
would indeed produce 12 random different tips!-)


Right, that would be systematic. What I wanted to write was:

s = set(range(49)).powerset(6)
for x in range(10):
c = choice(s)
print c
s.remove(c)

Of course you could also use random.sample again:
random.sample(set(range(49)).powerset(6), 10)
But this would be just as inefficient.
tips = set()
while len(tips) < 10:
tip = frozenzet(random.sample(range(49), 6))
tips.add(tip)


Yep, that's better. The amount of hand coding is even the same as above.

-- Christoph
Jan 22 '06 #18

P: n/a
Steven D'Aprano wrote:
I beg to differ: Python *allows* people to create classes if they're keen
on the syntax (and I can sympathise with that like), but Python
*encourages* by example generic tools that operate on as many different
types as makes sense.


But these generic tools (say the "pow" function) in turn simply use the
methods defined by the classes.

-- Christoph

Jan 22 '06 #19

P: n/a
On Sun, 22 Jan 2006 19:12:49 +0100, Christoph Zwerschke wrote:
Steven D'Aprano wrote:
On Sun, 22 Jan 2006 18:29:45 +0100, Christoph Zwerschke wrote:
For doing such things I would use a vector subtype of list.
Not everything needs to be a separate class! Why create a magic class for
every piece of functionality you want? Just create functions that operate
on existing classes!
>
> What advantage is there to creating a "list with cartesian product"
> subclass of list?


Principally, you're right (see also my example with iterators).

But I can still see two reasons for classes:

1) That function would have to make a lot of case distinctions (check
the types of operands). If you have a class, you already know the type
of the operands (at least one).


If you are happy to always return a list of tuples regardless of what the
two operands are, generators make it so easy it is shameful. Even if you
want a special case of two string arguments returning a string, it is
hardly any more difficult:

def cartprod(A, B):
if type(A) == type(B) == str:
convert = lambda obj: "".join(list(obj))
else:
convert = lambda obj: obj # do nothing
for a in A:
for b in B:
yield convert((a, b))

Notice that the *only* reason we look at the type of the arguments is
because we want two strings to return a string. If we don't care about
that, the generator is less than half the size:

def cartprod(A, B):
for a in A:
for b in B:
yield (a, b)

That's a generator giving you the products one at a time. For the
benefit of anyone out there who doesn't know about generators, you use it
like this:
cprods = cartprod([1,2,3], "abc")
for t in cprods: .... print t
....
(1, 'a')
(1, 'b')
(2, 'a')
(2, 'b')
(3, 'a')
(3, 'b')

If you want them all at once, memory allowing, you do this:
cprods = cartprod([1,2,3], "abc")
list(cprods)

[(1, 'a'), (1, 'b'), (2, 'a'), (2, 'b'), (3, 'a'), (3, 'b')]

2) It allows you to write a*b instead of mul(a,b) which looks nicer.


But not as clear as cartprod(a,b) or even cartesian_product(a,b).
--
Steven.

Jan 22 '06 #20

P: n/a
Steven D'Aprano wrote:
If you are happy to always return a list of tuples regardless of what the
two operands are, generators make it so easy it is shameful. Even if you
want a special case of two string arguments returning a string, it is
hardly any more difficult:

def cartprod(A, B):
if type(A) == type(B) == str:
convert = lambda obj: "".join(list(obj))
else:
convert = lambda obj: obj # do nothing
for a in A:
for b in B:
yield convert((a, b))


I didn't deny that it's handy; my "imul" example was pretty much the
same. If you don't want to use the a*b syntax, it's a good solution.

-- Christoph
Jan 22 '06 #21

P: n/a
Christoph Zwerschke wrote:
In Python, it is possible to multiply a string with a number:
>>> "hello"*3 'hellohellohello'
Which is really useful.
However, you can't multiply a string with another string:
>>> 'hello'*'world'
Traceback (most recent call last):
File "<interactive input>", line 1, in ?
TypeError: can't multiply sequence by non-int

Sometimes I was missing such a feature.
What I expect as the result is the "cartesian product" of the strings.


There's no such thing; you'd have to define it first. Are duplicates
significant? Order?

What you seem to want is easy enough:

[a + b for a in 'hello' for b in 'world']

[...] Cartesian products may be generally interesting for iterables:


And maybe you want the result to be a generator:

(a + b for a in 'hello' for b in 'world')
New language features should be widely useful, and difficult or
awkward to code in Python as it is. All-combinations-of-sequences
is trivial to code and rarely needed.
--
--Bryan
Jan 23 '06 #22

P: n/a
Bryan Olson wrote:
What you seem to want is easy enough:
[a + b for a in 'hello' for b in 'world']
Yes, but 'hello'*'world' is easier.

And how would you write '01'**8 ? This should give you all bytes in
binary code.
And maybe you want the result to be a generator:
(a + b for a in 'hello' for b in 'world')
That may be the main problem to decide whether the cartesian product
should return a generator or a list. A list would be more intuitive and
can be accessed directly. For instance,

('01'**8)[42]

would give you the binary code of 42 (though in an inefficient way).
But because the inflationary character of cartesian products, a
generator would be also desirable (though you wouldn't expect '01'**2 to
be a generator, since '01'*2 isn't a generator either).
New language features should be widely useful, and difficult or
awkward to code in Python as it is. All-combinations-of-sequences
is trivial to code and rarely needed.


That's the other problem. The uses cases (like the password cracker
example) are very limited and in these cases you can either write nested
loops or write your own cartesian product.

-- Christoph

BTW: What is the shortest way to get the binary representation of a
number in Python? Is there really not something like itoa() anywhere in
the standard libs?
Jan 23 '06 #23

P: n/a
Christoph Zwerschke wrote:
[...]
That may be the main problem to decide whether the cartesian product
should return a generator or a list.
The Cartesion product is a set.

[...] That's the other problem. The uses cases (like the password cracker
example) are very limited and in these cases you can either write nested
loops or write your own cartesian product.


Cartesian product is one of the essential operations of
relational algebra; in that context it's widely useful.
By itself, it's usually not what one wants.
--
--Bryan
Jan 23 '06 #24

P: n/a
On Mon, 23 Jan 2006 01:25:36 +0000, Bryan Olson wrote:
Sometimes I was missing such a feature.
What I expect as the result is the "cartesian product" of the strings.


There's no such thing; you'd have to define it first. Are duplicates
significant? Order?


Google "cartesian product" and hit "I'm feeling lucky".

Or go here: http://mathworld.wolfram.com/CartesianProduct.html

Still think there is no such thing?

--
Steven.

Jan 23 '06 #25

P: n/a
On Mon, 23 Jan 2006 10:36:55 +0000, Bryan Olson wrote:
Christoph Zwerschke wrote:
[...]
That may be the main problem to decide whether the cartesian product
should return a generator or a list.
The Cartesion product is a set.


And the generalization of mathematical sets in Python can be built-in
sets, lists or tuples, depending on what you need.

Given that cartesian products tend to be *extremely* large, some sort of
iterator is the only practical solution -- even if that's not
mathematically pure.

[...]
That's the other problem. The uses cases (like the password cracker
example) are very limited and in these cases you can either write nested
loops or write your own cartesian product.


Cartesian product is one of the essential operations of
relational algebra; in that context it's widely useful.
By itself, it's usually not what one wants.


Google on "Cartesian product python" and you will find thousands of hits.
This is something that keeps coming up over and over again.

Personally, I think cartesian products, together with permutations and
combinations, belong in a module, not built-ins.

--
Steven.

Jan 23 '06 #26

P: n/a
Bryan Olson wrote:
Christoph Zwerschke wrote:
[...]
That may be the main problem to decide whether the cartesian product
should return a generator or a list.


The Cartesion product is a set.


Sets are iterable, so that isn't an answer. If it's a set, then it
should either be a generator or a set.

--
Erik Max Francis && ma*@alcyone.com && http://www.alcyone.com/max/
San Jose, CA, USA && 37 20 N 121 53 W && AIM erikmaxfrancis
Nothing is potent against love save impotence.
-- Samuel Butler
Jan 23 '06 #27

P: n/a
Bryan Olson schrieb:
Christoph Zwerschke wrote:
[...]
That may be the main problem to decide whether the cartesian product
should return a generator or a list.
The Cartesion product is a set.


Of course it is a set. But if the factors of the product have a total
order (as in the case of strings, tuples or lists), this carries over to
the cartesian product (lexicographical order). The implementation should
reflect that by giving you a list or a generator, not a set.

Only if I build the cartesian product of sets I would expect a set.

"ab"*"cd" should be "ac", "ad", "bc", "bd", in this order.
Cartesian product is one of the essential operations of
relational algebra; in that context it's widely useful.
By itself, it's usually not what one wants.


Usually not. But sometimes you may want it. That's why SQL defines has
the "cross join" clause. Usually it is not needed and should be avoided,
but sometimes it may make sense.

-- Christoph
Jan 23 '06 #28

P: n/a
Steven D'Aprano wrote:
Bryan Olson wrote:
Christoph Zwerschke wrote:
[...]
That may be the main problem to decide whether the cartesian product
should return a generator or a list.


The Cartesion product is a set.


And the generalization of mathematical sets in Python can be built-in
sets, lists or tuples, depending on what you need.

Given that cartesian products tend to be *extremely* large, some sort of
iterator is the only practical solution -- even if that's not
mathematically pure.


Query languages have included Cartesian product for decades. Their
solution is to optimize expressions to avoid building, or even
iterating over, big Cartesian products.

[...]
That's the other problem. The uses cases (like the password cracker
example) are very limited and in these cases you can either write nested
loops or write your own cartesian product.


Cartesian product is one of the essential operations of
relational algebra; in that context it's widely useful.
By itself, it's usually not what one wants.


Google on "Cartesian product python" and you will find thousands of hits.
This is something that keeps coming up over and over again.


It keeps coming up in exercises and examples. How many of your
Google results note an application that actually calls for a
Cartesian product enumerator? I wrote one an old thread, and so
far I've written it one more time than I've used it.
--
--Bryan
Jan 23 '06 #29

P: n/a
Steven D'Aprano wrote:
Bryan Olson wrote:

[Christoph Zwerschke had written:]
What I expect as the result is the "cartesian product" of the strings.


There's no such thing; you'd have to define it first. Are duplicates
significant? Order?

Google "cartesian product" and hit "I'm feeling lucky".

Or go here: http://mathworld.wolfram.com/CartesianProduct.html

Still think there is no such thing?


Uh, yes.

The Cartesian product of two sets A and B (also called the
product set, set direct product, or cross product) is defined to
be the set of [...]

All sets, no strings. What were you looking at?

(I'm also happy to see someone agrees with my capitalization of
"Cartesian".)
--
--Bryan
Jan 23 '06 #30

P: n/a
Bryan Olson wrote:
There's no such thing; you'd have to define it first. Are duplicates
significant? Order?


That's all trivial isn't it? A string is a set of pairs (i,c) where i
is an integer number, the index, with 0<=i<j<=card(string)-1, for
(i,c), (j,d) in string and c is a character. Since there is a natural
order on integer numbers we can induce this order on the string, so
that it is an ordered set. For representation purposes we can omit i
since i is simply the position of c in the sequence. Same holds for
lists, tuples or other sequences.

Kay

Jan 23 '06 #31

P: n/a
On Mon, 23 Jan 2006 18:17:08 +0000, Bryan Olson wrote:
Steven D'Aprano wrote:
Bryan Olson wrote:

[Christoph Zwerschke had written:]
What I expect as the result is the "cartesian product" of the strings.

There's no such thing; you'd have to define it first. Are duplicates
significant? Order?

Google "cartesian product" and hit "I'm feeling lucky".

Or go here: http://mathworld.wolfram.com/CartesianProduct.html

Still think there is no such thing?


Uh, yes.

The Cartesian product of two sets A and B (also called the
product set, set direct product, or cross product) is defined to
be the set of [...]

All sets, no strings. What were you looking at?


You've just quoted a definition of Cartesian product [yes, you are right
about capitalisation, my bad]. How can you say with a straight face that
there is no such thing as a Cartesian product?

The question of sets versus strings is a red herring. Addition, in
mathematics, is defined on reals. No computer yet made can do addition on
reals. Should you declare that there is no such thing as addition because
reals and floats are different?

If people wish to extend Cartesian products to work on sequences, not just
sets, then to my mind that's a pretty obvious and sensible generalisation.

--
Steven.

Jan 23 '06 #32

P: n/a

Steven D'Aprano wrote:
On Mon, 23 Jan 2006 18:17:08 +0000, Bryan Olson wrote:
Steven D'Aprano wrote:
Bryan Olson wrote:

[Christoph Zwerschke had written:]
>What I expect as the result is the "cartesian product" of the strings.

There's no such thing; you'd have to define it first. Are duplicates
significant? Order?
Google "cartesian product" and hit "I'm feeling lucky".

Or go here: http://mathworld.wolfram.com/CartesianProduct.html

Still think there is no such thing?


Uh, yes.

The Cartesian product of two sets A and B (also called the
product set, set direct product, or cross product) is defined to
be the set of [...]

All sets, no strings. What were you looking at?


You've just quoted a definition of Cartesian product [yes, you are right
about capitalisation, my bad]. How can you say with a straight face that
there is no such thing as a Cartesian product?

The question of sets versus strings is a red herring. Addition, in
mathematics, is defined on reals. No computer yet made can do addition on
reals. Should you declare that there is no such thing as addition because
reals and floats are different?

If people wish to extend Cartesian products to work on sequences, not just
sets, then to my mind that's a pretty obvious and sensible generalisation.


I use Cartesian Products all the time in MS-Access, typically the
product of a set (of records) with itself. I really like your generator
because I can now do this stuff directly in Python. And I usually
want just a subset of the Cartesian Product, so I modified your
generator to produce what I want:

def cartprod(A, B, perm=True, repl=True):
if type(A) == type(B) == str:
convert = lambda obj: "".join(list(obj))
else:
convert = lambda obj: obj # do nothing
for a in A:
for b in B:
if perm and repl: # permutation with replacement
yield convert((a, b))
if (not perm) and repl: # combination with replacement
if b>=a:
yield convert((a,b))
if perm and (not repl): # permutation w/o replacement
if b!=a:
yield convert((a,b))
if (not perm) and (not repl): # combination w/o replacement
if b>a:
yield convert((a,b))
print
print 'For "abc"x"abc":'

print
cprods = cartprod("abc", "abc", True, True)
print 'permutation with replacement', list(cprods)

print
cprods = cartprod("abc", "abc", False, True)
print 'combination with replacement', list(cprods)

print
cprods = cartprod("abc", "abc", True, False)
print 'permutation w/o replacement', list(cprods)

print
cprods = cartprod("abc", "abc", False, False)
print 'combination w/o replacement', list(cprods)

"""

For "abc"x"abc":

permutation with replacement ['aa', 'ab', 'ac', 'ba', 'bb', 'bc', 'ca',
'cb', 'cc']

combination with replacement ['aa', 'ab', 'ac', 'bb', 'bc', 'cc']

permutation w/o replacement ['ab', 'ac', 'ba', 'bc', 'ca', 'cb']

combination w/o replacement ['ab', 'ac', 'bc']

"""

Thanks for showing me an easy way to do this.

--
Steven.


Jan 23 '06 #33

P: n/a
Bryan Olson schrieb:
Still think there is no such thing?


Uh, yes.

The Cartesian product of two sets A and B (also called the
product set, set direct product, or cross product) is defined to
be the set of [...]

All sets, no strings. What were you looking at?


Not only sets. This goes on (anyway "everything is a set"). You can also
have the Cartesian product of functions. And you can think of a string
as a function from a countable index set I to the set of all characters
C. So the Cartesian product of two strings will become a function from
IxI to CxC. Since IxX is countable again, this is equivalent to a tuple
of 2-tuples of characters which you can also interpret as a tuple of
strings with 2 chars:

"ab" x "cd" = ("ac", "ad", "bc", "bd")

Do I have eliminated all remaining clarities now? :-)

-- Christoph
Jan 24 '06 #34

P: n/a
Steven D'Aprano wrote:
On Mon, 23 Jan 2006 18:17:08 +0000, Bryan Olson wrote:

Steven D'Aprano wrote:
Bryan Olson wrote:
[Christoph Zwerschke had written:]
>What I expect as the result is the "cartesian product" of the strings.

There's no such thing; you'd have to define it first. Are duplicates
significant? Order?
Google "cartesian product" and hit "I'm feeling lucky".

Or go here: http://mathworld.wolfram.com/CartesianProduct.html

Still think there is no such thing?


Uh, yes.

The Cartesian product of two sets A and B (also called the
product set, set direct product, or cross product) is defined to
be the set of [...]

All sets, no strings. What were you looking at?


You've just quoted a definition of Cartesian product [yes, you are right
about capitalisation, my bad]. How can you say with a straight face that
there is no such thing as a Cartesian product?


Ah, I think I see. I said there's no such thing as the Cartesian
product of stings. You thought I claimed that there's no such thing
as Cartesian product at all. Try re-reading what I wrote and what
your chosen reference says.

The question of sets versus strings is a red herring. Addition, in
mathematics, is defined on reals. No computer yet made can do addition on
reals. Should you declare that there is no such thing as addition because
reals and floats are different?

If people wish to extend Cartesian products to work on sequences, not just
sets, then to my mind that's a pretty obvious and sensible generalisation.


You lost me. Are you advocating implementing it without defining
it? Or are you saying the the significance of duplicates and order
are so obvious that those were silly questions?
--
--Bryan
Jan 24 '06 #35

P: n/a
Kay Schluehr wrote:
Bryan Olson wrote:

There's no such thing; you'd have to define it first. Are duplicates
significant? Order?

That's all trivial isn't it? A string is a set of pairs (i,c) where i
is an integer number, the index, with 0<=i<j<=card(string)-1, for
(i,c), (j,d) in string and c is a character. Since there is a natural
order on integer numbers we can induce this order on the string, so
that it is an ordered set. For representation purposes we can omit i
since i is simply the position of c in the sequence. Same holds for
lists, tuples or other sequences.


Any of the combinations of duplicates and order being significant
or insignificant are easy to define, and easy to specify when
significant. Anticipating the implications of the choices is the
tricky part.
--
--Bryan
Jan 24 '06 #36

P: n/a
Christoph Zwerschke wrote:
Bryan Olson schrieb:
Still think there is no such thing?

Uh, yes.

The Cartesian product of two sets A and B (also called the
product set, set direct product, or cross product) is defined to
be the set of [...]

All sets, no strings. What were you looking at?

Not only sets. This goes on (anyway "everything is a set"). You can also
have the Cartesian product of functions. And you can think of a string
as a function from a countable index set I to the set of all characters
C. So the Cartesian product of two strings will become a function from
IxI to CxC. Since IxX is countable again, this is equivalent to a tuple
of 2-tuples of characters which you can also interpret as a tuple of
strings with 2 chars:

"ab" x "cd" = ("ac", "ad", "bc", "bd")

Do I have eliminated all remaining clarities now? :-)

-- Christoph


Christoph,
i think you raised a great issue: a lack of efficient support for
"combining" objects. Any language, if has smth to do with reality,
needs that kind of functionality. The combination dynamics, or growth
(multiplication) dynamics is a critically important functionality in
chemistry, physics, biology. It probably may be emulated by standard
means such as lists and dictionaries. If such support is available
though, this is a sign of mature language designed to cover the
realistic processes with rich growth/combination dynamics.
For instance, the dynamics of aperiodic growth that generates a 3D
aperiodic arrays/structures with the controllable "bits" in each unit to
be configured by dynamic masks to match the environmental ("boundary")
conditions would be a significant step in building next-generation
languages/silicon to support synthesis of realistic 3D structures (and
functions). Accordingly, the command "line" may need to be 2D and the
interpreter be designed to handle/understand not only a (command) text.
Just reflecting aloud..
val
Jan 24 '06 #37

P: n/a

Christoph Zwerschke wrote:
Now as I'm thinking about it, wouldn't it be nice to have the cartesian
products on Python sets? Maybe also a method that returns the power set
of a set (the set of all subsets), or the set of all subsets with a
given length.


For defining powersets it might suffice to include sets as exponents.
If A is a set then
Set([0,1])**A is the set of maps from A into {0,1}. The kernels of
those maps ( preimages of 0 in A ) is a subset classifier of A.

Kay

Jan 24 '06 #38

P: n/a
Christoph Zwerschke wrote:
Bryan Olson schrieb:
Still think there is no such thing?

Uh, yes.

The Cartesian product of two sets A and B (also called the
product set, set direct product, or cross product) is defined to
be the set of [...]

All sets, no strings. What were you looking at?


Not only sets.


Snipping is not going to make the facts go away. I did not
choose the reference at issue in this strand:

http://mathworld.wolfram.com/CartesianProduct.html
This goes on (anyway "everything is a set").
The claim "everything is a set" falls into the category of
'not even wrong'. Whatever semantics Python adopts, it must
be well-defined.

Watch things not be sets:

x = [1, 1, 2]
y = [1, 2]
print x == y
print set(x) == set(y)

You can also
have the Cartesian product of functions. And you can think of a string
as a function from a countable index set I to the set of all characters
C. So the Cartesian product of two strings will become a function from
IxI to CxC. Since IxX is countable again, this is equivalent to a tuple
of 2-tuples of characters which you can also interpret as a tuple of
strings with 2 chars:

"ab" x "cd" = ("ac", "ad", "bc", "bd")
I really did try to raise the real issues. I cannot make you answer,
but the question remains: are duplicate and order significant in
what you call "Cartesian product" or they not? Can you show that
your proposed language extensions are useful and consistent in
some reasonable sense?
Do I have eliminated all remaining clarities now? :-)


Yes. Good one. Sure.
--
--Bryan
Jan 24 '06 #39

P: n/a
Bryan Olson wrote:
The claim "everything is a set" falls into the category of
'not even wrong'.
No, it falls into the category of the most fundamental Mathematical
concepts. You actually *define* tuples as sets, or functions as sets or
relations as sets, or even all kinds of numbers and other things which
exist in the heads of Mathematicians as sets.
Watch things not be sets:

x = [1, 1, 2]
y = [1, 2]
print x == y
print set(x) == set(y)
Python tuples and lists are of course not the same as Python sets.
But mathematically, you can understand them as sets anyway and associate
every Python tuple with a Python set. The naive approach to understand a
tuple as the set of its values which is done by casting to set() does
not work, as you rightly noticed. The associated set to a Python tuple
or list x would be set(enumerate(x)), not set(x).

Generally, two approaches are common for constructing tuples as sets:

(A) Think of an n-tuple as a function on the index set, range(n). Then
remember a function is a special relation is a set.

(1, 2, 2) would correspond to the set {(0, 1), (1, 1), (1, 2)}
(1, 2) would correspond to the set {(0, 1), (1, 2)}

In Python, the tuple or list x would correspond to set(enumerate(x)).

As a sidemark, another common approach is this:

(B) Define the set corresponding to (1, 2) as {1, 2}. Define the set
corresponding to (1, 2, 2) as {{1, 2}, 2}, the set corresponding to (1,
2, 2, 4) as {{{1, 2}, 2}, 4} and so on.
I really did try to raise the real issues. I cannot make you answer,
but the question remains: are duplicate and order significant in
what you call "Cartesian product" or they not? Can you show that
your proposed language extensions are useful and consistent in
some reasonable sense?


I already tried to answer. It is not what "I call" Cartesian product.
Since functions are sets in Mathematics, once you have a Cartesian
product on sets, there is a natural (canonical) way to define a
Cartesian product on functions as well. So there is also a canonical way
to define a Cartesian product on tuples, if you interpret tuples as
functions via (A). And there is a canonical way to understand the
resulting sets as tuples again (by the lexicographical order of the
index set).

So the cartesian product of a string, tuple, or list is well-defined
including its order.

The only ambiguity is whether the result should be a generator or a
tuple, and in the case of strings whether the elements in the result
should be returned as tuples,
"ab"*"cd" = ("a", c"), ("a", "d"), ("b", "c"), ("b", "d")
or concatenated as strings:
"ab"*"cd" = "ac", "ad", "bc", "bd"

In any way, there is no dispute about duplicates or ordering. This is
all canonical and well-defined.

Concerning the use, I admit there is no really frequent use, but in some
occasions it may be useful and I already gave some examples.

-- Christoph
Jan 24 '06 #40

P: n/a
Bryan Olson wrote:
The claim "everything is a set" falls into the category of
'not even wrong'.
No, it falls into the category of the most fundamental Mathematical
concepts. You actually *define* tuples as sets, or functions as sets or
relations as sets, or even all kinds of numbers and other things, which
exist only in the heads of Mathematicians, as sets.
Watch things not be sets:

x = [1, 1, 2]
y = [1, 2]
print x == y
print set(x) == set(y)
Python tuples and lists are of course not the same as Python sets. But
mathematically, you can understand them as sets anyway and associate
every Python tuple with a Python set. The naive approach to understand a
list as the set of its values which is done by casting with set() does
not work, as you rightly noticed. The associated set to a Python tuple
or list x would be set(enumerate(x)), not set(x).

Generally, two approaches are common for constructing tuples as sets:

(A) Think of an n-tuple as a function on the index set, range(n). Then
remember a function is a special relation is a set.

(1, 2, 2) would correspond to the set {(0, 1), (1, 2), (2, 2)}
(1, 2) would correspond to the set {(0, 1), (1, 2)}

In Python, the tuple or list x would correspond to set(enumerate(x)).

As a sidemark, another common approach is this:

(B) Define the set corresponding to (1, 2) as {{1}, 2}. Define the set
corresponding to (1, 2, 2) as {{{1}, 2}, 2}, the set corresponding to
(1, 2, 2, 4) as {{{{1}, 2}, 2}, 4} and so on.
I really did try to raise the real issues. I cannot make you answer,
but the question remains: are duplicate and order significant in
what you call "Cartesian product" or they not? Can you show that
your proposed language extensions are useful and consistent in
some reasonable sense?


I already tried to answer. It is not what "I call" Cartesian product. If
there is a canonical set representation of something like a function or
a tuple you immediately have a well-defined Cartesian product on these
things, and this would be also called Cartesian product. A Cartesian
product of functions and tuples is a well-defined mathematical concept.
The Cartesian product of functions is even a function again, and (via
lexicographical order of the index set) you can also interpret the
Cartesian product of tuples as a tuple again (this was probably the
point where you had doubts, but I already tried to explain).

The only ambiguity is whether the result should be a generator or a
tuple, and in the case of strings whether the elements in the result
should be returned as tuples,
"ab"*"cd" = ("a", c"), ("a", "d"), ("b", "c"), ("b", "d")
or concatenated as strings:
"ab"*"cd" = "ac", "ad", "bc", "bd"

In any way, there is no dispute about duplicates or ordering. This is
all canonical and well-defined.

Concerning the use, I admit there is no really frequent use, but in some
occasions it may be useful and I already gave some examples.

-- Christoph
Jan 24 '06 #41

P: n/a
Christoph Zwerschke wrote:
Bryan Olson wrote:
The claim "everything is a set" falls into the category of
'not even wrong'.


No, it falls into the category of the most fundamental Mathematical
concepts. You actually *define* tuples as sets, or functions as sets or
relations as sets, or even all kinds of numbers and other things which
exist in the heads of Mathematicians as sets.


No, wrong ... or well, obviously ... or uh ... sorry.
I've lost track of what we're arguing. If you want to
stand behind "everything is a set," I can certainly present
a case to the contrary.

Is there anything I actually claimed that you are prepared to
argue against? Cite me, and I'll defend or retract, or at least
re-phrase.

I definitely did make specific claims. I'll list them if you want,
and I'll listen to evidence against them, should you choose to
present any. I expect that neither of us wants to devote our
energy to flaming on in violent agreement.
--
--Bryan
Jan 24 '06 #42

P: n/a
I think this has been discussed thoroughgoing enough. All I wanted to
say is that this proposal of building "Cartesian products" of strings is
well-defined, in line with the mathematical concept of Cartesian
products, and can be *sometimes* useful. I hope we agree on this, but if
not we don't need to go through this again...

Peace,
Christoph
Jan 24 '06 #43

P: n/a

"Christoph Zwerschke" <ci**@online.de> wrote in message
news:dr**********@online.de...
Bryan Olson wrote:
The claim "everything is a set" falls into the category of
'not even wrong'.


No, it falls into the category of the most fundamental Mathematical
concepts. You actually *define* tuples as sets, or functions as sets or
relations as sets, or even all kinds of numbers and other things which
exist in the heads of Mathematicians as sets.


You might so define, but Bryan and others might not. The
philosophical/methodological idea that 'everything is a set' has been very
fruitful but it is not a fact. Alternative ideas are 'everything is a
function' and 'everything is defined by axioms'. As for set theory itself,
there are multiple nonequivalent consistent theories for uncountable sets
and therefore multiple definitions of what a set it. And to me, Russel's
'paradox' is a proof by negation that there is at least one collection that
is *not* a set.

Back to the thread topic: The cross-catenation operator (a generalization
of 'cartesian product' to sequences) is sometimes very useful. Recursive
definitions of combinatorial functions provide several examples. Whether
it should be built into the language or left to subclassers is a different
issue.

Terry J. Reedy

Jan 25 '06 #44

P: n/a
Terry Reedy wrote:
You might so define, but Bryan and others might not. The
philosophical/methodological idea that 'everything is a set' has been very
fruitful but it is not a fact. Alternative ideas are 'everything is a
function' and 'everything is defined by axioms'.


according to google, ideas such as

everything is illuminated
everything is a miracle
everything is an object

and even

everything is a fscking dns problem

seems to be a bit more common.

</F>

Jan 25 '06 #45

This discussion thread is closed

Replies have been disabled for this discussion.