459,199 Members | 1,725 Online
Need help? Post your question and get tips & solutions from a community of 459,199 IT Pros & Developers. It's quick & easy.

# Some thougts on cartesian products

 P: n/a In Python, it is possible to multiply a string with a number: "hello"*3 'hellohellohello' However, you can't multiply a string with another string: 'hello'*'world' Traceback (most recent call last): File "", line 1, in ? TypeError: can't multiply sequence by non-int Sometimes I was missing such a feature. What I expect as the result is the "cartesian product" of the strings. Here is a simple implementation of new list and string objects that explains what I mean. It also implements powers of lists and strings: class plist(list): """list with cartesian product list""" def __mul__(self, other): if isinstance(other, pstr): return plist([s+o for s in self for o in other]) if hasattr(other, '__getitem__'): return plist([[s, o] for s in self for o in other]) else: return list(self)*other def __pow__(self, other): if isinstance(other, int) and other > 0: if other == 1: return self return self * self**(other-1) class pstr(str): """str with cartesian product list""" def __mul__(self, other): if hasattr(other, '__getitem__'): return plist([s+o for s in self for o in other]) else: return str(self)*other def __pow__(self, other): if isinstance(other, int) and other > 0: if other == 1: return self return self * self**(other-1) With these new strings you can do the following: pstr("ab")*pstr("cd")*pstr("ef") ['ace', 'acf', 'ade', 'adf', 'bce', 'bcf', 'bde', 'bdf'] print pstr("abcdefgh")*pstr("12345678") ['a1', 'a2', ..., 'a8', 'b1', 'b2', ..., 'b8', ...., ..., ..., 'h1', 'h2', ..., 'h8'] print pstr("ACGU")**3 ['AAA', 'AAC', 'AAG', 'AAU', 'ACA', 'ACC', 'ACG', ..., ...., 'UGC', 'UGG', 'UGU', 'UUA', 'UUC', 'UUG', 'UUU'] I think this can be quite handy at times and save some typing. If Python strings had this ability, you could even outdo the 117 byte solution in the recent shortest Python coding contest (http://www.pycontest.net), as follows: j=''.join;seven_seg=lambda x:j(j(\ (' |'*'_ '*' |')[ord('|B¬@z”(ÀD°'[int(d)])%e]\ for d in x)+'\n'for e in(4,9,7)) This has only 110 bytes. Or you could write a simple password cracker like that: def crack(crypted, alphabet): for passwd in alphabet**4: if crypt(passwd, crypted[:2]) == crypted: return passwd And call it with alphabet = string.lowercase, for example. Cartesian products may be generally interesting for iterables: def imul(iterable1, iterable2): """cartesian product of two iterables""" for object1 in iterable1: for object2 in iterable2: if isinstance(object1, basestring) and \ isinstance(object2, basestring): yield object1 + object2 else: yield (object1, object2) def ipow(iterable, number): """cartesian product power of an iterable""" if number == 1: for object in iterable: yield object elif number > 1: for object1 in iterable: for object2 in ipow(iterable, number-1): yield object1 + object2 class istr(str): """str with iterable cartesian product""" def __mul__(self, other): if isinstance(other, str): return imul(self, other) else: return str(self)*other def __pow__(self, other): return ipow(self, other) I was wondering if similar functionality could be added in some way to Python. I noticed that Python has a lot of aggregate functions that can "reduce" given collection objects (like reduce, filter, min, max, sum, hash) and functions that keep the same size (like map, sort or zip), but few functions that can "inflate" the given objects (like range and enumerate). I know, such functions are dangerous because they also inflate time and memory consumed by the program. Still, sometimes they can make sense, whenever you for some reason simply *have* to walk through all the combinations. Jan 22 '06 #1
44 Replies

 P: n/a Christoph Zwerschke wrote: Sometimes I was missing such a feature. What I expect as the result is the "cartesian product" of the strings. I've been thinking of it as well. I'd like it for lists too: range(3)**2 [(0,0), (0,1), (0,2), (1,0), (1,1), (1,2), (2,0), (2,1), (2,2)] -- Giovanni Bajo Jan 22 '06 #2

 P: n/a Giovanni Bajo wrote: Christoph Zwerschke wrote: Sometimes I was missing such a feature. What I expect as the result is the "cartesian product" of the strings. I've been thinking of it as well. I'd like it for lists too: range(3)**2 [(0,0), (0,1), (0,2), (1,0), (1,1), (1,2), (2,0), (2,1), (2,2)] -- Giovanni Bajo But why isn't this interpreted as [0, 1, 4] like it is in Mathematica? I would prefer a polymorphic distribute(*args) function ( or generator ) that acts on tuples of listlike objects of equal size. It could be extended to distribute(*args[,default]) by a single default argument that is inserted in the distribution table when two listlike objects have not the same size. Kay Jan 22 '06 #3

 P: n/a Kay Schluehr wrote: ...> range(3)**2 [(0,0), (0,1), (0,2), (1,0), (1,1), (1,2), (2,0), (2,1), (2,2)] ... But why isn't this interpreted as [0, 1, 4] like it is in Mathematica? Since range(3)*2 is [0, 1, 2, 0, 1, 2], it would be horribly, painfully inconsistent if **2 was interpreted as "square each item". Alex Jan 22 '06 #4

 P: n/a Kay Schluehr schrieb: But why isn't this interpreted as [0, 1, 4] like it is in Mathematica? Because we are thinking of a cartesian product. If you have lists of numbers, then there are mane more ways to define a product: tensor product, vector product, scalar product, componentwise product... The cartesian product is a much more generic concept, you can have it already for sets: class sset(set): def __mul__(self, other): return sset((a,b) for a in self for b in other) def __pow__(self, other): if isinstance(other, int) and other > 0: if other == 1: return self elif other == 2: return self*self else: return sset(a + (b,) \ for a in self**(other-1) for b in self) Example: for x in sorted(sset("ACGU")**3): print ''.join(x) AAA AAC AAG AAU ACA ACC .. .. .. UGU UUA UUC UUG UUU Now as I'm thinking about it, wouldn't it be nice to have the cartesian products on Python sets? Maybe also a method that returns the power set of a set (the set of all subsets), or the set of all subsets with a given length. You could get a 6/49 lotto tip with something like: choice(set(range(49)).powerset(6)) -- Christoph Jan 22 '06 #5

 P: n/a Alex Martelli wrote: Kay Schluehr wrote: range(3)**2 But why isn't this interpreted as [0, 1, 4] like it is in Mathematica? Since range(3)*2 is [0, 1, 2, 0, 1, 2], it would be horribly, painfully inconsistent if **2 was interpreted as "square each item". Yes. Python does not interpreate the product of a list with a number as a scalar product. Otherwise range(3)*2 should be [0, 1, 4] as well. For doing such things I would use a vector subtype of list. -- Christoph Jan 22 '06 #6

 P: n/a Christoph Zwerschke wrote: ... given length. You could get a 6/49 lotto tip with something like: choice(set(range(49)).powerset(6)) And that would be better than the current random.sample(range(49),6) in WHAT ways, exactly...? Alex Jan 22 '06 #7

 P: n/a al***@mail.comcast.net (Alex Martelli) writes: given length. You could get a 6/49 lotto tip with something like: choice(set(range(49)).powerset(6)) And that would be better than the current random.sample(range(49),6) in WHAT ways, exactly...? I think the first one would be incorrect since it samples with replacement. At least, it looks like it does. Jan 22 '06 #8

 P: n/a On Sun, 22 Jan 2006 18:29:45 +0100, Christoph Zwerschke wrote: Alex Martelli wrote: Kay Schluehr wrote: range(3)**2 But why isn't this interpreted as [0, 1, 4] like it is in Mathematica? Since range(3)*2 is [0, 1, 2, 0, 1, 2], it would be horribly, painfully inconsistent if **2 was interpreted as "square each item". Yes. Python does not interpreate the product of a list with a number as a scalar product. Otherwise range(3)*2 should be [0, 1, 4] as well. For doing such things I would use a vector subtype of list. Not everything needs to be a separate class! Why create a magic class for every piece of functionality you want? Just create functions that operate on existing classes! Instead of a class that supports cartesian products, make a function that takes two sequences and returns the cartesian product of them. (This will likely be best implemented as a generator.) If you write it properly, which is to say if you don't go out of your way to break it, this function will *automatically* work on any sequence type, lists, tuples, strings, and things you and I haven't even thought of. What advantage is there to creating a "list with cartesian product" subclass of list? -- Steven. Jan 22 '06 #9

 P: n/a Alex Martelli schrieb: Christoph Zwerschke wrote: ... given length. You could get a 6/49 lotto tip with something like: choice(set(range(49)).powerset(6)) And that would be better than the current random.sample(range(49),6) in WHAT ways, exactly...? You're right, random.sample(range(49),6) does the same and much faster. I just didn't think of it (it is new since Python 2.3). What if you need 12 different tips for your lotto ticket? s = set(range(49)).powerset(6) for x in range(10): print s.pop() But the real disadvantage of this idea is the memory consumed and the time to set up that memory: set(range(49)).powerset(6) has a cardinality of about 13 million entries! You PC would start to swap just for getting a lotto tip... -- Christoph Jan 22 '06 #10

 P: n/a Steven D'Aprano wrote: On Sun, 22 Jan 2006 18:29:45 +0100, Christoph Zwerschke wrote: For doing such things I would use a vector subtype of list. Not everything needs to be a separate class! Why create a magic class for every piece of functionality you want? Just create functions that operate on existing classes! What advantage is there to creating a "list with cartesian product" subclass of list? Principally, you're right (see also my example with iterators). But I can still see two reasons for classes: 1) That function would have to make a lot of case distinctions (check the types of operands). If you have a class, you already know the type of the operands (at least one). 2) It allows you to write a*b instead of mul(a,b) which looks nicer. -- Christoph Jan 22 '06 #11

 P: n/a Paul Rubin schrieb: al***@mail.comcast.net (Alex Martelli) writes: given length. You could get a 6/49 lotto tip with something like: choice(set(range(49)).powerset(6)) And that would be better than the current random.sample(range(49),6) in WHAT ways, exactly...? I think the first one would be incorrect since it samples with replacement. At least, it looks like it does. No, the elements of the powerset would be sets with 6 elements each, not tuples. So technically, it would be correct. Just horribly inefficient. -- Christoph Jan 22 '06 #12

 P: n/a Christoph Zwerschke writes: No, the elements of the powerset would be sets with 6 elements each, not tuples. So technically, it would be correct. Just horribly inefficient. Oh I see, not the Cartesian product. Yeah, it would be silly in practice. Jan 22 '06 #13

 P: n/a Steven D'Aprano wrote: ... What advantage is there to creating a "list with cartesian product" subclass of list? Essentially, syntax sugar -- for some people, being able to code a*b rather than product(a,b) takes on a huge significance; Python chooses to support this syntax variation by special methods in classes, and thus encourages people to create classes if they're keen on the syntax. Alex Jan 22 '06 #14

 P: n/a Christoph Zwerschke wrote: Alex Martelli schrieb: Christoph Zwerschke wrote: ... given length. You could get a 6/49 lotto tip with something like: choice(set(range(49)).powerset(6)) And that would be better than the current random.sample(range(49),6) in WHAT ways, exactly...? You're right, random.sample(range(49),6) does the same and much faster. I just didn't think of it (it is new since Python 2.3). Yep, it's one of 2.3's many little gems. Still, builtin set is new in 2.4, so it's hardly a "more classic" approach;-). What if you need 12 different tips for your lotto ticket? You probably want random ones, then... s = set(range(49)).powerset(6) for x in range(10): print s.pop() This is very systematic, not random;-). Still, using random.sample on s would indeed produce 12 random different tips!-) But the real disadvantage of this idea is the memory consumed and the time to set up that memory: set(range(49)).powerset(6) has a cardinality of about 13 million entries! You PC would start to swap just for getting a lotto tip... Well then, you need a new PC, 64-bit and with as many GB of RAM as required, no?-) Until you get such a PC, something like: tips = set() while len(tips) < 10: tip = frozenzet(random.sample(range(49), 6)) tips.add(tip) will have to suffice, I guess;-). Alex Jan 22 '06 #15

 P: n/a Generally, if you could multiply strings in the above fashion, you could spare one more more sub loops, as in this example: for f in ('index', 'default')*('.html', '.htm', '.shtml'): if exists(f): break In this case, it would be not really be better than that: for f in 'index', 'default': for e in '.html', '.htm', '.shtml': if exists(f+e): break The password cracking was already a better example. But this would be only efficient if the product of two strings is returned as a generator, not a list. And I don't know if you really would expect that, since multiplying a string or a list with a number does not change its type. -- Christoph Jan 22 '06 #16

 P: n/a On Sun, 22 Jan 2006 10:41:39 -0800, Alex Martelli wrote: Steven D'Aprano wrote: ... What advantage is there to creating a "list with cartesian product" subclass of list? Essentially, syntax sugar -- for some people, being able to code a*b rather than product(a,b) takes on a huge significance; Python chooses to support this syntax variation by special methods in classes, and thus encourages people to create classes if they're keen on the syntax. I beg to differ: Python *allows* people to create classes if they're keen on the syntax (and I can sympathise with that like), but Python *encourages* by example generic tools that operate on as many different types as makes sense. -- Steven. Jan 22 '06 #17

 P: n/a Alex Martelli schrieb: s = set(range(49)).powerset(6) for x in range(10): print s.pop() This is very systematic, not random;-). Still, using random.sample on s would indeed produce 12 random different tips!-) Right, that would be systematic. What I wanted to write was: s = set(range(49)).powerset(6) for x in range(10): c = choice(s) print c s.remove(c) Of course you could also use random.sample again: random.sample(set(range(49)).powerset(6), 10) But this would be just as inefficient. tips = set() while len(tips) < 10: tip = frozenzet(random.sample(range(49), 6)) tips.add(tip) Yep, that's better. The amount of hand coding is even the same as above. -- Christoph Jan 22 '06 #18

 P: n/a Steven D'Aprano wrote: I beg to differ: Python *allows* people to create classes if they're keen on the syntax (and I can sympathise with that like), but Python *encourages* by example generic tools that operate on as many different types as makes sense. But these generic tools (say the "pow" function) in turn simply use the methods defined by the classes. -- Christoph Jan 22 '06 #19

 P: n/a On Sun, 22 Jan 2006 19:12:49 +0100, Christoph Zwerschke wrote: Steven D'Aprano wrote: On Sun, 22 Jan 2006 18:29:45 +0100, Christoph Zwerschke wrote: For doing such things I would use a vector subtype of list. Not everything needs to be a separate class! Why create a magic class for every piece of functionality you want? Just create functions that operate on existing classes! > > What advantage is there to creating a "list with cartesian product" > subclass of list? Principally, you're right (see also my example with iterators). But I can still see two reasons for classes: 1) That function would have to make a lot of case distinctions (check the types of operands). If you have a class, you already know the type of the operands (at least one). If you are happy to always return a list of tuples regardless of what the two operands are, generators make it so easy it is shameful. Even if you want a special case of two string arguments returning a string, it is hardly any more difficult: def cartprod(A, B): if type(A) == type(B) == str: convert = lambda obj: "".join(list(obj)) else: convert = lambda obj: obj # do nothing for a in A: for b in B: yield convert((a, b)) Notice that the *only* reason we look at the type of the arguments is because we want two strings to return a string. If we don't care about that, the generator is less than half the size: def cartprod(A, B): for a in A: for b in B: yield (a, b) That's a generator giving you the products one at a time. For the benefit of anyone out there who doesn't know about generators, you use it like this: cprods = cartprod([1,2,3], "abc") for t in cprods: .... print t .... (1, 'a') (1, 'b') (2, 'a') (2, 'b') (3, 'a') (3, 'b') If you want them all at once, memory allowing, you do this: cprods = cartprod([1,2,3], "abc") list(cprods) [(1, 'a'), (1, 'b'), (2, 'a'), (2, 'b'), (3, 'a'), (3, 'b')] 2) It allows you to write a*b instead of mul(a,b) which looks nicer. But not as clear as cartprod(a,b) or even cartesian_product(a,b). -- Steven. Jan 22 '06 #20

 P: n/a Steven D'Aprano wrote: If you are happy to always return a list of tuples regardless of what the two operands are, generators make it so easy it is shameful. Even if you want a special case of two string arguments returning a string, it is hardly any more difficult: def cartprod(A, B): if type(A) == type(B) == str: convert = lambda obj: "".join(list(obj)) else: convert = lambda obj: obj # do nothing for a in A: for b in B: yield convert((a, b)) I didn't deny that it's handy; my "imul" example was pretty much the same. If you don't want to use the a*b syntax, it's a good solution. -- Christoph Jan 22 '06 #21

 P: n/a Christoph Zwerschke wrote: In Python, it is possible to multiply a string with a number: >>> "hello"*3 'hellohellohello' Which is really useful. However, you can't multiply a string with another string: >>> 'hello'*'world' Traceback (most recent call last): File "", line 1, in ? TypeError: can't multiply sequence by non-int Sometimes I was missing such a feature. What I expect as the result is the "cartesian product" of the strings. There's no such thing; you'd have to define it first. Are duplicates significant? Order? What you seem to want is easy enough: [a + b for a in 'hello' for b in 'world'] [...] Cartesian products may be generally interesting for iterables: And maybe you want the result to be a generator: (a + b for a in 'hello' for b in 'world') New language features should be widely useful, and difficult or awkward to code in Python as it is. All-combinations-of-sequences is trivial to code and rarely needed. -- --Bryan Jan 23 '06 #22

 P: n/a Bryan Olson wrote: What you seem to want is easy enough: [a + b for a in 'hello' for b in 'world'] Yes, but 'hello'*'world' is easier. And how would you write '01'**8 ? This should give you all bytes in binary code. And maybe you want the result to be a generator: (a + b for a in 'hello' for b in 'world') That may be the main problem to decide whether the cartesian product should return a generator or a list. A list would be more intuitive and can be accessed directly. For instance, ('01'**8)[42] would give you the binary code of 42 (though in an inefficient way). But because the inflationary character of cartesian products, a generator would be also desirable (though you wouldn't expect '01'**2 to be a generator, since '01'*2 isn't a generator either). New language features should be widely useful, and difficult or awkward to code in Python as it is. All-combinations-of-sequences is trivial to code and rarely needed. That's the other problem. The uses cases (like the password cracker example) are very limited and in these cases you can either write nested loops or write your own cartesian product. -- Christoph BTW: What is the shortest way to get the binary representation of a number in Python? Is there really not something like itoa() anywhere in the standard libs? Jan 23 '06 #23

 P: n/a Christoph Zwerschke wrote: [...] That may be the main problem to decide whether the cartesian product should return a generator or a list. The Cartesion product is a set. [...] That's the other problem. The uses cases (like the password cracker example) are very limited and in these cases you can either write nested loops or write your own cartesian product. Cartesian product is one of the essential operations of relational algebra; in that context it's widely useful. By itself, it's usually not what one wants. -- --Bryan Jan 23 '06 #24

 P: n/a On Mon, 23 Jan 2006 01:25:36 +0000, Bryan Olson wrote: Sometimes I was missing such a feature. What I expect as the result is the "cartesian product" of the strings. There's no such thing; you'd have to define it first. Are duplicates significant? Order? Google "cartesian product" and hit "I'm feeling lucky". Or go here: http://mathworld.wolfram.com/CartesianProduct.html Still think there is no such thing? -- Steven. Jan 23 '06 #25

 P: n/a On Mon, 23 Jan 2006 10:36:55 +0000, Bryan Olson wrote: Christoph Zwerschke wrote: [...] That may be the main problem to decide whether the cartesian product should return a generator or a list. The Cartesion product is a set. And the generalization of mathematical sets in Python can be built-in sets, lists or tuples, depending on what you need. Given that cartesian products tend to be *extremely* large, some sort of iterator is the only practical solution -- even if that's not mathematically pure. [...] That's the other problem. The uses cases (like the password cracker example) are very limited and in these cases you can either write nested loops or write your own cartesian product. Cartesian product is one of the essential operations of relational algebra; in that context it's widely useful. By itself, it's usually not what one wants. Google on "Cartesian product python" and you will find thousands of hits. This is something that keeps coming up over and over again. Personally, I think cartesian products, together with permutations and combinations, belong in a module, not built-ins. -- Steven. Jan 23 '06 #26

 P: n/a Bryan Olson wrote: Christoph Zwerschke wrote: [...] That may be the main problem to decide whether the cartesian product should return a generator or a list. The Cartesion product is a set. Sets are iterable, so that isn't an answer. If it's a set, then it should either be a generator or a set. -- Erik Max Francis && ma*@alcyone.com && http://www.alcyone.com/max/ San Jose, CA, USA && 37 20 N 121 53 W && AIM erikmaxfrancis Nothing is potent against love save impotence. -- Samuel Butler Jan 23 '06 #27

 P: n/a Bryan Olson schrieb: Christoph Zwerschke wrote: [...] That may be the main problem to decide whether the cartesian product should return a generator or a list. The Cartesion product is a set. Of course it is a set. But if the factors of the product have a total order (as in the case of strings, tuples or lists), this carries over to the cartesian product (lexicographical order). The implementation should reflect that by giving you a list or a generator, not a set. Only if I build the cartesian product of sets I would expect a set. "ab"*"cd" should be "ac", "ad", "bc", "bd", in this order. Cartesian product is one of the essential operations of relational algebra; in that context it's widely useful. By itself, it's usually not what one wants. Usually not. But sometimes you may want it. That's why SQL defines has the "cross join" clause. Usually it is not needed and should be avoided, but sometimes it may make sense. -- Christoph Jan 23 '06 #28

 P: n/a Steven D'Aprano wrote: Bryan Olson wrote:Christoph Zwerschke wrote:[...]That may be the main problem to decide whether the cartesian productshould return a generator or a list.The Cartesion product is a set. And the generalization of mathematical sets in Python can be built-in sets, lists or tuples, depending on what you need. Given that cartesian products tend to be *extremely* large, some sort of iterator is the only practical solution -- even if that's not mathematically pure. Query languages have included Cartesian product for decades. Their solution is to optimize expressions to avoid building, or even iterating over, big Cartesian products. [...]That's the other problem. The uses cases (like the password crackerexample) are very limited and in these cases you can either write nestedloops or write your own cartesian product.Cartesian product is one of the essential operations ofrelational algebra; in that context it's widely useful.By itself, it's usually not what one wants. Google on "Cartesian product python" and you will find thousands of hits. This is something that keeps coming up over and over again. It keeps coming up in exercises and examples. How many of your Google results note an application that actually calls for a Cartesian product enumerator? I wrote one an old thread, and so far I've written it one more time than I've used it. -- --Bryan Jan 23 '06 #29

 P: n/a Steven D'Aprano wrote: Bryan Olson wrote: [Christoph Zwerschke had written:]What I expect as the result is the "cartesian product" of the strings.There's no such thing; you'd have to define it first. Are duplicatessignificant? Order? Google "cartesian product" and hit "I'm feeling lucky". Or go here: http://mathworld.wolfram.com/CartesianProduct.html Still think there is no such thing? Uh, yes. The Cartesian product of two sets A and B (also called the product set, set direct product, or cross product) is defined to be the set of [...] All sets, no strings. What were you looking at? (I'm also happy to see someone agrees with my capitalization of "Cartesian".) -- --Bryan Jan 23 '06 #30

 P: n/a Bryan Olson wrote: There's no such thing; you'd have to define it first. Are duplicates significant? Order? That's all trivial isn't it? A string is a set of pairs (i,c) where i is an integer number, the index, with 0<=i

 P: n/a On Mon, 23 Jan 2006 18:17:08 +0000, Bryan Olson wrote: Steven D'Aprano wrote: Bryan Olson wrote: [Christoph Zwerschke had written:]What I expect as the result is the "cartesian product" of the strings.There's no such thing; you'd have to define it first. Are duplicatessignificant? Order? Google "cartesian product" and hit "I'm feeling lucky". Or go here: http://mathworld.wolfram.com/CartesianProduct.html Still think there is no such thing? Uh, yes. The Cartesian product of two sets A and B (also called the product set, set direct product, or cross product) is defined to be the set of [...] All sets, no strings. What were you looking at? You've just quoted a definition of Cartesian product [yes, you are right about capitalisation, my bad]. How can you say with a straight face that there is no such thing as a Cartesian product? The question of sets versus strings is a red herring. Addition, in mathematics, is defined on reals. No computer yet made can do addition on reals. Should you declare that there is no such thing as addition because reals and floats are different? If people wish to extend Cartesian products to work on sequences, not just sets, then to my mind that's a pretty obvious and sensible generalisation. -- Steven. Jan 23 '06 #32

 P: n/a Steven D'Aprano wrote: On Mon, 23 Jan 2006 18:17:08 +0000, Bryan Olson wrote: Steven D'Aprano wrote: Bryan Olson wrote: [Christoph Zwerschke had written:]>What I expect as the result is the "cartesian product" of the strings.There's no such thing; you'd have to define it first. Are duplicatessignificant? Order? Google "cartesian product" and hit "I'm feeling lucky". Or go here: http://mathworld.wolfram.com/CartesianProduct.html Still think there is no such thing? Uh, yes. The Cartesian product of two sets A and B (also called the product set, set direct product, or cross product) is defined to be the set of [...] All sets, no strings. What were you looking at? You've just quoted a definition of Cartesian product [yes, you are right about capitalisation, my bad]. How can you say with a straight face that there is no such thing as a Cartesian product? The question of sets versus strings is a red herring. Addition, in mathematics, is defined on reals. No computer yet made can do addition on reals. Should you declare that there is no such thing as addition because reals and floats are different? If people wish to extend Cartesian products to work on sequences, not just sets, then to my mind that's a pretty obvious and sensible generalisation. I use Cartesian Products all the time in MS-Access, typically the product of a set (of records) with itself. I really like your generator because I can now do this stuff directly in Python. And I usually want just a subset of the Cartesian Product, so I modified your generator to produce what I want: def cartprod(A, B, perm=True, repl=True): if type(A) == type(B) == str: convert = lambda obj: "".join(list(obj)) else: convert = lambda obj: obj # do nothing for a in A: for b in B: if perm and repl: # permutation with replacement yield convert((a, b)) if (not perm) and repl: # combination with replacement if b>=a: yield convert((a,b)) if perm and (not repl): # permutation w/o replacement if b!=a: yield convert((a,b)) if (not perm) and (not repl): # combination w/o replacement if b>a: yield convert((a,b)) print print 'For "abc"x"abc":' print cprods = cartprod("abc", "abc", True, True) print 'permutation with replacement', list(cprods) print cprods = cartprod("abc", "abc", False, True) print 'combination with replacement', list(cprods) print cprods = cartprod("abc", "abc", True, False) print 'permutation w/o replacement', list(cprods) print cprods = cartprod("abc", "abc", False, False) print 'combination w/o replacement', list(cprods) """ For "abc"x"abc": permutation with replacement ['aa', 'ab', 'ac', 'ba', 'bb', 'bc', 'ca', 'cb', 'cc'] combination with replacement ['aa', 'ab', 'ac', 'bb', 'bc', 'cc'] permutation w/o replacement ['ab', 'ac', 'ba', 'bc', 'ca', 'cb'] combination w/o replacement ['ab', 'ac', 'bc'] """ Thanks for showing me an easy way to do this. -- Steven. Jan 23 '06 #33

 P: n/a Bryan Olson schrieb: Still think there is no such thing? Uh, yes. The Cartesian product of two sets A and B (also called the product set, set direct product, or cross product) is defined to be the set of [...] All sets, no strings. What were you looking at? Not only sets. This goes on (anyway "everything is a set"). You can also have the Cartesian product of functions. And you can think of a string as a function from a countable index set I to the set of all characters C. So the Cartesian product of two strings will become a function from IxI to CxC. Since IxX is countable again, this is equivalent to a tuple of 2-tuples of characters which you can also interpret as a tuple of strings with 2 chars: "ab" x "cd" = ("ac", "ad", "bc", "bd") Do I have eliminated all remaining clarities now? :-) -- Christoph Jan 24 '06 #34

 P: n/a Kay Schluehr wrote: Bryan Olson wrote:There's no such thing; you'd have to define it first. Are duplicatessignificant? Order? That's all trivial isn't it? A string is a set of pairs (i,c) where i is an integer number, the index, with 0<=i

 P: n/a Christoph Zwerschke wrote: Bryan Olson schrieb: Still think there is no such thing? Uh, yes. The Cartesian product of two sets A and B (also called the product set, set direct product, or cross product) is defined to be the set of [...] All sets, no strings. What were you looking at? Not only sets. This goes on (anyway "everything is a set"). You can also have the Cartesian product of functions. And you can think of a string as a function from a countable index set I to the set of all characters C. So the Cartesian product of two strings will become a function from IxI to CxC. Since IxX is countable again, this is equivalent to a tuple of 2-tuples of characters which you can also interpret as a tuple of strings with 2 chars: "ab" x "cd" = ("ac", "ad", "bc", "bd") Do I have eliminated all remaining clarities now? :-) -- Christoph Christoph, i think you raised a great issue: a lack of efficient support for "combining" objects. Any language, if has smth to do with reality, needs that kind of functionality. The combination dynamics, or growth (multiplication) dynamics is a critically important functionality in chemistry, physics, biology. It probably may be emulated by standard means such as lists and dictionaries. If such support is available though, this is a sign of mature language designed to cover the realistic processes with rich growth/combination dynamics. For instance, the dynamics of aperiodic growth that generates a 3D aperiodic arrays/structures with the controllable "bits" in each unit to be configured by dynamic masks to match the environmental ("boundary") conditions would be a significant step in building next-generation languages/silicon to support synthesis of realistic 3D structures (and functions). Accordingly, the command "line" may need to be 2D and the interpreter be designed to handle/understand not only a (command) text. Just reflecting aloud.. val Jan 24 '06 #37

 P: n/a Christoph Zwerschke wrote: Now as I'm thinking about it, wouldn't it be nice to have the cartesian products on Python sets? Maybe also a method that returns the power set of a set (the set of all subsets), or the set of all subsets with a given length. For defining powersets it might suffice to include sets as exponents. If A is a set then Set([0,1])**A is the set of maps from A into {0,1}. The kernels of those maps ( preimages of 0 in A ) is a subset classifier of A. Kay Jan 24 '06 #38

 P: n/a Christoph Zwerschke wrote: Bryan Olson schrieb: Still think there is no such thing? Uh, yes. The Cartesian product of two sets A and B (also called the product set, set direct product, or cross product) is defined to be the set of [...] All sets, no strings. What were you looking at? Not only sets. Snipping is not going to make the facts go away. I did not choose the reference at issue in this strand: http://mathworld.wolfram.com/CartesianProduct.html This goes on (anyway "everything is a set"). The claim "everything is a set" falls into the category of 'not even wrong'. Whatever semantics Python adopts, it must be well-defined. Watch things not be sets: x = [1, 1, 2] y = [1, 2] print x == y print set(x) == set(y) You can also have the Cartesian product of functions. And you can think of a string as a function from a countable index set I to the set of all characters C. So the Cartesian product of two strings will become a function from IxI to CxC. Since IxX is countable again, this is equivalent to a tuple of 2-tuples of characters which you can also interpret as a tuple of strings with 2 chars: "ab" x "cd" = ("ac", "ad", "bc", "bd") I really did try to raise the real issues. I cannot make you answer, but the question remains: are duplicate and order significant in what you call "Cartesian product" or they not? Can you show that your proposed language extensions are useful and consistent in some reasonable sense? Do I have eliminated all remaining clarities now? :-) Yes. Good one. Sure. -- --Bryan Jan 24 '06 #39

 P: n/a Bryan Olson wrote: The claim "everything is a set" falls into the category of 'not even wrong'. No, it falls into the category of the most fundamental Mathematical concepts. You actually *define* tuples as sets, or functions as sets or relations as sets, or even all kinds of numbers and other things which exist in the heads of Mathematicians as sets. Watch things not be sets: x = [1, 1, 2] y = [1, 2] print x == y print set(x) == set(y) Python tuples and lists are of course not the same as Python sets. But mathematically, you can understand them as sets anyway and associate every Python tuple with a Python set. The naive approach to understand a tuple as the set of its values which is done by casting to set() does not work, as you rightly noticed. The associated set to a Python tuple or list x would be set(enumerate(x)), not set(x). Generally, two approaches are common for constructing tuples as sets: (A) Think of an n-tuple as a function on the index set, range(n). Then remember a function is a special relation is a set. (1, 2, 2) would correspond to the set {(0, 1), (1, 1), (1, 2)} (1, 2) would correspond to the set {(0, 1), (1, 2)} In Python, the tuple or list x would correspond to set(enumerate(x)). As a sidemark, another common approach is this: (B) Define the set corresponding to (1, 2) as {1, 2}. Define the set corresponding to (1, 2, 2) as {{1, 2}, 2}, the set corresponding to (1, 2, 2, 4) as {{{1, 2}, 2}, 4} and so on. I really did try to raise the real issues. I cannot make you answer, but the question remains: are duplicate and order significant in what you call "Cartesian product" or they not? Can you show that your proposed language extensions are useful and consistent in some reasonable sense? I already tried to answer. It is not what "I call" Cartesian product. Since functions are sets in Mathematics, once you have a Cartesian product on sets, there is a natural (canonical) way to define a Cartesian product on functions as well. So there is also a canonical way to define a Cartesian product on tuples, if you interpret tuples as functions via (A). And there is a canonical way to understand the resulting sets as tuples again (by the lexicographical order of the index set). So the cartesian product of a string, tuple, or list is well-defined including its order. The only ambiguity is whether the result should be a generator or a tuple, and in the case of strings whether the elements in the result should be returned as tuples, "ab"*"cd" = ("a", c"), ("a", "d"), ("b", "c"), ("b", "d") or concatenated as strings: "ab"*"cd" = "ac", "ad", "bc", "bd" In any way, there is no dispute about duplicates or ordering. This is all canonical and well-defined. Concerning the use, I admit there is no really frequent use, but in some occasions it may be useful and I already gave some examples. -- Christoph Jan 24 '06 #40

 P: n/a Christoph Zwerschke wrote: Bryan Olson wrote: The claim "everything is a set" falls into the category of 'not even wrong'. No, it falls into the category of the most fundamental Mathematical concepts. You actually *define* tuples as sets, or functions as sets or relations as sets, or even all kinds of numbers and other things which exist in the heads of Mathematicians as sets. No, wrong ... or well, obviously ... or uh ... sorry. I've lost track of what we're arguing. If you want to stand behind "everything is a set," I can certainly present a case to the contrary. Is there anything I actually claimed that you are prepared to argue against? Cite me, and I'll defend or retract, or at least re-phrase. I definitely did make specific claims. I'll list them if you want, and I'll listen to evidence against them, should you choose to present any. I expect that neither of us wants to devote our energy to flaming on in violent agreement. -- --Bryan Jan 24 '06 #42

 P: n/a I think this has been discussed thoroughgoing enough. All I wanted to say is that this proposal of building "Cartesian products" of strings is well-defined, in line with the mathematical concept of Cartesian products, and can be *sometimes* useful. I hope we agree on this, but if not we don't need to go through this again... Peace, Christoph Jan 24 '06 #43

 P: n/a "Christoph Zwerschke" wrote in message news:dr**********@online.de... Bryan Olson wrote: The claim "everything is a set" falls into the category of 'not even wrong'. No, it falls into the category of the most fundamental Mathematical concepts. You actually *define* tuples as sets, or functions as sets or relations as sets, or even all kinds of numbers and other things which exist in the heads of Mathematicians as sets. You might so define, but Bryan and others might not. The philosophical/methodological idea that 'everything is a set' has been very fruitful but it is not a fact. Alternative ideas are 'everything is a function' and 'everything is defined by axioms'. As for set theory itself, there are multiple nonequivalent consistent theories for uncountable sets and therefore multiple definitions of what a set it. And to me, Russel's 'paradox' is a proof by negation that there is at least one collection that is *not* a set. Back to the thread topic: The cross-catenation operator (a generalization of 'cartesian product' to sequences) is sometimes very useful. Recursive definitions of combinatorial functions provide several examples. Whether it should be built into the language or left to subclassers is a different issue. Terry J. Reedy Jan 25 '06 #44

 P: n/a Terry Reedy wrote: You might so define, but Bryan and others might not. The philosophical/methodological idea that 'everything is a set' has been very fruitful but it is not a fact. Alternative ideas are 'everything is a function' and 'everything is defined by axioms'. according to google, ideas such as everything is illuminated everything is a miracle everything is an object and even everything is a fscking dns problem seems to be a bit more common. Jan 25 '06 #45

### This discussion thread is closed

Replies have been disabled for this discussion.

### Similar topics

Browse more Python Questions on Bytes