# [perl-python] generic equivalence partition

 P: n/a another functional exercise with lists. Here's the perl documentation. I'll post a perl and the translated python version in 48 hours. =pod parti(aList, equalFunc) given a list aList of n elements, we want to return a list that is a range of numbers from 1 to n, partition by the predicate function of equivalence equalFunc. (a predicate function is a function that takes two arguments, and returns either True or False.) Note: a mathematical aspect: there are certain mathematical constraints on the a function that checks equivalence. That is to say, if a==b, then b==a. If a==b and b==c, then a==c. And, a==a. If a equivalence function does not satisfy these, it is inconsistent and basically give meaningless result. example: parti([['x','x','x','1'], ['x','x','x','2'], ['x','x','x','2'], ['x','x','x','2'], ['x','x','x','3'], ['x','x','x','4'], ['x','x','x','5'], ['x','x','x','5']], sub {\$_-> == \$_->} ) returns [,['2','3','4'],['5'],['6'],['7','8']]; =cut In the example given, the input list's elements are lists of 4 elements, and the equivalence function is one that returns True if the last item are the same. Note that this is a generic function. The input can be a list whose elements are of any type. What "parti" does is to return a partitioned range of numbers, that tells us which input element are equivalent to which, according to the predicate given. For example, in the given example, it tells us that the 2nd, 3rd, 4th elements are equivalent. And they are equivalent measured by the predicate function given, which basically tests if their last item are the same integer. (note that if we want to view the result as indexes, then it is 1-based index. i.e. counting starts at 1.) PS if you didn't realize yet, nested lists/dictionaries in perl is a complete pain in the ass. PS note that the code "sub {\$_-> == \$_->}" is what's called the lambda form, in Perl. Xah xa*@xahlee.org http://xahlee.org/PageTwo_dir/more.html Jul 18 '05 #1
 this is the first thing that came to my mind. i'm sure there are more clever ways to do this.

elements = [['x', 'x', 'x', '1'],
['x', 'x', 'x', '2'],
['x', 'x', 'x', '2'],
['x', 'x', 'x', '2'],
['x', 'x', 'x', '3'],
['x', 'x', 'x', '4'],
['x', 'x', 'x', '5'],
['x', 'x', 'x', '5']]

pos = {}
for i, element in enumerate(elements):
pos.setdefault(element[-1], []).append(i+1)

p = pos.values()
p.sort()

[, [2, 3, 4], , , [7, 8]]

bryan

 P: n/a On Thu, 24 Feb 2005 17:48:47 -0800, Bryan wrote: Xah Lee wrote: another functional exercise with lists. Here's the perl documentation. I'll post a perl and the translated python version in 48 hours. =pod parti(aList, equalFunc) given a list aList of n elements, we want to return a list that is a range of numbers from 1 to n, partition by the predicate function of equivalence equalFunc. (a predicate function is a function that takes two arguments, and returns either True or False.) [snip] example: parti([['x','x','x','1'], ['x','x','x','2'], [snip] ['x','x','x','5']], sub {\$_-> == \$_->} ) returns [,['2','3','4'],['5'],['6'],['7','8']]; =cut In the example given, the input list's elements are lists of 4 elements, and the equivalence function is one that returns True if the last item are the same. [snip] this is the first thing that came to my mind. i'm sure there are more cleverways to do this.elements = [['x', 'x', 'x', '1'], [snip] ['x', 'x', 'x', '5']]pos = {}for i, element in enumerate(elements): pos.setdefault(element[-1], []).append(i+1)p = pos.values()p.sort()[, [2, 3, 4], , , [7, 8]] Bryan: Bzzzt. Xah was proposing a GENERAL function. You have HARDWIRED his (simplistic) example. Xah: Bzzzt. Too close to your previous exercise. Jul 18 '05 #3

 P: n/a In article <11**********************@f14g2000cwb.googlegroups .com>, "Xah Lee" wrote: parti(aList, equalFunc) given a list aList of n elements, we want to return a list that is a range of numbers from 1 to n, partition by the predicate function of equivalence equalFunc. (a predicate function is a function that takes two arguments, and returns either True or False.) In Python it is much more natural to use ranges from 0 to n-1. In the worst case, this is going to have to take quadratic time (consider an equalFunc that always returns false) so we might as well do something really simple rather than trying to be clever. def parti(aList,equalFunc): eqv = [] for i in range(len(aList)): print i,eqv for L in eqv: if equalFunc(aList[i],aList[L]): L.append(i) break; else: eqv.append([i]) If you really want the ranges to be 1 to n, add one to each number in the returned list-of-lists. -- David Eppstein Computer Science Dept., Univ. of California, Irvine http://www.ics.uci.edu/~eppstein/ Jul 18 '05 #4

 P: n/a In article , David Eppstein wrote: def parti(aList,equalFunc): eqv = [] for i in range(len(aList)): print i,eqv for L in eqv: if equalFunc(aList[i],aList[L]): L.append(i) break; else: eqv.append([i]) Um, take out the print, that was just there for me to debug my code. -- David Eppstein Computer Science Dept., Univ. of California, Irvine http://www.ics.uci.edu/~eppstein/ Jul 18 '05 #5

 P: n/a David Eppstein wrote: In article <11**********************@f14g2000cwb.googlegroups .com>, "Xah Lee" wrote:given a list aList of n elements, we want to return a list that is arange of numbers from 1 to n, partition by the predicate function ofequivalence equalFunc. In the worst case, this is going to have to take quadratic time (consider an equalFunc that always returns false) so we might as well do something really simple rather than trying to be clever. def parti(aList,equalFunc): eqv = [] for i in range(len(aList)): print i,eqv for L in eqv: if equalFunc(aList[i],aList[L]): L.append(i) break; else: eqv.append([i]) Unless we can inspect the predicate function and derive a hash function such that hash(a) == hash(b) => predicate(a,b) is True. Then the partition can take linear time i.e., def equal(a,b): ... return a[-1] == b[-1] ... def hashFunc(obj): ... return hash(obj[-1]) ... def parti(aList, hashFunc): ... eqv = {} ... for i,obj in enumerate(aList): ... eqv.setdefault(hashFunc(obj),[]).append(i) ... return eqv.values() ... In the case where the predicate is a "black box", would a logistic regression over a sample of inputs enable a hash function to be derived experimentally? Michael Jul 18 '05 #6

 P: n/a David Eppstein writes: In article <11**********************@f14g2000cwb.googlegroups .com>, "Xah Lee" wrote: parti(aList, equalFunc) given a list aList of n elements, we want to return a list that is a range of numbers from 1 to n, partition by the predicate function of equivalence equalFunc. (a predicate function is a function that takes two arguments, and returns either True or False.) In Python it is much more natural to use ranges from 0 to n-1. In the worst case, this is going to have to take quadratic time (consider an equalFunc that always returns false) so we might as well do something really simple rather than trying to be clever. As you say, with the spec as it stands, you can't do better than quadratic time (although it's O(n*m) where m is the number of partitions, rather than O(n^2)). You can do a lot better if you can use a "key" function, rather than an "equivalence" function, much as list.sort has a "key" argument, and itertools.groupby (which is pretty close in function to this partitioning problem) uses a key argument. In fact, I'd have difficulty thinking of an example where I'd want a partition function as specified, in Python. In Perl, it makes a lot of sense, as Perl's array indexing operations lend themselves to slinging round lists of indices like this. But in Python, I'd be far more likely to use list.sort followed by itertools.groupby - sort is stable (so doesn't alter the relative order within equivalence classes), and groupby then picks out the equivalence classes: elements = [['x', 'x', 'x', '1'], .... ['x', 'x', 'x', '2'], .... ['x', 'x', 'x', '2'], .... ['x', 'x', 'x', '2'], .... ['x', 'x', 'x', '3'], .... ['x', 'x', 'x', '4'], .... ['x', 'x', 'x', '5'], .... ['x', 'x', 'x', '5']] # No need to sort here, as the elements are already sorted! from pprint import pprint pprint([(k, list(v)) for k, v in groupby(elements, itemgetter(3))]) [('1', [['x', 'x', 'x', '1']]), ('2', [['x', 'x', 'x', '2'], ['x', 'x', 'x', '2'], ['x', 'x', 'x', '2']]), ('3', [['x', 'x', 'x', '3']]), ('4', [['x', 'x', 'x', '4']]), ('5', [['x', 'x', 'x', '5'], ['x', 'x', 'x', '5']])] If you avoid the sort, the whole thing is highly memory efficient, as well, because by using iterators, we don't ever take a copy of the original list. Having cleverly redefined the question so that it fits the answer I wanted to give, I'll shut up now :-) Paul. -- To attain knowledge, add things every day; to attain wisdom, remove things every day. -- Lao-Tse Jul 18 '05 #7

 P: n/a # the following solution is submitted by # Sean Gugler and David Eppstein independently # 20050224. @def parti(aList, equalFunc): @ result = [] @ for i in range(len(aList)): @ for s in result: @ if equalFunc( aList[i], aList[s] ): @ s.append(i) @ break @ else: @ result.append( [i] ) @ return [[x+1 for x in L] for L in result] # add 1 to all numbers @ @--------------- as for my original perl code, i realized it is written to work on a sorted input. Here it is and the translated Python code. # perl sub parti(\$\$) { my @li = @{\$_}; my \$sameQ = \$_; my @tray=(1); my @result; for (my \$i=1; \$i <= ((scalar @li)-1); \$i++) { if (&\$sameQ(\$li[\$i-1], \$li[\$i])) { push @tray, \$i+1} else { push @result, [@tray]; @tray=(\$i+1); } } push @result, [@tray]; return \@result; } @#python @def parti(li,sameQ): @ tray=; @ result=[]; @ @ for i in range(1, len(li) ): @ if sameQ(li[i-1],li[i]): @ tray.append(i+1) @ else: @ result.append(tray) @ tray=[i+1] @ result.append(tray) @ return result @ http://xahlee.org/perl-python/gen_parti_by_equiv.html Xah xa*@xahlee.org http://xahlee.org/PageTwo_dir/more.html Jul 18 '05 #8

