Introduction
This week I'll write a bit about generics (those funny angular brackets). I need
an example and decided to use sets and some of their operations. This weeks'
article discusses one set class and two interesting operations on the set:
combinations and set partitioning. First a description of the algorithms is
given and then we'll have some fun with the generic implementation of them.
Combinations
For a given set S = [x, y, z] there are eight sets representing all possible
combinations of those three distinct elements:
- []
- [x]
- [y]
- [z]
- [x, y]
- [x, z]
- [y, z]
- [x, y, z]
Given that set S, how can we find all these combinations? Here's how: suppose
we can find all combinations of set S'= [y, z] which are: [], [y], [z] and [y, z]
Of course these subsets are combinations of the original set S as well, but
there's no 'x' to be found in them (of course not, because I left it out).
Simply add that element 'x' to all the subsets that make up the combinations
of set [y, z]; This is the result:
[x], [x, y], [x, z] and [x, y, z]
Together with the other four subsets we have constructed all eight combinations
of set S= [x, y, z]
But how did we find all combinations of set S'= [y, z]? We apply a similar
recursive reasoning here: suppose we can find all combinations of set S''= [z];
they are [] and [c]; they are part of all combinations of set S'= [y, z] and
we add element 'y' to it: [y], [y, z].
This may seem futile but here's how we find all combinations of set S''= [z]:
we find all combinations of the empty set, which is the empty set itself: [],
and we add element [z] to all the combinations (just one), which makes [z].
For a set with n elements there are 2^n combinations. This can easily be seen:
the empty set (zero elements) has one combination: the empty set itself. If
we add an element to the set we end up with twice the number of combinations:
the original combinations plus the original combinations with that new element
added to every combination. A set of one element has two combinations, a set of
two elements has four combinations etc. etc.
The recursive algorithm described above generates them all.
Partitions
A partition of a set S are one or more disjunct non-empty subsets and the union
of those sets form the set S again. For the set S= [x, y, z] the partitions are:
- [x, y, z]
- [x], [y, z]
- [x, y], [z]
- [x, z], [y]
- [x], [y], [z]
How do we find all partitions of a set? We apply a similar recursive reasoning:
suppose we can find all partitions of set S'= [y, z] which is:
- [y, z]
- [y], [z]
First we add the set [x] to every partition of set S':
- [x], [y, z]
- [x], [y], [z]
Next we add element 'x' to every set in the partitions of S', one by one:
- [x, y, z]
- [x, y], [z]
- [y], [x, z]
We end up with the same five partitions again. Try to find all partitions of
set S'= [y, z] given the partitions of set S''= [z]. The partitions of a single
element set is the single element set itself, so all partitions of set S'' is
the set S''= [z] itself.
By definition all partitions of an empty set is the empty set itself.
Sets
As you might have noticed by now this article is about sets. The Java programming
language has Set implementations available. I'm going to use the HashSet
implementation for the example in this article. I am going to change the behaviour
of the implementation a bit though: I want 'COW' sets. 'COW' is an acronym for
'Copy On Write'. COW implies that the set copies itself when it needs to be
altered; the copy is going to be altered instead. Let's get our hands dirty.
Here are the constructors of our (generic) set:
Expand|Select|Wrap|Line Numbers
- public class S<T> extends HashSet<T> {
- private static final long serialVersionUID = -8875179328198507416L;
- public S(Collection<T> collection) { super(collection); }
- public S(T arg) { this.add(arg); }
- public S(T ... args) { super(Arrays.asList(args)); }
- ...
constructors: a form of a copy constructor taking a Collection<T> as its
argument; a constructor that takes a single element T as its argument and
a constructor that takes a (possibly empty) list of T arguements.
The last constructor allows us to write new S<T>() i.e. no elements at all
which results in the empty set.
I included the second constructor again to keep Eclipse's mouth shut. I need
sets of sets as in:
Expand|Select|Wrap|Line Numbers
- S<S<T>> set= new S<S<T>>(new S<T>());
deduce that given the last constructor a generic array of type S<T> needs to be
constructed to put that single argument in. A type S<T> is not a type T so
Eclipse warns about its own creativity. If I hadn't included that second
constructor everything would still be fine but I just don't like warnings.
The last constructor simply turns the varargs array to a List (which is a
collection too) so a superclass constructor can handle it all.
The Set interface defines 'add()' and 'addAll()' methods that return true or false
indicating whether or not the element(s) could be added. The same applies to the
'remove' and 'removeAll' methods. I don't want that behaviour, I want methods
that do the same but return the new set instead. Here goes:
Expand|Select|Wrap|Line Numbers
- private S<T> unionHere(T arg) {
- add(arg);
- return this;
- }
- public S<T> union(T arg) {
- return new S<T>(this).unionHere(arg);
- }
- public S<T> union(Collection<T> collection) {
- S<T> copy= new S<T>(this);
- copy.addAll(collection);
- return copy;
- }
- public S<T> difference(T arg) {
- S<T> copy= new S<T>(this);
- copy.remove(arg);
- return copy;
- }
- public S<T> difference(Collection<T> collection) {
- S<T> copy= new S<T>(this);
- copy.removeAll(collection);
- return copy;
- }
the set itself; if they want to do that they can fiddle with the superclass
methods that change the set itself.
The public methods above simply copy the set and apply the wanted operation on
the copy instead and return the copied set instead of returning a boolean value.
Note that all methods are generic: they don't care about what type T actually is.
So far so good, but I haven't done much yet. Here are the other set operations:
intersection, difference and symmetric difference. First a few examples of the
operations; A= [x, y, z], B=[y, z, t]
- intersection: [y, z]
- difference: [x]
- symmetric difference: [x, t]
The intersection of A and B are the elements that occur in both A and B. The
difference of A and B are those elements that occur in A but not in B. The
symmetric difference of A and B are those elements in A or B bot not in both.
The symmetric difference plus the intersection of two sets is the union of those
two sets. The symmetric difference of sets A and B is the union of the difference
of A and B and the difference of B and A.
Here's the source code:
Expand|Select|Wrap|Line Numbers
- public S<T> difference(Collection<T> collection) {
- S<T> copy= new S<T>(this);
- copy.removeAll(collection);
- return copy;
- }
- public S<T> symDifference(Collection<T> collection) {
- return difference(collection).union(new S<T>(collection).difference(this));
- }
- public S<T> intersection(Collection<T> collection) {
- S<T> copy= new S<T>(this);
- copy.retainAll(collection);
- return copy;
- }
copy of the set and apply the operation on the copy of the set.
Combinations
The implementation of the 'combinations' method closely follows the description
in the paragraph above. The method returns a set of sets. The sets being the
individual combinations of the original set.
First the method constructs a set that contains the empty set. If the original
set isn't empty it walks over each element, removes it and generates all
combinations of the reduced set. It copies the combinations to a 'result' set
and copies them again after it had added that current element to the sets again.
Here's the code:
Expand|Select|Wrap|Line Numbers
- public S<S<T>> combinations() {
- S<S<T>> result= new S<S<T>>(new S<T>());
- for (T current : this) {
- S<S<T>> comb= difference(current).combinations();
- result.addAll(comb);
- for (S<T> set : comb)
- result.unionHere(set.union(current));
- }
- return result;
- }
of a true/false value. Not the generic type S<S<T>>, i.e. a set or sets of type T.
The method itself is defined in a generic definition of class S. The generic
type is T, so a (partial) specialization of S<S<T>> is valid too. In other
words: if, say, a definition of the union method for a type T is valid, then
automatically it is valid for a generic type S<T>. The above code makes use of
Java's generic behaviour.
I'm going to (ab)use that feature even more for the next algorithm implementation:
Partitions
This method uses the same implementation strategy: it recursively invokes itself
with a smaller set and uses the return value to collect the partitions of the
current set in an set of partitions. Here it is:
Expand|Select|Wrap|Line Numbers
- public S<S<S<T>>> partitions() {
- S<S<S<T>>> result= new S<S<S<T>>>();
- if (size() < 2)
- return result.unionHere(new S<S<T>>(this));
- T current= iterator().next();
- S<S<S<T>>> part= difference(current).partitions();
- for (S<S<T>> set : part) {
- result.unionHere(set.union(new S<T>(current)));
- for (S<T> elem : set) {
- S<S<T>> copy= set.difference(elem);
- result.unionHere(copy.unionHere(elem.union(current)));
- }
- }
- return result;
- }
sets; the last sets contain elements of type T.
For sets containing at most one element a set containing a set that contains
that (almost) empty set is returned. Otherwise the reasoning as explained in
the paragraph above is applied and the filled 'result' set is returned.
As you can see, in a generic definition of a method for type <T> a type S<T>,
S<S<T>> and even S<S<S<T>>> are defined automatically because an S<T>, S<S<T>>
and an S<S<S<T>>> are pure generic types themselves as well.
This makes it possible to find intersections, combinations or whatever not just
for sets of, say, Strings, i.e. S<String> but also for sets of sets of Strings,
i.e. S<S<String>> or whatever elements in a set you want.
Examples
After all this programming I'd like to see some results; I tested the above
methods like this:
Expand|Select|Wrap|Line Numbers
- public static void main(String[] args) {
- S<String> s= new S<String>("x", "y", "z");
- S<String> t= new S<String>("y", "z", "t");
- System.out.println("union : "+s.union(t));
- System.out.println("intersection : "+s.intersection(t));
- System.out.println("difference : "+s.difference(t));
- System.out.println("symDifference: "+s.symDifference(t));
- System.out.println("combinations : "+s.combinations());
- System.out.println("partitions : "+s.partitions());
- }
Expand|Select|Wrap|Line Numbers
- union : [t, z, y, x]
- intersection : [z, y]
- difference : [x]
- symDifference: [t, x]
- combinations : [[], [z, x], [z, y], [z], [y], [z, y, x], [x], [y, x]]
- partitions : [[[z, x], [y]], [[z, y], [x]], [[z], [y], [x]], [[z, y, x]], [[z], [y, x]]]
The chapter described a bit of set juggling and fiddling with the generics
feature of Java. I'm sure you'll have to read this little article a couple of
times to fully appreciate the generic feature and the recursive implementation
of both operations.
I hope to see you next week and,
kind regards,
Jos