# Set operations on object attributes question

Hi,

I have run into something I would like to do, but am not sure how to
code it up. I would like to perform 'set-like' operations (union,
intersection, etc) on a set of objects, but have the set operations
based on an attribute of the object, rather than the whole object.

For instance, say I have (pseudo-code):

LoTuples1 = [(1,1,0),(1,2,1),(1,3,3)]
Set1=set(LoTuples1)
LoTuples2 = [(2,1,3),(2,2,4),(2,3,2)]
Set2=set(LoTuples2)

What I would like to be able to do is:

Set3 = Set1union(Set2)
Set3.intersection(Set2, <use object>)

to return:
set([(2,1,3), (1,3,3)])

How can one do this operation?

Thanks,
Duane

Oct 23 '07 #1
Put your data in a class, and implement __hash__ and __eq__
Finally, put your objects in sets.

Albert

Oct 23 '07 #2
Conceptually, there is more than one operation going on. First,
finding the attributes shared in both sets:
ca = set(t for t in LoTuples1) & set(t for t in LoTuples2)
which gives:
set()

Second, find any tuple which has that attribute (including multiple
results for the same attribute):
set(t for t in (LoTuples1 + LoTuples2) if t in ca)
which returns:
set([(2, 1, 3), (1, 3, 3)])

Wanting multiple results for the same attribute value (i.e. both
(2,1,3) and (1,3,3) have 3 in the second position) is why multiple
steps are needed; otherwise, the behavior of intersection() is to
return a single representative of the equivalence class.
Raymond

Oct 23 '07 #3
Hi,

Thanks for the response! (See below for more discussion)

Conceptually, there is more than one operation going on. First,
finding the attributes shared in both sets:
ca = set(t for t in LoTuples1) & set(t for t in LoTuples2)
which gives:
set()
In my use case, I already know object is the key I wish to use, so
I could do this without the first step.
I am thinking of object[n] as the 'key' I wish to operate on, and the
other items in the tuple are useful attributes
I'll need in the end.
>
Second, find any tuple which has that attribute (including multiple
results for the same attribute):
set(t for t in (LoTuples1 + LoTuples2) if t in ca)
which returns:
set([(2, 1, 3), (1, 3, 3)])

Wanting multiple results for the same attribute value (i.e. both
(2,1,3) and (1,3,3) have 3 in the second position) is why multiple
steps are needed; otherwise, the behavior of intersection() is to
return a single representative of the equivalence class.
This second operation is really much like what I cam up with before I
started looking into exploiting the power of sets
(this same operation can be done strictly with lists, right?)

Since what I _really_ wanted from this was the intersection of the
objects (based on attribute 2), I was looking for a set-based
solution,
kinda like a decorate - <set operation- undecorate pattern. Perhaps
the problem does not fall into that category.

Thanks for your help,
Duane

Oct 23 '07 #4
Since what I _really_ wanted from this was the intersection of the
objects (based on attribute 2), I was looking for a set-based
solution,
kinda like a decorate - <set operation- undecorate pattern. Perhaps
the problem does not fall into that category.
The "kinda" part is where the idea falls down. If you've decorated
the inputs with a key function (like the key= argument to sorted()),
then the intersection step will return only a single element result,
not both matches as you specified in your original request. IOW,
you cannot get set([(2, 1, 3), (1, 3, 3)]) as a result if both
set members are to be treated as equal.

It would be rather like writing set().intersection(set([1.0]))
and expecting to get set([1, 1.0]) as a result. Does your
intuition support having an intersection return a set larger
than either of the two inputs?
Raymond

Oct 23 '07 #5
Decorating and undecorating do not apply to your example. If all the
input were decorated with new equality and hash methods, then the
intersection step would return just a single element, not the two that
you specified.

Compare this with ints and floats which already have cross-type
equality and hash functions. Running, set().intersection([1.0]),
would you expect set([1.0]) or set([1, 1.0])? Your example specified
the latter behavior which has the unexpected result where the
set intersection returns more elements than exist in either input.
Raymond

Oct 23 '07 #6

