470,849 Members | 1,074 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 470,849 developers. It's quick & easy.

histogram type thingy for (unique) dict items

hi. I've been banging my head against this one a while and have asked
around, and i
am throwing this one out there in the hopes that some one can shed
some light on
what has turned out to be a tough problem for me (though i am getting
closer).

i have been mucking with a lot of data in a dictionary that looks
like:

events = { (2, 1, 0) : [8, 3, 5, 4],
(2, 7, 0) : [4, 3, 2, 2],
(2, 14, 0) : [8, 3, 5, 4],
(2, 18, 0) : [10, 2, 8, 7],
(2, 20, 0) : [10, 0, 5, 7],
(2, 22, 0) : [10, 2, 8, 7],
(2, 24, 0) : [7, 9, 3, 8],
(2, 28, 0) : [10, 0, 5, 7],
(2, 29, 0) : [10, 11],
(2, 30, 0) : [8, 3, 5, 4],
(2, 32, 0) : [5, 0, 10, 7],
(2, 34, 0) : [8, 3, 7, 9],
(2, 36, 0) : [5, 4, 3, 1],
(2, 36, 1) : [5, 4, 3, 1, 7], # GNA
(2, 37, 0) : [0, 8, 2, 4, 9, 10, 1],
(2, 37, 1) : [0, 8, 2, 4, 9, 10, 1, 6], # GNA
(2, 39, 0) : [8, 10, 1, 9],
(2, 39, 1) : [8, 10, 1, 9, 7], # GNA
(2, 41, 0) : [2, 0, 3, 1],
(2, 41, 1) : [2, 0, 3, 1, 6], # GNA
# ~~~~~~~~~~~~~~~~~~~~ page 3 ~~~~~~~~~~~~~~~~~~~~
(3, 43, 0) : [3, 2, 4],
(3, 44, 0) : [0, 8, 2, 4, 9, 10, 1],
(3, 44, 1) : [0, 8, 2, 4, 9, 10, 1, 6] } # GNA

pages and pages of it, this is just a tiny slice...

The tuple (key) represents a point time (page, line, event) and the
lists are my values for
that time. It happens that there are many times that have the same
data values [that is,
the list items are the same]... what i want to do now is take each
_unique_ value
list (there are only 120 or so of them as my unique.py function
reports) and tell me where they occur so that, i would get a kind of
histogram with the values and a list of all the places this
item occurs. A further wrinkle (the one i am really stuck on) is that
i want just one sorted
value list and not several different entries for what is the same
input set in a different
order so that not all my time value pairs are accounted for properly.
Ideally all the
keys (locations/times) that fit the same values would be grouped
together
so that the input of:

foo = { (5, 138, 1) : [ 0, 2, 7 ],
(7, 264, 1) : [ 0, 2, 7 ],
(9, 367, 0) : [ 0, 2, 7 ],
(5, 156, 1) : [ 0, 7, 2 ],
(8, 315, 1) : [ 0, 7, 2 ],
(8, 317, 1) : [ 0, 7, 2 ] }

would give me the [unique sorted value](0, 2, 7)
with all the applicable locations.
(0, 2, 7) --> [(7, 264, 1), (5, 138, 1), (5, 156, 1),(8, 315, 1), (8,
317, 1), (9, 367, 0)]

instead i get some locations listed for sets:
(0, 2, 7)
(0, 7, 2)
(2, 0, 7)
(2, 7, 0)
(7, 0, 2)

etc. though for my purposes these are the same and i want them all to
be together.

i know that if it was i list on input i can do this:

# Now sort each individual set, but we don't want to have dupl.
elements either so we call unique here too
for each_event2 in all_events2:
ndupes = unique(each_event2)
ndupes.sort()
outlist.append(ndupes)
# find only the unique sets
unique_items = unique(outlist)
unique_item_count = 1
but i am not sure how to handle this in the dictionary scenario, since
in this case i
am almost treating values as keys and vice versa, and need only unique
sorted item, so
that all the times that show up for

(0, 2, 7)
(0, 7, 2)
(2, 0, 7)
(2, 7, 0)
(7, 0, 2) ...

are together and accounted for (ideally, id like to count the items
too)

I other words i want a sort of histogram of each unique event telling
me at what key
they happen and how many of each unique event there is.

i've searched http://aspn.activestate.com/ASPN/Mai...d/python-Tutor
and
the cookbook and i don't see anything, except stuff like
http://aspn.activestate.com/ASPN/Pyt...k/Recipe/52306
and a google turns up some crazy incomprehensible very advanced bells
and
whistles graphic things that just is a heck of a lot more than i can
wrap my pea sized
brain around and is just way more than i need.

So, if anyone has such a beast or wants to help me get started i would
be grateful. As you can see my code is getting more an more tangled
up:
def histo(input):
events = input.copy()
all_events = events.values()[:] # make new lists to be safe.
outlist = []
# Now sort each individual set, but we don't want to have
duplicate
# elements either so we call unique here too
for each_event in all_events:
ndupes = unique(each_event)
ndupes.sort()
outlist.append(ndupes)
# find only the unique sets
unique_items = unique(outlist)
unique_item_count = 1
line_num = 1
print '=' * 62, '\n -+ ---------- +- sets & their locations',
print '... (', len(unique_items), ') events :\n', '-' * 62
print "line#\tset\tLocations:"
n = {}
for key in events.keys():
try:
n[tuple(events[key])].append(key)
except KeyError:
n[tuple(events[key])] = [key]
nsortkeys = n.keys() # first make a copy of the keys
nsortkeys.sort() # now sort that copy in place
foobar = n.keys()
print foobar
foo2 = unique(foobar)
for xx in foo2:
print 'xx = ', xx
for key, value in n.items():
print line_num,"\t", str(key), "\t", str(value)
line_num = line_num + 1
####

cheers,
kevin
Jul 18 '05 #1
2 2139
>>>>> "kevin" == kevin parks <kp**@lycos.com> writes:

kevin> hi. I've been banging my head against this one a while and
kevin> have asked around, and i am throwing this one out there in
kevin> the hopes that some one can shed some light on what has
kevin> turned out to be a tough problem for me (though i am
kevin> getting closer).

kevin> i have been mucking with a lot of data in a dictionary that
kevin> looks like:

If I'm understanding you correctly, something like the following
should speed you on your way:

foo = { (5, 138, 1) : [ 0, 2, 7 ],
(7, 264, 1) : [ 0, 2, 7 ],
(9, 367, 0) : [ 0, 2, 7 ],
(5, 156, 1) : [ 0, 7, 2 ],
(8, 315, 1) : [ 0, 7, 2 ],
(8, 317, 1) : [ 0, 7, 2 ],
(1,2,3) : [4,5,6],}
# reverse the dict so that the values point to a list of keys which
# share that value, ignoring the order of the vale
rd = {} # reverse dict
for key, val in foo.items():
val.sort()
rd.setdefault(tuple(val),[]).append(key)

for key,val in rd.items():
print key,val

# make a count dictionary
countd = {}
for key, val in rd.items():
countd[key] = len(val)
print key, countd[key]
You can easily sort the count dictionary - search google for order
dictionary by values.

Hope this helps,
JDH

Jul 18 '05 #2
John hit it on the nose. That is how I would do it.

- Josiah
Jul 18 '05 #3

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

7 posts views Thread by Marcio Rosa da Silva | last post: by
6 posts views Thread by Ric Deez | last post: by
12 posts views Thread by KraftDiner | last post: by
15 posts views Thread by Jay Tee | last post: by
5 posts views Thread by Greg Corradini | last post: by
3 posts views Thread by =?Utf-8?B?YW1pcg==?= | last post: by
3 posts views Thread by Phoe6 | last post: by
2 posts views Thread by dpapathanasiou | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.