473,394 Members | 1,645 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,394 software developers and data experts.

histogram type thingy for (unique) dict items

hi. I've been banging my head against this one a while and have asked
around, and i
am throwing this one out there in the hopes that some one can shed
some light on
what has turned out to be a tough problem for me (though i am getting
closer).

i have been mucking with a lot of data in a dictionary that looks
like:

events = { (2, 1, 0) : [8, 3, 5, 4],
(2, 7, 0) : [4, 3, 2, 2],
(2, 14, 0) : [8, 3, 5, 4],
(2, 18, 0) : [10, 2, 8, 7],
(2, 20, 0) : [10, 0, 5, 7],
(2, 22, 0) : [10, 2, 8, 7],
(2, 24, 0) : [7, 9, 3, 8],
(2, 28, 0) : [10, 0, 5, 7],
(2, 29, 0) : [10, 11],
(2, 30, 0) : [8, 3, 5, 4],
(2, 32, 0) : [5, 0, 10, 7],
(2, 34, 0) : [8, 3, 7, 9],
(2, 36, 0) : [5, 4, 3, 1],
(2, 36, 1) : [5, 4, 3, 1, 7], # GNA
(2, 37, 0) : [0, 8, 2, 4, 9, 10, 1],
(2, 37, 1) : [0, 8, 2, 4, 9, 10, 1, 6], # GNA
(2, 39, 0) : [8, 10, 1, 9],
(2, 39, 1) : [8, 10, 1, 9, 7], # GNA
(2, 41, 0) : [2, 0, 3, 1],
(2, 41, 1) : [2, 0, 3, 1, 6], # GNA
# ~~~~~~~~~~~~~~~~~~~~ page 3 ~~~~~~~~~~~~~~~~~~~~
(3, 43, 0) : [3, 2, 4],
(3, 44, 0) : [0, 8, 2, 4, 9, 10, 1],
(3, 44, 1) : [0, 8, 2, 4, 9, 10, 1, 6] } # GNA

pages and pages of it, this is just a tiny slice...

The tuple (key) represents a point time (page, line, event) and the
lists are my values for
that time. It happens that there are many times that have the same
data values [that is,
the list items are the same]... what i want to do now is take each
_unique_ value
list (there are only 120 or so of them as my unique.py function
reports) and tell me where they occur so that, i would get a kind of
histogram with the values and a list of all the places this
item occurs. A further wrinkle (the one i am really stuck on) is that
i want just one sorted
value list and not several different entries for what is the same
input set in a different
order so that not all my time value pairs are accounted for properly.
Ideally all the
keys (locations/times) that fit the same values would be grouped
together
so that the input of:

foo = { (5, 138, 1) : [ 0, 2, 7 ],
(7, 264, 1) : [ 0, 2, 7 ],
(9, 367, 0) : [ 0, 2, 7 ],
(5, 156, 1) : [ 0, 7, 2 ],
(8, 315, 1) : [ 0, 7, 2 ],
(8, 317, 1) : [ 0, 7, 2 ] }

would give me the [unique sorted value](0, 2, 7)
with all the applicable locations.
(0, 2, 7) --> [(7, 264, 1), (5, 138, 1), (5, 156, 1),(8, 315, 1), (8,
317, 1), (9, 367, 0)]

instead i get some locations listed for sets:
(0, 2, 7)
(0, 7, 2)
(2, 0, 7)
(2, 7, 0)
(7, 0, 2)

etc. though for my purposes these are the same and i want them all to
be together.

i know that if it was i list on input i can do this:

# Now sort each individual set, but we don't want to have dupl.
elements either so we call unique here too
for each_event2 in all_events2:
ndupes = unique(each_event2)
ndupes.sort()
outlist.append(ndupes)
# find only the unique sets
unique_items = unique(outlist)
unique_item_count = 1
but i am not sure how to handle this in the dictionary scenario, since
in this case i
am almost treating values as keys and vice versa, and need only unique
sorted item, so
that all the times that show up for

(0, 2, 7)
(0, 7, 2)
(2, 0, 7)
(2, 7, 0)
(7, 0, 2) ...

are together and accounted for (ideally, id like to count the items
too)

I other words i want a sort of histogram of each unique event telling
me at what key
they happen and how many of each unique event there is.

i've searched http://aspn.activestate.com/ASPN/Mai...d/python-Tutor
and
the cookbook and i don't see anything, except stuff like
http://aspn.activestate.com/ASPN/Pyt...k/Recipe/52306
and a google turns up some crazy incomprehensible very advanced bells
and
whistles graphic things that just is a heck of a lot more than i can
wrap my pea sized
brain around and is just way more than i need.

So, if anyone has such a beast or wants to help me get started i would
be grateful. As you can see my code is getting more an more tangled
up:
def histo(input):
events = input.copy()
all_events = events.values()[:] # make new lists to be safe.
outlist = []
# Now sort each individual set, but we don't want to have
duplicate
# elements either so we call unique here too
for each_event in all_events:
ndupes = unique(each_event)
ndupes.sort()
outlist.append(ndupes)
# find only the unique sets
unique_items = unique(outlist)
unique_item_count = 1
line_num = 1
print '=' * 62, '\n -+ ---------- +- sets & their locations',
print '... (', len(unique_items), ') events :\n', '-' * 62
print "line#\tset\tLocations:"
n = {}
for key in events.keys():
try:
n[tuple(events[key])].append(key)
except KeyError:
n[tuple(events[key])] = [key]
nsortkeys = n.keys() # first make a copy of the keys
nsortkeys.sort() # now sort that copy in place
foobar = n.keys()
print foobar
foo2 = unique(foobar)
for xx in foo2:
print 'xx = ', xx
for key, value in n.items():
print line_num,"\t", str(key), "\t", str(value)
line_num = line_num + 1
####

cheers,
kevin
Jul 18 '05 #1
2 2278
>>>>> "kevin" == kevin parks <kp**@lycos.com> writes:

kevin> hi. I've been banging my head against this one a while and
kevin> have asked around, and i am throwing this one out there in
kevin> the hopes that some one can shed some light on what has
kevin> turned out to be a tough problem for me (though i am
kevin> getting closer).

kevin> i have been mucking with a lot of data in a dictionary that
kevin> looks like:

If I'm understanding you correctly, something like the following
should speed you on your way:

foo = { (5, 138, 1) : [ 0, 2, 7 ],
(7, 264, 1) : [ 0, 2, 7 ],
(9, 367, 0) : [ 0, 2, 7 ],
(5, 156, 1) : [ 0, 7, 2 ],
(8, 315, 1) : [ 0, 7, 2 ],
(8, 317, 1) : [ 0, 7, 2 ],
(1,2,3) : [4,5,6],}
# reverse the dict so that the values point to a list of keys which
# share that value, ignoring the order of the vale
rd = {} # reverse dict
for key, val in foo.items():
val.sort()
rd.setdefault(tuple(val),[]).append(key)

for key,val in rd.items():
print key,val

# make a count dictionary
countd = {}
for key, val in rd.items():
countd[key] = len(val)
print key, countd[key]
You can easily sort the count dictionary - search google for order
dictionary by values.

Hope this helps,
JDH

Jul 18 '05 #2
John hit it on the nose. That is how I would do it.

- Josiah
Jul 18 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

7
by: Marcio Rosa da Silva | last post by:
Hi! In dictionaries, unlinke lists, it doesn't matter the order one inserts the contents, elements are stored using its own rules. Ex: >>> d = {3: 4, 1: 2} >>> d {1: 2, 3: 4}
6
by: Ric Deez | last post by:
Hi there, I have a list: L1 = How can I easily turn this into a list of tuples where the first element is the list element and the second is the number of times it occurs in the list (I...
3
by: Bengt Richter | last post by:
Has anyone found a way besides not deriving from dict? Shouldn't there be a way? TIA (need this for what I hope is an improvement on the Larosa/Foord OrderedDict ;-) I guess I can just document...
12
by: KraftDiner | last post by:
Hi, I wrote a C++ class that implements an n dimensional histogram in C++, using stl maps and vectors. I want to code this up now in Python and would like some input from this group. The C++...
15
by: Jay Tee | last post by:
Hi, I have some code that does, essentially, the following: - gather information on tens of thousands of items (in this case, jobs running on a compute cluster) - store the information as a...
5
by: Greg Corradini | last post by:
Hello All, I'm attempting to create multiple dictionaries at once, each with unique variable names. The number of dictionaries i need to create depends on the length of a list, which was returned...
3
by: =?Utf-8?B?YW1pcg==?= | last post by:
Hi, I have a Generic Object that has a private List field item. I populate the List in a different function I use a FOREACH LOOP with a FindAll function to get all items that have a certain...
3
by: Phoe6 | last post by:
I have a requirement for using caseless dict. I searched the web for many different implementations and found one snippet which was implemented in minimal and useful way. ############# import...
2
by: dpapathanasiou | last post by:
If I define a dictionary where one or more of the values is also a dictionary, e.g.: my_dict={"a":"string", "b":"string", "c":{"x":"0","y":"1"}, "d":"string"} How can I use the output of...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.