histogram type thingy for (unique) dict items

kevin parks

hi. I've been banging my head against this one a while and have asked
around, and i
am throwing this one out there in the hopes that some one can shed
some light on
what has turned out to be a tough problem for me (though i am getting
closer).

i have been mucking with a lot of data in a dictionary that looks
like:

events = { (2, 1, 0) : [8, 3, 5, 4],
(2, 7, 0) : [4, 3, 2, 2],
(2, 14, 0) : [8, 3, 5, 4],
(2, 18, 0) : [10, 2, 8, 7],
(2, 20, 0) : [10, 0, 5, 7],
(2, 22, 0) : [10, 2, 8, 7],
(2, 24, 0) : [7, 9, 3, 8],
(2, 28, 0) : [10, 0, 5, 7],
(2, 29, 0) : [10, 11],
(2, 30, 0) : [8, 3, 5, 4],
(2, 32, 0) : [5, 0, 10, 7],
(2, 34, 0) : [8, 3, 7, 9],
(2, 36, 0) : [5, 4, 3, 1],
(2, 36, 1) : [5, 4, 3, 1, 7], # GNA
(2, 37, 0) : [0, 8, 2, 4, 9, 10, 1],
(2, 37, 1) : [0, 8, 2, 4, 9, 10, 1, 6], # GNA
(2, 39, 0) : [8, 10, 1, 9],
(2, 39, 1) : [8, 10, 1, 9, 7], # GNA
(2, 41, 0) : [2, 0, 3, 1],
(2, 41, 1) : [2, 0, 3, 1, 6], # GNA
# ~~~~~~~~~~~~~~~~~~~~ page 3 ~~~~~~~~~~~~~~~~~~~~
(3, 43, 0) : [3, 2, 4],
(3, 44, 0) : [0, 8, 2, 4, 9, 10, 1],
(3, 44, 1) : [0, 8, 2, 4, 9, 10, 1, 6] } # GNA

pages and pages of it, this is just a tiny slice...

The tuple (key) represents a point time (page, line, event) and the
lists are my values for
that time. It happens that there are many times that have the same
data values [that is,
the list items are the same]... what i want to do now is take each
_unique_ value
list (there are only 120 or so of them as my unique.py function
reports) and tell me where they occur so that, i would get a kind of
histogram with the values and a list of all the places this
item occurs. A further wrinkle (the one i am really stuck on) is that
i want just one sorted
value list and not several different entries for what is the same
input set in a different
order so that not all my time value pairs are accounted for properly.
Ideally all the
keys (locations/times) that fit the same values would be grouped
together
so that the input of:

foo = { (5, 138, 1) : [ 0, 2, 7 ],
(7, 264, 1) : [ 0, 2, 7 ],
(9, 367, 0) : [ 0, 2, 7 ],
(5, 156, 1) : [ 0, 7, 2 ],
(8, 315, 1) : [ 0, 7, 2 ],
(8, 317, 1) : [ 0, 7, 2 ] }

would give me the [unique sorted value](0, 2, 7)
with all the applicable locations.
(0, 2, 7) --> [(7, 264, 1), (5, 138, 1), (5, 156, 1),(8, 315, 1), (8,
317, 1), (9, 367, 0)]

instead i get some locations listed for sets:
(0, 2, 7)
(0, 7, 2)
(2, 0, 7)
(2, 7, 0)
(7, 0, 2)

etc. though for my purposes these are the same and i want them all to
be together.

i know that if it was i list on input i can do this:

# Now sort each individual set, but we don't want to have dupl.
elements either so we call unique here too
for each_event2 in all_events2:
ndupes = unique(each_event2)
ndupes.sort()
outlist.append(ndupes)
# find only the unique sets
unique_items = unique(outlist)
unique_item_count = 1
but i am not sure how to handle this in the dictionary scenario, since
in this case i
am almost treating values as keys and vice versa, and need only unique
sorted item, so
that all the times that show up for

(0, 2, 7)
(0, 7, 2)
(2, 0, 7)
(2, 7, 0)
(7, 0, 2) ...

are together and accounted for (ideally, id like to count the items
too)

I other words i want a sort of histogram of each unique event telling
me at what key
they happen and how many of each unique event there is.

i've searched http://aspn.activestate.com/ASPN/Mai...d/python-Tutor
and
the cookbook and i don't see anything, except stuff like
http://aspn.activestate.com/ASPN/Pyt...k/Recipe/52306
and a google turns up some crazy incomprehensible very advanced bells
and
whistles graphic things that just is a heck of a lot more than i can
wrap my pea sized
brain around and is just way more than i need.

So, if anyone has such a beast or wants to help me get started i would
be grateful. As you can see my code is getting more an more tangled
up:
def histo(input):
events = input.copy()
all_events = events.values()[:] # make new lists to be safe.
outlist = []
# Now sort each individual set, but we don't want to have
duplicate
# elements either so we call unique here too
for each_event in all_events:
ndupes = unique(each_event)
ndupes.sort()
outlist.append(ndupes)
# find only the unique sets
unique_items = unique(outlist)
unique_item_count = 1
line_num = 1
print '=' * 62, '\n -+ ---------- +- sets & their locations',
print '... (', len(unique_items), ') events :\n', '-' * 62
print "line#\tset\tLocations:"
n = {}
for key in events.keys():
try:
n[tuple(events[key])].append(key)
except KeyError:
n[tuple(events[key])] = [key]
nsortkeys = n.keys() # first make a copy of the keys
nsortkeys.sort() # now sort that copy in place
foobar = n.keys()
print foobar
foo2 = unique(foobar)
for xx in foo2:
print 'xx = ', xx
for key, value in n.items():
print line_num,"\t", str(key), "\t", str(value)
line_num = line_num + 1
####

cheers,
kevin

Jul 18 '05 #1

Subscribe Post Reply

2278

John Hunter

>>>>> "kevin" == kevin parks <kp**@lycos.com> writes:

kevin> hi. I've been banging my head against this one a while and
kevin> have asked around, and i am throwing this one out there in
kevin> the hopes that some one can shed some light on what has
kevin> turned out to be a tough problem for me (though i am
kevin> getting closer).

kevin> i have been mucking with a lot of data in a dictionary that
kevin> looks like:

If I'm understanding you correctly, something like the following
should speed you on your way:

foo = { (5, 138, 1) : [ 0, 2, 7 ],
(7, 264, 1) : [ 0, 2, 7 ],
(9, 367, 0) : [ 0, 2, 7 ],
(5, 156, 1) : [ 0, 7, 2 ],
(8, 315, 1) : [ 0, 7, 2 ],
(8, 317, 1) : [ 0, 7, 2 ],
(1,2,3) : [4,5,6],}
# reverse the dict so that the values point to a list of keys which
# share that value, ignoring the order of the vale
rd = {} # reverse dict
for key, val in foo.items():
val.sort()
rd.setdefault(tuple(val),[]).append(key)

for key,val in rd.items():
print key,val

# make a count dictionary
countd = {}
for key, val in rd.items():
countd[key] = len(val)
print key, countd[key]
You can easily sort the count dictionary - search google for order
dictionary by values.

Hope this helps,
JDH

Jul 18 '05 #2

Josiah Carlson

John hit it on the nose. That is how I would do it.

- Josiah

Jul 18 '05 #3

Similar topics

Order of elements in a dict

by: Marcio Rosa da Silva | last post by:

Hi! In dictionaries, unlinke lists, it doesn't matter the order one inserts the contents, elements are stored using its own rules. Ex: >>> d = {3: 4, 1: 2} >>> d {1: 2, 3: 4}

Python

Returning histogram-like data for items in a list

by: Ric Deez | last post by:

Hi there, I have a list: L1 = How can I easily turn this into a list of tuples where the first element is the list element and the second is the number of times it occurs in the list (I...

Python

Looking for magic method to override to prevent dict(d) from grabbing subclass inst d contents directly

by: Bengt Richter | last post by:

Has anyone found a way besides not deriving from dict? Shouldn't there be a way? TIA (need this for what I hope is an improvement on the Larosa/Foord OrderedDict ;-) I guess I can just document...

Python

nDimensional sparse histogram in python.

by: KraftDiner | last post by:

Hi, I wrote a C++ class that implements an n dimensional histogram in C++, using stl maps and vectors. I want to code this up now in Python and would like some input from this group. The C++...

Python

How to test if one dict is subset of another?

by: Jay Tee | last post by:

Hi, I have some code that does, essentially, the following: - gather information on tens of thousands of items (in this case, jobs running on a compute cluster) - store the information as a...

Python

Creating Unique Dictionary Variables from List

by: Greg Corradini | last post by:

Hello All, I'm attempting to create multiple dictionaries at once, each with unique variable names. The number of dictionaries i need to create depends on the length of a list, which was returned...

Python

Retrieving Unique Items from a List

by: =?Utf-8?B?YW1pcg==?= | last post by:

Hi, I have a Generic Object that has a private List field item. I populate the List in a different function I use a FOREACH LOOP with a FindAll function to get all items that have a certain...

C# / C Sharp

caseless dict - questions

by: Phoe6 | last post by:

I have a requirement for using caseless dict. I searched the web for many different implementations and found one snippet which was implemented in minimal and useful way. ############# import...

Python

Using the result of type() in a boolean statement?

by: dpapathanasiou | last post by:

If I define a dictionary where one or more of the values is also a dictionary, e.g.: my_dict={"a":"string", "b":"string", "c":{"x":"0","y":"1"}, "d":"string"} How can I use the output of...

Python

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing