473,395 Members | 1,516 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,395 software developers and data experts.

Complex Nested Dictionaries

To list,

I'm trying to figure out the best approach to the following problem:

I have four variables:
1) headlines
2) times
3) states
4) zones

At this time, I'm thinking of creating a dictionary, headlinesDB, that
stores different headlines and their associated time(s), state(s), and
zone(s). The complexity is that each headline can have one or more times,
one or more states, and one or more zones. However, there can only be 1
zone per time, and 1 zone per state. What is the best way to tackle this
particular problem?

Here's an example of the complexity:

Let's say we have a "High Wind Warning" for our headline or hazard. In
addition, there are currently two "High Wind Warnings" in effect. The first
goes from Tonight through Friday morning (i.e., I'll probably store the
begin/end times in seconds from 1/1/1970). It affects three counties all in
the state of Oregon: ORZ047, ORZ048, and ORZ049. The second High Wind
Warning is in effect from Friday at Noon through Friday evening. It affects
two counties in two separate states: ORZ044 in Oregon and WAZ028 in
Washington. Here's the flow chart:

High Wind Warning --> time1 --> state1 --> zone1, zone2, zone3
|
--> time2 --> state1 --> zone4
--> state2 --> zone5

Keep in mind, each headline or hazard can have multiple times. Each time
will have one or more states with each state containing one or more zones.
Is there a better way than a dictionary. As mentioned above, the headline
or hazard is the key I'll be extracting all the information from.

Thanks in advance,

Tom
Jul 18 '05 #1
9 2074
T. Earle wrote:
To list,

I'm trying to figure out the best approach to the following problem:

I have four variables:
1) headlines
2) times
3) states
4) zones

At this time, I'm thinking of creating a dictionary, headlinesDB, that
stores different headlines and their associated time(s), state(s), and
zone(s). The complexity is that each headline can have one or more times,
one or more states, and one or more zones. However, there can only be 1
zone per time, and 1 zone per state. What is the best way to tackle this
particular problem?

Here's an example of the complexity:

Let's say we have a "High Wind Warning" for our headline or hazard. In
addition, there are currently two "High Wind Warnings" in effect. The first
goes from Tonight through Friday morning (i.e., I'll probably store the
begin/end times in seconds from 1/1/1970). It affects three counties all in
the state of Oregon: ORZ047, ORZ048, and ORZ049. The second High Wind
Warning is in effect from Friday at Noon through Friday evening. It affects
two counties in two separate states: ORZ044 in Oregon and WAZ028 in
Washington. Here's the flow chart:

High Wind Warning --> time1 --> state1 --> zone1, zone2, zone3
|
--> time2 --> state1 --> zone4
--> state2 --> zone5

Keep in mind, each headline or hazard can have multiple times. Each time
will have one or more states with each state containing one or more zones.
Is there a better way than a dictionary. As mentioned above, the headline
or hazard is the key I'll be extracting all the information from.

Thanks in advance,

Tom


I'd recommend the mx.DateTime package for storing the times instead of
seconds. That module includes many useful functions may be need so give
it a look.
http://www.egenix.com/files/python/mxDateTime.html
Secondly, although I am not 100% sure about the stated problem I would
recommend that instead of nested dictionaries you use a tuple as a
key,say, headlines[(time,state,zone)]=someValue
From what you say above it would seem that this would create a unique
key for all the mentioned situations.

Jul 18 '05 #2
In article <40******@news.bmi.net>, "T. Earle" <tn*********@bmi.net>
wrote:
...
High Wind Warning --> time1 --> state1 --> zone1, zone2, zone3
|
--> time2 --> state1 --> zone4
--> state2 --> zone5

Keep in mind, each headline or hazard can have multiple times. Each time
will have one or more states with each state containing one or more zones.
Is there a better way than a dictionary. As mentioned above, the headline
or hazard is the key I'll be extracting all the information from.


If you really only want to look up data by headline, then a dictionary
of dictionaries or nested lists or some other kind of collection is easy
and should suffice. For instance:
warndict["High Wind Warning"] = (
(time1, {
state1: (zone1, zone2, zone3),
state2: (zone1, zone3),
}),
(time2, {...}),
)

However, I suspect you will also want to be able to locate data by
state, time or zone. If that is true, I really think you should consider
storing the data in a relational database. It sounds like a perfect
match to your problem. Python has some nice interfaces to various
databases (including PostgreSQL and MySQL).

-- Russell

P.S. if you do go with the dictionary, note that it is very easy to make
a variant dictionary that defines
a[key] = foo
to mean "if list a[key] exists, then append foo to that list, otherwise
create a new list with foo as its only element" (in fact my RO package
contains just such a class: RO.Alg.MultiDict -- see <http://www.astro.washington.edu/owen/ROPython.html>)
Jul 18 '05 #3
Russell,
If you really only want to look up data by headline, then a dictionary
of dictionaries or nested lists or some other kind of collection is easy
and should suffice. For instance:
warndict["High Wind Warning"] = (
(time1, {
state1: (zone1, zone2, zone3),
state2: (zone1, zone3),
}),
(time2, {...}),
)
This definitely seems to be the structure I've been looking for or at least
have in mind. Since I'm no expert, could offer some code examples on how to
create this structure on the fly?
However, I suspect you will also want to be able to locate data by
state, time or zone. If that is true, I really think you should consider
storing the data in a relational database. It sounds like a perfect
match to your problem. Python has some nice interfaces to various
databases (including PostgreSQL and MySQL).


My first inclination was to go with a database; however, I thought about it
and concluded there may be too much variability each time the program is
executed. For example, there will be times when there are no headlines;
other times, there will be numerous headlines. Because of this variability,
the database would have to be created from scratch each time the program is
ran. As a result, would a database still be the right choice?

I really appreciate your help and suggestions

T. Earle
Jul 18 '05 #4
T. Earle wrote:
Russell,

If you really only want to look up data by headline, then a dictionary
of dictionaries or nested lists or some other kind of collection is easy
and should suffice. For instance:
warndict["High Wind Warning"] = (
(time1, {
state1: (zone1, zone2, zone3),
state2: (zone1, zone3),
}),
(time2, {...}),
) This definitely seems to be the structure I've been looking for or at least
have in mind. Since I'm no expert, could offer some code examples on how to
create this structure on the fly?

For something very much (but not quite) like the above:

warndict['High Wind Warning'] = {
time1: {
state1: [zone1, zone2, zone3],
state2: [zone1, zone3]},
time2: {...},
...}

can be built with something like:
warndict = {}
for headline, time, state, zone in somesource:
timedict = warndict.setdefault(headline, {})
statedict = timedict.setdefault(time, {})
stateentry = statedict.setdefault(state, [])
stateentry.append(zone)

My first inclination was to go with a database; however, I thought about it
and concluded there may be too much variability each time the program is
executed. For example, there will be times when there are no headlines;
other times, there will be numerous headlines. Because of this variability,
the database would have to be created from scratch each time the program is
ran. As a result, would a database still be the right choice?

It really depends on the volume of data and the kinds of searches.
Anything under a thousand or so entries will be searchable by simple
brute force in reasonable time, so internal data structures may well
be the way to go.

--
-Scott David Daniels
Sc***********@Acm.Org
Jul 18 '05 #5
Scot,

I really appreciate your help and code. It really helps me to understand
the underlying solution to my problem. I have another question though,
what's the best way to test if the headline already exists? If it does not,
I need to create it along with the required associated data; however, if it
already exists, I need to test to ensure I'm not already adding data that's
already there (e.g., time and/or state already exists). Basically, I
envision, if the state already exists all I need to do is add the new zone.
I probably should check to make sure the zone doesn't already exists too.
Any help would be greatly appreciated. I believe it would be similiar to
what Russell mentioned in his previous responce:

"if list a[key] exists, then append; otherwise, create a new list"

Would it be possible to supply a code snippet of this logic to get me
started? What are state and time? Is it possible to use the "key" keyword
on these variables to test for their existence? I apologize for my lack of
knowledge in this particular realm of programming in Python. Nested
dictionaries have always given me trouble.

Thanks,

T. Earle

warndict['High Wind Warning'] = {
time1: {
state1: [zone1, zone2, zone3],
state2: [zone1, zone3]},
time2: {...},
...}

can be built with something like:
warndict = {}
for headline, time, state, zone in somesource:
timedict = warndict.setdefault(headline, {})
statedict = timedict.setdefault(time, {})
stateentry = statedict.setdefault(state, [])
stateentry.append(zone)

Jul 18 '05 #6
T. Earle wrote:
Scot,

I really appreciate your help and code. It really helps me to understand
the underlying solution to my problem. I have another question though,
what's the best way to test if the headline already exists? If it does not,
I need to create it along with the required associated data; however, if it
already exists, I need to test to ensure I'm not already adding data that's
already there (e.g., time and/or state already exists). Basically, I
envision, if the state already exists all I need to do is add the new zone.
I probably should check to make sure the zone doesn't already exists too.
Any help would be greatly appreciated. I believe it would be similiar to
what Russell mentioned in his previous responce:

"if list a[key] exists, then append; otherwise, create a new list"

Would it be possible to supply a code snippet of this logic to get me
started? What are state and time? Is it possible to use the "key" keyword
on these variables to test for their existence? I apologize for my lack of
knowledge in this particular realm of programming in Python. Nested
dictionaries have always given me trouble.

Thanks,

T. Earle


Check out "MultiDict.py" at "http://members.tripod.com/~edcjones/".

Ed Jones
Jul 18 '05 #7
T. Earle wrote:
"if list a[key] exists, then append; otherwise, create a new list"


when a is a dict, key the required key, and object the value you want
to insert;

a.setdefault(key,[]).append(object)

--Irmen
Jul 18 '05 #8
has
"T. Earle" <tn*********@bmi.net> wrote in message news:<40******@news.bmi.net>...
To list,

I'm trying to figure out the best approach to the following problem:

I have four variables:
1) headlines
2) times
3) states
4) zones

At this time, I'm thinking of creating a dictionary, headlinesDB, that
stores different headlines and their associated time(s), state(s), and
zone(s). The complexity is that each headline can have one or more times,
one or more states, and one or more zones. However, there can only be 1
zone per time, and 1 zone per state. What is the best way to tackle this
particular problem?


Shake out non-essential complexity first. Not really up on relational
DBs and stuff, so take my attempts at table design with a pinch of
salt, but think I'd break your problem down something like this:
- Hazard Type Table
TYPE
High Wind Warning
Tornado Warning
Blizzard Warning

- Hazard Event Table
ID TYPE START END
ZONES
1 High Wind Warning 2004-03-01-22-00-00 2004-03-02-08-00-00
[ORZ047, ORZ048, ORZ049]
2 High Wind Warning 2004-03-02-12-00-00 2004-03-02-20-00-00
[ORZ044, WAZ028]

- Zone Table
ZONE STATE
ORZ044 Oregon
ORZ047 Oregon
ORZ048 Oregon
ORZ049 Oregon
WAZ028 Washington
Note this organises around individual hazard 'events', rather than
hazard types, making it easier to think see what's going on. Also,
because Zones already identify their States, there's no need to put
State info into hazard events. (State names, if you need them, can be
looked up separately.)

How you actually implement it - as a relational DB/a list of
HazardEvent instances stuffed into a list and brute-force searched via
list comprehensions/nested dicts and lists - really depends on how
you're going to manipulate it, how much flexibility/simplicity you
need, etc.

HTH

has
Jul 18 '05 #9
In article <40******@news.bmi.net>, "T. Earle" <tn*********@bmi.net>
wrote:
Russell,
If you really only want to look up data by headline, then a dictionary
of dictionaries or nested lists or some other kind of collection is easy
and should suffice. For instance:
warndict["High Wind Warning"] = (
(time1, {
state1: (zone1, zone2, zone3),
state2: (zone1, zone3),
}),
(time2, {...}),
)


This definitely seems to be the structure I've been looking for or at least
have in mind. Since I'm no expert, could offer some code examples on how to
create this structure on the fly?
However, I suspect you will also want to be able to locate data by
state, time or zone. If that is true, I really think you should consider
storing the data in a relational database. It sounds like a perfect
match to your problem. Python has some nice interfaces to various
databases (including PostgreSQL and MySQL).


My first inclination was to go with a database; however, I thought about it
and concluded there may be too much variability each time the program is
executed. For example, there will be times when there are no headlines;
other times, there will be numerous headlines. Because of this variability,
the database would have to be created from scratch each time the program is
ran. As a result, would a database still be the right choice?

I really appreciate your help and suggestions


Regarding a database: if you are mainly interested in fairly current
events (rather than being able to go back and search for old events) and
you don't have a huge # of events, then a database does seem "overkill".

However, if you have a lot of events or want to do a lot of searching,
it may be worth keeping a database around. If you use a database, I
recommend creating only one of them. Just add new events, and
occasionally purge old data if you don't care about it anymore.

Here is some sample code (untested) to create the structure shown above.
I assume a simple (for me) structure for the input data; modify
addHealine accordingly if your data needs more massaging first.

This code exposes the internal data, because the class is itself the
dictionary of data. Whether or not this is a good idea depends on how
you want to search for data. If the built in dict methods are of
interest, then you are all set. If not, I would make HeadDict *contain*
a dict instead of *being* a dict, then write your own methods to
retrieve data.

- Download the RO package from http://astro.washington.edu/owen and install it in site-packages
or anywhere on your PythonPath. RO includes RO.Alg.ListDict, which
supports a dictionary whose values are a list and for which the
expression md[key] = value appends "value" to the list associated with
"key", creating a new list if "key" doesn't already exist.

import RO.Alg

class HeadDict(RO.Alg.ListDict):
def addHeadline(self, headline, time, stateZoneList)
"""Add a headline for a given time. stateZoneList is of the form:
((state1, zones_for_state1), (state2, zones_for_state2), ...)
"""
stateZoneDict = dict(stateZoneList)
self[headline] = (time, stateZoneDict)

warndict = HeadDict()
warnDict.addHeadline("High Wind Warning", time1, stateZoneList1)
warnDict.addHeadline("High Wind Warning", time2, stateZoneList2)

-- Russell
Jul 18 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
by: Andy Baker | last post by:
Hi there, I'm learning Python at the moment and trying to grok the thinking behind it's scoping and nesting rules. I was googling for nested functions and found this Guido quote:...
0
by: mjcsfo | last post by:
I can't seem to find a reference nor any helpful threads on this topic. I've gotten the following error in two circumstances: 1. A complex type has nested within it another complex type, in the...
5
by: Trail Monster | last post by:
Ok, I've been searching the net now for several days and can't find how to do this anywhere. Version: VS 2005 Professional Release, 2.0 Framework Background: I have a complex business object...
4
by: techiepundit | last post by:
I'm a Python newbie who just started learning the language a few weeks ago. So these are beginner questions. I have a list of sockets that I use for select.select calls like this: ...
2
by: techiepundit | last post by:
I'm parsing some data of the form: OuterName1 InnerName1=5,InnerName2=7,InnerName3=34; OuterName2 InnerNameX=43,InnerNameY=67,InnerName3=21; OuterName3 .... and so on.... These are fake...
8
by: Brian L. Troutwine | last post by:
I've got a problem that I can't seem to get my head around and hoped somebody might help me out a bit: I've got a dictionary, A, that is arbitarily large and may contains ints, None and more...
16
by: IamIan | last post by:
Hello, I'm writing a simple FTP log parser that sums file sizes as it runs. I have a yearTotals dictionary with year keys and the monthTotals dictionary as its values. The monthTotals dictionary...
0
by: d80013 | last post by:
Hello all, I am trying to create a Dictionary of dictionaries in VBA. All I do is declare two dictionaries, one temporary one, and add the temporary dictionary to the main one recursively. The...
1
by: Matthew Schibler | last post by:
I'm a newbie to Python, with some experience using perl (where I used nested arrays and hashes extensively). I am building a script in python for a MUD I play, and I want to use the shelve module...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.