To list,
I'm trying to figure out the best approach to the following problem:
I have four variables:
1) headlines
2) times
3) states
4) zones
At this time, I'm thinking of creating a dictionary, headlinesDB, that
stores different headlines and their associated time(s), state(s), and
zone(s). The complexity is that each headline can have one or more times,
one or more states, and one or more zones. However, there can only be 1
zone per time, and 1 zone per state. What is the best way to tackle this
particular problem?
Here's an example of the complexity:
Let's say we have a "High Wind Warning" for our headline or hazard. In
addition, there are currently two "High Wind Warnings" in effect. The first
goes from Tonight through Friday morning (i.e., I'll probably store the
begin/end times in seconds from 1/1/1970). It affects three counties all in
the state of Oregon: ORZ047, ORZ048, and ORZ049. The second High Wind
Warning is in effect from Friday at Noon through Friday evening. It affects
two counties in two separate states: ORZ044 in Oregon and WAZ028 in
Washington. Here's the flow chart:
High Wind Warning --> time1 --> state1 --> zone1, zone2, zone3
|
--> time2 --> state1 --> zone4
--> state2 --> zone5
Keep in mind, each headline or hazard can have multiple times. Each time
will have one or more states with each state containing one or more zones.
Is there a better way than a dictionary. As mentioned above, the headline
or hazard is the key I'll be extracting all the information from.
Thanks in advance,
Tom 9 2074
T. Earle wrote: To list,
I'm trying to figure out the best approach to the following problem:
I have four variables: 1) headlines 2) times 3) states 4) zones
At this time, I'm thinking of creating a dictionary, headlinesDB, that stores different headlines and their associated time(s), state(s), and zone(s). The complexity is that each headline can have one or more times, one or more states, and one or more zones. However, there can only be 1 zone per time, and 1 zone per state. What is the best way to tackle this particular problem?
Here's an example of the complexity:
Let's say we have a "High Wind Warning" for our headline or hazard. In addition, there are currently two "High Wind Warnings" in effect. The first goes from Tonight through Friday morning (i.e., I'll probably store the begin/end times in seconds from 1/1/1970). It affects three counties all in the state of Oregon: ORZ047, ORZ048, and ORZ049. The second High Wind Warning is in effect from Friday at Noon through Friday evening. It affects two counties in two separate states: ORZ044 in Oregon and WAZ028 in Washington. Here's the flow chart:
High Wind Warning --> time1 --> state1 --> zone1, zone2, zone3 | --> time2 --> state1 --> zone4 --> state2 --> zone5
Keep in mind, each headline or hazard can have multiple times. Each time will have one or more states with each state containing one or more zones. Is there a better way than a dictionary. As mentioned above, the headline or hazard is the key I'll be extracting all the information from.
Thanks in advance,
Tom
I'd recommend the mx.DateTime package for storing the times instead of
seconds. That module includes many useful functions may be need so give
it a look. http://www.egenix.com/files/python/mxDateTime.html
Secondly, although I am not 100% sure about the stated problem I would
recommend that instead of nested dictionaries you use a tuple as a
key,say, headlines[(time,state,zone)]=someValue
From what you say above it would seem that this would create a unique
key for all the mentioned situations.
In article <40******@news.bmi.net>, "T. Earle" <tn*********@bmi.net>
wrote: ... High Wind Warning --> time1 --> state1 --> zone1, zone2, zone3 | --> time2 --> state1 --> zone4 --> state2 --> zone5
Keep in mind, each headline or hazard can have multiple times. Each time will have one or more states with each state containing one or more zones. Is there a better way than a dictionary. As mentioned above, the headline or hazard is the key I'll be extracting all the information from.
If you really only want to look up data by headline, then a dictionary
of dictionaries or nested lists or some other kind of collection is easy
and should suffice. For instance:
warndict["High Wind Warning"] = (
(time1, {
state1: (zone1, zone2, zone3),
state2: (zone1, zone3),
}),
(time2, {...}),
)
However, I suspect you will also want to be able to locate data by
state, time or zone. If that is true, I really think you should consider
storing the data in a relational database. It sounds like a perfect
match to your problem. Python has some nice interfaces to various
databases (including PostgreSQL and MySQL).
-- Russell
P.S. if you do go with the dictionary, note that it is very easy to make
a variant dictionary that defines
a[key] = foo
to mean "if list a[key] exists, then append foo to that list, otherwise
create a new list with foo as its only element" (in fact my RO package
contains just such a class: RO.Alg.MultiDict -- see <http://www.astro.washington.edu/owen/ROPython.html>)
Russell, If you really only want to look up data by headline, then a dictionary of dictionaries or nested lists or some other kind of collection is easy and should suffice. For instance: warndict["High Wind Warning"] = ( (time1, { state1: (zone1, zone2, zone3), state2: (zone1, zone3), }), (time2, {...}), )
This definitely seems to be the structure I've been looking for or at least
have in mind. Since I'm no expert, could offer some code examples on how to
create this structure on the fly?
However, I suspect you will also want to be able to locate data by state, time or zone. If that is true, I really think you should consider storing the data in a relational database. It sounds like a perfect match to your problem. Python has some nice interfaces to various databases (including PostgreSQL and MySQL).
My first inclination was to go with a database; however, I thought about it
and concluded there may be too much variability each time the program is
executed. For example, there will be times when there are no headlines;
other times, there will be numerous headlines. Because of this variability,
the database would have to be created from scratch each time the program is
ran. As a result, would a database still be the right choice?
I really appreciate your help and suggestions
T. Earle
T. Earle wrote: Russell,
If you really only want to look up data by headline, then a dictionary of dictionaries or nested lists or some other kind of collection is easy and should suffice. For instance: warndict["High Wind Warning"] = ( (time1, { state1: (zone1, zone2, zone3), state2: (zone1, zone3), }), (time2, {...}), ) This definitely seems to be the structure I've been looking for or at least have in mind. Since I'm no expert, could offer some code examples on how to create this structure on the fly?
For something very much (but not quite) like the above:
warndict['High Wind Warning'] = {
time1: {
state1: [zone1, zone2, zone3],
state2: [zone1, zone3]},
time2: {...},
...}
can be built with something like:
warndict = {}
for headline, time, state, zone in somesource:
timedict = warndict.setdefault(headline, {})
statedict = timedict.setdefault(time, {})
stateentry = statedict.setdefault(state, [])
stateentry.append(zone)
My first inclination was to go with a database; however, I thought about it and concluded there may be too much variability each time the program is executed. For example, there will be times when there are no headlines; other times, there will be numerous headlines. Because of this variability, the database would have to be created from scratch each time the program is ran. As a result, would a database still be the right choice?
It really depends on the volume of data and the kinds of searches.
Anything under a thousand or so entries will be searchable by simple
brute force in reasonable time, so internal data structures may well
be the way to go.
--
-Scott David Daniels Sc***********@Acm.Org
Scot,
I really appreciate your help and code. It really helps me to understand
the underlying solution to my problem. I have another question though,
what's the best way to test if the headline already exists? If it does not,
I need to create it along with the required associated data; however, if it
already exists, I need to test to ensure I'm not already adding data that's
already there (e.g., time and/or state already exists). Basically, I
envision, if the state already exists all I need to do is add the new zone.
I probably should check to make sure the zone doesn't already exists too.
Any help would be greatly appreciated. I believe it would be similiar to
what Russell mentioned in his previous responce:
"if list a[key] exists, then append; otherwise, create a new list"
Would it be possible to supply a code snippet of this logic to get me
started? What are state and time? Is it possible to use the "key" keyword
on these variables to test for their existence? I apologize for my lack of
knowledge in this particular realm of programming in Python. Nested
dictionaries have always given me trouble.
Thanks,
T. Earle warndict['High Wind Warning'] = { time1: { state1: [zone1, zone2, zone3], state2: [zone1, zone3]}, time2: {...}, ...}
can be built with something like: warndict = {} for headline, time, state, zone in somesource: timedict = warndict.setdefault(headline, {}) statedict = timedict.setdefault(time, {}) stateentry = statedict.setdefault(state, []) stateentry.append(zone)
T. Earle wrote: Scot,
I really appreciate your help and code. It really helps me to understand the underlying solution to my problem. I have another question though, what's the best way to test if the headline already exists? If it does not, I need to create it along with the required associated data; however, if it already exists, I need to test to ensure I'm not already adding data that's already there (e.g., time and/or state already exists). Basically, I envision, if the state already exists all I need to do is add the new zone. I probably should check to make sure the zone doesn't already exists too. Any help would be greatly appreciated. I believe it would be similiar to what Russell mentioned in his previous responce:
"if list a[key] exists, then append; otherwise, create a new list"
Would it be possible to supply a code snippet of this logic to get me started? What are state and time? Is it possible to use the "key" keyword on these variables to test for their existence? I apologize for my lack of knowledge in this particular realm of programming in Python. Nested dictionaries have always given me trouble.
Thanks,
T. Earle
Check out "MultiDict.py" at "http://members.tripod.com/~edcjones/".
Ed Jones
T. Earle wrote: "if list a[key] exists, then append; otherwise, create a new list"
when a is a dict, key the required key, and object the value you want
to insert;
a.setdefault(key,[]).append(object)
--Irmen
"T. Earle" <tn*********@bmi.net> wrote in message news:<40******@news.bmi.net>... To list,
I'm trying to figure out the best approach to the following problem:
I have four variables: 1) headlines 2) times 3) states 4) zones
At this time, I'm thinking of creating a dictionary, headlinesDB, that stores different headlines and their associated time(s), state(s), and zone(s). The complexity is that each headline can have one or more times, one or more states, and one or more zones. However, there can only be 1 zone per time, and 1 zone per state. What is the best way to tackle this particular problem?
Shake out non-essential complexity first. Not really up on relational
DBs and stuff, so take my attempts at table design with a pinch of
salt, but think I'd break your problem down something like this:
- Hazard Type Table
TYPE
High Wind Warning
Tornado Warning
Blizzard Warning
- Hazard Event Table
ID TYPE START END
ZONES
1 High Wind Warning 2004-03-01-22-00-00 2004-03-02-08-00-00
[ORZ047, ORZ048, ORZ049]
2 High Wind Warning 2004-03-02-12-00-00 2004-03-02-20-00-00
[ORZ044, WAZ028]
- Zone Table
ZONE STATE
ORZ044 Oregon
ORZ047 Oregon
ORZ048 Oregon
ORZ049 Oregon
WAZ028 Washington
Note this organises around individual hazard 'events', rather than
hazard types, making it easier to think see what's going on. Also,
because Zones already identify their States, there's no need to put
State info into hazard events. (State names, if you need them, can be
looked up separately.)
How you actually implement it - as a relational DB/a list of
HazardEvent instances stuffed into a list and brute-force searched via
list comprehensions/nested dicts and lists - really depends on how
you're going to manipulate it, how much flexibility/simplicity you
need, etc.
HTH
has
In article <40******@news.bmi.net>, "T. Earle" <tn*********@bmi.net>
wrote: Russell,
If you really only want to look up data by headline, then a dictionary of dictionaries or nested lists or some other kind of collection is easy and should suffice. For instance: warndict["High Wind Warning"] = ( (time1, { state1: (zone1, zone2, zone3), state2: (zone1, zone3), }), (time2, {...}), )
This definitely seems to be the structure I've been looking for or at least have in mind. Since I'm no expert, could offer some code examples on how to create this structure on the fly?
However, I suspect you will also want to be able to locate data by state, time or zone. If that is true, I really think you should consider storing the data in a relational database. It sounds like a perfect match to your problem. Python has some nice interfaces to various databases (including PostgreSQL and MySQL).
My first inclination was to go with a database; however, I thought about it and concluded there may be too much variability each time the program is executed. For example, there will be times when there are no headlines; other times, there will be numerous headlines. Because of this variability, the database would have to be created from scratch each time the program is ran. As a result, would a database still be the right choice?
I really appreciate your help and suggestions
Regarding a database: if you are mainly interested in fairly current
events (rather than being able to go back and search for old events) and
you don't have a huge # of events, then a database does seem "overkill".
However, if you have a lot of events or want to do a lot of searching,
it may be worth keeping a database around. If you use a database, I
recommend creating only one of them. Just add new events, and
occasionally purge old data if you don't care about it anymore.
Here is some sample code (untested) to create the structure shown above.
I assume a simple (for me) structure for the input data; modify
addHealine accordingly if your data needs more massaging first.
This code exposes the internal data, because the class is itself the
dictionary of data. Whether or not this is a good idea depends on how
you want to search for data. If the built in dict methods are of
interest, then you are all set. If not, I would make HeadDict *contain*
a dict instead of *being* a dict, then write your own methods to
retrieve data.
- Download the RO package from http://astro.washington.edu/owen and install it in site-packages
or anywhere on your PythonPath. RO includes RO.Alg.ListDict, which
supports a dictionary whose values are a list and for which the
expression md[key] = value appends "value" to the list associated with
"key", creating a new list if "key" doesn't already exist.
import RO.Alg
class HeadDict(RO.Alg.ListDict):
def addHeadline(self, headline, time, stateZoneList)
"""Add a headline for a given time. stateZoneList is of the form:
((state1, zones_for_state1), (state2, zones_for_state2), ...)
"""
stateZoneDict = dict(stateZoneList)
self[headline] = (time, stateZoneDict)
warndict = HeadDict()
warnDict.addHeadline("High Wind Warning", time1, stateZoneList1)
warnDict.addHeadline("High Wind Warning", time2, stateZoneList2)
-- Russell This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: Andy Baker |
last post by:
Hi there,
I'm learning Python at the moment and trying to grok the thinking behind
it's scoping and nesting rules.
I was googling for nested functions and found this Guido quote:...
|
by: mjcsfo |
last post by:
I can't seem to find a reference nor any helpful threads on this
topic.
I've gotten the following error in two circumstances:
1. A complex type has nested within it another complex type, in the...
|
by: Trail Monster |
last post by:
Ok, I've been searching the net now for several days and can't find how to do
this anywhere.
Version:
VS 2005 Professional Release, 2.0 Framework
Background:
I have a complex business object...
|
by: techiepundit |
last post by:
I'm a Python newbie who just started learning the language a few weeks
ago. So these are beginner questions.
I have a list of sockets that I use for select.select calls like this:
...
|
by: techiepundit |
last post by:
I'm parsing some data of the form:
OuterName1 InnerName1=5,InnerName2=7,InnerName3=34;
OuterName2 InnerNameX=43,InnerNameY=67,InnerName3=21;
OuterName3 ....
and so on....
These are fake...
|
by: Brian L. Troutwine |
last post by:
I've got a problem that I can't seem to get my head around and hoped
somebody might help me out a bit:
I've got a dictionary, A, that is arbitarily large and may contains
ints, None and more...
|
by: IamIan |
last post by:
Hello,
I'm writing a simple FTP log parser that sums file sizes as it runs. I
have a yearTotals dictionary with year keys and the monthTotals
dictionary as its values. The monthTotals dictionary...
|
by: d80013 |
last post by:
Hello all,
I am trying to create a Dictionary of dictionaries in VBA. All I do is declare two dictionaries, one temporary one, and add the temporary dictionary to the main one recursively.
The...
|
by: Matthew Schibler |
last post by:
I'm a newbie to Python, with some experience using perl (where I used
nested arrays and hashes extensively). I am building a script in
python for a MUD I play, and I want to use the shelve module...
|
by: Charles Arthur |
last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
|
by: emmanuelkatto |
last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud.
Please let me know.
Thanks!
Emmanuel
|
by: BarryA |
last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
|
by: nemocccc |
last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers,...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
| |