Array of dict or lists or ....? | | |
I can't figure out how to set up a Python data structure to read in data
that looks something like this (albeit somewhat simplified and contrived):
States
Counties
Schools
Classes
Max Allowed Students
Current enrolled Students
Nebraska, Wabash, Newville, Math, 20, 0
Nebraska, Wabash, Newville, Gym, 400, 0
Nebraska, Tingo, Newfille, Gym, 400, 0
Ohio, Dinger, OldSchool, English, 10, 0
With each line I read in, I would create a hash entry and increment the
number of enrolled students.
I wrote a routine in Perl using arrays of hash tables (but the syntax
was a bear) that allowed me to read in the data and with those arrays of
hash tables to arrays of hash tables almost everything was dynamically
assigned.
I was able to fill in the hash tables and determine if any school class
(e.g. Gym) had exceeded the number of max students or if no students had
enrolled.
No, this is not a classroom project. I really need this for my job.
I'm converting my Perl program to Python and this portion has me stumped.
The reason why I'm converting a perfectly working program is because no
one else knows Perl or Python either (but I believe that someone new
would learn Python quicker than Perl) and the Perl program has become
huge and is continuously growing. | | | | re: Array of dict or lists or ....?
I can't figure out how to set up a Python data structure to read in data Quote:
that looks something like this (albeit somewhat simplified and contrived):
>
States
Counties
Schools
Classes
Max Allowed Students
Current enrolled Students
>
Nebraska, Wabash, Newville, Math, 20, 0
Nebraska, Wabash, Newville, Gym, 400, 0
Nebraska, Tingo, Newfille, Gym, 400, 0
Ohio, Dinger, OldSchool, English, 10, 0
>
With each line I read in, I would create a hash entry and increment the
number of enrolled students.
A python version of what you describe:
class TooManyAttendants(Exception): pass
class Attendence(object):
def __init__(self, max):
self.max = int(max)
self.total = 0
def accrue(self, other):
self.total += int(other)
if self.total self.max: raise TooManyAttendants
def __str__(self):
return "%s/%s" % (self.max, self.total)
__repr__ = __str__
data = {}
for i, line in enumerate(file("input.txt")):
print line,
state, county, school, cls, max_students, enrolled = map(
lambda s: s.strip(),
line.rstrip("\r\n").split(",")
)
try:
data.setdefault(
state, {}).setdefault(
county, {}).setdefault(
cls, Attendence(max_students)).accrue(enrolled)
except TooManyAttendants:
print "Too many Attendants in line %i" % (i + 1)
print repr(data)
You can then access things like
a = data["Nebraska"]["Wabash"]["Newville"]["Math"]
print a.max, a.total
If capitalization varies, you may have to do something like
data.setdefault(
state.upper(), {}).setdefault(
county.upper(), {}).setdefault(
cls.upper(), Attendence(max_students)).accrue(enrolled)
to make sure they're normalized into the same groupings.
-tkc | | | | re: Array of dict or lists or ....?
Tim Chase: Quote:
__repr__ = __str__
I don't know if that's a good practice. Quote:
try:
data.setdefault(
state, {}).setdefault(
county, {}).setdefault(
cls, Attendence(max_students)).accrue(enrolled)
except TooManyAttendants:
I suggest to decompress that part a little, to make it a little more
readable.
Bye,
bearophile | | | | re: Array of dict or lists or ....?
> __repr__ = __str__ Quote:
>
I don't know if that's a good practice.
I've seen it in a couple places, and it's pretty explicit what
it's doing. Quote: Quote:
> try:
> data.setdefault(
> state, {}).setdefault(
> county, {}).setdefault(
> cls, Attendence(max_students)).accrue(enrolled)
> except TooManyAttendants:
>
I suggest to decompress that part a little, to make it a little more
readable.
I played around with the formatting and didn't really like any of
the formatting I came up with. My other possible alternatives were:
try:
data \
.setdefault(state, {}) \
.setdefault(county, {}) \
.setdefault(cls, Attendence(max_students)) \
.accrue(enrolled)
except TooManyAttendants:
or
try:
(data
.setdefault(state, {})
.setdefault(county, {})
.setdefault(cls, Attendence(max, 0))
).accrue(enrolled)
except TooManyAttendants:
Both accentuate the setdefault() calls grouped with their
parameters, which can be helpful. Which one is "better" is a
matter of personal preference:
* no extra characters but hard to read
* backslashes, or
* an extra pair of parens
-tkc | | | | re: Array of dict or lists or ....?
En Mon, 06 Oct 2008 22:52:29 -0300, Tim Chase
<python.list@tim.thechases.comescribió: Quote: Quote: Quote:
>> __repr__ = __str__
[bearophileHUGS@lycos.com wrote] Quote: Quote:
> I don't know if that's a good practice.
Quote:
I've seen it in a couple places, and it's pretty explicit what it's
doing.
__repr__ is used as a fallback for __str__, so just defining __repr__ (and
leaving out __str__) is enough.
--
Gabriel Genellina | | | | re: Array of dict or lists or ....?
Tim Chase <python.list@tim.thechases.comwrites: Quote: Quote: Quote:
>> __repr__ = __str__
>>
>I don't know if that's a good practice.
>
I've seen it in a couple places, and it's pretty explicit what it's
doing.
But what's the point? Simply define __repr__, and both repr and str
will pick it up. | | | | re: Array of dict or lists or ....?
Dennis Lee Bieber wrote: Quote:
On Mon, 06 Oct 2008 19:45:07 -0400, Pat <Pat@junk.comdeclaimed the
following in comp.lang.python:
> Quote:
>I can't figure out how to set up a Python data structure to read in data
>that looks something like this (albeit somewhat simplified and contrived):
>>
>>
>States
> Counties
> Schools
> Classes
> Max Allowed Students
> Current enrolled Students
>>
>Nebraska, Wabash, Newville, Math, 20, 0
>Nebraska, Wabash, Newville, Gym, 400, 0
>Nebraska, Tingo, Newfille, Gym, 400, 0
>Ohio, Dinger, OldSchool, English, 10, 0
>
<snip>
>
Quote:
The structure looks more suited to a database -- maybe SQLite since
the interface is supplied with the newer versions of Python (and
available for older versions).
I don't understand why I need a database when it should just be a matter
of defining the data structure. I used a fictional example to make it
easier to (hopefully) convey how the data is laid out.
One of the routines in the actual program checks a few thousand
computers to verify that certain processes are running. I didn't want
to complicate my original question by going through all of the gory
details (multiple userids running many processes with some of the
processes having the same name). To save time, I fork a process for
each computer that I'm checking. It seems to me that banging away at a
database would greatly slow down the program and make the program more
complicated.
The Perl routine works fine and I'd like to emulate that behavior but
since I've just starting learning Python I don't know the syntax for
designing the data structure. I would really appreciate it if someone
could point me in the right direction. | | | | re: Array of dict or lists or ....?
Would the following be suitable data structure:
....
struct = {}
struct["Nebraska"] = "Wabash"
struct["Nebraska"]["Wabash"] = "Newville"
struct["Nebraska"]["Wabash"]["Newville"]["topics"] = "Math"
struct["Nebraska"]["Wabash"]["Newville"]["Math"]["Max Allowed Students"] = 20
struct["Nebraska"]["Wabash"]["Newville"]["Math"]["Current enrolled Students"] = 0
....
Have an easy Yom Kippur,
Ron.
-----Original Message-----
From: Pat [mailto:Pat@junk.net]
Sent: Wednesday, October 08, 2008 04:16
To: python-list@python.org
Subject: Re: Array of dict or lists or ....?
Dennis Lee Bieber wrote: Quote:
On Mon, 06 Oct 2008 19:45:07 -0400, Pat <Pat@junk.comdeclaimed the
following in comp.lang.python:
> Quote:
>I can't figure out how to set up a Python data structure to read in
>data that looks something like this (albeit somewhat simplified and contrived):
>>
>>
>States
> Counties
> Schools
> Classes
> Max Allowed Students
> Current enrolled Students
>>
>Nebraska, Wabash, Newville, Math, 20, 0 Nebraska, Wabash, Newville,
>Gym, 400, 0 Nebraska, Tingo, Newfille, Gym, 400, 0 Ohio, Dinger,
>OldSchool, English, 10, 0
>
<snip>
>
Quote:
The structure looks more suited to a database -- maybe SQLite since
the interface is supplied with the newer versions of Python (and
available for older versions).
I don't understand why I need a database when it should just be a matter ofdefining the data structure. I used a fictional example to make it easierto (hopefully) convey how the data is laid out.
One of the routines in the actual program checks a few thousand computers to verify that certain processes are running. I didn't want to complicate my original question by going through all of the gory details (multiple userids running many processes with some of the processes having the same name).. To save time, I fork a process for each computer that I'm checking. It seems to me that banging away at a database would greatly slow down the program and make the program more complicated.
The Perl routine works fine and I'd like to emulate that behavior but sinceI've just starting learning Python I don't know the syntax for designing the data structure. I would really appreciate it if someone could point me in the right direction. | | | | re: Array of dict or lists or ....?
On Oct 7, 10:16*am, "Barak, Ron" <Ron.Ba...@lsi.comwrote: Quote:
Would the following be suitable data structure:
...
struct = {}
struct["Nebraska"] = "Wabash"
struct["Nebraska"]["Wabash"] = "Newville"
struct["Nebraska"]["Wabash"]["Newville"]["topics"] = "Math"
struct["Nebraska"]["Wabash"]["Newville"]["Math"]["Max Allowed Students"] = 20
struct["Nebraska"]["Wabash"]["Newville"]["Math"]["Current enrolled Students"] = 0
...
That's not quite right as stated. Quote: Quote: Quote:
>>struct = {}
>>struct["Nebraska"] = "Wabash"
>>struct["Nebraska"]["Wabash"] = "Newville"
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'str' object does not support item assignment | | | | re: Array of dict or lists or ....?
-----Original Message----- Quote:
From: python-list-bounces+jr9445=att.com@python.org [mailto:python-
list-bounces+jr9445=att.com@python.org] On Behalf Of Pat
Sent: Tuesday, October 07, 2008 10:16 PM
To: python-list@python.org
Subject: Re: Array of dict or lists or ....?
Quote:
The Perl routine works fine and I'd like to emulate that behavior but
since I've just starting learning Python I don't know the syntax for
designing the data structure. I would really appreciate it if someone
could point me in the right direction.
states = {}
if 'georgia' not in states:
states['georgia'] = {}
states['georgia']['fulton'] = {}
states['georgia']['fulton']['ps101'] = {}
states['georgia']['fulton']['ps101']['math'] = {}
states['georgia']['fulton']['ps101']['math']['max'] = 100
states['georgia']['fulton']['ps101']['math']['current'] = 33
states['georgia']['dekalb'] = {}
states['georgia']['dekalb']['ps202'] = {}
states['georgia']['dekalb']['ps202']['english'] = {}
states['georgia']['dekalb']['ps202']['english']['max'] = 500
states['georgia']['dekalb']['ps202']['english']['current'] = 44
print states
*****
The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential, proprietary, and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from all computers. GA621 | | | | re: Array of dict or lists or ....?
On Oct 7, 10:15 pm, Pat <P...@junk.netwrote: Quote:
Dennis Lee Bieber wrote: Quote:
On Mon, 06 Oct 2008 19:45:07 -0400, Pat <P...@junk.comdeclaimed the
following in comp.lang.python:
> Quote: Quote:
I can't figure out how to set up a Python data structure to read in data
that looks something like this (albeit somewhat simplified and contrived):
> Quote: Quote:
States
Counties
Schools
Classes
Max Allowed Students
Current enrolled Students
> Quote: Quote:
Nebraska, Wabash, Newville, Math, 20, 0
Nebraska, Wabash, Newville, Gym, 400, 0
Nebraska, Tingo, Newfille, Gym, 400, 0
Ohio, Dinger, OldSchool, English, 10, 0
> > Quote:
The structure looks more suited to a database -- maybe SQLite since
the interface is supplied with the newer versions of Python (and
available for older versions).
Seconded. Quote:
I don't understand why I need a database when it should just be
a matter of defining the data structure.
Picking an appropriate data structure depends on the kind of
functionality you want to provide. So far you basically described just
one requirement: keep a tally of how many students are in each class
and compare it to the max allowed (and zero). If that's the only kind
of query you want to run against your data, there's no reason to index
separately each state, county, or school; all you care about are
classes. A simple data structure that satisfies perfectly the
requirement could then be:
# mapping of {class-info : (max,enrolled)}
data = {
('Nebraska', 'Wabash', 'Newville', 'Math') : (20, 0),
('Nebraska', 'Wabash', 'Newville', 'Gym') : (400, 0),
('Nebraska', 'Tingo', 'Newville', 'Gym') : (400, 0),
('Ohio', 'Dinger', 'OldSchool', 'English') : (10, 0),
}
Of course this data structure is pretty bad at answering a query like
"how many classes are there in Nebraska" or "what's the average number
of enrolled students in Newville". The more general information you
might want to get from the data, the more obvious it becomes that you
need a real database.
HTH,
George | | | | re: Array of dict or lists or ....?
George Sakkis <george.sakkis@gmail.comwrites: Quote:
On Oct 7, 10:15 pm, Pat <P...@junk.netwrote: Quote:
I don't understand why I need a database when it should just be a
matter of defining the data structure.
>
Picking an appropriate data structure depends on the kind of
functionality you want to provide.
[…] Quote:
The more general information you might want to get from the data,
the more obvious it becomes that you need a real database.
Thanks very much for posting this answer; I tried to do something
similar but couldn't get at the essential points the way you did here.
Perhaps the original poster is confusing “you should use a databaseâ€
with “you should use a database stored in a fully-concurrent
dedicated database management systemâ€.
Far from it: with Python 2.5 you have SQLite (in the ‘sqlite3’
module), which would be ideal for implementing a powerful relational
SQL database used directly by one program instance, without needing a
full-blown database management system in a separately-administrated
server application.
--
\ “Patience, n. A minor form of despair, disguised as a virtue.†|
`\ —Ambrose Bierce, _The Devil's Dictionary_, 1906 |
_o__) |
Ben Finney | | | | re: Array of dict or lists or ....?
En Tue, 07 Oct 2008 23:15:54 -0300, Pat <Pat@junk.netescribió: Quote:
Dennis Lee Bieber wrote: Quote:
>On Mon, 06 Oct 2008 19:45:07 -0400, Pat <Pat@junk.comdeclaimed the
>following in comp.lang.python:
>> Quote:
>>I can't figure out how to set up a Python data structure to read in
>>data that looks something like this (albeit somewhat simplified and
>>contrived):
>>>
>>>
>>States
>> Counties
>> Schools
>> Classes
>> Max Allowed Students
>> Current enrolled Students
>>>
>>Nebraska, Wabash, Newville, Math, 20, 0
>>Nebraska, Wabash, Newville, Gym, 400, 0
>>Nebraska, Tingo, Newfille, Gym, 400, 0
>>Ohio, Dinger, OldSchool, English, 10, 0
> <snip>
>>
> Quote:
>The structure looks more suited to a database -- maybe SQLite since
>the interface is supplied with the newer versions of Python (and
>available for older versions).
>
I don't understand why I need a database when it should just be a matter
of defining the data structure. I used a fictional example to make it
easier to (hopefully) convey how the data is laid out.
You don't need a full-blown-multiuser-concurrent-petabyte-capable-server
database, just one that does the job. SQLite is very small and comes with
Python 2.5 Quote:
The Perl routine works fine and I'd like to emulate that behavior but
since I've just starting learning Python I don't know the syntax for
designing the data structure. I would really appreciate it if someone
could point me in the right direction.
So none of the previously posted alternatives worked for you?
--
Gabriel Genellina | | | | re: Array of dict or lists or ....?
Pat wrote: Quote:
I can't figure out how to set up a Python data structure to read in data
that looks something like this (albeit somewhat simplified and contrived):
>
States
Counties
Schools
Classes
Max Allowed Students
Current enrolled Students
>
Nebraska, Wabash, Newville, Math, 20, 0
Nebraska, Wabash, Newville, Gym, 400, 0
Nebraska, Tingo, Newfille, Gym, 400, 0
Ohio, Dinger, OldSchool, English, 10, 0
>
With each line I read in, I would create a hash entry and increment the
number of enrolled students.
You might want something like this: Quote: Quote: Quote:
>>import collections, functools
>>int_dict = functools.partial(collections.defaultdict, int)
>>curr = functools.partial(collections.defaultdict, int)
>># builds a dict-maker where t = curr(); t['name'] += 1 "works"
>>for depth in range(4):
# add a layer with a default of the preceding "type"
curr = functools.partial(collections.defaultdict, curr) Quote: Quote: Quote:
>>base = curr() # actually make one
>>base['Nebraska']['Wabash']['Newville']['Math']['max'] = 20
>>base['Nebraska']['Wabash']['Newville']['Math']['curr'] += 1
>>base['Nebraska']['Wabash']['Newville']['Math']['curr']
1 Quote: Quote: Quote:
>>base['Nebraska']['Wabash']['Newville']['English']['curr']
0
--Scott David Daniels Scott.Daniels@Acm.Org |  | | | | /bytes/about
We are a network of experts and professionals in IT and software development that help one another with answers to tough questions and share insights.
Get the best answers to your questions from over 226,295 network members.
|