By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
444,050 Members | 1,020 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 444,050 IT Pros & Developers. It's quick & easy.

head for grouped data - looking for best practice

P: n/a
Old, very old informatical problem: I want to "print" grouped data with
head information, that is:
eingabe=[
("Stuttgart","70197","Fernsehturm","20"),
("Stuttgart","70197","Brotmuseum","123"),
("Stuttgart","70197","Porsche","123123"),
("Leipzig","01491","Messe","91822"),
("Leipzig","01491","Schabidu","9181231"),
]

shall give: ( Braces are not important...)

'Stuttgart', '70197'
--data-- ('Fernsehturm', '20')
--data-- ('Brotmuseum', '123')
--data-- ('Porsche', '123123')
Leipzig', '01491'
--data-- ('Messe', '91822')
--data-- ('Schabidu', '9181231')

my first approach was:
from itertools import groupby
from operator import itemgetter

for key, bereich in groupby(eingabe,itemgetter(0)):
print "Area:",key
headnotprinted=True
for data in bereich:
if headnotprinted:
headnotprinted=False
print "additional head info", data[1]
print "--data--", data[2:]

leading to:

Area: Stuttgart
additional head info 70197
--data-- ('Fernsehturm', '20')
--data-- ('Brotmuseum', '123')
--data-- ('Porsche', '123123')
Area: Leipzig
additional head info 01491
--data-- ('Messe', '91822')
--data-- ('Schabidu', '9181231')
which is quite what I expected. But ...
if headnotprinted:
headnotprinted=False
print "additional head info", data[1]

REALLY looks patched, not programmed.

my second try was:
def getdoublekey(row):
return row[0:2]

for key, bereich in groupby(eingabe,getdoublekey):
print "Area:",key
for data in bereich:
print "--data--", data[2:]

which indeed leeds to the expected result, while looking less "hacky" ..
on the other hand side, that "getdoublekey" ist not very flexible; when
doing the same with 3 Columns forming the head information, I have to
define the next function...

gettriplekey(row):
return (row[1], row[0], ---yadda yadda yadda

so, what is the best recommended practice for this usual problem within
Python?

Harald

Jul 18 '05 #1
Share this Question
Share on Google+
4 Replies


P: n/a
> which indeed leeds to the expected result, while looking less "hacky" ..
on the other hand side, that "getdoublekey" ist not very flexible; when
doing the same with 3 Columns forming the head information, I have to
define the next function...


Make getdoublekey something like this (untested):

def get_key(f=0,t=1):
def _get_key(list_value):
return list_value[f:t]
return _get_key

Then use it like this:

for key, bereich in groupby(eingabe,get_key(t=key_size)):
****print*"Area:",key
****for*data*in*bereich:
********print*"--data--",*data[key_size:]

--
Regards,

Diez B. Roggisch
Jul 18 '05 #2

P: n/a
Harald Massa wrote:
def getdoublekey(row):
return*row[0:2]

for key, bereich in groupby(eingabe,getdoublekey):
print*"Area:",key
for*data*in*bereich:
print*"--data--",*data[2:]

which indeed leeds to the expected result, while looking less "hacky" ..
on the other hand side, that "getdoublekey" ist not very flexible; when
doing the same with 3 Columns forming the head information, I have to
define the next function...


Function creation is cheap and easily understood by someone reading your
code -- so you may already have the best solution. If Raymond Hettingers
recent suggestion on python-dev makes it into Python 2.5,
itemgetter()/attrgettter() could grow support for the extraction of
multiple attributes/items.

Anyway, here is a generalized getter factory that tries to handle all the
common cases in an intuitive way. E. g. you can create itemgetters using
the [] notation:
extract[::3](range(5)) [0, 3] extract[3](range(5)) 3 extract[0,3,4](range(5)) (0, 3, 4) import os
extract.path(os)

<module 'posixpath' from '/somewhere/posixpath.pyc'>

Peter

import itertools
import operator

def tuple_itemgetter(*keys):
"""Create a function that extracts a tuple of items from an
indexable object.
"""
# helper for extract
getters = map(operator.itemgetter, keys)
def get(obj):
return tuple(get(obj) for get in getters)
return get

def tuple_attrgetter(*names):
"""Create a function that extracts a tuple of attributes from an object.
"""
# helper for extract
getters = map(operator.attrgetter, names)
def get(obj):
return tuple(get(obj) for get in getters)
return get

class extract(object):
"""Present unified access to the creation of
attribute and item getters.
"""
def __getitem__(self, index):
if isinstance(index, tuple):
return tuple_itemgetter(*index)
return operator.itemgetter(index)
def __getattribute__(self, name):
return operator.attrgetter(name)
def __call__(self, *names):
return tuple_attrgetter(*names)

extract = extract() # we only ever need one instance

if __name__ == "__main__":
# the demo is an anglo-german hotchpotch, really:
eingabe=[
("Stuttgart","70197","Fernsehturm","20"),
("Stuttgart","70197","Brotmuseum","123"),
("Stuttgart","70197","Porsche","123123"),
("Leipzig","01491","Messe","91822"),
("Leipzig","01491","Schabidu","9181231"),
]
class Site(object):
def __init__(self, stadt, plz, name, nummer):
self.stadt = stadt
self.plz = plz
self.name = name
self.nummer = nummer
def __str__(self):
return "Site(stadt=%r, plz=%r, name=%r, nummer=%r)" % (
self.stadt, self.plz, self.name, self.nummer)
__repr__ = __str__

def show(iterable, groupkey):
print "-" * 20
for group, items in itertools.groupby(iterable, groupkey):
print group
for item in items:
print "\t", item

show(eingabe, extract[1])
show(eingabe, extract[0, 1, 0:2])
show(eingabe, extract[0:2])
show((Site(*e) for e in eingabe), extract("stadt", "plz"))
show((Site(*e) for e in eingabe), extract.stadt)

Jul 18 '05 #3

P: n/a
Harald Massa wrote:
def getdoublekey(row):
return row[0:2]

for key, bereich in groupby(eingabe,getdoublekey):
print "Area:",key
for data in bereich:
print "--data--", data[2:]


Why don't you just pass a slice to itemgetter?

py> eingabe=[
.... ("Stuttgart","70197","Fernsehturm","20"),
.... ("Stuttgart","70197","Brotmuseum","123"),
.... ("Stuttgart","70197","Porsche","123123"),
.... ("Leipzig","01491","Messe","91822"),
.... ("Leipzig","01491","Schabidu","9181231"),
.... ]
py> from itertools import groupby
py> from operator import itemgetter
py> for key, bereich in groupby(eingabe, itemgetter(slice(0, 2))):
.... print "Area:", key
.... for data in bereich:
.... print "--data--", data[2:]
....
Area: ('Stuttgart', '70197')
--data-- ('Fernsehturm', '20')
--data-- ('Brotmuseum', '123')
--data-- ('Porsche', '123123')
Area: ('Leipzig', '01491')
--data-- ('Messe', '91822')
--data-- ('Schabidu', '9181231')

STeVe
Jul 18 '05 #4

P: n/a
Steve,

Why don't you just pass a slice to itemgetter?

py> for key, bereich in groupby(eingabe, itemgetter(slice(0, 2))):
WHOW, that is great! that makes it really simple, just have to structure
the SQL to make a real "cut first, serve first" structure.

Thanks to all who helped!

also the "function factory function" bei Dietz was very helpfull; and
Peters classes looked really impressive!

Thanks again... now my code will be even clearer.

Harald
Jul 18 '05 #5

This discussion thread is closed

Replies have been disabled for this discussion.