473,378 Members | 1,639 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,378 software developers and data experts.

Delete items in nested dictionary based on value.

I've got a problem that I can't seem to get my head around and hoped
somebody might help me out a bit:

I've got a dictionary, A, that is arbitarily large and may contains
ints, None and more dictionaries which themselves may contain ints,
None and more dictionaries. Each of the sub-dictionaries is also
arbitrarily large. When pretty printing A, in the context I'm using A
for, it's rather helpful to remove all key:value pairs where value is
None. So I'd like to be able to define a function dict_sweep to recurse
through A and del all the keys with None as a value.

I feel like the solution is right on the tip of my tounge, so to speak,
but I just can't quite get it. Anyway, here's the best I've come up
with:

def dict_sweep(dictionary, key_value):
"""Sweep through dictionary, deleting any key:value where
value is None.

"""
# Find key at position key_value.
key = dictionary.keys()[key_value]

# If the key value is a dictionary we want to move into it.
# Therefore we call rec_rem on dictionary[key] with a
# key_value of 0
if type(dictionary[key]) is type({}):
dict_sweep(dictionary[key], 0)

# If the value isn't a dictionary we care only that it's not None.
# If it is None we nuke it and call rec_rem on the current
# dictionary with key_value incrimented by one.
elif dictionary[key] is None:
del dictionary[key]
dict_sweep(dictionary, key_value+1)

# If the value isn't a dictionary and the value isn't None then
# we still want to walk through to the next value in the
# dictionary, so call rec_rem with key_value+1
else:
dict_sweep(dictionary, key_value+1)

Assuming dict_sweep worked perfectly it would take input like this:

A_in = {1: {2: 2, 3: {1: None, 2: 2}}, 2: 2, 3: None}

B_in = {1: {1: {1: None, 2: {1: None}}, 2: 2, 3: None}

and output this:

A_out = {1: {2: 2, 3: {2: 2}}, 2: 2}

B_out = {2:2}

This dict_sweep above obviously doesn't work and I'm rather afraid of
hitting python's recursion limit. Does anyone see a way to modify
dict_sweep in it's current state to perform dictionary sweeps like
this? What about a non-recursive implementation of dict_sweep?

Sep 13 '06 #1
8 13235
My first try, not much tested:

def clean(d):
for key,val in d.items():
if isinstance(val, dict):
val = clean(val)
if not val:
del d[key]
return d

a = {1: {2: 2, 3: {1: None, 2: 2}}, 2: 2, 3: None}
print clean(a) # Out: {1: {2: 2, 3: {2: 2}}, 2: 2}
b = {1: {1: {1: None, 2: {1: None}}}, 2: 2, 3: None}
print clean(b) # Out: {2: 2}

Recursivity overflow problem: you can increase the recursivity limit,
of if you think you need it after some tests on your real data, then
you can use a stack (a python list, using append and pop methods) to
simulate recursivity. You probably don't have 300+ levels of nesting.

Bye,
bearophile

Sep 13 '06 #2
Better:

def clean(d):
for key,val in d.items():
if isinstance(val, dict):
val = clean(val)
if val is None or val == {}:
del d[key]
return d

a = {1: {2: 2, 3: {1: None, 2: 2}}, 2: 2, 3: None}
print clean(a) # Out: {1: {2: 2, 3: {2: 2}}, 2: 2}
b = {1: {1: {1: None, 2: {1: None}}}, 2: 2, 3: None}
print clean(b) # Out: {2: 2}
c = {1: {1: {1: None, 2: {1: None}}}, 2: 2, 3: 0}
print clean(c) # Out: {2: 2, 3: 0}

Sep 13 '06 #3
Brian L. Troutwine wrote:
I've got a problem that I can't seem to get my head around and hoped
somebody might help me out a bit:

I've got a dictionary, A, that is arbitarily large and may contains
ints, None and more dictionaries which themselves may contain ints,
None and more dictionaries. Each of the sub-dictionaries is also
arbitrarily large. When pretty printing A, in the context I'm using A
for, it's rather helpful to remove all key:value pairs where value is
None. So I'd like to be able to define a function dict_sweep to recurse
through A and del all the keys with None as a value.

I feel like the solution is right on the tip of my tounge, so to speak,
but I just can't quite get it. Anyway, here's the best I've come up
with:
How about this:

def dict_sweep(dictionary):
for k, v in dictionary.items():
if v is None:
del dictionary[k]
if type(v) == type({}):
dict_sweep(v)
#if len(v) == 0:
# del dictionary[k]
A_in = {1: {2: 2, 3: {1: None, 2: 2}}, 2: 2, 3: None}

B_in = {1: {1: {1: None, 2: {1: None}}, 2: 2, 3: None}}

dict_sweep(A_in)
dict_sweep(B_in)
print A_in
print B_in
>>{1: {2: 2, 3: {2: 2}}, 2: 2}
{1: {2: 2}}

It doesn't produce the output you expect for B_in, but your brackets didn't
match and I'm not sure where the extra close should be. Also, I suspect
you're wanting to delete empty dictionaries but I don't see that mentioned
in the text. If that is so, include the two commented statements.
--
Dale Strickland-Clark
Riverhall Systems - www.riverhall.co.uk

Sep 13 '06 #4

be************@lycos.com wrote:
Better:

def clean(d):
for key,val in d.items():
if isinstance(val, dict):
val = clean(val)
if val is None or val == {}:
del d[key]
return d
Can you delete values from a dictionary whilst iterating over its items
?

Fuzzyman
http://www.voidspace.org.uk/python/index.shtml
>
a = {1: {2: 2, 3: {1: None, 2: 2}}, 2: 2, 3: None}
print clean(a) # Out: {1: {2: 2, 3: {2: 2}}, 2: 2}
b = {1: {1: {1: None, 2: {1: None}}}, 2: 2, 3: None}
print clean(b) # Out: {2: 2}
c = {1: {1: {1: None, 2: {1: None}}}, 2: 2, 3: 0}
print clean(c) # Out: {2: 2, 3: 0}
Sep 13 '06 #5
Fuzzyman:
Can you delete values from a dictionary whilst iterating over its items ?
Try the code, it works. I am not iterating on the dict, I am not using
dict.iteritems(), that produces a lazy iterable, I am using
dict.items() that produces a list of (key,value) pairs. And I am not
removing elements from that list :-)

Hugs,
bearophile

Sep 13 '06 #6
>
Assuming dict_sweep worked perfectly it would take input like this:

A_in = {1: {2: 2, 3: {1: None, 2: 2}}, 2: 2, 3: None}

B_in = {1: {1: {1: None, 2: {1: None}}, 2: 2, 3: None}

and output this:

A_out = {1: {2: 2, 3: {2: 2}}, 2: 2}

B_out = {2:2}

This dict_sweep above obviously doesn't work and I'm rather afraid of
hitting python's recursion limit. Does anyone see a way to modify
dict_sweep in it's current state to perform dictionary sweeps like
this? What about a non-recursive implementation of dict_sweep?
How about this:

A_in = {1: {2: 2, 3: {1: None, 2: 2}}, 2: 2, 3: None}

def dict_sweep(dictin):

for key in dictin.keys():
if dictin.get(key) == None:
del dictin[key]
elif type(dictin[key]) is dict:
dictin[key] = dict_sweep(dictin[key])
return dictin

A_out = dict_sweep(A_in)
print A_out


Running the above returns:

{1: {2: 2, 3: {2: 2}}, 2: 2}

Regards,

John


--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

Sep 14 '06 #7
Would it not be easier to modify the pretty-printer to ignore such
entries rather than have to make a duplicate (I presume you have reason
to need the original dictionary unchanged) dictionary and then delete
entries that match your "ignore" criteria?
Good question. The data I'm handling relates to books: their ISBN,
pricing information, weight, statistical data for each page, citations
in other books to certain pages, those books statistical data, so on
and so forth. It's too much data, really, but this is what I'm tasked
with handling.

Currently I'm just pretty printing the data, but I'm really cheating
right now. The tool I'm writing is supposed to just collect the data,
strip out all the extranious data based on filters defined at the start
of the collection process and then output all that data to stdout,
where it will be caught by a tool to properly format and display it.

Yesterday morning I got into work, went into deep hack mode, wrote the
entire tool and at the end of the day could not, for the life of me,
figure out the recursion I needed. I suppose I'd spent myself. Who
knows?

Anyway, you are right in that if I did indeed need the original
dictionary unchanged it would be much, much easier to modify the
pretty-printer.

Dennis Lee Bieber wrote:
On 13 Sep 2006 16:08:37 -0700, "Brian L. Troutwine"
<go*************@gmail.comdeclaimed the following in comp.lang.python:
arbitrarily large. When pretty printing A, in the context I'm using A
for, it's rather helpful to remove all key:value pairs where value is
None. So I'd like to be able to define a function dict_sweep to recurse
through A and del all the keys with None as a value.

I feel like the solution is right on the tip of my tounge, so to speak,
but I just can't quite get it. Anyway, here's the best I've come up
with:
Would it not be easier to modify the pretty-printer to ignore such
entries rather than have to make a duplicate (I presume you have reason
to need the original dictionary unchanged) dictionary and then delete
entries that match your "ignore" criteria?
--
Wulfraed Dennis Lee Bieber KD6MOG
wl*****@ix.netcom.com wu******@bestiaria.com
HTTP://wlfraed.home.netcom.com/
(Bestiaria Support Staff: we******@bestiaria.com)
HTTP://www.bestiaria.com/
Sep 14 '06 #8
This is a general reply to all.

Thanks for all the suggestions. I hadn't really thought about filtering
empty dictionaries because the data I'm processing won't have them, but
it does make for a much nicer, more general filter. I did end up using
bearophileH's code, but that's mostly because he got there first. Each
solution functioned as advertised.

Again, thanks for all the help.

Sep 14 '06 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

8
by: Charlotte Henkle | last post by:
Hello; I'm pondering how to count the number of times an item appears in total in a nested list. For example: myList=,,] I'd like to know that a appeared three times, and b appeared twice,...
12
by: Donnal Walter | last post by:
The following method is defined in one of my classes: def setup(self, items={}): """perform setup based on a dictionary of items""" if 'something' in items: value = items # now do something...
6
by: B0nj | last post by:
I've got a class in which I want to implement a property that operates like an indexer, for the various colors associated with the class. For instance, I want to be able to do 'set' operations...
2
by: Antoine | last post by:
I would like to construct my own list of items in a grid/ table/ item list layout but I have a problem. I want to add a sort of index row based on time, such as there might be blank values. Sure...
3
by: Jake Emerson | last post by:
I'm attempting to build a process that helps me to evaluate the performance of weather stations. The script below operates on an MS Access database, brings back some data, and then loops through to...
1
by: georgehernando | last post by:
I am looking for an efficient way to delete the elements of a Python dictionary based on an evaluation of the element's value. Delete all elements with value bigger than x. Thanks
12
by: Rich Shepard | last post by:
I want to code what would be nested "for" loops in C, but I don't know the most elegant way of doing the same thing in python. So I need to learn how from you folks. Here's what I need to do: build...
16
by: IamIan | last post by:
Hello, I'm writing a simple FTP log parser that sums file sizes as it runs. I have a yearTotals dictionary with year keys and the monthTotals dictionary as its values. The monthTotals dictionary...
0
by: Gary Herron | last post by:
Dan Upton wrote: Yes. Create a list of keys, and loop through it: pids = procs_dict.keys() for pid in pids: if procs_dict.poll() != None # do the counter updates del procs_dict Then the...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.