473,657 Members | 2,430 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

RE: Dictionary of Dicts question

Joe had a good point! Let me describe what problem I'm trying to solve and the list can recommend some suggestions.

I have two text files. Each file contains data like this:

Test file 1234 4567 8975

I want to compare the numbers in each text file. The data set (i.e. the numbers) has a unique identifier: the "test" + the "file". The text files are similar, but may not be exactly the same.

My initial idea was to read the text files and turn each line into a dictionary entry. A dict for each text file. Then walk through the dicts and compare the numbers.

If anyone has a better idea, I'd love to hear it.

-----Original Message-----
From: Joe Strout [mailto:jo*@stro ut.net]
Sent: Thursday, October 16, 2008 1:49 PM
To: John Townsend
Subject: Re: Dictionary of Dicts question

On Oct 16, 2008, at 1:19 PM, John Townsend wrote:
Accessing values is pretty straightforward (nice change from my Perl
days). For example:

myDict['TestName']['FileName']['ct_shutdown']

in Python interpreter yields

9021
FWIW, I'd recommend you scrap this dict-of-dicts structure and instead
define a class (or three). You'll know you have it right when instead
of the above, you're typing something like:

myDict['TestName'].FileName.ct_sh utdown

(The division might be a little different -- it's not entirely clear
to me what in your sample code is just place-holder names, and what is
meant literally.)
However, when I try to add, let's say, a new FileName entry, I end
up replacing the previous FileName entry.

In Python interpreter, I try:

myDict['TestName'] = {'NewFileName': {}, }
Well, yes, this code says "stuff the new dictionary {'NewFileName': {}}
into myDict under 'TestName', replacing whatever was under 'TestName'
before."

If that's not what you want, then don't do that -- assign to
myDict['TestName2'] or some such.
So, how do I add a new entry without replacing the old entry?
You don't; that's the whole point of dictionaries. A dictionary maps
keys to values. You can't have two values for the same key. You
could have a key map to a list, but I suspect we're straying a bit far
now from whatever it is you're trying to accomplish.

If you can explain what it is you're actually trying to accomplish
(without reference to implementation details like dictionaries and
entries), maybe we can suggest a suitable approach.

Best,
- Joe

Oct 16 '08 #1
4 1553
Here are some sample lines.

Text file 1 contains:

DescribeImage AllAdjustments. psd 0.668000012636 0.046 0.426 0.06475 0.06475 0.005875
DescribeImage All_Options_Mul ti.psd 0.552750021219 0.046 0.355875 0.01525 0.017125 0.0
DescribeImage All_Options_Qua d.psd 0.57025000453 0.046 0.314875 0.058375 0.058375 0.007875
DescribeImage Apple_RGB.psd 0.538999974728 0.046 0.315 0.04675 0.04875 0.0

Text file 2 contains:
DescribeImage AllAdjustments. psd 0.7889 0.056 0.786 0.0665 0.06476 0.999
DescribeImage All_Options_Mul ti.psd 0.5527500421419 0.43154312 0.4443 0.43124 0.017125 0.0
DescribeImage All_Options_Qua d.psd 0.5702503200453 0.046 0.34 0.058375 0.4342 0.43214

Lines are tab delimited. Note, in this example text file 2 contains three lines, while text file 1 contains four lines.

Where there are matching lines in each text file (e.g. "DescribeIm age AllAdjustments. psd" exists in both files), I want to compare each of the numbers from that line to the numbers in the corresponding line.

If there is not corresponding line (like " DescribeImage Apple_RGB.psd") , skip the comparison test.

I hope this helps describe my problem.

-----Original Message-----
From: py************* *************** ********@python .org [mailto:py****** *************** *************** @python.org] On Behalf Of Dennis Lee Bieber
Sent: Thursday, October 16, 2008 2:44 PM
To: py*********@pyt hon.org
Subject: Re: Dictionary of Dicts question

On Thu, 16 Oct 2008 14:05:16 -0700, John Townsend <jt******@adobe .com>
declaimed the following in comp.lang.pytho n:
>
I have two text files. Each file contains data like this:

Test file 1234 4567 8975

I want to compare the numbers in each text file. The data set (i.e. the numbers) has a unique identifier: the "test" + the "file". The text files are similar, but may not be exactly the same.
This would be easier if you gave a few lines sample from each file
and the expected output of your processing of those lines.
My initial idea was to read the text files and turn each line into a dictionary entry. A dict for each text file. Then walk through the dicts and compare the numbers.

If anyone has a better idea, I'd love to hear it.
Use a relational database and "GROUP BY" selections?
--
Wulfraed Dennis Lee Bieber KD6MOG
wl*****@ix.netc om.com wu******@bestia ria.com
HTTP://wlfraed.home.netcom.com/
(Bestiaria Support Staff: we******@bestia ria.com)
HTTP://www.bestiaria.com/
--
http://mail.python.org/mailman/listinfo/python-list
Oct 16 '08 #2
On Oct 16, 11:03*pm, John Townsend <jtown...@adobe .comwrote:
Here are some sample lines.

Text file 1 contains:

DescribeImage * AllAdjustments. psd * * *0.668000012636 *0.046 * 0.426 * 0.06475 0.06475 0.005875
DescribeImage * All_Options_Mul ti.psd * 0.552750021219 *0.046 * 0..355875 * * * *0.01525 0.017125 * * * *0.0
DescribeImage * All_Options_Qua d.psd * *0.57025000453 * 0.046 *0.314875 * * * *0.058375 * * * *0.058375 * * * *0.007875
DescribeImage * Apple_RGB.psd * 0.538999974728 *0.046 * 0.315 *0.04675 0.04875 0.0

Text file 2 contains:
DescribeImage * AllAdjustments. psd * * *0.7889 *0.056 * 0.786* 0.0665 *0.06476 0.999
DescribeImage * All_Options_Mul ti.psd * 0.5527500421419 0.43154312 * * *0.4443 *0.43124 0.017125 * * * *0.0
DescribeImage * All_Options_Qua d.psd * *0.570250320045 3 0.046 * 0..34 * *0.058375 * * * *0.4342 *0.43214

Lines are tab delimited. Note, in this example text file 2 contains threelines, while text file 1 contains four lines.

Where there are matching lines in each text file (e.g. "DescribeIm age *AllAdjustments .psd" exists in both files), I want to compare each of the numbers from that line to the numbers in the corresponding line.

If there is not corresponding line (like " DescribeImage * * * *Apple_RGB.psd" ), skip the comparison test.

I hope this helps describe my problem.
If the first 2 fields are unique when combined, then you could use
that as the key:

data = {}
for line in open(path):
fields = line.split("\t" )
data[tuple(fields[ : 2])] = fields[2 : ]
Oct 17 '08 #3
MRAB:
for line in open(path):
fields = line.split("\t" )
data[tuple(fields[ : 2])] = fields[2 : ]
Keeping the key as a string may have some memory/performance
advantages (not tested):

for line in open(path):
fields = line.split("\t" )
data[fields[0] + fields[1]] = map(float, islice(fields, 2, None))

Or probably faster (not tested):

for line in open(path):
parts = s.rsplit("\t", 6)
data[parts[0]] = map(float, islice(parts, 1, None))

Or (not tested):

for line in open(path):
parts = s.rsplit("\t", 6)
data[parts[0]] = [float(parts[i]) for i in xrange(1, 7)]

Having a built-in xsplit/xsplitr method here probably helps
significantly.
If the FP numbers are really precise then you can compare them as
strings too, but that's quite unsafe.

Bye,
bearophile
Oct 17 '08 #4
On Oct 16, 5:05*pm, John Townsend <jtown...@adobe .comwrote:
Joe had a good point! Let me describe what problem I'm trying to solve and the list can recommend some suggestions.

I have two text files. Each file contains data like this:

Test file 1234 4567 8975

I want to compare the numbers in each text file. The data set (i.e. the numbers) has a unique identifier: the "test" + the "file". The text files are similar, but may not be exactly the same.

My initial idea was to read the text files and turn each line into a dictionary entry. A dict for each text file. Then walk through the dicts and compare the numbers.

If anyone has a better idea, I'd love to hear it.
If getting the diffs is the only thing you want to do with the data,
your idea is good enough. For more open-ended data manipulation you
should move to a database like in Dennis's reply but for a single
simple task a DB (even sqlite) is probably an overkill.

Here's a quick solution using the dicts approach:

import sys
from operator import sub

def read_data(path) :
data = {}
for line in open(path):
fields = line.split('\t' )
data[tuple(fields[:2])] = map(float,field s[2:])
return data

d1 = read_data(sys.a rgv[1])
d2 = read_data(sys.a rgv[2])
for key in d1:
if key in d2:
diffs = map(sub, d1[key], d2[key])
print key, diffs

HTH,
George
Oct 17 '08 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

125
7154
by: Raymond Hettinger | last post by:
I would like to get everyone's thoughts on two new dictionary methods: def count(self, value, qty=1): try: self += qty except KeyError: self = qty def appendlist(self, key, *values): try:
14
1914
by: Antoon Pardon | last post by:
I'm writing a Tree class, which should behave a lot like a dictionary. In order to test this, I took the unittest from the source distribution for dictionaries and used it to test against my Tree class. Things are working out rather well, but I stumbled on a problem. this unittest tries to test for '==' and '<' operators. However I couldn't find anything in the documentation that defined how dictionaries should behave with respect to...
14
3452
by: vatamane | last post by:
This has been bothering me for a while. Just want to find out if it just me or perhaps others have thought of this too: Why shouldn't the keyset of a dictionary be represented as a set instead of a list? I know that sets were introduced a lot later and lists/dictionaries were used instead but I think "the only correct way" now is for the dictionary keys and values to be sets. Presently {1:0,2:0,3:0}.keys() will produce but it could also...
5
3842
by: Jon Smirl | last post by:
Is there some way to tell a dictionary object that I am going to load 1M objects into it and have it pre-allocate enought slots to hold all of the entries? Thus avoiding many thousand memory allocations. Jon Smirl jonsmirl@gmail.com
11
11403
by: John | last post by:
I am coding a radix sort in python and I think that Python's dictionary may be a choice for bucket. The only problem is that dictionary is a mapping without order. But I just found that if the keys are numeric, the keys themselves are ordered in the dictionary. part of my code is like this: radix={} for i in range(256):
18
2961
by: Marko.Cain.23 | last post by:
Hi, I create a dictionary like this myDict = {} and I add entry like this: myDict = 1 but how can I empty the whole dictionary? Thank you.
5
2868
by: Greg Corradini | last post by:
Hello All, I'm attempting to create multiple dictionaries at once, each with unique variable names. The number of dictionaries i need to create depends on the length of a list, which was returned from a previous function. The pseudo code for this problem would be: returnedlist = count = 0 for i in returnedlist: if count < len(returnedlist):
0
980
by: Chris Rebert | last post by:
On Thu, Oct 16, 2008 at 12:19 PM, John Townsend <jtownsen@adobe.comwrote: Right, this clobbers the existing entry with this new blank one. This is evidenced by the fact that you're performing an _assignment_ on a dictionary key rather than calling a _mutator_ method on a dictionary value. A dictionary has only one value for a given key (but importantly, that value can be a list). Switch to a Dict of Lists of Dicts and append to the...
14
3326
by: Prateek | last post by:
I've been using Python for a while (4 years) so I feel like a moron writing this post because I think I should know the answer to this question: How do I make a dictionary which has distinct key-value pairs for 0, False, 1 and True. As I have learnt, 0 and False both hash to the same value (same for 1 and True). {0: 'abc'} # Am I the only one who thinks this is weird?
0
8402
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8315
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
8829
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
8734
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
8608
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
7341
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6172
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
4164
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
2733
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.