473,386 Members | 1,828 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,386 software developers and data experts.

dictionary comparison

I'm trying to compare sun patch levels on a server to those of what sun
is recommending. For those that aren't familiar with sun patch
numbering here is a quick run down.

A patch number shows up like this:
113680-03
^^^^^^ ^^
patch# revision

What I want to do is make a list. I want to show what server x has
versus what sun recommends, and if the patch exists, but the revision
is different, I want to show that difference.

Here are some sample patches that sun recommends:
117000-05
116272-03
116276-01
116278-01
116378-02
116455-01
116602-01
116606-01

Here are some sample patches that server x has:
117000-01
116272-02
116272-01
116602-02

So there are some that are the same, some that sun recommends that
server x doesn't have, and some where the patch is the same but the
revision is different.

I've thrown the data into dictionaries, but I just can't seem to figure
out how I should actually compare the data and present it. Here's what
I have so far (the split is in place because there is actually a lot
more data in the file, so I split it out so I just get the patch number
and revision). So I end up with (for example) 116272-01, then split so
field[0] is 116272 and field[1] is 01.

def sun():
sun = open('sun-patchlist', 'r')
for s in sun:
sun_fields = s.split(None, 7)
for sun_field in sun_fields:
sun_field = sun_field.strip()
sun_patch = {}
sun_patch['number'] = sun_fields[0]
sun_patch['rev'] = sun_fields[1]
print sun_patch['number'], sun_patch['rev']
sun.close()

def serverx():
serverx = open('serverx-patchlist', 'r')
for p in serverx:
serverx_fields = p.split(None, 7)
for serverx_field in serverx_fields:
serverx_field = serverx_field.strip()
serverx_patch = {}
serverx_patch['number'] = serverx_fields[0]
serverx_patch['rev'] = serverx_fields[1]
print serverx_patch['number'], serverx_patch['rev']
serverx.close()

if __name__=='__main__':
sun()
serverx()
Right now I'm just printing the data, just to be sure that each
dictionary contains the correct data, which it does. But now I need
the comparison and I just can't seem to figure it out. I could
probably write this in perl or a shell script, but I'm trying really
hard to force myself to learn Python so I want this to be a python
script, created with only built-in modules.

Any help would be greatly appreciated,
Rick

Jul 19 '05 #1
7 1939
On 5 May 2005 08:19:31 -0700, rickle <de*******@gmail.com> wrote:
I'm trying to compare sun patch levels on a server to those of what sun
is recommending. For those that aren't familiar with sun patch
numbering here is a quick run down.

A patch number shows up like this:
113680-03
^^^^^^ ^^
patch# revision

What I want to do is make a list. I want to show what server x has
versus what sun recommends, and if the patch exists, but the revision
is different, I want to show that difference.

Here are some sample patches that sun recommends:
117000-05
116272-03
116276-01
116278-01
116378-02
116455-01
116602-01
116606-01

Here are some sample patches that server x has:
117000-01
116272-02
116272-01
116602-02

So there are some that are the same, some that sun recommends that
server x doesn't have, and some where the patch is the same but the
revision is different.

I've thrown the data into dictionaries, but I just can't seem to figure
out how I should actually compare the data and present it. Here's what
I have so far (the split is in place because there is actually a lot
more data in the file, so I split it out so I just get the patch number
and revision). So I end up with (for example) 116272-01, then split so
field[0] is 116272 and field[1] is 01.

def sun():
sun = open('sun-patchlist', 'r')
for s in sun:
sun_fields = s.split(None, 7)
for sun_field in sun_fields:
sun_field = sun_field.strip()
sun_patch = {}
sun_patch['number'] = sun_fields[0]
sun_patch['rev'] = sun_fields[1]
print sun_patch['number'], sun_patch['rev']
sun.close()

def serverx():
serverx = open('serverx-patchlist', 'r')
for p in serverx:
serverx_fields = p.split(None, 7)
for serverx_field in serverx_fields:
serverx_field = serverx_field.strip()
serverx_patch = {}
serverx_patch['number'] = serverx_fields[0]
serverx_patch['rev'] = serverx_fields[1]
print serverx_patch['number'], serverx_patch['rev']
serverx.close()


The first thing you should notice about this code is that you copied a
good amount of code between functions; this should be a huge warning
bell that something can be abstracted out into a function. In this
case, it's the parsing of the patch files.

Also, you should see that you're creating a new dictionary every
iteration through the loop, and furthermore, you're not returning it
at the end of your function. Thus, it's destroyed when the function
exits and it goes out of scope.

<snip>

Anyway, since you at least made an effort, here's some totally
untested code that should (I think) do something close to what you're
looking for:

def parse_patch_file(f):
patches = {}
for line in f:
patch, rev = line.strip().split('-')
patches[patch] = rev
return patches

def diff_patches(sun, serverx):
for patch in sun:
if not serverx.has_key(patch):
print "Sun recommends patch %s" % patch
for patch in serverx:
if not sun.has_key(patch):
print "Serverx has unnecessary patch %s" % patch

def diff_revs(sun, serverx):
for patch, rev in sun.iteritems():
if serverx.has_key(patch) and rev != serverx[patch]:
print "Sun recommends rev %d of patch %s; serverx has rev %d"\
% (rev, patch, serverx[patch])

if __name__ == '__main__':
sun = parse_patch_file(open('sun-patchlist'))
serverx = parse_patch_file(open('serverx-patchlist'))
diff_patches(sun, serverx)
diff_revs(sun, serverx)

Hope this helps.

Peace
Bill Mill
bill.mill at gmail.com
Jul 19 '05 #2
rickle wrote:
I'm trying to compare sun patch levels on a server to those of what sun is recommending. For those that aren't familiar with sun patch
numbering here is a quick run down.

A patch number shows up like this:
113680-03
^^^^^^ ^^
patch# revision

What I want to do is make a list. I want to show what server x has
versus what sun recommends, and if the patch exists, but the revision
is different, I want to show that difference.

Here are some sample patches that sun recommends:
117000-05
116272-03
116276-01
116278-01
116378-02
116455-01
116602-01
116606-01

Here are some sample patches that server x has:
117000-01
116272-02
116272-01
116602-02

So there are some that are the same, some that sun recommends that
server x doesn't have, and some where the patch is the same but the
revision is different.

I've thrown the data into dictionaries, but I just can't seem to figure out how I should actually compare the data and present it. Here's what I have so far (the split is in place because there is actually a lot
more data in the file, so I split it out so I just get the patch number and revision). So I end up with (for example) 116272-01, then split so field[0] is 116272 and field[1] is 01.

def sun():
sun = open('sun-patchlist', 'r')
for s in sun:
sun_fields = s.split(None, 7)
for sun_field in sun_fields:
sun_field = sun_field.strip()
sun_patch = {}
sun_patch['number'] = sun_fields[0]
sun_patch['rev'] = sun_fields[1]
print sun_patch['number'], sun_patch['rev']
sun.close()

def serverx():
serverx = open('serverx-patchlist', 'r')
for p in serverx:
serverx_fields = p.split(None, 7)
for serverx_field in serverx_fields:
serverx_field = serverx_field.strip()
serverx_patch = {}
serverx_patch['number'] = serverx_fields[0]
serverx_patch['rev'] = serverx_fields[1]
print serverx_patch['number'], serverx_patch['rev']
serverx.close()

if __name__=='__main__':
sun()
serverx()
Right now I'm just printing the data, just to be sure that each
dictionary contains the correct data, which it does. But now I need
the comparison and I just can't seem to figure it out. I could
probably write this in perl or a shell script, but I'm trying really
hard to force myself to learn Python so I want this to be a python
script, created with only built-in modules.

Any help would be greatly appreciated,
Rick


Well, it seems that what youre asking is more of a generic programming
question than anything specific to Python - if you can think of how
you'd solve this in Perl, for example, then a Python solution along the
same lines would work just as well. I'm not sure if there was some
specific issue with Python that was confusing you - if so, perhaps you
could state it more explicitly.

To address the problem itself, there are a few things about your
approach in the above code that I find puzzling. First of all, the
sun() and servex() functions are identical, except for the name of the
file they open. This kind of code duplication is bad practice, in
Python, Perl, or any other language (even Shell scripting perhaps,
although I wouldn't really know) - you should definitely use a single
function that takes a filename as an argument instead.

Second, you are creating a new dictionary inside every iteration of the
for loop, one for every patch in the file; each dictionary you create
contains one patch number and one revision number. This data is
printed, and thereafter ignored (and thus will be consumed by Python's
Garbage Collector.) Hence youre not actually storing it for later use.
I don't know whether this was because you were unsure how to proceed to
the comparing the two datasets; however I think what you probably
wanted was to have a single dictionary, that keeps track of all the
patches in the file. You need to define this outside the for loop; and,
if you want to use it outside the body of the function, you'll need to
return it. Also, rather than have a dictionary of two values, keyed by
strings, I'd suggest a dictionary mapping patch numbers to their
corresponding revision numbers is what you want.

Once you've got two dictionaries - one for the list for the servers
patches, and one for Sun's recommended patches - you can compare the
two sets of data by going through the Sun's patches, checking if the
server has that patch, and if so, caluclating the difference in
revision numbers.

So heres a rough idea of how I'd suggest modifying what you've got to
get the intended result:

def patchlevels(filename):
patchfile = open(filename, 'r')
patch_dict = {}
for line in patchfile:
fields = line.split(None, 7)
for field in fields:
field = field.strip()
number = fields[0]
rev = fields[1]
patch_dict[number] = rev
# print number, patch_dict[number]
patchfile.close()
return patch_dict

if __name__=='__main__':
sun = patchlevels('sun-patchfile')
serverx = patchlevels('serverx-patchfile')
print "Sun recommends:\t\t", "Server has:\n"
for patch in sun:
if patch in serverx:
rev = serverx[patch]
diff = int(rev) - int(sun[patch])
serverhas = "Revision: %s Difference: %s" % (rev, diff)
else:
serverhas = "Does not have this patch"
print patch, sun[patch], "\t\t", serverhas

I've tried to stay as close to your code as possible and not introduce
new material, although I have had to use the inbuilt function int to
convert the revision numbers from strings to integers in order to
subtract one from the other; also, I used C printf-style string
formatting on the line after. I hope its reasonably obvious what these
things do.

For the sample data you gave, this outputs:

Sun recommends: Server has:

116276 01 Does not have this patch
116378 02 Does not have this patch
116272 03 Revision: 01 Difference: -2
116278 01 Does not have this patch
116602 01 Revision: 02 Difference: 1
116606 01 Does not have this patch
116455 01 Does not have this patch
117000 05 Revision: 01 Difference: -4

Here negative differences mean the server's version of the patch is out
of date, whereas positive differences mean its as recent as Sun's
recommendation or better. You could change the nature of the output to
whatever your own preference is easily enough. Or, if you want store
the data in some other structure like a list for further processing,
instead of just printing it, thats also pretty simple to do.

This code isn't exactly a work of art, I could have put more effort
into a sensible name for the function and variables, made it more
'pythonic' (e.g. by using a list-comprehension in place of the
whitespace stripping for loop ), etc; but I think it achieves the
desired result, or something close to it, right?

Let me know if I was on completely the wrong track.

Jul 19 '05 #3
Bill and Jordan, thank you both kindly. I'm not too well versed in
functions in python and that's exactly what I needed. I could see I
was doing something wrong in my original attempt, but I didn't know how
to correct it.

It's working like a charm now, thank you both very much.
-Rick

Jul 19 '05 #4
On Thursday 05 May 2005 10:20 am, so sayeth rickle:
Bill and Jordan, thank you both kindly. I'm not too well versed in
functions in python and that's exactly what I needed. I could see I
was doing something wrong in my original attempt, but I didn't know how
to correct it.

It's working like a charm now, thank you both very much.
-Rick


I thought I'd throw this in to show some things in python that make such comparisons very easy to write and also to recommend to use the patch as key and version as value in the dict.:

Note that the meat of the code is really about 4 lines because of (module) sets and list comprehension. Everything else is window dressing.

James

===================================

# /usr/bin/env python

from sets import Set

# pretending these stripped from file
recc_ary = ["117000-05", "116272-03", "116276-01", "116278-01", "116378-02", "116455-01", "116602-01", "116606-01"]
serv_ary = ["117000-01", "116272-02", "116272-01", "116602-02"]
# use patch as value and version as key
recc_dct = dict([x.split("-") for x in recc_ary])
serv_dct = dict([x.split("-") for x in serv_ary])

# use Set to see if patches overlap
overlap = Set(recc_dct.keys()).intersection(serv_dct.keys())

# find differences (change comparison operator to <,>,<=,>=, etc.)
diffs = [patch for patch in overlap if recc_dct[patch] != serv_dct[patch]]

# print a pretty report
for patch in diffs:
print "reccomended patch for %s (%s) is not server patch (%s)" % \
(patch, recc_dct[patch], serv_dct[patch])
--
James Stroud
UCLA-DOE Institute for Genomics and Proteomics
Box 951570
Los Angeles, CA 90095

http://www.jamesstroud.com/
Jul 19 '05 #5
On 5 May 2005 08:19:31 -0700, "rickle" <de*******@gmail.com> wrote:
I'm trying to compare sun patch levels on a server to those of what sun
is recommending. For those that aren't familiar with sun patch
numbering here is a quick run down.

A patch number shows up like this:
113680-03
^^^^^^ ^^
patch# revision

What I want to do is make a list. I want to show what server x has
versus what sun recommends, and if the patch exists, but the revision
is different, I want to show that difference.

Here are some sample patches that sun recommends:
117000-05
116272-03
116276-01
116278-01
116378-02
116455-01
116602-01
116606-01

Here are some sample patches that server x has:
117000-01
116272-02
116272-01
116602-02

So there are some that are the same, some that sun recommends that
server x doesn't have, and some where the patch is the same but the
revision is different.

I've thrown the data into dictionaries, but I just can't seem to figure
out how I should actually compare the data and present it. Here's what
I have so far (the split is in place because there is actually a lot
more data in the file, so I split it out so I just get the patch number
and revision). So I end up with (for example) 116272-01, then split so
field[0] is 116272 and field[1] is 01.

def sun():
sun = open('sun-patchlist', 'r')
for s in sun:
sun_fields = s.split(None, 7)
for sun_field in sun_fields:
sun_field = sun_field.strip()
sun_patch = {}
sun_patch['number'] = sun_fields[0]
sun_patch['rev'] = sun_fields[1]
print sun_patch['number'], sun_patch['rev']
sun.close()

def serverx():
serverx = open('serverx-patchlist', 'r')
for p in serverx:
serverx_fields = p.split(None, 7)
for serverx_field in serverx_fields:
serverx_field = serverx_field.strip()
serverx_patch = {}
serverx_patch['number'] = serverx_fields[0]
serverx_patch['rev'] = serverx_fields[1]
print serverx_patch['number'], serverx_patch['rev']
serverx.close()

if __name__=='__main__':
sun()
serverx()
Right now I'm just printing the data, just to be sure that each
dictionary contains the correct data, which it does. But now I need
the comparison and I just can't seem to figure it out. I could
probably write this in perl or a shell script, but I'm trying really
hard to force myself to learn Python so I want this to be a python
script, created with only built-in modules.

Any help would be greatly appreciated,

In place of sun_rec.splitlines() and x_has.splitlines() you can substitute
open('sun-patchlist') adn open('serverx-patchlist') respectively,
and you can wrap it all in some rountine for your convenience etc.
But this shows recommended revs that are either there, missing, and/or have unrecommended revs present.
I added some test data to illustrate. You might want to make the input a little more forgiving about
e.g. blank lines etc or raise exceptions for what's not allowed or expected.

----< sunpatches.py >--------------------------------------------------------------
#Here are some sample patches that sun recommends:
sun_rec = """\
117000-05
116272-03
116276-01
116278-01
116378-02
116455-01
116602-01
116606-01
testok-01
testok-02
testok-03
test_0-01
test_0-02
test_0-03
test_2-01
test_2-02
test_2-03
test23-02
test23-03
"""

#Here are some sample patches that server x has:
x_has = """\
117000-01
116272-02
116272-01
116602-02
testok-01
testok-02
testok-03
test_2-01
test_2-02
test23-01
test23-02
test23-03
"""

def mkdict(lineseq):
dct = {}
for line in lineseq:
patch, rev = line.split('-')
dct.setdefault(patch, set()).add(rev)
return dct

dct_x_has = mkdict(x_has.splitlines()) # or e.g., mkdict(open('sunrecfile.txt'))
dct_sun_rec = mkdict(sun_rec.splitlines())

for sunpatch, sunrevs in sorted(dct_sun_rec.items()):
xrevs = dct_x_has.get(sunpatch, set())
print 'patch %s: recommended revs %s, missing %s, actual other %s'%(
sunpatch, map(str,sunrevs&xrevs) or '(none)',
map(str,sunrevs-xrevs) or '(none)', map(str,xrevs-sunrevs) or '(none)')
----------------------------------------------------------------------------------
Result:

[12:51] C:\pywk\clp>py24 sunpatches.py
patch 116272: recommended revs (none), missing ['03'], actual other ['02', '01']
patch 116276: recommended revs (none), missing ['01'], actual other (none)
patch 116278: recommended revs (none), missing ['01'], actual other (none)
patch 116378: recommended revs (none), missing ['02'], actual other (none)
patch 116455: recommended revs (none), missing ['01'], actual other (none)
patch 116602: recommended revs (none), missing ['01'], actual other ['02']
patch 116606: recommended revs (none), missing ['01'], actual other (none)
patch 117000: recommended revs (none), missing ['05'], actual other ['01']
patch test23: recommended revs ['02', '03'], missing (none), actual other ['01']
patch test_0: recommended revs (none), missing ['02', '03', '01'], actual other (none)
patch test_2: recommended revs ['02', '01'], missing ['03'], actual other (none)
patch testok: recommended revs ['02', '03', '01'], missing (none), actual other (none)

Oops, didn't pyt multiple revs in sort order. Oh well, you can do that if you like.

Regards,
Bengt Richter
Jul 19 '05 #6
On Thu, 5 May 2005 10:37:23 -0700, James Stroud <js*****@mbi.ucla.edu> wrote:
[...]
We had the same impulse ;-)
(see my other post in this thread)

# use patch as value and version as key ??? seems the other way around (as it should be?)
recc_dct = dict([x.split("-") for x in recc_ary])
serv_dct = dict([x.split("-") for x in serv_ary])

But what about multiple revs for the same patch?

Regards,
Bengt Richter
Jul 19 '05 #7
On Thursday 05 May 2005 01:18 pm, so sayeth Bengt Richter:
On Thu, 5 May 2005 10:37:23 -0700, James Stroud <js*****@mbi.ucla.edu>
wrote: [...]
We had the same impulse ;-)
(see my other post in this thread)
# use patch as value and version as key
??? seems the other way around (as it should be?)


Sorry, typo in the comment.
recc_dct = dict([x.split("-") for x in recc_ary])
serv_dct = dict([x.split("-") for x in serv_ary])


But what about multiple revs for the same patch?


My Bad...

serv_dct = dict([(a,max([z for y,z in [f.split("-") for f in serv_ary] if a==y]))
for a,b in [g.split("-") for g in serv_ary]])

;o)

James

--
James Stroud
UCLA-DOE Institute for Genomics and Proteomics
Box 951570
Los Angeles, CA 90095

http://www.jamesstroud.com/
Jul 19 '05 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

8
by: Jan-Erik Meyer-Lütgens | last post by:
In the Python Language Reference, I found the following statements about using objects as dictionary keys: 1. "__hash__() should return a 32-bit integer." 2. "The only required property is...
57
by: Egor Bolonev | last post by:
why functions created with lambda forms cannot contain statements? how to get unnamed function with statements?
14
by: Antoon Pardon | last post by:
I'm writing a Tree class, which should behave a lot like a dictionary. In order to test this, I took the unittest from the source distribution for dictionaries and used it to test against my Tree...
2
by: orekinbck | last post by:
Hi There I am probably missing something fundamental here, but I cannot see a method to search the values of a generic dictionary so that I can find the key ? Of course I could enumerate...
90
by: Christoph Zwerschke | last post by:
Ok, the answer is easy: For historical reasons - built-in sets exist only since Python 2.4. Anyway, I was thinking about whether it would be possible and desirable to change the old behavior in...
3
by: kim.nolsoee | last post by:
Hi I want to use the Dictionary Classs that will load my own class called KeyClass used as TKey. Here is the code: public class Dictionary { public static void Main()
0
by: alanwo | last post by:
Hi Experts, Interesting finding, when comparing two dictionary of byte(), KeyNotFoundException throwed but, that byte() key is present in another dictionary. Is that the limitation of comparison...
3
by: brainstaurm | last post by:
I would like to understand the internals of the dictionary comparison in python. The python reference manual says that: "If a class does not define a __cmp__() method it should not define a...
2
by: =?Utf-8?B?c2lwcHl1Y29ubg==?= | last post by:
Hi I have a class that inherits from Generics Dictionary The string that is used for the key is passed thru-out my pgm and sometimes it has modifiers added to the key string that are used in the...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.