469,332 Members | 6,649 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,332 developers. It's quick & easy.

File compare

I have two files
file1 in format
<id> <val1> <test1> <test2>
'AA' 1 T T
'AB' 1 T F

file2 same as file1
<id> <val1> <test1> <test2>
'AA' 1 T T
'AB' 1 T T

Also the compare should be based on id. So it should look for line
starting with id 'AA' (for example) and then match the line so if in
second case.

so this is what I am looking for:
1. read both files.
2. read id of first line in file1 check if it matches with the same id
in file2.
3. repeat step 2 for all lines in file1.
4. return a percent of success to failure. ie if one line matches and
one lines does'nt then return 0.5 or 50%

I wrote a boolean version ..as a start

def getdata(f):
try:
f1 = open(f,'r')
data=[]
for eachline in f1.readlines():
data.append(re.split("",
re.sub('\n','',strip(re.split('\s\s+',eachline)[0]))))
return data
except IOError:
raise("Invalid File Input")

if __name__=='__main__':

data1 = getdata('file1')
data2 = getdata('file2')

if data1 == data2:
print "True"
else:
print "False"

hope I am clear...

Oct 12 '05 #1
5 2913
Note that the code i wrote wont do the compare based on id which i am
looking for..it just does a direct file to file compare..

Oct 12 '05 #2
Sounds a little like "homework", but I'll help you out.
There are lots of ways, but this works.

import sys
class fobject:
def __init__(self, inputfilename):
try:
fp=open(inputfilename, 'r')
self.lines=fp.readlines()
except IOError:
print "Unable to open and read inputfilename=%s" % inputfilename
sys.exit(3)

self.datadict={}
for line in self.lines:
line=line.strip()
line=line.strip("'")
key, values=line.split(' ',1)
self.datadict[key]=values

return

def keys(self):
return self.datadict.keys()

def compare(self, otherobject):
keys=otherobject.keys()
match=0
for key in keys:
if self.datadict[key] == otherobject.datadict[key]: match+=1

return float(match)/float(len(keys))

if __name__=="__main__":
f1=fobject(r'f:\syscon\python\zbkup\f1.txt')
f2=fobject(r'f:\syscon\python\zbkup\f2.txt')
print f1.compare(f2)
Larry Bates
PyPK wrote:
I have two files
file1 in format
<id> <val1> <test1> <test2>
'AA' 1 T T
'AB' 1 T F

file2 same as file1
<id> <val1> <test1> <test2>
'AA' 1 T T
'AB' 1 T T

Also the compare should be based on id. So it should look for line
starting with id 'AA' (for example) and then match the line so if in
second case.

so this is what I am looking for:
1. read both files.
2. read id of first line in file1 check if it matches with the same id
in file2.
3. repeat step 2 for all lines in file1.
4. return a percent of success to failure. ie if one line matches and
one lines does'nt then return 0.5 or 50%

I wrote a boolean version ..as a start

def getdata(f):
try:
f1 = open(f,'r')
data=[]
for eachline in f1.readlines():
data.append(re.split("",
re.sub('\n','',strip(re.split('\s\s+',eachline)[0]))))
return data
except IOError:
raise("Invalid File Input")

if __name__=='__main__':

data1 = getdata('file1')
data2 = getdata('file2')

if data1 == data2:
print "True"
else:
print "False"

hope I am clear...

Oct 12 '05 #3
Not for homework. But anyway thanks much...

Oct 13 '05 #4
PyPK wrote:
I have two files
file1 in format
<id> <val1> <test1> <test2>
'AA' 1 T T
'AB' 1 T F

file2 same as file1
<id> <val1> <test1> <test2>
'AA' 1 T T
'AB' 1 T T

Also the compare should be based on id. So it should look for line
starting with id 'AA' (for example) and then match the line so if in
second case.


See the recent thread with subject line "List performance and CSV".
Oct 14 '05 #5
but what if
case 1:
no.of keys in f1 > f2 and
case2:
no.of keys in f1 < f2.
Should'nt we get 1.1 if case 1 and 0.9 if case 2?? it errors of with a
keyerror.?

Oct 14 '05 #6

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

2 posts views Thread by RitaG | last post: by
46 posts views Thread by dawn | last post: by
reply views Thread by Sunit Joshi | last post: by
18 posts views Thread by Torben Laursen | last post: by
1 post views Thread by Roy | last post: by
reply views Thread by zhoujie | last post: by
reply views Thread by suresh191 | last post: by
1 post views Thread by haryvincent176 | last post: by
1 post views Thread by Marylou17 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.