By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
443,846 Members | 1,857 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 443,846 IT Pros & Developers. It's quick & easy.

file compare the hard way please help

P: 2
Hi all this is my first post and Iím sorry Iím a noob.

Iíve been working on this for a couple of days and I cant seem to get it. Iím very sure that this is probably a very simple problem but it eludes me.

I need to do this in python on a Linux box. Here is the sequence of events.

Open a tcpdump file named something like ďwebdump.txtĒ, here is a sample of that file

11:30:07.830643 00:b0:64:19:86:f0 > 01:00:0c:cc:cc:cc snap ui/C len=35
0x0000: 0100 0ccc cccc 00b0 6419 86f0 0022 aaaa ........d...."..
0x0010: 0300 000c 2004 0100 0100 0500 0002 0005 ................
0x0020: 0400 0300 05a5 0004 000a 00b0 6419 86f0 ............d...
0x0030: 0001 42dc 861d 805c 0000 1400 ..B....\....
11:30:07.830722 00:b0:64:19:86:f0 > 01:00:0c:00:00:00 snap ui/C len=69
0x0000: 0100 0c00 0000 00b0 6419 86f0 0050 aaaa ........d....P..
0x0010: 0300 b064 0003 0084 0000 0100 0ccc cccc ...d............
0x0020: 00b0 6419 86f0 0032 aaaa 0300 000c 2004 ..d....2........
0x0030: 0100 0100 0500 0002 0005 0400 0300 05a5 ................
0x0040: 0004 000a 00b0 6419 86f0 0000 0000 0000 ......d.........
0x0050: 0000 0000 0000 0000 0000 30db e516 ..........0...

break the file up into each packet being in its own array, list, or container (for people who donít work with tcpdump the packets start with the time stamp, and as you notice packets can be anywhere from a few lines to many lines) also removing the line end \n \t.
Next open another tcpdump file and find a match for each packet (that are now in arrays) in the first file, in the second file, if there is a match print match successful if there isnít a match print the packet and match not found. I would have used filecmp but the information in file 1 and file 2 may be in a different order.
So far I have opened the file and put each packet into a stack. With

f = open('webdump.txt','rd')
for line in f.read().split('11:'):
stack = [line]
# stack.remove('\n')
print stack
f.close()
This doesnít take the \n \t off (which I'm not sure is absolutely important as long as i can make a match)
and it also requires me to change the code every time i run it unless i only run my dumps in the 11 hour of the day. It also removes the 11: from the packet to look something like this:

['']
['30:07.830643 00:b0:64:19:86:f0 > 01:00:0c:cc:cc:cc snap ui/C len=35\n\t0x0000: 0100 0ccc cccc 00b0 6419 86f0 0022 aaaa ........d...."..\n\t0x0010: 0300 000c 2004 0100 0100 0500 0002 0005 ................\n\t0x0020: 0400 0300 05a5 0004 000a 00b0 6419 86f0 ............d...\n\t0x0030: 0001 42dc 861d 805c 0000 1400 ..B....\\....\n']
['30:07.830722 00:b0:64:19:86:f0 > 01:00:0c:00:00:00 snap ui/C len=69\n\t0x0000: 0100 0c00 0000 00b0 6419 86f0 0050 aaaa ........d....P..\n\t0x0010: 0300 b064 0003 0084 0000 0100 0ccc cccc ...d............\n\t0x0020: 00b0 6419 86f0 0032 aaaa 0300 000c 2004 ..d....2........\n\t0x0030: 0100 0100 0500 0002 0005 0400 0300 05a5 ................\n\t0x0040: 0004 000a 00b0 6419 86f0 0000 0000 0000 ......d.........\n\t0x0050: 0000 0000 0000 0000 0000 30db e516 ..........0...\n']
Oct 11 '06 #1
Share this Question
Share on Google+
2 Replies


bartonc
Expert 5K+
P: 6,596
Start by using

stack = line.split()
Oct 11 '06 #2

P: 2
Figured it out with a bunch of help

Expand|Select|Wrap|Line Numbers
  1. import re
  2.  
  3. def getPackets(fileobj):
  4.     packets = []
  5.  
  6.     for line in fileobj:
  7.         #match the tcpdump timestamp
  8.         pattern = '[0-9][0-9]:[0-9][0-9]:[0-9][0-9].[0-9][0-9][0-9][0-9][0-9][0-9] '
  9.         # if line matches the pattern
  10.         if re.match(pattern, line):
  11.             # remove the timestamp
  12.             line = re.sub(pattern, '', line)
  13.             # add the line to the end of the list          
  14.             packets.append(line)
  15.         else: 
  16.             # otherwise append to the end of the last
  17.             # item in the list
  18.             try: packets[len(packets)-1] += line
  19.             # exception shouldn't occur, unless 
  20.             # the input file is bad 
  21.             except: pass
  22.     return packets
  23.  
  24. list1 = getPackets(open('file1.txt'))
  25. list2 = getPackets(open('file2.txt'))
  26.  
  27. # loop through list1, checking 
  28. # if each of its elements is in
  29. # list2
  30.  
  31. for packet in list1:
  32.     if packet in list2:
  33.         print 'Match Successful'
  34.     else:
  35.         print 'Match Unsuccessful'
  36.         print packet
Oct 12 '06 #3

Post your reply

Sign in to post your reply or Sign up for a free account.