473,395 Members | 1,554 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,395 software developers and data experts.

file compare the hard way please help

Hi all this is my first post and I’m sorry I’m a noob.

I’ve been working on this for a couple of days and I cant seem to get it. I’m very sure that this is probably a very simple problem but it eludes me.

I need to do this in python on a Linux box. Here is the sequence of events.

Open a tcpdump file named something like “webdump.txt”, here is a sample of that file

11:30:07.830643 00:b0:64:19:86:f0 > 01:00:0c:cc:cc:cc snap ui/C len=35
0x0000: 0100 0ccc cccc 00b0 6419 86f0 0022 aaaa ........d...."..
0x0010: 0300 000c 2004 0100 0100 0500 0002 0005 ................
0x0020: 0400 0300 05a5 0004 000a 00b0 6419 86f0 ............d...
0x0030: 0001 42dc 861d 805c 0000 1400 ..B....\....
11:30:07.830722 00:b0:64:19:86:f0 > 01:00:0c:00:00:00 snap ui/C len=69
0x0000: 0100 0c00 0000 00b0 6419 86f0 0050 aaaa ........d....P..
0x0010: 0300 b064 0003 0084 0000 0100 0ccc cccc ...d............
0x0020: 00b0 6419 86f0 0032 aaaa 0300 000c 2004 ..d....2........
0x0030: 0100 0100 0500 0002 0005 0400 0300 05a5 ................
0x0040: 0004 000a 00b0 6419 86f0 0000 0000 0000 ......d.........
0x0050: 0000 0000 0000 0000 0000 30db e516 ..........0...

break the file up into each packet being in its own array, list, or container (for people who don’t work with tcpdump the packets start with the time stamp, and as you notice packets can be anywhere from a few lines to many lines) also removing the line end \n \t.
Next open another tcpdump file and find a match for each packet (that are now in arrays) in the first file, in the second file, if there is a match print match successful if there isn’t a match print the packet and match not found. I would have used filecmp but the information in file 1 and file 2 may be in a different order.
So far I have opened the file and put each packet into a stack. With

f = open('webdump.txt','rd')
for line in f.read().split('11:'):
stack = [line]
# stack.remove('\n')
print stack
f.close()
This doesn’t take the \n \t off (which I'm not sure is absolutely important as long as i can make a match)
and it also requires me to change the code every time i run it unless i only run my dumps in the 11 hour of the day. It also removes the 11: from the packet to look something like this:

['']
['30:07.830643 00:b0:64:19:86:f0 > 01:00:0c:cc:cc:cc snap ui/C len=35\n\t0x0000: 0100 0ccc cccc 00b0 6419 86f0 0022 aaaa ........d...."..\n\t0x0010: 0300 000c 2004 0100 0100 0500 0002 0005 ................\n\t0x0020: 0400 0300 05a5 0004 000a 00b0 6419 86f0 ............d...\n\t0x0030: 0001 42dc 861d 805c 0000 1400 ..B....\\....\n']
['30:07.830722 00:b0:64:19:86:f0 > 01:00:0c:00:00:00 snap ui/C len=69\n\t0x0000: 0100 0c00 0000 00b0 6419 86f0 0050 aaaa ........d....P..\n\t0x0010: 0300 b064 0003 0084 0000 0100 0ccc cccc ...d............\n\t0x0020: 00b0 6419 86f0 0032 aaaa 0300 000c 2004 ..d....2........\n\t0x0030: 0100 0100 0500 0002 0005 0400 0300 05a5 ................\n\t0x0040: 0004 000a 00b0 6419 86f0 0000 0000 0000 ......d.........\n\t0x0050: 0000 0000 0000 0000 0000 30db e516 ..........0...\n']
Oct 11 '06 #1
2 2305
bartonc
6,596 Expert 4TB
Start by using

stack = line.split()
Oct 11 '06 #2
Figured it out with a bunch of help

Expand|Select|Wrap|Line Numbers
  1. import re
  2.  
  3. def getPackets(fileobj):
  4.     packets = []
  5.  
  6.     for line in fileobj:
  7.         #match the tcpdump timestamp
  8.         pattern = '[0-9][0-9]:[0-9][0-9]:[0-9][0-9].[0-9][0-9][0-9][0-9][0-9][0-9] '
  9.         # if line matches the pattern
  10.         if re.match(pattern, line):
  11.             # remove the timestamp
  12.             line = re.sub(pattern, '', line)
  13.             # add the line to the end of the list          
  14.             packets.append(line)
  15.         else: 
  16.             # otherwise append to the end of the last
  17.             # item in the list
  18.             try: packets[len(packets)-1] += line
  19.             # exception shouldn't occur, unless 
  20.             # the input file is bad 
  21.             except: pass
  22.     return packets
  23.  
  24. list1 = getPackets(open('file1.txt'))
  25. list2 = getPackets(open('file2.txt'))
  26.  
  27. # loop through list1, checking 
  28. # if each of its elements is in
  29. # list2
  30.  
  31. for packet in list1:
  32.     if packet in list2:
  33.         print 'Match Successful'
  34.     else:
  35.         print 'Match Unsuccessful'
  36.         print packet
Oct 12 '06 #3

Sign in to post your reply or Sign up for a free account.

Similar topics

11
by: Wilsoch | last post by:
Long story short: My Access developer is letting me down. He doesn't really know VB and he can't figure out how to do what I need. Situation: Access database that will be used locally on...
46
by: dawn | last post by:
Hi all, I am now working on a C program under Unix. The requirement for the program is that: A file name is passed to program as a parameter. The program will Find files under a specified...
7
by: Drew Berkemeyer | last post by:
I've encounted a pretty strange problem and I'm not quite sure what to make of it. I have a web service that consumes an XML file as well as a few other parameters. This web service works fine...
9
by: paczkow | last post by:
Dear Python Community, I am an engineering and I am experiencing some trouble. Having output data from other software I want to use it. To achieve this I decided to use Python since this...
1
by: Tlholo | last post by:
Here are the requirements for the project. I have also included the other project that I am working on and is also giving me some problems. i need code examples. Project 1 This program actually...
8
by: Perl Beginner | last post by:
I am new to Perl and new to this site. I have the same question that I keep seeing, but not finding an answer…why doesn’t the compare function work? I’ve been going at this for a while. My code is...
26
by: neha_chhatre | last post by:
can anybody tell me how to compare two float values say for example t and check are two variables declared float how to compare t and check please help me as soon as possible
6
by: provor | last post by:
Hello, I have the following code that I am using when a user presses a button to import an excel file into a table. The code is hard coded to point to the correct table. This works great for this...
2
by: NEBozman | last post by:
I am writing my own script for a simple calculator that determines if a number is prime, and its prime factors. Currently my code looks like: Private Sub primebutton_Click(ByVal sender As...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.