473,770 Members | 4,055 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Best way to compare the contents of two directories

I have two directory trees that I want to compare and I'm trying to
figure out what the best way of doing this would be. I am using walk
to get a list of all of the files in each directory.

I am using this code to compare the file lists:

def compare_files(f irst_list, second_list, first_dir, second_dir):
missing = in_first_only(f irst_list, second_list)
for item in missing:
index = first_list.inde x(item)
print first_list[index] + ' does not exist in ' +
second_dir[index]
first_list.pop( index); first_dir.pop(i ndex)
return first_list, second_list, first_dir, second_dir

However, before I actually compare the files, I want to compare the
directories and if a directory is mising in either set, I want to
report it:

dir_list_a = ['d:\\results\\f oldera\\','d:\\ results\\folder b\\','d:\\resul ts\\folderc\\']
dir_list_b = ['c:\\results\\f oldera\\','c:\\ results\\folder b\\']

output:
'folderc' exists in d:\results but not in c:\results
I am using splitall (from the Python Cookbook) to split the paths into
there parts and appending this to a list, but I can't figure out the
best way to compare the contents of the resulting 2 lists and I think
I am starting to make things *too* complicated:

def splitall(path):
"""
Source: Python Cookbook
Credit: Trent Mick

Split a path into all of its parts.
"""
allparts = []
while 1:
parts = os.path.split(p ath)
if parts[0] == path:
allparts.insert (0, parts[0])
break
elif parts[1] == path:
allparts.insert (0, parts[1])
break
else:
path = parts[0]
allparts.insert (0, parts[1])
return allparts

After using this, I end up with this:

dir_list_a = [['d:\\', 'results', 'foldera', 'd:\\', 'results',
'folderb', 'd:\\', 'results', 'folderc']]
dir_list_b =
[['d:\\', 'results', 'foldera', 'd:\\', 'results', 'folderb']]
Jul 18 '05 #1
6 4548
ro***********@p almsource.com (Robin Siebler) wrote in message news:<95******* *************** ****@posting.go ogle.com>...
I have two directory trees that I want to compare and I'm trying to
figure out what the best way of doing this would be. I am using walk
to get a list of all of the files in each directory.


Once you have the two lists, look into difflib. E.g.,

import difflib
for i in difflib.ndiff(l ist1, list2):
print i

Dan Gass has recently contributed some code that can produce
a side-by-side difference in HTML format. I believe it is in the 2.4
release, but you can also get it from
"""https://sourceforge.net/tracker/?func=detail&at id=305470&aid=9 14575&group_id= 5470"""

import difflib
tbl = difflib.HtmlDif f().make_file(l ist1, list2)
f = open("diffs.htm l", "w")
f.write(tbl)
f.close()

--dang
Jul 18 '05 #2
[Robin Siebler]
However, before I actually compare the files, I want to compare the
directories and if a directory is mising in either set, I want to
report it:


The operative word is "set".

Try using sets.py:
one_only = Set(dirlistone) - Set(dirlisttwo)
two_only = Set(dirlisttwo) - Set(dirlistone)
Raymond Hettinger
Jul 18 '05 #3
"Raymond Hettinger" <vz******@veriz on.net> wrote in message news:<Zxc1d.412 $W73.185@trndny 03>...
[Robin Siebler]
However, before I actually compare the files, I want to compare the
directories and if a directory is mising in either set, I want to
report it:


The operative word is "set".

Try using sets.py:
one_only = Set(dirlistone) - Set(dirlisttwo)
two_only = Set(dirlisttwo) - Set(dirlistone)
Raymond Hettinger


I get the following error:

NameError: name 'Set' is not defined

I'm using ActivePython 2.2.3. Is this something that has been added
to a later version of Python?
Jul 18 '05 #4
On Mon, 2004-09-13 at 11:37 -0700, Robin Siebler wrote:
"Raymond Hettinger" <vz******@veriz on.net> wrote in message news:<Zxc1d.412 $W73.185@trndny 03>...
[Robin Siebler]
However, before I actually compare the files, I want to compare the
directories and if a directory is mising in either set, I want to
report it:


The operative word is "set".

Try using sets.py:
one_only = Set(dirlistone) - Set(dirlisttwo)
two_only = Set(dirlisttwo) - Set(dirlistone)
Raymond Hettinger


I get the following error:

NameError: name 'Set' is not defined

I'm using ActivePython 2.2.3. Is this something that has been added
to a later version of Python?


I think sets were added in 2.3, but either way you must still 'from sets
import Set' before using them.

Regards,
Cliff

--
Cliff Wells <cl************ @comcast.net>

Jul 18 '05 #5
Cliff Wells <clifford.wel ls <at> comcast.net> writes:

I think sets were added in 2.3, but either way you must still 'from sets
import Set' before using them.


It also might be good to get in the habit of writing this as:

from sets import Set as set

so that when you move to Python 2.4, where set() is a builtin, all you have to
do is remove the import.

Python 2.4a3 (#56, Sep 2 2004, 20:50:21) [MSC v.1310 32 bit (Intel)] on win32
Type "help", "copyright" , "credits" or "license" for more information.
set(abs(2*i - 6) for i in range(10))

set([0, 2, 4, 6, 8, 10, 12])

Steve

Jul 18 '05 #6
> > I'm using ActivePython 2.2.3. Is this something that has been added
to a later version of Python?


I think sets were added in 2.3, but either way you must still 'from sets
import Set' before using them.


Right.

Also, sets.py is now Py2.2 compatibility, so you can take the current module off
of CVS and use it directly.
Raymond Hettinger
Jul 18 '05 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
3650
by: could ildg | last post by:
I want to compare 2 directories, and find If all of theire sub-folders and files and sub-files are identical.. If not the same, I want know which files or folders are not the same. I know filecmp moudle has cmpfiles function and a class named dircmp, they may help, but I wonder if there is a ready-to-use function in python libs? If not, would somebody like to give me some propositions? Thank you.
2
1620
by: Edwin M. Turner | last post by:
I would like the directories of my Website and their contents containing images to be visible just like it is when you navigate on your local machine using Explorer. That way anyone can browse the directory, view a list of the direcory's subdirectories and click on the images they contain to view them; and if given access to to certain site directories will be able to upload their own images. This would be very useful since it would...
0
1246
by: Jeff Higgins | last post by:
Hi all Newbie to XPath. Can someone help with an XPath expression(s) that will accomplish the following using JAXP in J2SE-1.5. Thanks Jeff Higgins In a class I have two private org.w3c.dom Document fields; Directories,and FileList, that are populated as below;
19
9521
by: David zhu | last post by:
I've got different result when comparing two strings using "==" and string.Compare(). The two strings seems to have same value "1202002" in the quick watch, and both have the same length 7 which I have tried to print out by debug.writeline(). But the "==" operator results false, and string.Compare() results true. Somebody helps me!
26
3666
by: puzzlecracker | last post by:
It'd be interesting to compare the learning practices of c++ practitioners. I'll start with mine The C++ Programming Language C++ Primer Effective C++ More Effective C++ Effective STL The C++ Standard Library : A Tutorial and Reference (most of it) Exceptional C++
3
1908
by: could.net | last post by:
I want to compare 2 directories: dir1 and dir2. What I want to do is to get these informations: 1. does they have the same number of files and sub-directories? 2. does each file with the same name have the same size and date information? So, how can I do it in python? Thank you!
6
3151
by: JM | last post by:
I have never used a (content management system) CMS before but I need one for my internship as a webdeveloper. Requirements: runs on Apache, linux or unix, MySQL and PHP (maybe Windows server and IIS) Authentication (not sure yet): existing user database, LDAP or permissions on directories (not sure if that last one is possible) Purpose: scientist working on projects should be able to upload their
5
5588
by: stamatis32 | last post by:
Hello everybody, i want to make a prog to compare the contents of 2 dirs. You can compare easily 2 files, but how this happend with dirs. I am very confused, i dont know where to start from, can anyone help? I search on many books but i haven't found anything similar
1
1636
by: Richard | last post by:
I'm writing code to compare two directories side-by-side, like FolderMatch or FolderSync. I've used ListBoxes in VB6, so have the idea about concatenating strings for list items, but have no experience with ListView. Which control do you suggest I use?
0
9592
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
10230
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
10004
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9870
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8886
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
6678
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5313
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
3972
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
3
2817
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.