473,385 Members | 1,474 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

diffing and uniqing directories

Over years Ive collected tgz's of my directories. I would like to diff
and uniq them

Now I guess it would be quite simple to write a script that does a
walk or find through a pair of directory trees, makes a SHA1 of each
file and then sorts out the files whose SHA1s are the same/different.
What is more difficult for me to do is to write a visual/gui tool to
help me do this.

I would guess that someone in the python world must have already done
it [The alternative is to use some of the tools that come with version
control systems like git. But if I knew more about that option I would
not be stuck with tgzs in the first place ;-)]

So if there is such software known please let me know.

PS Also with the spam flood that has hit the python list I dont know
if this mail is being read at all or Ive fallen off the list!
Jun 27 '08 #1
6 1230
On Sat, 26 Apr 2008 20:35:29 -0700, rustom wrote:
On Apr 27, 12:31Â*am, castiro...@gmail.com wrote:
>On Apr 26, 1:14Â*pm, "Rustom Mody" <rustompm...@gmail.comwrote:
[…]

If this is an answer to my question I dont understand it!
castironpi is either a bot or trolling. Just ignore its posts.

Ciao,
Marc 'BlackJack' Rintsch
Jun 27 '08 #2
On Apr 27, 2:37*am, Marc 'BlackJack' Rintsch <bj_...@gmx.netwrote:
On Sat, 26 Apr 2008 20:35:29 -0700, rustom wrote:
On Apr 27, 12:31*am, castiro...@gmail.com wrote:
On Apr 26, 1:14*pm, "Rustom Mody" <rustompm...@gmail.comwrote:
[…]
If this is an answer to my question I dont understand it!

castironpi is either a bot or trolling. *Just ignore its posts.

Ciao,
* * * * Marc 'BlackJack' Rintsch
I am a bot or trolling. Bots and bot detectors were the first forms
of internet life, you know.
Jun 27 '08 #3
Just so happens that I am partially finished a gui file backup app. I have
many backup CDs and I wanted to consolidate them. You know, all image files
in one dir, all install files in another dir, etc. My app scans the input
dir tree and displays all file extensions that it finds. You can then remove
any extensions that you don't want backed-up, and you can toggle to exclude
the listed extensions. It also calculates min/max file sizes that you can
adjust.

Then the next page allows you to adjust the sub-dir depth with a slider,
which displays the total number of files and total amount of memory they
will take, for each sub-dir depth. You can also choose to enable versioning,
whether or not to put all files into one dir or create a dir for each file
type (extension), whether or not to actually backup the files, to write all
input pathnames and/or output pathnames to a file. Of course, it won't
backup a dir tree as a copy, it can only flatten a dir tree, so if you
backup a development source dir, all files will get put into the same dir
and that wouldn't be good.

I've also used py2exe to make it a drag-n-drop install. What do you think?
<ca********@gmail.comwrote in message
news:bf**********************************@a22g2000 hsc.googlegroups.com...
On Apr 27, 2:37 am, Marc 'BlackJack' Rintsch <bj_...@gmx.netwrote:
On Sat, 26 Apr 2008 20:35:29 -0700, rustom wrote:
On Apr 27, 12:31 am, castiro...@gmail.com wrote:
On Apr 26, 1:14 pm, "Rustom Mody" <rustompm...@gmail.comwrote:
[…]
If this is an answer to my question I dont understand it!

castironpi is either a bot or trolling. Just ignore its posts.

Ciao,
Marc 'BlackJack' Rintsch
I am a bot or trolling. Bots and bot detectors were the first forms
of internet life, you know.

Jun 27 '08 #4
On Apr 27, 11:29*pm, "telus news" <thinkof...@yahoo.cawrote:
Just so happens that I am partially finished a gui file backup app. I have
many backup CDs and I wanted to consolidate them. You know, all image files
in one dir, all install files in another dir, etc. My app scans the input
dir tree and displays all file extensions that it finds. You can then remove
any extensions that you don't want backed-up, and you can toggle to exclude
the listed extensions. It also calculates min/max file sizes that you can
adjust.
I guess what I am looking for is a merge app more than a backup app.
>
Then the next page allows you to adjust the sub-dir depth with a slider,
which displays the total number of files and total amount of memory they
will take, for each sub-dir depth. You can also choose to enable versioning,
whether or not to put all files into one dir or create a dir for each file
type (extension), whether or not to actually backup the files, to write all
input pathnames and/or output pathnames to a file. Of course, it won't
backup a dir tree as a copy, it can only flatten a dir tree,
That wont do for me.
so if you
backup a development source dir, all files will get put into the same dir
and that wouldn't be good.

I've also used py2exe to make it a drag-n-drop install. What do you think?
Im working (mostly) on linux
Jun 27 '08 #5
On Sat, 26 Apr 2008 23:44:17 +0530, Rustom Mody wrote:
Over years Ive collected tgz's of my directories. I would like to diff
and uniq them

Now I guess it would be quite simple to write a script that does a walk
or find through a pair of directory trees, makes a SHA1 of each file and
then sorts out the files whose SHA1s are the same/different. What is
more difficult for me to do is to write a visual/gui tool to help me do
this.

I would guess that someone in the python world must have already done it
[The alternative is to use some of the tools that come with version
control systems like git. But if I knew more about that option I would
not be stuck with tgzs in the first place ;-)]

So if there is such software known please let me know.

PS Also with the spam flood that has hit the python list I dont know if
this mail is being read at all or Ive fallen off the list!
It doesn't have a GUI, but here's a python program I wrote for dividing
large collections of files up into identical groups:

http://stromberg.dnsalias.org/~strom...e-classes.html

Jun 27 '08 #6
On May 11, 2:44*pm, Dan Stromberg <dstrombergli...@gmail.comwrote:
On Sat, 26 Apr 2008 23:44:17 +0530, Rustom Mody wrote:
Over years Ive collected tgz's of my directories. I would like to diff
and uniq them
Now I guess it would be quite simple to write a script that does a walk
or find through a pair of directory trees, makes a SHA1 of each file and
then sorts out the files whose SHA1s are the same/different. What is
more difficult for me to do is to write a visual/gui tool to help me do
this.
I would guess that someone in the python world must have already done it
[The alternative is to use some of the tools that come with version
control systems like git. But if I knew more about that option I would
not be stuck with tgzs in the first place ;-)]
So if there is such software known please let me know.
PS Also with the spam flood that has hit the python list I dont know if
this mail is being read at all or Ive fallen off the list!

It doesn't have a GUI, but here's a python program I wrote for dividing
large collections of files up into identical groups:

http://stromberg.dnsalias.org/~strom...-classes.html- Hide quoted text -

- Show quoted text -
I want to question terminology!

Question 'identical' groups over related!

Jun 27 '08 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: Tum | last post by:
Hi folks, I've been trying to make a decision and it's driving me crazy. Is a directory a file or is a directory NOT a file but a node? Should I have A)
6
by: Billy Jacobs | last post by:
I have a website which has both secure and non-secure pages. I want to uses forms authentication. How do I accomplish this? Originally I had my web.config file in the root with Forms...
1
by: Heath | last post by:
I'm dealing with a C# application that monitors changes to the file system, and need to exclude irrelevent directories, temp directories for example. Is there any way to identify such...
2
by: Jeffry van de Vuurst | last post by:
Hi, (sorry for the crosspost, I wasn't sure which was the best place to put this). I was just thinking about something and wondered if any of you has some ideas about this. I'm using the...
2
by: Angelo Cook | last post by:
how do you prevent the publishing of virtual directories in VS 2005. I have been using VS2003 and developing websites for years. I have been using virtual directories for images, icons, styles...
4
by: rn5a | last post by:
I have a ListBox which should list all the files & directories that exist in a particular directory. The problem is I can get the ListBox to list either all the files or all the directories but not...
1
by: rn5a | last post by:
A ListBox lists all the folders & files existing in a directory named 'MyDir' on the server. Assume that the ListBox lists 2 directories - 'Dir1' & 'Dir2' i.e. these 2 directories reside in the...
6
by: =?Utf-8?B?WW9naSBXYXRjaGVy?= | last post by:
Hello, I am using Visual Studio-2003. I created a project to build my library. Since I am using third party libraries as well, I have specified those additional library dependencies in project...
4
by: Edwin Velez | last post by:
http://msdn.microsoft.com/en-us/library/806sc8c5.aspx The URL above gives sample code for use within a Console Application. What I would like to do is use this code within a Windows Form. That...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.