By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
446,275 Members | 1,921 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 446,275 IT Pros & Developers. It's quick & easy.

Script to automate extraction of file from compressed archives

P: n/a
I am a perl newbie who is trying to write a script to automate a
task.

I have a large collection of compressed archives (mostly .tar.gz,
tar.bz2, tar.Z, .tgz etc). This are stored in a number of directories
/ sub-directories).

I am looking for a script that will recursively extract a single file
from each of these archives e.g. the file INSTALL, for the extracted
file to be moved to a different location and renamed to the name of
the archive itself, but keeping the same directory structure;

e.g.

Suppose I have archives files, x.tar.gz, y.tar.gz, and x.tar.Z in
/home/peter/a/

and in /home/peter/a/b/ files ab.tar.gz, b.tar.bz2, c.tgz

I would like the script to recursively extract the INSTALL from all of
these files, for the INSTALL files to be copied to /tmp, and renamed
to the name of the archive, so that in

/tmp/a/ there will be files named x.tar.gz, y.tar.gz, and x.tar.Z
(which are just the relevant INSTALL files), and in /tmp/a/b/ files
ab.tar.gz, b.tar.bz2, c.tgz (again these files to be just the INSTALL
files).
I appreciate that tar -zf name.tar.gz -x <file name> extracts just a
file, but it creates directories etc, which mean the above is
unworkable.

Would be really grateful for any help you can give. Please bear in
mind that I am not very technically minded.

Thanks,
Peter
pe****************@fastmail.fm

May 7 '06 #1
Share this Question
Share on Google+
1 Reply


P: n/a
Peter Thorne wrote:
I am looking for a script that will recursively extract a single file
from each of these archives e.g. the file INSTALL, for the extracted
file to be moved to a different location and renamed to the name of
the archive itself, but keeping the same directory structure;


This should be quite straightforward, especially if you intend to use an
external program like `tar' to unarchive your files.

You should need only one function that accepts as its sole argument a
directory name. This function is recursive, in that it calls itself,
passing along the name of whatever subdirectory it's currently looking
at. This function is just called from one starting point in your
program, and should be passed the initial directory name (perhaps from a
command-line switch).

The code might look like:

#!/usr/bin/perl

my $tar_path = '/bin/tar'; # Path to the tar program
my $starting_dir = shift; # Starting point for extraction

extract_dir($starting_dir);

exit;

sub extract_dir {
my $current_dir = shift; # Current directory level

# Get a filehandle for this directory
opendir my $DIR, $current_dir;

# Go through each file in this directory
for my $filename (readdir $DIR) {
# Check to see if this is a regular file
if (-f "$current_dir/$filename") {
# Extract the `INSTALL' file
system $tar_path, '-xf', $filename, 'INSTALL';
}

# Check to see if this is a subdirectory
if (-d "$current_dir/$filename") {
# Go through this directory
extract_dir("$current_dir/$filename");
}
}

return;
}
And of course, you'll want to add additional logic to make sure that the
current file it's iterating over isn't the same directory or a parent
directory (`.' and `..') or some cyclical symlink, and that if it's a
regular file, that it's an archive you actually want to extract. Given
your cited example, you'll probably want to also check the type of
archive in order to pass `tar' the appropriate arguments (since of
course, my example only works on a basic uncompressed archive).

Hope this gives some direction.

- Michael Wehner
May 21 '06 #2

This discussion thread is closed

Replies have been disabled for this discussion.