473,418 Members | 2,090 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,418 software developers and data experts.

Version of TAR in tarfile module? TAR 1.14 or 1.15 port to Windows?

I need to unpack on a Windows 2000 machine
some Wikipedia media .tar archives which are
compressed with TAR 1.14 (support for long file
names and maybe some other features) .
It seems, that Pythons tarfile module is able to list far
more files inside the archives than WinRAR or 7zip or
TotalCommander, but will it unpack all available files
(largest archive size 17 GByte)?

If tarfile is build on TAR 1.14 or TAR 1.15 it will be
probably ok, so my questions are:

What TAR version is built into the tarfile module?

Is there a TAR 1.14 or 1.15 port to Windows
available in Internet for download (which URL)?

Claudio

Aug 20 '05 #1
4 3347
Claudio Grondi wrote:
What TAR version is built into the tarfile module?
None: the tarfile module is not built on top of
GNU tar. Instead, it is a complete reimplementation.
Is there a TAR 1.14 or 1.15 port to Windows
available in Internet for download (which URL)?


http://sources.redhat.com/cygwin/

Regards,
Martin
Aug 20 '05 #2
Martin, thank you for your response.

I see, that I have to test myself if the tarfile
module can do what I need, so I did and I
have evidence, that the Python tarfile module
is not able to see all the files inside the TAR
archives created on Linux with TAR 1.14 .
The Python tarfile module stops to go through the
TAR archive somewhere already at the beginning,
like the other tools I have used, did.

I have no understanding what Cygwin, MinGW
(do I put MinGW in the right context?) are and do,
but I gave Cygwin a try and installed it selecting
also the tar package version 1.15 for installation.

I tried to run the tar.exe, but in the beginning it was
not able to do anything for me. I haven't given up
and after hours of trying to find the reason for this
I found that:
the tar.exe seems to have no access to any files
not inside the
[%SystemDrive%\Cygwin\usr]
directory on my system drive e: , where Cygwin is
installed.
Is there a way to go around this, because I have
to uncompress a 17 GByte file and my system
drive has only 3 GByte of free storage space.
I have in mind, that it could have something to do
with the mount command, but this is all I can
remember. I work in a Windows command shell
(DOS-box) and mount says:
j: on /cygdrive/j , but I don't know how to write
the entire path
"j:\o\archives\images\dump.tar",
so that the file can be found by tar.exe and
unpacked to "i:\images" .
tar.exe --extract --directory=tmp -f j:/o/archives/images/dump.tar
results in:
/usr/bin/tar: j\:/o/archives/images/dump.tar: Cannot open: Input/Output
error
telling
tar.exe --extract --directory=tmp -f /cygdrive/j/o/archives/images/dump.tar
doesn't work either.

Claudio

"Martin v. Löwis" <ma****@v.loewis.de> schrieb im Newsbeitrag
news:43***********************@news.freenet.de...
Claudio Grondi wrote:
What TAR version is built into the tarfile module?


None: the tarfile module is not built on top of
GNU tar. Instead, it is a complete reimplementation.
Is there a TAR 1.14 or 1.15 port to Windows
available in Internet for download (which URL)?


http://sources.redhat.com/cygwin/

Regards,
Martin



Aug 20 '05 #3
Claudio Grondi wrote:
remember. I work in a Windows command shell
(DOS-box) and mount says:
j: on /cygdrive/j , but I don't know how to write
the entire path
"j:\o\archives\images\dump.tar",
so that the file can be found by tar.exe and
unpacked to "i:\images" .
tar.exe --extract --directory=tmp -f j:/o/archives/images/dump.tar
results in:
/usr/bin/tar: j\:/o/archives/images/dump.tar: Cannot open: Input/Output
error
telling
tar.exe --extract --directory=tmp -f /cygdrive/j/o/archives/images/dump.tar
doesn't work either.

Try the cygpath-command like this:

echo `cygpath c:\\some\\windows\\path`

That should yield

/cygdrive/c/some/windows/path

Alternatively, do somethink like this

mkdir -p /mnt/j

mount j: /mnt/j

Then /mnt/j should be the root for all files under j:

HTH Diez
Aug 20 '05 #4

Thank you both (Martin and Diez) for your help.

The 17 GByte TAR archive was unpacked
without problems the way you suggested.

Let's summarize:

# Python tarfile module can't extract files from
newer TAR archives (archived with tar 1.14 or later)

# The core of my problems was, that I was not aware
how easy it is to install and work with Cygwin and that to
get Cygwins tar.exe to work it is necessary to use the
provided Cygwin bash-3.00 shell an NOT the Windows
command shell (DOS-box) .

The receipt for unpacking Wikipedia media files
provided as TAR archives when using Microsoft
Windows is:

Step 1. download http://sources.redhat.com/cygwin/setup.exe

Step 2a. run the downloaded setup.exe which goes online and
lets you choose which packages should be installed

Step 2b. select for the installation additional to suggested ones
the tar package version 1.15

Step 3. use the Cygwin icon on the Desktop or in Start-Programs-
Cygwin-Cygwin Bash Shell to start a Cygwin shell and type :

bash-3.00$
../bin/tar.exe --extract --directory=/cygdrive/i/wikipedia/en/media
-f
/cygdrive/j/download.wikimedia.org/archives/images/en/20050530_upload.tar

where i: and j: are the drive letters of appropriate Windows drives.

The media files stored in TAR archive
j: \download.wikimedia.org\archives\images\en\2005053 0_upload.tar
will be unpacked to
i:\wikipedia\en\media
directory

Step 4. wait, wait, wait ... (how long depends most on
speed of your harddrives, on my system with USB drives
appr. one hour)

Step 5. BE HAPPY :))
and enjoy it, because you have mastered a step towards
your own static HTML version of Wikipedia.
The problems with extracting the content from the MySQL
database dumps will kill you soon - and if not, especially
for non-english languages (like German, Polish, Russian)
and with math-formulas converted to pictures,
all done on a Windows system __PLEASE__ share your
know-how with me !!!
(the only useful information I found on Internet about it
were postings within the thread I initiated a longer time
ago myself
http://www.pythonforum.org/ftopic194...ta_to_HTM.html
)

Claudio

"Diez B. Roggisch" <de***@nospam.web.de> schrieb im Newsbeitrag
news:3m*************@uni-berlin.de...
Claudio Grondi wrote:
remember. I work in a Windows command shell
(DOS-box) and mount says:
j: on /cygdrive/j , but I don't know how to write
the entire path
"j:\o\archives\images\dump.tar",
so that the file can be found by tar.exe and
unpacked to "i:\images" .
tar.exe --extract --directory=tmp -f j:/o/archives/images/dump.tar
results in:
/usr/bin/tar: j\:/o/archives/images/dump.tar: Cannot open: Input/Output
error
telling
tar.exe --extract --directory=tmp -f /cygdrive/j/o/archives/images/dump.tar doesn't work either.

Try the cygpath-command like this:

echo `cygpath c:\\some\\windows\\path`

That should yield

/cygdrive/c/some/windows/path

Alternatively, do somethink like this

mkdir -p /mnt/j

mount j: /mnt/j

Then /mnt/j should be the root for all files under j:

HTH Diez

Aug 20 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

8
by: Jay Donnell | last post by:
Is there a way to use the tarfile module to recursively compress the contents of a directory and maintain the directory structure in the tar archive? Simply doing os.system('tar -czvf ' +...
5
by: Uwe Mayer | last post by:
Hi, is it possible to delete a file from a tar-archive using the tarfile module? Thanks Uwe
1
by: Beowulf | last post by:
Hello, I'm using Python to automate admin tasks on my job. We use Windoze 2000 as desktop platform. When executing this daily backup scripts I get the following error: Traceback (most recent...
3
by: justin.vanwinkle | last post by:
Hello everyone, I need some tar functionality for my program. Currently the following code properly displays tar archives, and tar.gz archives. However, it chokes on tar.bz2 archives with the...
7
by: aurora00 | last post by:
I have a program that generates a number of files that will be packaged into a tarball. Can I stream the content into TarFile without first writing them out to the file system? All add(), addfile()...
1
by: Antoon Pardon | last post by:
I have to write a little program that has to run on a number of hosts. Versions of python range from 2.2.3. to 2.4.2. The easiest way to implement the necessary behaviour seems to me to use the...
3
by: Anurag | last post by:
Hi, I am trying to use tarfile module to list contents of a 'gz' file but it seems to hang for large files and CPU usage goes 100%. though 'tar -tvf' on same file list contents in couple of...
6
by: Terry Carroll | last post by:
I am trying to do something with a very large tarfile from within Python, and am running into memory constraints. The tarfile in question is a 4-gigabyte datafile from freedb.org,...
1
by: boblatest | last post by:
Hello, I'm trying to catch an "EOFError" exception that occurs when reading truncated tarfile. Here's my routine, and below that the callback trace. Note that although I'm trying to catch all...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.