473,224 Members | 1,378 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,224 software developers and data experts.

tarfile...bug?

Hi,

I am trying to use tarfile module to list contents of a 'gz' file but
it seems to hang for large files and CPU usage goes 100%.
though 'tar -tvf' on same file list contents in couple of secs.

Here is a test script which can show the problem; I am using python
Python 2.4.3

------------
import tarfile

bigFilePath = "/tmp/bigFile"
bigFileTGZ = "/tmp/big.tar.gz"

# create a big file
print "Creating big file...",bigFilePath
f = open(bigFilePath,"w")
for i in xrange(100):
f.write("anurag"*1024*1024)
f.close()

#create a tarfile from big file
print "pack to...",bigFileTGZ
tar = tarfile.open(bigFileTGZ, "w:gz")
tar.add(bigFilePath,"bigFile")
tar.close()

print "unpack...",bigFileTGZ
# now try to list contents of tar
tar = tarfile.open(bigFileTGZ, "r")
tar.list() #hangs
------------

rgds
Anurag

Oct 9 '07 #1
3 1587
Hi,

Have any one faced such problem, I assume it must be common if it can
be replicated so easily , or something wrong with my system

Also if I use tar.members instead of tar.getmembers() it works
so what is the diff. between tar.members and tar.getmembers()

rgds
Anurag

Oct 10 '07 #2
On Oct 9, 10:33 pm, Anurag <anuraguni...@gmail.comwrote:
Hi,

Have any one faced such problem, I assume it must be common if it can
be replicated so easily , or something wrong with my system

Also if I use tar.members instead of tar.getmembers() it works
so what is the diff. between tar.members and tar.getmembers()

rgds
Anurag
if you are not fully dependant on tarfiles, have a look at the zipfile
library in Python. Everytime I start to use the tarfile .lib, the
zip .lib turns out to be a better solution.

Cheers,

--Alan

Oct 12 '07 #3
al**********@gmail.com wrote:
On Oct 9, 10:33 pm, Anurag <anuraguni...@gmail.comwrote:
>Have any one faced such problem, I assume it must be common if it can
be replicated so easily , or something wrong with my system

Also if I use tar.members instead of tar.getmembers() it works
so what is the diff. between tar.members and tar.getmembers()

if you are not fully dependant on tarfiles, have a look at the zipfile
library in Python. Everytime I start to use the tarfile .lib, the
zip .lib turns out to be a better solution.
And here's why:
The tar-gzip format (sometimes .tar.gz, sometimes .tgz) is defined by
taking a fully expanded archive (tar archives), and compressing them
_as_a_whole_ with the gzip compression. It is not possible to see the
last bytes of the .tgz file without uncompressing _all_ of the file.

The zip format compresses the contained files individually, and keeps
a separate directory. So it can expand only the file you want whether
it is at the beginning or the end of the zip file. This is also (one
of) the reason(s) the .zip format gets less compression than the .tgz
format. Each file in the zip is separately compressed, so redundancy
between files is not compressed out.

-Scott David Daniels
Sc***********@Acm.Org
Oct 13 '07 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

8
by: Jay Donnell | last post by:
Is there a way to use the tarfile module to recursively compress the contents of a directory and maintain the directory structure in the tar archive? Simply doing os.system('tar -czvf ' +...
5
by: Matt Doucleff | last post by:
Hi everyone! I must be doing something wrong here :) I have a tarball that contains a single file whose contents are a pickled object. I would like to unpickle the object directly from the...
8
by: Dennis Hotson | last post by:
Hi, I'm trying to write a function that adds a file-like-object to a compressed tarfile... eg ".tar.gz" or ".tar.bz2" I've had a look at the tarfile module but the append mode doesn't support...
5
by: Uwe Mayer | last post by:
Hi, is it possible to delete a file from a tar-archive using the tarfile module? Thanks Uwe
1
by: Beowulf | last post by:
Hello, I'm using Python to automate admin tasks on my job. We use Windoze 2000 as desktop platform. When executing this daily backup scripts I get the following error: Traceback (most recent...
1
by: Matthew Thorley | last post by:
I've been using tarfile like this import tarfile tar = tarfile.open('path_to_tar_archive', 'r:gz') But I need to use it like this: archive = open('path_to_tar_archive', 'r') tar =...
7
by: aurora00 | last post by:
I have a program that generates a number of files that will be packaged into a tarball. Can I stream the content into TarFile without first writing them out to the file system? All add(), addfile()...
6
by: sebastian.noack | last post by:
Hi, is there a way to or at least a reason why I can not use tarfile to create a gzip or bunzip2 compressed archive in the memory? You might might wanna answer "use StringIO" but this isn't...
6
by: Terry Carroll | last post by:
I am trying to do something with a very large tarfile from within Python, and am running into memory constraints. The tarfile in question is a 4-gigabyte datafile from freedb.org,...
1
by: boblatest | last post by:
Hello, I'm trying to catch an "EOFError" exception that occurs when reading truncated tarfile. Here's my routine, and below that the callback trace. Note that although I'm trying to catch all...
1
isladogs
by: isladogs | last post by:
The next online meeting of the Access Europe User Group will be on Wednesday 6 Dec 2023 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, Mike...
0
by: veera ravala | last post by:
ServiceNow is a powerful cloud-based platform that offers a wide range of services to help organizations manage their workflows, operations, and IT services more efficiently. At its core, ServiceNow...
3
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 3 Jan 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). For other local times, please check World Time Buddy In...
0
by: mar23 | last post by:
Here's the situation. I have a form called frmDiceInventory with subform called subfrmDice. The subform's control source is linked to a query called qryDiceInventory. I've been trying to pick up the...
0
by: abbasky | last post by:
### Vandf component communication method one: data sharing ​ Vandf components can achieve data exchange through data sharing, state sharing, events, and other methods. Vandf's data exchange method...
2
by: jimatqsi | last post by:
The boss wants the word "CONFIDENTIAL" overlaying certain reports. He wants it large, slanted across the page, on every page, very light gray, outlined letters, not block letters. I thought Word Art...
0
by: stefan129 | last post by:
Hey forum members, I'm exploring options for SSL certificates for multiple domains. Has anyone had experience with multi-domain SSL certificates? Any recommendations on reliable providers or specific...
1
by: davi5007 | last post by:
Hi, Basically, I am trying to automate a field named TraceabilityNo into a web page from an access form. I've got the serial held in the variable strSearchString. How can I get this into the...
0
by: MeoLessi9 | last post by:
I have VirtualBox installed on Windows 11 and now I would like to install Kali on a virtual machine. However, on the official website, I see two options: "Installer images" and "Virtual machines"....

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.