471,624 Members | 1,833 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,624 software developers and data experts.

tarfile...bug?

Hi,

I am trying to use tarfile module to list contents of a 'gz' file but
it seems to hang for large files and CPU usage goes 100%.
though 'tar -tvf' on same file list contents in couple of secs.

Here is a test script which can show the problem; I am using python
Python 2.4.3

------------
import tarfile

bigFilePath = "/tmp/bigFile"
bigFileTGZ = "/tmp/big.tar.gz"

# create a big file
print "Creating big file...",bigFilePath
f = open(bigFilePath,"w")
for i in xrange(100):
f.write("anurag"*1024*1024)
f.close()

#create a tarfile from big file
print "pack to...",bigFileTGZ
tar = tarfile.open(bigFileTGZ, "w:gz")
tar.add(bigFilePath,"bigFile")
tar.close()

print "unpack...",bigFileTGZ
# now try to list contents of tar
tar = tarfile.open(bigFileTGZ, "r")
tar.list() #hangs
------------

rgds
Anurag

Oct 9 '07 #1
3 1507
Hi,

Have any one faced such problem, I assume it must be common if it can
be replicated so easily , or something wrong with my system

Also if I use tar.members instead of tar.getmembers() it works
so what is the diff. between tar.members and tar.getmembers()

rgds
Anurag

Oct 10 '07 #2
On Oct 9, 10:33 pm, Anurag <anuraguni...@gmail.comwrote:
Hi,

Have any one faced such problem, I assume it must be common if it can
be replicated so easily , or something wrong with my system

Also if I use tar.members instead of tar.getmembers() it works
so what is the diff. between tar.members and tar.getmembers()

rgds
Anurag
if you are not fully dependant on tarfiles, have a look at the zipfile
library in Python. Everytime I start to use the tarfile .lib, the
zip .lib turns out to be a better solution.

Cheers,

--Alan

Oct 12 '07 #3
al**********@gmail.com wrote:
On Oct 9, 10:33 pm, Anurag <anuraguni...@gmail.comwrote:
>Have any one faced such problem, I assume it must be common if it can
be replicated so easily , or something wrong with my system

Also if I use tar.members instead of tar.getmembers() it works
so what is the diff. between tar.members and tar.getmembers()

if you are not fully dependant on tarfiles, have a look at the zipfile
library in Python. Everytime I start to use the tarfile .lib, the
zip .lib turns out to be a better solution.
And here's why:
The tar-gzip format (sometimes .tar.gz, sometimes .tgz) is defined by
taking a fully expanded archive (tar archives), and compressing them
_as_a_whole_ with the gzip compression. It is not possible to see the
last bytes of the .tgz file without uncompressing _all_ of the file.

The zip format compresses the contained files individually, and keeps
a separate directory. So it can expand only the file you want whether
it is at the beginning or the end of the zip file. This is also (one
of) the reason(s) the .zip format gets less compression than the .tgz
format. Each file in the zip is separately compressed, so redundancy
between files is not compressed out.

-Scott David Daniels
Sc***********@Acm.Org
Oct 13 '07 #4

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

8 posts views Thread by Jay Donnell | last post: by
8 posts views Thread by Dennis Hotson | last post: by
5 posts views Thread by Uwe Mayer | last post: by
1 post views Thread by Beowulf | last post: by
1 post views Thread by Matthew Thorley | last post: by
7 posts views Thread by aurora00 | last post: by
6 posts views Thread by Terry Carroll | last post: by
1 post views Thread by boblatest | last post: by
1 post views Thread by XIAOLAOHU | last post: by
1 post views Thread by ZEDKYRIE | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.