473,397 Members | 2,056 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,397 software developers and data experts.

Tarfile .bz2

When using python to create a tar.bz2 archive, and then using winrar to
open the archive, it can't tell what the compressed size of each
individual file in the archive is. Is this an issue with winrar or is
there something that needs to be set when making the archive that isn't
there by default.

example archive:
#Note, the tabs on this are not really tabs, so good luck copying it
correctly

import os, tarfile
archive = tarfile.open("archive.tar.bz2","w:bz2")
for thing in os.listdir(somepath):
nthing = somepath+thing
if os.path.isfile(nthing): # somepath must end in "\\" for this to
work
info = archive.gettarinfo(nthing)
archive.addfile(info,file(nthing,'rb'))
archive.close()
---

Thanks,
Jordan

Dec 11 '06 #1
8 1693
Jordan schrieb:
When using python to create a tar.bz2 archive, and then using winrar to
open the archive, it can't tell what the compressed size of each
individual file in the archive is. Is this an issue with winrar or is
there something that needs to be set when making the archive that isn't
there by default.
I believe it's an issue of the file format (tar.bz2). You don't compress
individual files, but you compress the entire tar file. So it is not
meaningful to talk about the compressed size of an individual archive
member - they are all uncompressed.

Regards,
Martin
Dec 11 '06 #2
Jordan wrote:
When using python to create a tar.bz2 archive, and then using
winrar to open the archive, it can't tell what the compressed
size of each
individual file in the archive is. Is this an issue with
winrar or is there something that needs to be set when making
the archive that isn't there by default.
When compressing a tar archive all files in the archive are
compressed as a whole, i.e. you can only specify a compression
ration for the whole archive and not just for a single file.

Technically a tar.bz2 is actually a aggregation of multiple files
into a single tar file, which is then compressed.

This is different to e.g. PKZip in which each file is compressed
individually and the compressed files are then merged into an
archive.

The first method has better compression ratio, since redundancies
among files are compressed, too, whereas the latter is better if
you need random access to the individual files.

Wolfgang Draxinger
--
E-Mail address works, Jabber: he******@jabber.org, ICQ: 134682867

Dec 11 '06 #3
So that would explain why a tar.bz2 archive can't be appended to
wouldn't it... And also explain why winrar was so slow to open it (not
something I mentioned before, but definitely noticed). I had wondered
what it was that made bz2 so much better at compression than zip and
rar. Not really on topic anymore but what's the method for tar.gz? And
even more off the topic, does anyone know a good lossless compression
method for images (mainly .jpg and .png)?

Cheers,
Jordan
Wolfgang Draxinger wrote:
Jordan wrote:
When using python to create a tar.bz2 archive, and then using
winrar to open the archive, it can't tell what the compressed
size of each
individual file in the archive is. Is this an issue with
winrar or is there something that needs to be set when making
the archive that isn't there by default.

When compressing a tar archive all files in the archive are
compressed as a whole, i.e. you can only specify a compression
ration for the whole archive and not just for a single file.

Technically a tar.bz2 is actually a aggregation of multiple files
into a single tar file, which is then compressed.

This is different to e.g. PKZip in which each file is compressed
individually and the compressed files are then merged into an
archive.

The first method has better compression ratio, since redundancies
among files are compressed, too, whereas the latter is better if
you need random access to the individual files.

Wolfgang Draxinger
--
E-Mail address works, Jabber: he******@jabber.org, ICQ: 134682867
Dec 12 '06 #4
Jordan schrieb:
Not really on topic anymore but what's the method for tar.gz?
It works like .tar.bz2, except that it uses gzip (www.gzip.org)
as the compression library. The underlying compression algorithm
is LZW.
And
even more off the topic, does anyone know a good lossless compression
method for images (mainly .jpg and .png)?
Well, .jpg files are already compressed in a lossy way (.jpg is
inherently lossy); to compress it further, you need to increase
the loss. PNG is also compressed already, see

http://www.mywebsite.force9.co.uk/png/

The compression algorithm inside PNG is zlib (which is the same
as the gzip algorithm). Perhaps you should read the comp.compression
FAQ:

http://www.faqs.org/faqs/compression-faq/

Regards,
Martin
Dec 12 '06 #5
Jordan wrote:
So that would explain why a tar.bz2 archive can't be appended to
wouldn't it... And also explain why winrar was so slow to open it (not
something I mentioned before, but definitely noticed). I had wondered
what it was that made bz2 so much better at compression than zip and
rar. Not really on topic anymore but what's the method for tar.gz? And
even more off the topic, does anyone know a good lossless compression
method for images (mainly .jpg and .png)?
You can get the same effect from RAR and other formats (ACE, 7z) by
using the "Solid Archive" or similar option. Ideally, you'd be
compressing lots of similar files for this to be effective. The actual
compression ratios of RAR and bz2 can be pretty similar when done this way.

Dec 12 '06 #6
Martin v. Löwis wrote:
Well, .jpg files are already compressed in a lossy way (.jpg is
inherently lossy); to compress it further, you need to increase
the loss. PNG is also compressed already, see
Not really. Stuffit has a JPEG compressor which takes advantage of the
fact that the JPEG algorithm isn't as optimal as it can be. It converts
JPEG images to its own more compact representation which then can be
converted back to JPEG as needed, without any loss. It is sadly not free.
Dec 12 '06 #7
Martin v. Löwis wrote:
Well, .jpg files are already compressed in a lossy way (.jpg is
inherently lossy); to compress it further, you need to increase
the loss.
or use a better algorithm, such as JPEG 2000 or Microsoft's HD Photo,
which both give better visual quality at lower bit rates (which means
that you can often get by with around half the bits compared to JPEG).

</F>

Dec 12 '06 #8
>>>>"Martin v. Löwis" <ma****@v.loewis.de(MvL) wrote:
>MvLJordan schrieb:
>>Not really on topic anymore but what's the method for tar.gz?
>MvLIt works like .tar.bz2, except that it uses gzip (www.gzip.org)
MvLas the compression library. The underlying compression algorithm
MvLis LZW.
No, it uses a compression algorithm based on LZ77 (called DEFLATE).
Therefore gzip was not encumbered by the the LZW patent.
--
Piet van Oostrum <pi**@cs.uu.nl>
URL: http://www.cs.uu.nl/~piet [PGP 8DAE142BE17999C4]
Private email: pi**@vanoostrum.org
Dec 13 '06 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

8
by: Jay Donnell | last post by:
Is there a way to use the tarfile module to recursively compress the contents of a directory and maintain the directory structure in the tar archive? Simply doing os.system('tar -czvf ' +...
8
by: Dennis Hotson | last post by:
Hi, I'm trying to write a function that adds a file-like-object to a compressed tarfile... eg ".tar.gz" or ".tar.bz2" I've had a look at the tarfile module but the append mode doesn't support...
6
by: sebastian.noack | last post by:
Hi, is there a way to or at least a reason why I can not use tarfile to create a gzip or bunzip2 compressed archive in the memory? You might might wanna answer "use StringIO" but this isn't...
6
by: Terry Carroll | last post by:
I am trying to do something with a very large tarfile from within Python, and am running into memory constraints. The tarfile in question is a 4-gigabyte datafile from freedb.org,...
1
by: boblatest | last post by:
Hello, I'm trying to catch an "EOFError" exception that occurs when reading truncated tarfile. Here's my routine, and below that the callback trace. Note that although I'm trying to catch all...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.