473,378 Members | 1,400 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,378 software developers and data experts.

Predicting space available for a ZIP file

11
I am creating a zip file in say c:\Archive, location(could be a network drive also) i need to check whether there is enough space for zipped file to fit in C:\archive.

Currently what i am doing is first creating a zip file in the same location C:\Archive and then comparing the size of file with the disk space available.

But the problem is i have already written that file to that location(C:\Archive). Is there any other alternative?
Feb 26 '07 #1
7 2989
@bir

In common (exceptions may exist), the size of a ZIP archive is not predictable. So, it seems that you have to create your ZIP archive anyway.

My advice:

You should create the ZIP archive in some tmp space which is considered to be large enough to hold the archive. Then, determine the file size of the archive via stat() and decide whether it fits into your target directory.

Greetz, Doc
Feb 26 '07 #2
bir
11
Doc

Thats what i have done only after creating the zip file i check for the diskspace. Aren't there any other solutions available?
Feb 26 '07 #3
@Bir
Doc

Thats what i have done only after creating the zip file i check for the diskspace. Aren't there any other solutions available?
I guess not! But, let's see what we can do ...

The size of the resulting ZIP archive can not be determined in advance in an analytical manner.

This means, that the algorithm building the ZIP archive should run somehow in order to obtain the size.

So, if you do not want to write the archive to disk prior the size is known, the algorithm could be performed in-memory, as a consequence of the constraints given by you.

A possible solution I can think of is as follows (assuming it is performed on a UNIX system:
Expand|Select|Wrap|Line Numbers
  1. zip - <file list> | wc -c
This will build the archive (option - forces the output to be written to STDOUT instead of to a file) and pipe it to the word count programm (wc), where the piped bytes are getting counted (option -c).

With this solution, no data is written to disk at all!

Also, this solution will work for huge archives as well, because of the nature of a pipe: data is process in small junks (typically 4KB at a time) and does not accumulate in memory.

If you are not running a UNIX system, try to map the above approach to your system capabilities.

Greetz, Doc
Feb 26 '07 #4
bir
11
Hi Doc

I am facing a problem running this script on windows. This script basically deals in calculating the compressed file size before writing it to a disk. I am making use of devnull() function here to point to a null device which is causing a problem. i am not able to calculate compressed size properly and some messages are also coming up as mentioned below.

We are running ActiveState Perl v5.8.8 for windows.

Any suggestions please?


My script is as follows:
# Example of how to compute compressed sizes
# $Revision: 1.2 $
use strict;
use Archive::Zip qw(:ERROR_CODES);
use File::Spec;
my $zip = Archive::Zip->new();
my $blackHoleDevice = File::Spec->devnull();

$zip->addFile($_) foreach (<*.pl>);

# Write and throw the data away.
# after members are written, the writeOffset will be set
# to the compressed size.
$zip->writeToFileNamed($blackHoleDevice);

my $totalSize = 0;
my $totalCompressedSize = 0;
foreach my $member ($zip->members())
{
$totalSize += $member->uncompressedSize;
$totalCompressedSize += $member->_writeOffset;
print "Member ", $member->externalFileName,
" size=", $member->uncompressedSize,
", writeOffset=", $member->_writeOffset,
", compressed=", $member->compressedSize,
"\n";
}

print "Total Size=", $totalSize, ", total compressed=", $totalCompressedSize, "\n";

$zip->writeToFileNamed('test.zip');



Output of the script
IO error: seeking to rewrite local header : Invalid seek
at C:/Perl/site/lib/Archive/Zip/Member.pm line 623
Archive::Zip::Member::_refreshLocalFileHeader('Arc hive::Zip::NewFileMemb
er=HASH(0x1bd5ce0)', 'IO::File=GLOB(0x1bd5bb4)') called at C:/Perl/site/lib/Arch
ive/Zip/Member.pm line 909
Archive::Zip::Member::_writeToFileHandle('Archive: :Zip::NewFileMember=HA
SH(0x1bd5ce0)', 'IO::File=GLOB(0x1bd5bb4)', 1, 0) called at C:/Perl/site/lib/Arc
hive/Zip/Archive.pm line 280
Archive::Zip::Archive::writeToFileHandle('Archive: :Zip::Archive=HASH(0x2
25550)', 'IO::File=GLOB(0x1bd5bb4)', 1) called at C:/Perl/site/lib/Archive/Zip/A
rchive.pm line 257
Archive::Zip::Archive::writeToFileNamed('Archive:: Zip::Archive=HASH(0x22
5550)', 'nul') called at calcSizes.pl line 14
Member calcSizes.pl size=876, writeOffset=417, compressed=417
Member copy.pl size=451, writeOffset=, compressed=451
Member extract.pl size=862, writeOffset=, compressed=862
Member mailZip.pl size=1572, writeOffset=, compressed=1572
Member mfh.pl size=619, writeOffset=, compressed=619
Member readScalar.pl size=752, writeOffset=, compressed=752
Feb 27 '07 #5
@bir

I could not run your code, because I do not have Zip.pm installed. So, I can just give you guesses.

It appears, that devnull() may not be a seekable device (like pipes). I assume, that writeToFileNamed() is trying to seek to the beginning of the given file before starting to write.

Have you read about the devnull() approach in the internet? Is there somebody out there, who can verify that it should work?

Sorry, can't tell you more.

Greetz, Doc
Feb 27 '07 #6
bir
11
@DOC

I am able to get the compressed size of the zip file members by simply writing it to a devnull() using file handles and no messages are coming up this time as used to be earlier. But now I am facing some other problem i.e. i am calculating the sum of the compressed size of the members of the zip file (files to be zipped) which is slightly diffrent from the zip file size.

And this difference varies according to the volume of files to be zipped.

1. Why is this size difference is there and how can i get that difference?

OR

2. Earlier i was calculating size of zip file by -s tag and now i am not writing it to the disc but to a device handle which points to the null device. How can i get that size from handle?

Output now is
Member Archive::Zip::NewFileMember=HASH(0x1a4c298) size=719, compressed=367
Member Archive::Zip::NewFileMember=HASH(0x1ceafdc) size=2079, compressed=383
Member Archive::Zip::NewFileMember=HASH(0x1ceb1bc) size=719, compressed=367
Total Size=3517, total compressed=1117

Device Null Size :0

Actual zip file Size: 1571

U can see the difference between the Actual file size and total compressed size. Also i am not able to get Device null size.
Feb 28 '07 #7
@Bir

There is a difference, because a ZIP file does not consist of just a sequence of compressed files. Some additional administrative data is required as well (table of contents, offsets, sizes, ...). I do not know whether Archive::Zip provides a method to figure out the amount of administrative data within a ZIP file.

The null-device is always of size null, that's why it is called null-device. Writing to such a device will just absorb the written input and discard it. For this reason, there is nothing like "size" associated with this device.

Greetz, Doc
Feb 28 '07 #8

Sign in to post your reply or Sign up for a free account.

Similar topics

1
by: John Jayaseelan | last post by:
Hi, Received the following error during index creation of the tables. The data & log files are set to 'unrestricted growth' and enough space available on the disk. Any reasons? ___________...
3
by: New MSSQL DBA | last post by:
has anyone met with this before? the setting is SQL2K with SP3 on a 2 node active-active W2K3 cluster. on one of the machine, it occasionally prompts for the following error: "The log file for...
27
by: Aurangzeb M. Agha | last post by:
I'm running Postgres 7.1.3, and just started having a problem where my dynamic site is going down (read-only DB, with no writes happening to the DB) regularly (every other day). I have no idea...
10
by: Bing Wu | last post by:
Hi Folks, I have a problem while creating a big table space. It reports error: SQL1139N The total size of the table space is too big Explanation: The size of the current table space is too...
5
by: Yasaswi Pulavarti | last post by:
does a command like, db2 drop table tabschema.tabname when run from the Aix prompt reclaim the disk space? Are there any other options? How can we make sure the disk space is reclaimed? Thanks,...
3
by: A P | last post by:
Hi, I am loosing disk space and wanting to install .Net Framework on my W2KSP4 Server. I only have 400MB available, is this enough for me to install .Net Framework? Me
1
by: Greg.Harabedian | last post by:
I am using SQL Server 2000. How can I tell how much free space is available within a particular filegroup? What query can I run to get this information? Thanks, Greg
10
by: kevinliu23 | last post by:
HI, I am new to Python and wanted to know how to check for the remaining disk space on my Windows machine using Python? I was thinking of using the command line "dir" and trying to extract the...
9
by: DataSmash | last post by:
I simply want to capture the free disc space in a variable so that I can compare changes. I'm aware of a few commands like "df -h" or "du - k", but I can't figure out how to capture those values...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.