473,399 Members | 3,302 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,399 software developers and data experts.

GZip Compression :(

Hello there,

I'm having an odd issue with GZIP compression (having followed example code
found on MSDN). Basically, after running through the compression routine I
end up with a byte array several times larger than the source text file,
full of zero data. Below is the code used to do the compression, it's a part
of a web service to retreive a file, there's a compress option prior to
base64 encoding the data. In the following code all undeclared variables you
see are properties, compress repersents a compress attribute specified in
the xml request, FileName is a relitive path to the file on the server
inside the webroot.

Response.ContentType = "text/xml"
If Not File.Exists(Server.MapPath(FileName)) Then
Throw New GetBinaryFileException(FileName,
GetBinaryFileException.GetBinaryFileError.FileNotF ound)
End If

Dim FileData() As Byte = Nothing
Dim FStream As New FileStream(Server.MapPath(FileName),
FileMode.Open, FileAccess.Read, FileShare.ReadWrite)
If Compress Then
Dim TempData(FStream.Length - 1) As Byte
FStream.Read(TempData, 0, FStream.Length)
Dim MStream As New MemoryStream
Dim Compressor As New GZipStream(MStream,
CompressionMode.Compress, True)
Compressor.Write(TempData, 0, TempData.Length)

ReDim FileData(MStream.Length - 1)
Dim BytesRead As Integer = MStream.Read(FileData, 0,
MStream.Length)
MStream.Close()
MStream.Dispose()
Compressor.Close()
Compressor.Dispose()
Else
ReDim FileData(FStream.Length - 1)
FStream.Read(FileData, 0, FStream.Length)
End If
FStream.Close()
FStream.Dispose()

Dim Base64 As String = Convert.ToBase64String(FileData)

Dim FileDataNode As XmlNode =
XmlExchangeLib.GetOrSetXmlNode("FileData", Root)
XmlExchangeLib.AddAttributeWithValue(FileDataNode, "Compressed",
Compress.ToString().ToLower())
FileDataNode.InnerText = Base64
XmlResponse.Save(Response.OutputStream)

Mar 27 '08 #1
9 1318
First, you need to make sure that you close the zip-stream (compressor)
before looking at the memory-stream - it won't have finished writing yet;
second, you then either need to rewind the memory stream, or just use
ToArray() to get the full contents.
Third - Read (on the file stream) is not strictly guaranteed to get
everything - and even if it did it isn't very efficient. But
File.ReadAllBytes would be a more reliable way of reading the entire file at
once.

You might also be allocating the FileData array one too short - I'm not sure
(VB...)

Marc
Mar 27 '08 #2
The last point is the one I know for a fact is fine, in VB you need to
declare it to length -1. But I'll take a look at the rest of the points.
Thanks very much for your thoughts, most helpful...
"Marc Gravell" <ma**********@gmail.comwrote in message
news:eE**************@TK2MSFTNGP02.phx.gbl...
First, you need to make sure that you close the zip-stream (compressor)
before looking at the memory-stream - it won't have finished writing yet;
second, you then either need to rewind the memory stream, or just use
ToArray() to get the full contents.
Third - Read (on the file stream) is not strictly guaranteed to get
everything - and even if it did it isn't very efficient. But
File.ReadAllBytes would be a more reliable way of reading the entire file
at once.

You might also be allocating the FileData array one too short - I'm not
sure (VB...)

Marc
Mar 27 '08 #3
Thanks for the advice, I swithed to autoclosing the zip stream and using
ToArray on the memory stream and it seems to be pulling bytes. Now my only
consern is I'm getting back a byte array much larger than my original 26
byte text file :(

"Marc Gravell" <ma**********@gmail.comwrote in message
news:eE**************@TK2MSFTNGP02.phx.gbl...
First, you need to make sure that you close the zip-stream (compressor)
before looking at the memory-stream - it won't have finished writing yet;
second, you then either need to rewind the memory stream, or just use
ToArray() to get the full contents.
Third - Read (on the file stream) is not strictly guaranteed to get
everything - and even if it did it isn't very efficient. But
File.ReadAllBytes would be a more reliable way of reading the entire file
at once.

You might also be allocating the FileData array one too short - I'm not
sure (VB...)

Marc
Mar 27 '08 #4
Just to make sure...

You are talking about "before" the step of going to base64, correct? The
base 64 step will bloat the string by a factor of 1.37 plus header data, if I
recall correctly.

"Carlo Razzeto" wrote:
Thanks for the advice, I swithed to autoclosing the zip stream and using
ToArray on the memory stream and it seems to be pulling bytes. Now my only
consern is I'm getting back a byte array much larger than my original 26
byte text file :(

"Marc Gravell" <ma**********@gmail.comwrote in message
news:eE**************@TK2MSFTNGP02.phx.gbl...
First, you need to make sure that you close the zip-stream (compressor)
before looking at the memory-stream - it won't have finished writing yet;
second, you then either need to rewind the memory stream, or just use
ToArray() to get the full contents.
Third - Read (on the file stream) is not strictly guaranteed to get
everything - and even if it did it isn't very efficient. But
File.ReadAllBytes would be a more reliable way of reading the entire file
at once.

You might also be allocating the FileData array one too short - I'm not
sure (VB...)

Marc
Mar 27 '08 #5
Raw byte array size (prior to conversion to base64 string). I read in 26
bytes and typically get back 132 bytes worth of "compressed" data.

"Family Tree Mike" <Fa************@discussions.microsoft.comwrote in
message news:C6**********************************@microsof t.com...
Just to make sure...

You are talking about "before" the step of going to base64, correct? The
base 64 step will bloat the string by a factor of 1.37 plus header data,
if I
recall correctly.

"Carlo Razzeto" wrote:
>Thanks for the advice, I swithed to autoclosing the zip stream and using
ToArray on the memory stream and it seems to be pulling bytes. Now my
only
consern is I'm getting back a byte array much larger than my original 26
byte text file :(

"Marc Gravell" <ma**********@gmail.comwrote in message
news:eE**************@TK2MSFTNGP02.phx.gbl...
First, you need to make sure that you close the zip-stream (compressor)
before looking at the memory-stream - it won't have finished writing
yet;
second, you then either need to rewind the memory stream, or just use
ToArray() to get the full contents.
Third - Read (on the file stream) is not strictly guaranteed to get
everything - and even if it did it isn't very efficient. But
File.ReadAllBytes would be a more reliable way of reading the entire
file
at once.

You might also be allocating the FileData array one too short - I'm not
sure (VB...)

Marc
Mar 27 '08 #6
I wouldn't bother compressing 26 bytes... gzip itself has header overhead
etc. This also isn't enough space to actually get many useful compression
opportunities. Finally, it depends on what the data is: if it is fairly
random (a complex image, a security token, etc) then it simply won't
compress.

Marc
Mar 27 '08 #7
Demo; outputs "125"; compression just isn't going to help you with very
short inputs:

using(MemoryStream dest = new MemoryStream()) {
using(GZipStream zip = new GZipStream(dest,
CompressionMode.Compress, true))
using(StreamWriter writer = new StreamWriter(zip)) {
writer.Write("Hi hi hi");
writer.Close();
zip.Close();
}
Console.WriteLine(dest.Length);
}

Marc
Mar 27 '08 #8
Ah, yeah hadn't been considering the compression headers. Thanks for
reminding me of that, so that makes sense. IRL this code isn't going to be
used to compress 25 byte files, more like several K to an M or two pdf files
so it should be fine. Thanks,

Carlo

"Marc Gravell" <ma**********@gmail.comwrote in message
news:e2*************@TK2MSFTNGP04.phx.gbl...
Demo; outputs "125"; compression just isn't going to help you with very
short inputs:

using(MemoryStream dest = new MemoryStream()) {
using(GZipStream zip = new GZipStream(dest,
CompressionMode.Compress, true))
using(StreamWriter writer = new StreamWriter(zip)) {
writer.Write("Hi hi hi");
writer.Close();
zip.Close();
}
Console.WriteLine(dest.Length);
}

Marc
Mar 28 '08 #9
One approach would be to use the first byte to indicate whether compression
is on (and what) - i.e. 0x00 = none, 0x01 = gzip, etc. I use this trick
quite happily; pick a cutoff under which you won't even bother trying to
compress... otherwise try compressing it and see if it got shorter (even
some non-trivial data gets longer when "compressed"). Worth consideration
perhaps... And in reverse check the first byte - if 0 return the rest of the
stream vanilla, if 1 the gzip, etc...

Marc
Mar 28 '08 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

10
by: Xah Lee | last post by:
today i need to use Python to decompress gzip files. since i'm familiar with Python doc and have 10 years of computing experience with 4 years in unix admin and perl, i have quickly located the...
25
by: Xah Lee | last post by:
Python Doc Problem Example: gzip Xah Lee, 20050831 Today i need to use Python to compress/decompress gzip files. Since i've read the official Python tutorial 8 months ago, have spent 30...
4
by: Anders K. Jacobsen [DK] | last post by:
Hi Does anyone have experience using gZip compression on the IIS server with a ASP.NET application. How much can a page be compressed. As i see it it must have a huge impact on the total size....
1
by: sameer | last post by:
Hi All, I am adding a custom header (Gzip header for compression) to the request when calling a webservice( sitting on a webserver) over the internet from my application ( If interested in the...
2
by: .nLL | last post by:
hi. i have recently enabled gzip compression on my server and it works fine. but i have noticed something wierd. To protect my self from hotlinking i use a simple code to send my download with...
2
by: Chaos | last post by:
I have tried to search Google, but I cannot seem to find a library to decompress a gzip string or char to a string or char. I want to write something that allows libcurl to access a page, save the...
2
by: Carlo Razzeto | last post by:
Hello there, I'm having an odd issue with GZIP compression (having followed example code found on MSDN). Basically, after running through the compression routine I end up with a byte array...
3
by: Sean Davis | last post by:
I have a set of numpy arrays which I would like to save to a gzip file. Here is an example without gzip: b=numpy.ones(1000000,dtype=numpy.uint8) a=numpy.zeros(1000000,dtype=numpy.uint8) fd =...
6
by: Giorgio Parmeggiani | last post by:
Hi I'm using the gzip compression found in WCG samples kit. It works well, but how can I set the SendTimeout and ReceiveTimeout parameters? Thank in advance Giorgio
6
by: pooppoop | last post by:
Hi, and thanks for viewing my post. i have an odd result when trying to compress and decompress a string. it seems that when i replace the Zero's in the input stream it works, if not the string...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.