473,388 Members | 1,574 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,388 software developers and data experts.

Built in compression on .net looses data, maybe?

Thanks to anyone who reads this.

Below is some C# that compresses an array of bytes, and then
decompresses, and compares the original data with the new.

Firstly, the length of the decompressed data is shorter than the
original. So some loss of data has occured. But the content up until
the early truncation matches. So am I flushing correctly? This error
only occurs for particular combinations of bytes in the original
buffer.

Secondly, when I read the decompressed data from the zip-stream, the
first read returns zero bytes. After that I perform a second read and
the data can be read. Why is that?

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;
using System.IO.Compression;
using System.Text;

namespace lab02
{

class Program
{

static void Main(string[] args)
{

// declaration of local variables
int i1 = 0;

// create a buffer of data for compressing
byte[] bufferData = new byte[10];
for (i1 = 0; i1 < 10; i1++)
bufferData[i1] = Convert.ToByte(i1);

// PART 1 - Compression
byte[] bufferCompressed = null;
{

// compress buffer into a memory stream (ms)
MemoryStream msCompressed = new MemoryStream();
DeflateStream zipStream = new DeflateStream(msCompressed,
CompressionMode.Compress);
zipStream.Write(bufferData, 0, bufferData.Length);
msCompressed.Flush();

// get the compressed memory stream into a buffer
bufferCompressed = new byte[msCompressed.Length];
msCompressed.Position = 0;
msCompressed.Read(bufferCompressed, 0, bufferCompressed.Length);

// close zip stream
zipStream.Close();

}

// PART 2 - Decompression
byte[] bufferDecompressed = null;
{

// put the compressed data (bufferCompressed) into a memory stream
(msCompressed)
MemoryStream msCompressed = new MemoryStream();
msCompressed.Write(bufferCompressed, 0, bufferCompressed.Length);
msCompressed.Position = 0;

// decompress buffer
DeflateStream zipStream = new DeflateStream(msCompressed,
CompressionMode.Decompress);
msCompressed.Flush();

// read the de-compressed data into a buffer
MemoryStream msDecompressed = new MemoryStream();
int iBytesRead = 0;
byte[] bufferSub = new Byte[1024];
do
{

// read next bytes (problem with the first read, always read zero
first)
iBytesRead = zipStream.Read(bufferSub, 0, bufferSub.Length);
if ((msDecompressed.Length == 0) && (iBytesRead == 0))
iBytesRead = zipStream.Read(bufferSub, 0, bufferSub.Length);

// if some data was read...
if (iBytesRead 0)
{

// add to stream
msDecompressed.Write(bufferSub, 0, iBytesRead);

}

} while (iBytesRead == bufferSub.Length);

// close zip stream
zipStream.Close();

// load buffer with unzipped data
bufferDecompressed = new byte[msDecompressed.Length];
msDecompressed.Position = 0;
msDecompressed.Read(bufferDecompressed, 0,
bufferDecompressed.Length);

}

// PART 3 - Comparision of what was and now is, or is it?
if(bufferData.Length!=bufferDecompressed.Length)
Trace.TraceInformation("Length mismatch!!!");
else
{

// compare contents
bool bMatch = true;
for (i1 = 0; i1 < bufferData.Length; i1++)
{

// if bytes do not match...
if (bufferData[i1] != bufferDecompressed[i1])
{

// update flag
bMatch = false;

// break out of loop
break;

}

}
if(!bMatch)
Trace.TraceInformation("Content does not match!!!");

}

}

}

}

Aug 29 '06 #1
2 2468
First - you are flushing the wrong stream; it is zipStream that needs
flushing. However, I have seen the compression classes refuse to Flush
completely until Close is called; presumably this is an optimisation to keep
a few bytes for use in the compression algorithm. So My compression code
would be (note overload to ctor to leave the stream open):

using (MemoryStream msCompressed = new MemoryStream()) {
using (DeflateStream zipStream = new
DeflateStream(msCompressed, CompressionMode.Compress, true)) {
zipStream.Write(bufferData, 0, bufferData.Length);
zipStream.Close();
}
bufferCompressed = msCompressed.ToArray();
}

I don't know why your code reports zero (I didn't even run it to find out,
I'm afraid) - however, I would do as follows; note the trick is on the while
condition (which captures the count and tests it in one go).

using (MemoryStream msDecompressed = new MemoryStream()) {
using (MemoryStream msCompressed = new
MemoryStream(bufferCompressed))
using (DeflateStream zipStream = new
DeflateStream(msCompressed, CompressionMode.Decompress)) {
int bytesRead;
const int BUFFER_SIZE = 1024;
byte[] buffer = new byte[BUFFER_SIZE];
while ((bytesRead = zipStream.Read(buffer, 0,
BUFFER_SIZE)) 0) {
msDecompressed.Write(buffer, 0, bytesRead);
}
}
bufferDecompressed = msDecompressed.ToArray();
}

Third - in this scenario, since I used the ctor to not close the stream
(first block of code) I could actually re-use my MemoryStream between the
two blocks simply by rewinding (.Position = 0); but the above works fine,
and illustrates the point. Also note that short or fairly random blocks of
data can get longer during compression. Ironic, but life, especially with
single-pass compression. Double-pass compression can use a "don't bother"
flag.

Marc
Aug 29 '06 #2
Thanks for the help Marc. Especially the

Stream.ToArray();

function, which make the code a lot more compact and readable.

I have discovered from another group that the zip stream must be
flushed AND closed before the compressed data can be read.

zipStream.Flush();
zipStream.Close();

Solved!

Terry.

Aug 29 '06 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Jim Hubbard | last post by:
I went to the compression newsgroups, but all I saw was spam. So, I thought I'd post his question here to get the best info I could from other programmers. Which compression algorithm offers...
8
by: Anurag | last post by:
Hi, I am told that Oracle has this "data compression" feature that allows you to store online data ina compressed format. This is different from archived data - you compress only that data which...
10
by: Just D. | last post by:
Who knows beginning from which version of IIS, 5 or 6 only, and what system like 2000 Server, 2000 Advanced Server, we can use the built-in compression? I heard once that it was implemented...
16
by: Claudio Grondi | last post by:
What started as a simple test if it is better to load uncompressed data directly from the harddisk or load compressed data and uncompress it (Windows XP SP 2, Pentium4 3.0 GHz system with 3 GByte...
3
by: Benny Ng | last post by:
Dear All, Now I met some performance problems in my application. Because according to our business. The size of some web forms are larger than 1xxx MB. So it takes a long time for user opening a...
20
by: chance | last post by:
Hello, I want to add compression to a memory stream and save it in an Oracle database. This is the code I have so far: //save the Word document to a binary field, MemoryStream dataStream = new...
6
by: mike_dba | last post by:
I have been testing compression for update operations. Can anyone tell me why I require more log for an update of a compressed table than I do for the same table that is not compressed ? I...
14
by: Thomas Mlynarczyk | last post by:
Hello, I have a script that generates XML files which are sent to a client app on request via "new SimpleXMLElement( $sUrl, 0, true )". Both client and server run PHP 5.2 and the XML file can...
3
by: GiJeet | last post by:
Hello, we have an app that scans documents into TIFF format and we need to transfer them over the internet. If anyone knows of a SDK we can use that can compress TIFFs on the fly or even if it can...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.