GZipStream compressed bytes written | Newbie | | Join Date: Sep 2008
Posts: 3
| | |
When using a GZipStream, is there any way to know how many compressed bytes were written?
For example:
// "buffer" is data from a text file
// "offset" is 0
// "count" is 100
stream.Write(buffer, offset, count);
This is going to write the compressed data to the underlying data store. Is there any way to get the length of this data?
Note: I know you can use a MemoryStream to get the size but this inefficient.
| | Moderator | | Join Date: Dec 2007 Location: India
Posts: 701
| | | re: GZipStream compressed bytes written Quote:
Originally Posted by mach77 When using a GZipStream, is there any way to know how many compressed bytes were written?
For example:
// "buffer" is data from a text file
// "offset" is 0
// "count" is 100
stream.Write(buffer, offset, count);
This is going to write the compressed data to the underlying data store. Is there any way to get the length of this data?
Note: I know you can use a MemoryStream to get the size but this inefficient. could you explain more? i dont know whether i got u right... but heres wat i can explain....
"count" is number of bytes compressed... -
byte[] myByte;
-
using (FileStream f1 = new FileStream(@"C:\log.txt", FileMode.Open))
-
{
-
myByte = new byte[f1.Length];
-
f1.Read(myByte, 0, (int)f1.Length);
-
}
-
-
-
using (FileStream f2 = new FileStream(@"C:\log123.txt", FileMode.Create))
-
using (GZipStream gz = new GZipStream(f2, CompressionMode.Compress, false))
-
{
-
gz.Write(myByte, 0, myByte.Length);
-
}
-
-
To get the length of the data written... on the other file .. i guess you will have to read it ... -
using (FileStream f1 = new FileStream(@"C:\log123.txt", FileMode.Open))
-
{
-
myByte = new byte[f1.Length];
-
f1.Read(myByte, 0, (int)f1.Length);
-
}
-
theres also a length property of GZipStream... but its no longer supported... so i wont know wat exactly it does....
"GZipStream.Length Property :This property is not supported and always throws a NotSupportedException." -
byte[] more = System.Text.UnicodeEncoding.Unicode.GetBytes("Prr");
-
-
using (MemoryStream ms = new MemoryStream())
-
{
-
using (GZipStream GZ = new GZipStream(ms, CompressionMode.Compress, false))
-
{
-
GZ.Write(more, 0, more.Length);
-
// string sss = ms.Length.ToString();
-
-
// sss = ms.ToArray().Length.ToString();
-
//as GZipStream writes additional data including information when its been //disposed... you should not do the above
-
}
-
-
byte[] bb = ms.ToArray();
-
-
string len = bb.Length.ToString();
-
//You will get the length here....
-
-
-
}
-
| | Newbie | | Join Date: Sep 2008
Posts: 3
| | | re: GZipStream compressed bytes written
In my example, "count" was the uncompressed size. So 100 uncompressed bytes pass into the stream, but the stream compresses them before sending it to the underlying data source. I want to know how many bytes are sent to the underlying source.
|  | Moderator | | Join Date: Apr 2007 Location: New England
Posts: 7,161
| | | re: GZipStream compressed bytes written
Look at the underlying stream's size? -
FileStream fs = new FileStream(@"c:\tempzip.zip", FileMode.Create);
-
System.IO.Compression.GZipStream gz = new System.IO.Compression.GZipStream(fs, System.IO.Compression.CompressionMode.Compress) ;
-
byte[] fred= Encoding.ASCII.GetBytes("Billy joe has a lot of bottles of rum");
-
gz.Write(fred, 0, fred.Length);
-
Int64 SizeOfCompressedBytes = fs.Length;
-
If you are doing multiple writes and want how much came from each write, you could probably implement logic to look at the change in the .Length property.
| | Expert | | Join Date: Sep 2008 Location: USA
Posts: 188
| | | re: GZipStream compressed bytes written
Just to reiterate what dirtBag noted, closing a GZipStream flushes the buffers and writes some additional EOF bytes that the decompressor needs. Therefore, the most accurate byte count can only be done after the GZipStream is closed or disposed.
Also, just a warning, if checking the underlying stream length after a series of atomic writes you must take into account that stream buffering does not always write all the read bytes with each iteration unless the buffer is Flushed to the stream.
| | Expert | | Join Date: Sep 2008 Location: USA
Posts: 188
| | | re: GZipStream compressed bytes written
Also, if you are worried about reading a large file into a memory stream, an alternative is to count bytes in chunks from a buffer size to your liking: -
int _BUFFER_SIZE = 4096;
-
FileStream inputStream = new FileStream(@"C:\someFile.zip", FileMode.Open);
-
byte[] readBuffer = new byte[_BUFFER_SIZE];
-
long totalBytes = 0;
-
int bytesRead = 0;
-
do {
-
// read _BUFFER_SIZE bytes from inputStream into the buffer
-
bytesRead = inputStream.Read(readBuffer, 0, _BUFFER_SIZE);
-
totalBytes += bytesRead;
-
}
-
// until no more bytes are read from the input stream
-
while (bytesRead > 0);
-
-
Console.WriteLine("{0} : {1} bytes.", inputStream.Name, totalBytes);
-
-
| | Newbie | | Join Date: Sep 2008
Posts: 3
| | | re: GZipStream compressed bytes written
The examples given are all pretty straight forward, but what happens when you don't have access to the underlying stream or the underlying stream throws an exception when you call Length on it?
|  | Moderator | | Join Date: Apr 2007 Location: New England
Posts: 7,161
| | | re: GZipStream compressed bytes written
Well you always have access to the underlying stream via the .BaseStream proeprty. Not sure which stream types would throw an exception on the Length proeprty, but I guess it could happen?
What is your underlying stream type?
| | Expert | | Join Date: Sep 2008 Location: USA
Posts: 188
| | | re: GZipStream compressed bytes written
That is exactly what dirtBag pointed out. GZipStream does not support Length or Position while open, so neither will its BaseStream.
As you can see, the implementation of GZip compression does not allow a real-time feedback of incremental bytes written...which is what you want.
Unless you write your own compression algorithm, using the GZipStream class means you will only know how may compressed bytes were written by retrieving the length of the entire compressed stream after it has been fully written and closed.
At least that is my conclusion. If someone else can show otherwise, please correct me.
|  | Moderator | | Join Date: Apr 2007 Location: New England
Posts: 7,161
| | | re: GZipStream compressed bytes written
I have had zero trouble using the .Length property on the BaseStream
| | Expert | | Join Date: Sep 2008 Location: USA
Posts: 188
| | | re: GZipStream compressed bytes written
OK, you are correct Plater, the BaseStream.Length is available. My statement above about the BaseStream is incorrect.
However, only while the wrapper (GZipStream) stream is open...when the GZipStream is closed so is the Base Stream...
And because of this, the BaseStream.Length is not absolutely correct until the Compression stream is flushed/closed.
So for mach77, you can calculate bytes written by examining the increase in BaseStream.Length...but the last length given will not equal the final length of the compressed stream.
I have provided to test methods to show this: you will need to provide a text file. -
using System;
-
using System.IO;
-
using System.IO.Compression;
-
-
namespace bytes {
-
-
class test {
-
-
static void Main() {
-
GZipCompressAllAtOnce(@"C:\Temp\someFile.txt");
-
GZipCompressIncremental(@"C:\Temp\someFile.txt");
-
}
-
-
public static void GZipCompressAllAtOnce(string filename) {
-
byte[] fileBuffer;
-
int bytesRead;
-
MemoryStream ms;
-
GZipStream compressedzipStream;
-
using (FileStream infile = new FileStream(filename, FileMode.Open, FileAccess.Read, FileShare.Read)) {
-
fileBuffer = new byte[infile.Length];
-
bytesRead = infile.Read(fileBuffer, 0, fileBuffer.Length);
-
}
-
using (ms = new MemoryStream()) {
-
using (compressedzipStream = new GZipStream(ms, CompressionMode.Compress, true)) {
-
compressedzipStream.Write(fileBuffer, 0, bytesRead);
-
Console.WriteLine("Underlying Stream Length: {0}", compressedzipStream.BaseStream.Length);
-
}
-
Console.WriteLine("Original size: {0}, Compressed size: {1}", fileBuffer.Length, ms.Length);
-
}
-
}
-
-
public static void GZipCompressIncremental(string filename) {
-
byte[] fileBuffer;
-
int bytesRead;
-
int BUFFERSIZE = 512;
-
MemoryStream ms;
-
using (FileStream infile = new FileStream(filename, FileMode.Open, FileAccess.Read, FileShare.Read)) {
-
fileBuffer = new byte[BUFFERSIZE];
-
using (ms = new MemoryStream()) {
-
using (GZipStream compressedzipStream = new GZipStream(ms, CompressionMode.Compress, true)) {
-
do {
-
bytesRead = infile.Read(fileBuffer, 0, BUFFERSIZE);
-
compressedzipStream.Write(fileBuffer, 0, bytesRead);
-
Console.WriteLine("Underlying Stream Length: {0}", compressedzipStream.BaseStream.Length);
-
} while (bytesRead > 0);
-
}
-
Console.WriteLine("Original size: {0}, Compressed size: {1}", infile.Length, ms.Length);
-
}
-
}
-
}
-
-
}
-
}
-
|  | Moderator | | Join Date: Apr 2007 Location: New England
Posts: 7,161
| | | re: GZipStream compressed bytes written
Well the OPs question was to know how many compressed bytes were written, which can be done with the length property as mentioned. Sort of.
The compressed streams get a preamble and postamble as mentioned.
But I think it works like this:
CompressedStream after 3 writes
[preamble][compressedbytes][postamble]
And *NOT* like this:
CompressedStream after 3 writes
[preamble][compressedbytes][postamble][preamble][compressedbytes][postamble][preamble][compressedbytes][postamble]
I have not confirmed this however.
|  | Similar .NET Framework bytes | | | /bytes/about
We are a network of experts and professionals in IT and software development that help one another with answers to tough questions and share insights.
Get the best answers to your questions from over 226,449 network members.
|