Connecting Tech Pros Worldwide Forums | Help | Site Map

GZipStream compressed bytes written

Newbie
 
Join Date: Sep 2008
Posts: 3
#1: Sep 30 '08
When using a GZipStream, is there any way to know how many compressed bytes were written?

For example:

// "buffer" is data from a text file
// "offset" is 0
// "count" is 100

stream.Write(buffer, offset, count);

This is going to write the compressed data to the underlying data store. Is there any way to get the length of this data?

Note: I know you can use a MemoryStream to get the size but this inefficient.

PRR PRR is offline
Moderator
 
Join Date: Dec 2007
Location: India
Posts: 701
#2: Sep 30 '08

re: GZipStream compressed bytes written


Quote:

Originally Posted by mach77

When using a GZipStream, is there any way to know how many compressed bytes were written?
For example:

// "buffer" is data from a text file
// "offset" is 0
// "count" is 100

stream.Write(buffer, offset, count);

This is going to write the compressed data to the underlying data store. Is there any way to get the length of this data?

Note: I know you can use a MemoryStream to get the size but this inefficient.

could you explain more? i dont know whether i got u right... but heres wat i can explain....

"count" is number of bytes compressed...

Expand|Select|Wrap|Line Numbers
  1. byte[] myByte;
  2.             using (FileStream f1 = new FileStream(@"C:\log.txt", FileMode.Open))
  3.             {
  4.                 myByte = new byte[f1.Length];
  5.                 f1.Read(myByte, 0, (int)f1.Length);
  6.             }
  7.  
  8.  
  9.             using (FileStream f2 = new FileStream(@"C:\log123.txt", FileMode.Create))
  10.             using (GZipStream gz = new GZipStream(f2, CompressionMode.Compress, false))
  11.             {
  12.                 gz.Write(myByte, 0, myByte.Length);
  13.             }
  14.  
  15.  
To get the length of the data written... on the other file .. i guess you will have to read it ...

Expand|Select|Wrap|Line Numbers
  1. using (FileStream f1 = new FileStream(@"C:\log123.txt", FileMode.Open))
  2.             {
  3.                 myByte = new byte[f1.Length];
  4.                 f1.Read(myByte, 0, (int)f1.Length);
  5.             }
  6.  
theres also a length property of GZipStream... but its no longer supported... so i wont know wat exactly it does....
"GZipStream.Length Property :This property is not supported and always throws a NotSupportedException."


Expand|Select|Wrap|Line Numbers
  1. byte[] more = System.Text.UnicodeEncoding.Unicode.GetBytes("Prr");
  2.  
  3.             using (MemoryStream ms = new MemoryStream())
  4.             {
  5.                 using (GZipStream GZ = new GZipStream(ms, CompressionMode.Compress, false))
  6.                 {
  7.                     GZ.Write(more, 0, more.Length);
  8.                    // string sss = ms.Length.ToString();
  9.  
  10.                    // sss = ms.ToArray().Length.ToString();
  11. //as GZipStream writes additional data including  information when its been //disposed... you should not do the above
  12.                 }
  13.  
  14.                 byte[] bb = ms.ToArray();
  15.  
  16.                 string len = bb.Length.ToString();
  17. //You will get the length here....
  18.  
  19.  
  20.             }
  21.  
Newbie
 
Join Date: Sep 2008
Posts: 3
#3: Sep 30 '08

re: GZipStream compressed bytes written


In my example, "count" was the uncompressed size. So 100 uncompressed bytes pass into the stream, but the stream compresses them before sending it to the underlying data source. I want to know how many bytes are sent to the underlying source.
Plater's Avatar
Moderator
 
Join Date: Apr 2007
Location: New England
Posts: 7,161
#4: Sep 30 '08

re: GZipStream compressed bytes written


Look at the underlying stream's size?

Expand|Select|Wrap|Line Numbers
  1. FileStream fs = new FileStream(@"c:\tempzip.zip", FileMode.Create);
  2. System.IO.Compression.GZipStream gz = new System.IO.Compression.GZipStream(fs, System.IO.Compression.CompressionMode.Compress) ;
  3. byte[] fred= Encoding.ASCII.GetBytes("Billy joe has a lot of bottles of rum");
  4. gz.Write(fred, 0, fred.Length);
  5. Int64  SizeOfCompressedBytes = fs.Length;
  6.  
If you are doing multiple writes and want how much came from each write, you could probably implement logic to look at the change in the .Length property.
Expert
 
Join Date: Sep 2008
Location: USA
Posts: 188
#5: Sep 30 '08

re: GZipStream compressed bytes written


Just to reiterate what dirtBag noted, closing a GZipStream flushes the buffers and writes some additional EOF bytes that the decompressor needs. Therefore, the most accurate byte count can only be done after the GZipStream is closed or disposed.

Also, just a warning, if checking the underlying stream length after a series of atomic writes you must take into account that stream buffering does not always write all the read bytes with each iteration unless the buffer is Flushed to the stream.
Expert
 
Join Date: Sep 2008
Location: USA
Posts: 188
#6: Sep 30 '08

re: GZipStream compressed bytes written


Also, if you are worried about reading a large file into a memory stream, an alternative is to count bytes in chunks from a buffer size to your liking:

Expand|Select|Wrap|Line Numbers
  1.       int _BUFFER_SIZE = 4096;
  2.       FileStream inputStream = new FileStream(@"C:\someFile.zip", FileMode.Open);
  3.       byte[] readBuffer = new byte[_BUFFER_SIZE];
  4.       long totalBytes = 0;
  5.       int bytesRead = 0;
  6.         do {
  7.           // read _BUFFER_SIZE bytes from inputStream into the buffer
  8.           bytesRead = inputStream.Read(readBuffer, 0, _BUFFER_SIZE);
  9.           totalBytes += bytesRead;
  10.         }
  11.         // until no more bytes are read from the input stream
  12.         while (bytesRead > 0);
  13.  
  14.         Console.WriteLine("{0} : {1} bytes.", inputStream.Name, totalBytes);
  15.  
  16.  
Newbie
 
Join Date: Sep 2008
Posts: 3
#7: Oct 2 '08

re: GZipStream compressed bytes written


The examples given are all pretty straight forward, but what happens when you don't have access to the underlying stream or the underlying stream throws an exception when you call Length on it?
Plater's Avatar
Moderator
 
Join Date: Apr 2007
Location: New England
Posts: 7,161
#8: Oct 2 '08

re: GZipStream compressed bytes written


Well you always have access to the underlying stream via the .BaseStream proeprty. Not sure which stream types would throw an exception on the Length proeprty, but I guess it could happen?

What is your underlying stream type?
Expert
 
Join Date: Sep 2008
Location: USA
Posts: 188
#9: Oct 2 '08

re: GZipStream compressed bytes written


That is exactly what dirtBag pointed out. GZipStream does not support Length or Position while open, so neither will its BaseStream.

As you can see, the implementation of GZip compression does not allow a real-time feedback of incremental bytes written...which is what you want.

Unless you write your own compression algorithm, using the GZipStream class means you will only know how may compressed bytes were written by retrieving the length of the entire compressed stream after it has been fully written and closed.

At least that is my conclusion. If someone else can show otherwise, please correct me.
Plater's Avatar
Moderator
 
Join Date: Apr 2007
Location: New England
Posts: 7,161
#10: Oct 2 '08

re: GZipStream compressed bytes written


I have had zero trouble using the .Length property on the BaseStream
Expert
 
Join Date: Sep 2008
Location: USA
Posts: 188
#11: Oct 2 '08

re: GZipStream compressed bytes written


OK, you are correct Plater, the BaseStream.Length is available. My statement above about the BaseStream is incorrect.

However, only while the wrapper (GZipStream) stream is open...when the GZipStream is closed so is the Base Stream...

And because of this, the BaseStream.Length is not absolutely correct until the Compression stream is flushed/closed.

So for mach77, you can calculate bytes written by examining the increase in BaseStream.Length...but the last length given will not equal the final length of the compressed stream.

I have provided to test methods to show this: you will need to provide a text file.
Expand|Select|Wrap|Line Numbers
  1. using System;
  2. using System.IO;
  3. using System.IO.Compression;
  4.  
  5. namespace bytes {
  6.  
  7.   class test {
  8.  
  9.     static void Main() {
  10.       GZipCompressAllAtOnce(@"C:\Temp\someFile.txt");
  11.       GZipCompressIncremental(@"C:\Temp\someFile.txt");
  12.     }
  13.  
  14.     public static void GZipCompressAllAtOnce(string filename) {
  15.       byte[] fileBuffer;
  16.       int bytesRead;
  17.       MemoryStream ms;
  18.       GZipStream compressedzipStream;
  19.       using (FileStream infile = new FileStream(filename, FileMode.Open, FileAccess.Read, FileShare.Read)) {
  20.         fileBuffer = new byte[infile.Length];
  21.         bytesRead = infile.Read(fileBuffer, 0, fileBuffer.Length);
  22.       }
  23.       using (ms = new MemoryStream()) {
  24.         using (compressedzipStream = new GZipStream(ms, CompressionMode.Compress, true)) {
  25.           compressedzipStream.Write(fileBuffer, 0, bytesRead);
  26.           Console.WriteLine("Underlying Stream Length: {0}", compressedzipStream.BaseStream.Length);
  27.         }
  28.         Console.WriteLine("Original size: {0}, Compressed size: {1}", fileBuffer.Length, ms.Length);
  29.       }
  30.     }
  31.  
  32.     public static void GZipCompressIncremental(string filename) {
  33.       byte[] fileBuffer;
  34.       int bytesRead;
  35.       int BUFFERSIZE = 512;
  36.       MemoryStream ms;
  37.       using (FileStream infile = new FileStream(filename, FileMode.Open, FileAccess.Read, FileShare.Read)) {
  38.         fileBuffer = new byte[BUFFERSIZE];
  39.         using (ms = new MemoryStream()) {
  40.           using (GZipStream compressedzipStream = new GZipStream(ms, CompressionMode.Compress, true)) {
  41.             do {
  42.               bytesRead = infile.Read(fileBuffer, 0, BUFFERSIZE);
  43.               compressedzipStream.Write(fileBuffer, 0, bytesRead);
  44.               Console.WriteLine("Underlying Stream Length: {0}", compressedzipStream.BaseStream.Length);
  45.             } while (bytesRead > 0);
  46.           }
  47.           Console.WriteLine("Original size: {0}, Compressed size: {1}", infile.Length, ms.Length);
  48.         }
  49.       }
  50.     }
  51.  
  52.   }
  53. }
  54.  
Plater's Avatar
Moderator
 
Join Date: Apr 2007
Location: New England
Posts: 7,161
#12: Oct 2 '08

re: GZipStream compressed bytes written


Well the OPs question was to know how many compressed bytes were written, which can be done with the length property as mentioned. Sort of.
The compressed streams get a preamble and postamble as mentioned.
But I think it works like this:

CompressedStream after 3 writes
[preamble][compressedbytes][postamble]

And *NOT* like this:
CompressedStream after 3 writes
[preamble][compressedbytes][postamble][preamble][compressedbytes][postamble][preamble][compressedbytes][postamble]

I have not confirmed this however.
Reply


Similar .NET Framework bytes