469,364 Members | 2,424 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,364 developers. It's quick & easy.

string to byte[] back to string + Compression Failed!

I'm writing some code that will convert a regular string to a byte[]
for compression and then beable to convert that compressed string back
into original form.

Conceptually I have....

For compression
string ->(Unicode Conversion) byte[] -(Compression + Unicode
Conversion) string

For Decompression
string ->(Unicode Conversion) byte[] -(DECompression + Unicode
Conversion) string

The problem is that there's a code chunk that fails. Probably because
of some bad conversion somewhere in my code.

The key line that is constantly failing is....
int size = s.Read(write_data, 0, 8);
GZip algorithm gives me a ArrayIndexOutOfBounds
Deflate gives me some data corruption error. I looked at the byte[]
right after compression and right before decompression and they DO NOT
MATCH! What is the problem in this situation?

I need this process such that I get these 2 functions...

string CompressString(string in, Algorithm.GZip or Algorithm.Deflate);
string DecompressString(string in, Algorithm.GZip or
Algorithm.Deflate);

I've seen similar code but no potential fixes on Google.com
My Code is below....


using System;
using System.Collections.Generic;
using System.Text;
using System.IO;
using System.IO.Compression;

namespace Jeremyje.Utility
{
public class StringTransforms
{
public enum StringCompressionAlgorithm
{
GZip,
Deflate
}

public static byte[] UnicodeStringToByteArray(string str)
{
UnicodeEncoding enc = new UnicodeEncoding();
return enc.GetBytes(str);
}

public static string ByteArrayToUnicodeString(byte[] str_arr)
{
UnicodeEncoding enc = new UnicodeEncoding();
return enc.GetString(str_arr);
}

public static bool DecompressString(string in_string, out
string out_string)
{
return DecompressString(in_string, out out_string,
StringCompressionAlgorithm.GZip);
}

public static bool DecompressString(string in_string, out
string out_string, StringCompressionAlgorithm alg)
{
bool status = false;
out_string = in_string;

switch (alg)
{
case StringCompressionAlgorithm.GZip:
{
try
{
out_string = "";
int total_length = 0;
byte[] write_data = new byte[4096];
byte[] bData =
UnicodeStringToByteArray(in_string);

GZipStream s = new GZipStream(new
MemoryStream(bData), CompressionMode.Decompress);

while (true)
{
int size = s.Read(write_data, 0, 8);
if (size 0)
{
total_length += size;
out_string +=
Encoding.Unicode.GetString(write_data, 0, size);
}
else
{
break;
}
}
s.Close();
status = true;
}
catch (Exception e)
{
Console.WriteLine(e);
status = false;
}

return status;
}
case StringCompressionAlgorithm.Deflate:
{
try
{
out_string = "";
int total_length = 0;
byte[] write_data = new byte[4096];
byte[] bData =
UnicodeStringToByteArray(in_string);

DeflateStream s = new DeflateStream(new
MemoryStream(bData), CompressionMode.Decompress);

while (true)
{
int size = s.Read(write_data, 0, 8);
if (size 0)
{
total_length += size;
out_string +=
Encoding.Unicode.GetString(write_data, 0, size);
}
else
{
break;
}
}
s.Close();
status = true;
}
catch (Exception e)
{
Console.WriteLine(e);
status = false;
}

return status;
}

default:
break;
}

return status;
}

public static bool CompressString(string in_string, out string
out_string)
{
return CompressString(in_string, out out_string,
StringCompressionAlgorithm.GZip);
}

public static bool CompressString(string in_string, out string
out_string, StringCompressionAlgorithm alg)
{
bool status = false;
out_string = in_string;

switch(alg)
{
case StringCompressionAlgorithm.GZip:
{
try
{
MemoryStream ms = new MemoryStream();
Stream s = new GZipStream(ms,
CompressionMode.Compress);
byte[] bData =
UnicodeStringToByteArray(in_string);

s.Write(bData, 0, bData.Length);
s.Close();
byte[] compressed_data =
(byte[])ms.ToArray();
out_string =
ByteArrayToUnicodeString(compressed_data);
status = true;
}
catch(Exception e)
{
Console.WriteLine(e);
status = false;
}

return status;
}
case StringCompressionAlgorithm.Deflate:
{
try
{
MemoryStream ms = new MemoryStream();
Stream s = new DeflateStream(ms,
CompressionMode.Compress);
byte[] bData =
UnicodeStringToByteArray(in_string);

s.Write(bData, 0, bData.Length);
s.Close();
byte[] compressed_data =
(byte[])ms.ToArray();
out_string =
ByteArrayToUnicodeString(compressed_data);
status = true;
}
catch (Exception e)
{
Console.WriteLine(e);
status = false;
}

return status;
}

default:
break;
}

return false;
}
}
}

Feb 8 '07 #1
5 5699
Hi Jeremy,

Your problem is converting the compressed byte[] to string. After the
compression a string can't hold the data the byte[] holds and you lose
lots of data causing an exception when you try to decompress it.

Having Compress return a byte[] and Decompress take a byte[] will solve
your problem. If you need the byte[] to be represented as string you can
use Base64.

[DecompressString]
byte[] bData = Convert.FromBase64String(in_string);

[CompressString]
out_string = Convert.ToBase64String(compressed_data);
--
Happy Coding!
Morten Wennevik [C# MVP]
Feb 8 '07 #2
<je******@gmail.comwrote:
I'm writing some code that will convert a regular string to a byte[]
for compression and then beable to convert that compressed string back
into original form.
Don't try to encode arbitrary binary data as a string directly using
Encoding.Unicode. As Morten suggested, use Base64 instead.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Feb 8 '07 #3
On Feb 8, 1:55 am, Jon Skeet [C# MVP] <s...@pobox.comwrote:
<jerem...@gmail.comwrote:
I'm writing some code that will convert a regular string to a byte[]
for compression and then beable to convert that compressed string back
into original form.

Don't try to encode arbitrary binary data as a string directly using
Encoding.Unicode. As Morten suggested, use Base64 instead.

--
Jon Skeet - <s...@pobox.com>http://www.pobox.com/~skeet Blog:http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Yeah, that's what I ended up doing but that's not much of a
compression since base64 makes the data a lot larger. Is there another
encoding other than base64 that will yield better results?

Feb 10 '07 #4
<je******@gmail.comwrote:
Don't try to encode arbitrary binary data as a string directly using
Encoding.Unicode. As Morten suggested, use Base64 instead.

Yeah, that's what I ended up doing but that's not much of a
compression since base64 makes the data a lot larger. Is there another
encoding other than base64 that will yield better results?
Not with the same degree of safety. If you could store the raw
compressed data instead of converting it back to a string, you'd be
okay - but to convert arbitrary binary data into text data which is
"safe" in many situations (i.e. won't be subject to unicode
normalization, can be expressed in many encodings etc) Base64 is a very
good choice.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Feb 10 '07 #5
je******@gmail.com wrote:
On Feb 8, 1:55 am, Jon Skeet [C# MVP] <s...@pobox.comwrote:
> <jerem...@gmail.comwrote:
>>I'm writing some code that will convert a regular string to a byte[]
for compression and then beable to convert that compressed string back
into original form.
Don't try to encode arbitrary binary data as a string directly using
Encoding.Unicode. As Morten suggested, use Base64 instead.
Yeah, that's what I ended up doing but that's not much of a
compression since base64 makes the data a lot larger. Is there another
encoding other than base64 that will yield better results?
Nothing standard.

Using a home made Base128 together with using a single byte
encoding like ISO8859-1 will reduce the overhead slightly.

Arne
Feb 10 '07 #6

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

5 posts views Thread by Jamie Risk | last post: by
3 posts views Thread by =?Utf-8?B?ai5hLiBoYXJyaW1hbg==?= | last post: by
3 posts views Thread by =?Utf-8?B?c3BkMzAwMQ==?= | last post: by
1 post views Thread by CARIGAR | last post: by
reply views Thread by suresh191 | last post: by
1 post views Thread by Marylou17 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.