473,770 Members | 1,891 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Compress ASCII text as Hex?

Hi -

I was speaking with someone who mentioned that it's possible to encode
an ascii string as hex(?) in order to fit more data into the same # of
chars. Can anyone enlighten me?

The scenario is - I've got a CSV with a field that has a 16 character
limit. I need to fit potentially 24 ASCII characters into it.

Thanks.
-Ben
--
to reply, remove .s.p.a.m. from email
Nov 16 '05 #1
8 10988
Ben,

You can't do that unless you limit the range of characters that can be
used in the 24 character string. Without doing that, you have to accept the
full range of characters and you can't just squeeze them in there without
some loss.

Hope this helps.

--
- Nicholas Paldino [.NET/C# MVP]
- mv*@spam.guard. caspershouse.co m

"Ben Bloom" <bb****@macg.s. p.a.m.regor.com > wrote in message
news:ew******** ******@TK2MSFTN GP15.phx.gbl...
Hi -

I was speaking with someone who mentioned that it's possible to encode an
ascii string as hex(?) in order to fit more data into the same # of chars.
Can anyone enlighten me?

The scenario is - I've got a CSV with a field that has a 16 character
limit. I need to fit potentially 24 ASCII characters into it.

Thanks.
-Ben
--
to reply, remove .s.p.a.m. from email

Nov 16 '05 #2
Thanks Nicholas,

The 24 character string is a concatenation of a number (8-10 digits, I
believe) and two other string fields. Would I have more success if I
tried to shrink the number only?

-Ben

Nicholas Paldino [.NET/C# MVP] wrote:
Ben,

You can't do that unless you limit the range of characters that can be
used in the 24 character string. Without doing that, you have to accept the
full range of characters and you can't just squeeze them in there without
some loss.

Hope this helps.

--
to reply, remove .s.p.a.m. from email
Nov 16 '05 #3
if you are using a subset of characters, try fit 2 characters into character
written to the csv file,
say for example you were only interested in the character codes from 0-127,
you could write the string "me" i.e. hex codes 6d and 65, into one character
[pseudo]
char c = 0x6d65
[/pseudo]

and write that single char to the text file,
then when you read it, you breake it up again.
hope that helps

"Ben Bloom" wrote:
Thanks Nicholas,

The 24 character string is a concatenation of a number (8-10 digits, I
believe) and two other string fields. Would I have more success if I
tried to shrink the number only?

-Ben

Nicholas Paldino [.NET/C# MVP] wrote:
Ben,

You can't do that unless you limit the range of characters that can be
used in the 24 character string. Without doing that, you have to accept the
full range of characters and you can't just squeeze them in there without
some loss.

Hope this helps.

--
to reply, remove .s.p.a.m. from email

Nov 16 '05 #4
<"=?Utf-8?B?QnJpYW4gS2V hdGluZyBFSTlGWE I=?=" <csharp at
briankeating.ne t>> wrote:
if you are using a subset of characters, try fit 2 characters into character
written to the csv file,
say for example you were only interested in the character codes from 0-127,
you could write the string "me" i.e. hex codes 6d and 65, into one character
[pseudo]
char c = 0x6d65
[/pseudo]

and write that single char to the text file,
then when you read it, you breake it up again.


Note that that will only work if your CSV file is written in a Unicode-
supporting encoding. There's also no absolute guarantee that it won't
end up forming invalid characters, or characters which the reader might
normalize to a different but equivalent form as far as Unicode is
concerned. I doubt that it'll be a problem, but it's worth bearing in
mind.

--
Jon Skeet - <sk***@pobox.co m>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 16 '05 #5
Yes your right,
Encoding could prevent a problem but my description was slightly actually
more than slightly incorrect,
if we were limited the the 0-127 characters for the ascii table then we
would be using 7 bits to represent a character, therefore for every 7
characters we could squeeze in an extra char.
More trouble than it's worth i guess.

regards
Brian.
"Jon Skeet [C# MVP]" wrote:
<"=?Utf-8?B?QnJpYW4gS2V hdGluZyBFSTlGWE I=?=" <csharp at
briankeating.ne t>> wrote:
if you are using a subset of characters, try fit 2 characters into character
written to the csv file,
say for example you were only interested in the character codes from 0-127,
you could write the string "me" i.e. hex codes 6d and 65, into one character
[pseudo]
char c = 0x6d65
[/pseudo]

and write that single char to the text file,
then when you read it, you breake it up again.


Note that that will only work if your CSV file is written in a Unicode-
supporting encoding. There's also no absolute guarantee that it won't
end up forming invalid characters, or characters which the reader might
normalize to a different but equivalent form as far as Unicode is
concerned. I doubt that it'll be a problem, but it's worth bearing in
mind.

--
Jon Skeet - <sk***@pobox.co m>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too

Nov 16 '05 #6
<"=?Utf-8?B?QnJpYW4gS2V hdGluZyBFSTlGWE I=?=" <csharp at
briankeating.ne t>> wrote:
Yes your right,
Encoding could prevent a problem but my description was slightly actually
more than slightly incorrect,
if we were limited the the 0-127 characters for the ascii table then we
would be using 7 bits to represent a character, therefore for every 7
characters we could squeeze in an extra char.
More trouble than it's worth i guess.


Certainly when the only necessity is to squeeze 24 characters into 16
:)

--
Jon Skeet - <sk***@pobox.co m>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 16 '05 #7
There's a method, but it's a bit snarky....

There an encoding format code BASE64 (also known as UUEncoding in some
quarters). It take fully binary data (0-255) and converts it a set of 64
printable characters (digits, uppercase, lowercase plus two symbols + and
/). Since email messages are required to be pure printable text (due to
some ancient hardware, which are almost certainly no longer on the 'net),
all attachments are BASE64 encoded. It converts 3 binary bytes into 4
characters, so encoded blocks increase 33% in size.

So, what does this effect you? Well, as long as your "encoded" string meets
the criteria of Base64 encoding, you can "decode" it into a smaller block of
binary data. 4 characters will become 3 bytes, or in your case, 20
characters can become 15 bytes.

string origString = "123456,abcdef, ghijkl"; // 20 character CSV text

string prepareText = origString.Repl ace(',', '+'); // Replace commas with
plus signs
byte[] compressedText = Convert.FromBas e64String(prepa reText);
Console.WriteLi ne("Length of Conpressed text = {0}", compressedText. Length);
// Save compressedText to your store.
// :
// Later read it back
string alteredText = Convert.ToBase6 4String(compres sedText);
string finalString = alteredText.Rep lace('+', ',');

Console.WriteLi ne("Text: {0}, this {1} the same as the original",
finalString, finalString == origString ? "IS" : "IS NOT");

Running the above, I get:
Length of Conpressed text = 15
Text: 123456,abcdef,g hijkl, this IS the same as the original

--
Truth,
James Curran
[erstwhile VC++ MVP]
Home: www.noveltheory.com Work: www.njtheater.com
Blog: www.honestillusion.com Day Job: www.partsearch.com

"Ben Bloom" <bb****@macg.s. p.a.m.regor.com > wrote in message
news:ew******** ******@TK2MSFTN GP15.phx.gbl...
Hi -

I was speaking with someone who mentioned that it's possible to encode
an ascii string as hex(?) in order to fit more data into the same # of
chars. Can anyone enlighten me?

The scenario is - I've got a CSV with a field that has a 16 character
limit. I need to fit potentially 24 ASCII characters into it.

Thanks.
-Ben
--
to reply, remove .s.p.a.m. from email

Nov 16 '05 #8
James Curran <Ja*********@mv ps.org> wrote:
There's a method, but it's a bit snarky....

There an encoding format code BASE64 (also known as UUEncoding in some
quarters). It take fully binary data (0-255) and converts it a set of 64
printable characters (digits, uppercase, lowercase plus two symbols + and
/). Since email messages are required to be pure printable text (due to
some ancient hardware, which are almost certainly no longer on the 'net),
all attachments are BASE64 encoded. It converts 3 binary bytes into 4
characters, so encoded blocks increase 33% in size.

So, what does this effect you? Well, as long as your "encoded" string meets
the criteria of Base64 encoding, you can "decode" it into a smaller block of
binary data. 4 characters will become 3 bytes, or in your case, 20
characters can become 15 bytes.


Yes... it does mean you can only have 63 distinct characters though
(IIRC, '=' is used for end padding, which you also need to work out).

It also doesn't get 24 characters down to 16 :( Possibly a combination
of that (if it all applies appropriately) with something clever to do
with the 8 digits (which can be represented as a 4 byte integer, which
should help) could help.

It all sounds like something which should be redesigned rather than
munged like this though...

--
Jon Skeet - <sk***@pobox.co m>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 16 '05 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
1523
by: Patrick Questembert | last post by:
I am developping with Visual Studio 2003 + C# + MySQL 4.1 and the OleDb components. My problem is that a stament using the COMPRESS() function seems to work or not depending on the data ... Here is the code: OleDbCommand catCMD = DatabaseConnection.CreateCommand(); catCMD.CommandText = "INSERT INTO MyTable (id,msg) VALUES (88,COMPRESS(?));" OleDbParameter p = catCMD.CreateParameter(); p.ParameterName = "msg";
17
542
by: DraguVaso | last post by:
Hi, For my SMS-application I need to be able to send characters with accents (like é and à). But this doesn't seem to work in Text Mode, so i will need to do it in PDU Mode. Does anybody has soem converting routines for that in VB.NET? I need my text to be converted to a PDU-message, and back (for received messages). Thansk in advance,
5
15046
by: Lenard Gunda | last post by:
hi! I have the following problem. I need to read data from a TXT file our company receives. I would use StreamReader, and process it line by line using ReadLine, however, the following problem occurs. The file contains characters with ASCII codes above 128. But the file is still text (nothing like UTF7/8 or the like). It also might contain + signs. As a result:
18
34146
by: Ger | last post by:
I have not been able to find a simple, straight forward Unicode to ASCII string conversion function in VB.Net. Is that because such a function does not exists or do I overlook it? I found Encoding.Convert, but that needs byte arrays. Thanks, /Ger
31
3222
by: Claude Yih | last post by:
Hi, everyone. I got a question. How can I identify whether a file is a binary file or an ascii text file? For instance, I wrote a piece of code and saved as "Test.c". I knew it was an ascii text file. Then after compilation, I got a "Test" file and it was a binary executable file. The problem is, I know the type of those two files in my mind because I executed the process of compilation, but how can I make the computer know the type of a...
6
2998
by: Champika Nirosh | last post by:
Hi, I have two machine where I needed to have a extended TCP/IP protocol to make the link between the two machines Mean,I need to write a application that compress every data the machine send out.. and also need to un compress at the receiving end.. futher explaing...
6
8815
by: Adriano | last post by:
Can anyone recommend a simple way to compress/decomress a String in .NET 1.1 ? I have a random string of 70 characters, the output from a DES3 encryption, and I wish to reduce the lengh of it, thanks,
6
6160
by: =?Utf-8?B?V2F5bmUgR29yZQ==?= | last post by:
Hi I want to achive 2 things. First I would like to compress an existing file on my harddrive. I can easily find out if a file is compressed or not by using "File.GetAttributes". But no matter what I do I simply can't set the compressed attribute using "SetAttributes". I have no problem setting the attributes "Read-Only", "Archive" or "Hidden" using "SetAttributes"!
5
4930
by: zgh1970 | last post by:
Hi, Friends, default DB2 compression library. I am wondering if this option will have any new restriction on RESTORE in the following. (Can I used that backup imsage for restore at the follwoing compatible situations?) : There are operating system considerations when using the RESTORE DATABASE command:
0
9453
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10254
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10099
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10036
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9904
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
6710
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5481
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4007
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
3
2849
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.