473,327 Members | 2,074 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,327 software developers and data experts.

c# - Encoding in 8 bits!!

Hi all,

I have seen a similar post to this in the past but no resolution. I
will explain fully my problem:

I am writing a text editor that will be used in several regions around
the world but my testing will be done in Turkey (I am in GB). The
output of this editor will be used in a DOS enviroment so any hint of
it outputting unicode is out of the question.

Before writing the application I requested a file from a turkish
client to be generated in notepad with all the letters of the alphabet
and any special Turkish characters. I then generated a screen font for
the DOS enviroment which mapped the special characters values (all
above 128) to their graphical representations. So far good.

I then moved on the create the application. Origonally for IO I used a
pair of StreamReaders / Writers with encoding set to Encoding.ASCII.
Obviously this scheme only appreciates ASCII values within the 7 bit
range - noo good as 129+ spilled over into 2 bytes.

I next tried Encoding.Default, this behaves very strangley – it saves
single byte values on my machine and 2 byte values on the Turkish
machine. Still not good enough then.

I am desperate to find a solution to this, I would simply like to
output all charcters in the 8 bit character mapping scheme that is
used by NOTEPAD!! Surely this is easy. I know I can get to the ANSI
codepage as follows:

TextInfo ti = CultureInfo.CurrentCulture.TextInfo;
ti.ANSICodePage;

But what now!!!

I would appreciate any advice anybody can give me.
Jul 21 '05 #1
6 3558
Cor
Hi Duncan,

You can prevent a lot of work.

dotNet programs are not running on DOS.

Cor
Jul 21 '05 #2
Duncan M <du*******@hotmail.com> wrote:
I have seen a similar post to this in the past but no resolution. I
will explain fully my problem:

I am writing a text editor that will be used in several regions around
the world but my testing will be done in Turkey (I am in GB). The
output of this editor will be used in a DOS enviroment so any hint of
it outputting unicode is out of the question.

Before writing the application I requested a file from a turkish
client to be generated in notepad with all the letters of the alphabet
and any special Turkish characters. I then generated a screen font for
the DOS enviroment which mapped the special characters values (all
above 128) to their graphical representations. So far good.

I then moved on the create the application. Origonally for IO I used a
pair of StreamReaders / Writers with encoding set to Encoding.ASCII.
Obviously this scheme only appreciates ASCII values within the 7 bit
range - noo good as 129+ spilled over into 2 bytes.
I would expect Unicode 128+ to come out as rubbish using an ASCII
encoding, but still a single byte - probably (unicodeValue & 0x7f).
I next tried Encoding.Default, this behaves very strangley =3F it saves
single byte values on my machine and 2 byte values on the Turkish
machine. Still not good enough then.
Encoding.Default uses whatever the system default encoding is - I
suspect the Turkish machine has a different default encoding,
presumably a multibyte one.
I am desperate to find a solution to this, I would simply like to
output all charcters in the 8 bit character mapping scheme that is
used by NOTEPAD!!
Used by Notepad on which machine though? It will vary...
Surely this is easy. I know I can get to the ANSI
codepage as follows:

TextInfo ti = CultureInfo.CurrentCulture.TextInfo;
ti.ANSICodePage;

But what now!!!

I would appreciate any advice anybody can give me.


I would suggest finding out *exactly* what encoding you're really after
(not just using Encoding.Default) and specify that for your
StreamWriter.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Jul 21 '05 #3
Cor <no*@non.com> wrote:
You can prevent a lot of work.

dotNet programs are not running on DOS.


The OP never suggested they would be - just that the *output* of the
..NET program would be used in DOS.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Jul 21 '05 #4
Cor
Hi Jon,

Thanks, I tried to correct my message.

Cor
Jul 21 '05 #5
Cor
Hi Duncan,

Jon pointed me that I understand your question wrong, and I think he can be
right.

But to mix up for your both ASCII is a 7 bit value. That is not used on a
Dos computer.

On a Dos computer is it as far as I know UTF8

http://msdn.microsoft.com/library/de...classtopic.asp

I think that you need for that the right code scheme for codes above 127,
(mostly used in Europe en US are as far as I remember me 850 and 437).

I do not know how you can use that but maybe you can see it in the class
information for which I have given a link above.

Cor
Jul 21 '05 #6
Cor <no*@non.com> wrote:
Jon pointed me that I understand your question wrong, and I think he can be
right.

But to mix up for your both ASCII is a 7 bit value. That is not used on a
Dos computer.
Well, all the encodings I've seen used in DOS as ASCII-*compatible*,
i.e. they're "extensions" of ASCII. Given the encodings problem, it's
also often safest just to restrict yourself to ASCII if you can :)
On a Dos computer is it as far as I know UTF8


Nope, it's an individual code page, usually (as you say) 850 and 437.
Usually single byte encodings though, as far as I've seen.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Jul 21 '05 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
by: lkrubner | last post by:
Last year I asked a bunch of questions about character encoding on this newsgroup. All the answers came down to using ord() in creative ways to try to make guesses about multi-byte characters. I...
3
by: wenmang | last post by:
Hi, I ma thinking whether to use Base64 encoding to encode the binary content in the XML file. I have done some simple calculations, it seems to me that the size for encoded content increases by...
48
by: Zenobia | last post by:
Recently I was editing a document in GoLive 6. I like GoLive because it has some nice features such as: * rewrite source code * check syntax * global search & replace (through several files at...
4
by: Amir | last post by:
Hello. Can someone help me with the diff between UTF8 and Unicode encoding ? I know both use 8 bits, both can use more then 2 Bytes (?) Thanks.
4
by: Mark | last post by:
Hi... Just noticed something odd... In old ASP if you had query parameters that were invalid for their encoding (broken utf-8, say), ASP would give you back chars representing the 8-bit byte...
5
by: Licheng Fang | last post by:
I want to store Chinese in Unicode internally in my program, and give output in UTF-8 or GBK format. After two days of searching and reading, I still cannot find a simple and straightforward way to...
23
by: Umesh | last post by:
This is a basic thing. Say A=0100 0001 in ASCII which deals with 256 characters(you know better than me!) But we deal with only four characters and 2 bits are enough to encode them. I want to...
5
by: Bartholomew Simpson | last post by:
Slightly OT, but someone may know an algorithm to help me do this .... I have six numbers that I want to encode into one single larger number. The 6 numbers may be presented as ff: number ...
1
by: Alexander Adam | last post by:
Hi, I am a bit list in encoding related stuff. Let me explain what I am doing (yes it's C++ :)): I am getting some input content due Expat Xml Parser. I've setup Expat to use wchar_t. First...
6
by: John Messenger | last post by:
I notice that the C standard allows padding bits in both unsigned and signed integer types. Does anyone know of any real-world examples of compilers that use padding bits? -- John
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.