473,385 Members | 1,587 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

c# - Encoding in 8 bits!!

Hi all,

I have seen a similar post to this in the past but no resolution. I
will explain fully my problem:

I am writing a text editor that will be used in several regions around
the world but my testing will be done in Turkey (I am in GB). The
output of this editor will be used in a DOS enviroment so any hint of
it outputting unicode is out of the question.

Before writing the application I requested a file from a turkish
client to be generated in notepad with all the letters of the alphabet
and any special Turkish characters. I then generated a screen font for
the DOS enviroment which mapped the special characters values (all
above 128) to their graphical representations. So far good.

I then moved on the create the application. Origonally for IO I used a
pair of StreamReaders / Writers with encoding set to Encoding.ASCII.
Obviously this scheme only appreciates ASCII values within the 7 bit
range - noo good as 129+ spilled over into 2 bytes.

I next tried Encoding.Default, this behaves very strangley – it saves
single byte values on my machine and 2 byte values on the Turkish
machine. Still not good enough then.

I am desperate to find a solution to this, I would simply like to
output all charcters in the 8 bit character mapping scheme that is
used by NOTEPAD!! Surely this is easy. I know I can get to the ANSI
codepage as follows:

TextInfo ti = CultureInfo.CurrentCulture.TextInfo;
ti.ANSICodePage;

But what now!!!

I would appreciate any advice anybody can give me.
Jul 21 '05 #1
6 3564
Cor
Hi Duncan,

You can prevent a lot of work.

dotNet programs are not running on DOS.

Cor
Jul 21 '05 #2
Duncan M <du*******@hotmail.com> wrote:
I have seen a similar post to this in the past but no resolution. I
will explain fully my problem:

I am writing a text editor that will be used in several regions around
the world but my testing will be done in Turkey (I am in GB). The
output of this editor will be used in a DOS enviroment so any hint of
it outputting unicode is out of the question.

Before writing the application I requested a file from a turkish
client to be generated in notepad with all the letters of the alphabet
and any special Turkish characters. I then generated a screen font for
the DOS enviroment which mapped the special characters values (all
above 128) to their graphical representations. So far good.

I then moved on the create the application. Origonally for IO I used a
pair of StreamReaders / Writers with encoding set to Encoding.ASCII.
Obviously this scheme only appreciates ASCII values within the 7 bit
range - noo good as 129+ spilled over into 2 bytes.
I would expect Unicode 128+ to come out as rubbish using an ASCII
encoding, but still a single byte - probably (unicodeValue & 0x7f).
I next tried Encoding.Default, this behaves very strangley =3F it saves
single byte values on my machine and 2 byte values on the Turkish
machine. Still not good enough then.
Encoding.Default uses whatever the system default encoding is - I
suspect the Turkish machine has a different default encoding,
presumably a multibyte one.
I am desperate to find a solution to this, I would simply like to
output all charcters in the 8 bit character mapping scheme that is
used by NOTEPAD!!
Used by Notepad on which machine though? It will vary...
Surely this is easy. I know I can get to the ANSI
codepage as follows:

TextInfo ti = CultureInfo.CurrentCulture.TextInfo;
ti.ANSICodePage;

But what now!!!

I would appreciate any advice anybody can give me.


I would suggest finding out *exactly* what encoding you're really after
(not just using Encoding.Default) and specify that for your
StreamWriter.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Jul 21 '05 #3
Cor <no*@non.com> wrote:
You can prevent a lot of work.

dotNet programs are not running on DOS.


The OP never suggested they would be - just that the *output* of the
..NET program would be used in DOS.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Jul 21 '05 #4
Cor
Hi Jon,

Thanks, I tried to correct my message.

Cor
Jul 21 '05 #5
Cor
Hi Duncan,

Jon pointed me that I understand your question wrong, and I think he can be
right.

But to mix up for your both ASCII is a 7 bit value. That is not used on a
Dos computer.

On a Dos computer is it as far as I know UTF8

http://msdn.microsoft.com/library/de...classtopic.asp

I think that you need for that the right code scheme for codes above 127,
(mostly used in Europe en US are as far as I remember me 850 and 437).

I do not know how you can use that but maybe you can see it in the class
information for which I have given a link above.

Cor
Jul 21 '05 #6
Cor <no*@non.com> wrote:
Jon pointed me that I understand your question wrong, and I think he can be
right.

But to mix up for your both ASCII is a 7 bit value. That is not used on a
Dos computer.
Well, all the encodings I've seen used in DOS as ASCII-*compatible*,
i.e. they're "extensions" of ASCII. Given the encodings problem, it's
also often safest just to restrict yourself to ASCII if you can :)
On a Dos computer is it as far as I know UTF8


Nope, it's an individual code page, usually (as you say) 850 and 437.
Usually single byte encodings though, as far as I've seen.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Jul 21 '05 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
by: lkrubner | last post by:
Last year I asked a bunch of questions about character encoding on this newsgroup. All the answers came down to using ord() in creative ways to try to make guesses about multi-byte characters. I...
3
by: wenmang | last post by:
Hi, I ma thinking whether to use Base64 encoding to encode the binary content in the XML file. I have done some simple calculations, it seems to me that the size for encoded content increases by...
48
by: Zenobia | last post by:
Recently I was editing a document in GoLive 6. I like GoLive because it has some nice features such as: * rewrite source code * check syntax * global search & replace (through several files at...
4
by: Amir | last post by:
Hello. Can someone help me with the diff between UTF8 and Unicode encoding ? I know both use 8 bits, both can use more then 2 Bytes (?) Thanks.
4
by: Mark | last post by:
Hi... Just noticed something odd... In old ASP if you had query parameters that were invalid for their encoding (broken utf-8, say), ASP would give you back chars representing the 8-bit byte...
5
by: Licheng Fang | last post by:
I want to store Chinese in Unicode internally in my program, and give output in UTF-8 or GBK format. After two days of searching and reading, I still cannot find a simple and straightforward way to...
23
by: Umesh | last post by:
This is a basic thing. Say A=0100 0001 in ASCII which deals with 256 characters(you know better than me!) But we deal with only four characters and 2 bits are enough to encode them. I want to...
5
by: Bartholomew Simpson | last post by:
Slightly OT, but someone may know an algorithm to help me do this .... I have six numbers that I want to encode into one single larger number. The 6 numbers may be presented as ff: number ...
1
by: Alexander Adam | last post by:
Hi, I am a bit list in encoding related stuff. Let me explain what I am doing (yes it's C++ :)): I am getting some input content due Expat Xml Parser. I've setup Expat to use wchar_t. First...
6
by: John Messenger | last post by:
I notice that the C standard allows padding bits in both unsigned and signed integer types. Does anyone know of any real-world examples of compilers that use padding bits? -- John
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.