473,569 Members | 2,752 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

C# Chr and Asc Function Equivalents - The Undocumented Truth!

There have been many postings about this subject on this newsgroup.
Unfortunately, they're incorrect. You can't just cast a value in C#
and have it work for all ASCII characters. Nor can you use the ASCII
encoding as some have suggested.

The undocumented truth is, Microsoft uses the Western European encoding
in these functions. If you don't believe me, use 137 in your VB Chr
function then compare the C# output if you just cast it or use the
ASCII encoding. You'll see they don't match! You can even do a quick
loop and check all the values from 0 to 255 and you'll see that there
are many that won't match the VB function's output.

Now if you're doing simple stuff, that maybe OK but if your writing
components in one language and expect to communicate with components
written in the other, you're going to have a real problem. Case in
point, you've got an older VB6 object (or VB.Net object for that
matter) that uses it's own encryption algorithm and it must communicate
with a C# object that must mimic the encryption function. If you don't
use the proper implementation of the Chr and Asc functions in your C#
component, you'll never be able to decipher the encrypted data from the
VB component.

Here are the true implementations of the Asc and Chr functions:

internal static string Chr(int p_intByte)
{
if( (p_intByte < 0) || (p_intByte > 255) )
{
throw new ArgumentOutOfRa ngeException("p _intByte", p_intByte,
"Must be between 0 and 255.");
}
byte[] bytBuffer = new byte[]{(byte) p_intByte};
return Encoding.GetEnc oding(1252).Get String(bytBuffe r);
}

internal static int Asc(string p_strChar)
{
if( p_strChar.Lengt h != 1 )
{
throw new ArgumentOutOfRa ngeException("p _strChar", p_strChar,
"Must be a single character.");
}
char[] chrBuffer = {Convert.ToChar (p_strChar)};
byte[] bytBuffer = Encoding.GetEnc oding(1252).Get Bytes(chrBuffer );
return (int) bytBuffer[0];
}

I hope this answers the question once and for all and puts an end to
the huge amount of misinformation that exists out there on this
subject.

Darrell Sparti, MCSD
Bikers Against Child Abuse National Webmaster
we*******@bacau sa.com
www.bacausa.com
Because No Child Should Live In Fear

Nov 17 '05 #1
4 5324
Darrell Sparti, MCSD <we*******@baca usa.com> wrote:
There have been many postings about this subject on this newsgroup.
Unfortunately, they're incorrect. You can't just cast a value in C#
and have it work for all ASCII characters. Nor can you use the ASCII
encoding as some have suggested.
You can for *ASCII* characters. Don't forget that ASCII only extends to
126 or 127 (I can never remember whether 127 is considered to be part
of it or not; it's not particularly important though as it's
unprintable).
The undocumented truth is, Microsoft uses the Western European encoding
in these functions.
If you mean the VB.NET functions, it's perfectly well documented, and
it's not the Western European encoding - it's whatever the default
encoding is for the thread.

From the docs for Asc:

<quote>
Asc returns the code point, or character code, for the input character.
This can be 0 through 255 for single-byte character set (SBCS) values
and -32768 through 32767 for double-byte character set (DBCS) values.
The returned value depends on the code page for the current thread,
which is contained in the ANSICodePage property of the TextInfo class.
TextInfo.ANSICo dePage can be obtained by specifying
System.Globaliz ation.CultureIn fo.CurrentCultu re.TextInfo.ANS ICodePage.
</quote>

And from the docs for Chr:

<quote>
Chr uses the Encoding class in the System.Text namespace to determine
if the current thread is using a single-byte character set (SBCS) or a
double-byte character set (DBCS). It then takes CharCode as a code
point in the appropriate set. The range can be 0 through 255 for SBCS
characters and -32768 through 65535 for DBCS characters. The returned
character depends on the code page for the current thread, which is
contained in the ANSICodePage property of the TextInfo class.
TextInfo.ANSICo dePage can be obtained by specifying
System.Globaliz ation.CultureIn fo.CurrentCultu re.TextInfo.ANS ICodePage.
</quote>
If you don't believe me, use 137 in your VB Chr
function then compare the C# output if you just cast it or use the
ASCII encoding. You'll see they don't match! You can even do a quick
loop and check all the values from 0 to 255 and you'll see that there
are many that won't match the VB function's output.
And that's what I'd expect, as ASCII doesn't have any values above 127,
and Unicode 128-159 is not the same as most ANSI code pages for the
same range.
Now if you're doing simple stuff, that maybe OK but if your writing
components in one language and expect to communicate with components
written in the other, you're going to have a real problem. Case in
point, you've got an older VB6 object (or VB.Net object for that
matter) that uses it's own encryption algorithm and it must communicate
with a C# object that must mimic the encryption function. If you don't
use the proper implementation of the Chr and Asc functions in your C#
component, you'll never be able to decipher the encrypted data from the
VB component.

Here are the true implementations of the Asc and Chr functions:
<snip>

Those would be fine if the thread's default code page is 1252, but
otherwise it's not correct.

I've had a bit of an experiment, and unfortunately the behaviour varies
depending on whether you're using .NET 1.1 or .NET 2.0, which doesn't
help matters. For instance, try the following program:

Option Strict On

Imports Microsoft.Visua lBasic
Imports System
Imports System.Threadin g
Imports System.Globaliz ation

Public Class Test

Shared Sub Main()
Thread.CurrentT hread.CurrentCu lture = new CultureInfo(719 4)
Dim x As Char = Chr (240)
Console.WriteLi ne (AscW(x))
End Sub
End Class

Using .NET 1.1, this prints 240. Using .NET 2.0 it prints 1088. I've no
idea what it would do on VB6.

Changing the current culture of the thread makes a difference in *some*
situations but not others, which is plain bizarre.

Fortunately, C# is considerably more consistent in these matters. If
you need to interoperate with legacy VB code, I'd strongly suggest you
make sure you know *exactly* what that VB code is going to produce in
terms of actual encodings, including what happens in various cultures.
Once you know that, getting the C# side to work should be easy...
I hope this answers the question once and for all and puts an end to
the huge amount of misinformation that exists out there on this
subject.


Personally I think it just added to the misinformation, I'm afraid...

--
Jon Skeet - <sk***@pobox.co m>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 17 '05 #2
Darrell Sparti, MCSD <we*******@baca usa.com> wrote:
There have been many postings about this subject on this newsgroup.
Unfortunately, they're incorrect. You can't just cast a value in C#
and have it work for all ASCII characters. Nor can you use the ASCII
encoding as some have suggested.
You can for *ASCII* characters. Don't forget that ASCII only extends to
126 or 127 (I can never remember whether 127 is considered to be part
of it or not; it's not particularly important though as it's
unprintable).
The undocumented truth is, Microsoft uses the Western European encoding
in these functions.
If you mean the VB.NET functions, it's perfectly well documented, and
it's not the Western European encoding - it's whatever the default
encoding is for the thread.

From the docs for Asc:

<quote>
Asc returns the code point, or character code, for the input character.
This can be 0 through 255 for single-byte character set (SBCS) values
and -32768 through 32767 for double-byte character set (DBCS) values.
The returned value depends on the code page for the current thread,
which is contained in the ANSICodePage property of the TextInfo class.
TextInfo.ANSICo dePage can be obtained by specifying
System.Globaliz ation.CultureIn fo.CurrentCultu re.TextInfo.ANS ICodePage.
</quote>

And from the docs for Chr:

<quote>
Chr uses the Encoding class in the System.Text namespace to determine
if the current thread is using a single-byte character set (SBCS) or a
double-byte character set (DBCS). It then takes CharCode as a code
point in the appropriate set. The range can be 0 through 255 for SBCS
characters and -32768 through 65535 for DBCS characters. The returned
character depends on the code page for the current thread, which is
contained in the ANSICodePage property of the TextInfo class.
TextInfo.ANSICo dePage can be obtained by specifying
System.Globaliz ation.CultureIn fo.CurrentCultu re.TextInfo.ANS ICodePage.
</quote>
If you don't believe me, use 137 in your VB Chr
function then compare the C# output if you just cast it or use the
ASCII encoding. You'll see they don't match! You can even do a quick
loop and check all the values from 0 to 255 and you'll see that there
are many that won't match the VB function's output.
And that's what I'd expect, as ASCII doesn't have any values above 127,
and Unicode 128-159 is not the same as most ANSI code pages for the
same range.
Now if you're doing simple stuff, that maybe OK but if your writing
components in one language and expect to communicate with components
written in the other, you're going to have a real problem. Case in
point, you've got an older VB6 object (or VB.Net object for that
matter) that uses it's own encryption algorithm and it must communicate
with a C# object that must mimic the encryption function. If you don't
use the proper implementation of the Chr and Asc functions in your C#
component, you'll never be able to decipher the encrypted data from the
VB component.

Here are the true implementations of the Asc and Chr functions:
<snip>

Those would be fine if the thread's default code page is 1252, but
otherwise it's not correct.

I've had a bit of an experiment, and unfortunately the behaviour varies
depending on whether you're using .NET 1.1 or .NET 2.0, which doesn't
help matters. For instance, try the following program:

Option Strict On

Imports Microsoft.Visua lBasic
Imports System
Imports System.Threadin g
Imports System.Globaliz ation

Public Class Test

Shared Sub Main()
Thread.CurrentT hread.CurrentCu lture = new CultureInfo(719 4)
Dim x As Char = Chr (240)
Console.WriteLi ne (AscW(x))
End Sub
End Class

Using .NET 1.1, this prints 240. Using .NET 2.0 it prints 1088. I've no
idea what it would do on VB6.

Changing the current culture of the thread makes a difference in *some*
situations but not others, which is plain bizarre.

Fortunately, C# is considerably more consistent in these matters. If
you need to interoperate with legacy VB code, I'd strongly suggest you
make sure you know *exactly* what that VB code is going to produce in
terms of actual encodings, including what happens in various cultures.
Once you know that, getting the C# side to work should be easy...
I hope this answers the question once and for all and puts an end to
the huge amount of misinformation that exists out there on this
subject.


Personally I think it just added to the misinformation, I'm afraid...

--
Jon Skeet - <sk***@pobox.co m>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 17 '05 #3
This is a perfect example of why the real solution to coding components that
will work in cross-language envrionments is to not use language-specific
libraries that are, after all, mostly there for backward compatibility
anyway. Use DirectCast() in VB in the same way you'd use C# casting and you
should get the same results. Should, at any rate; you'd have to test to be
sure there isn't some kind of tap dance going on under the hood. You never
know with VB.

It's all a matter of perspective. If you're used to VB6 and think that's
the "right" way that everything should work then you'll use VB.NET
constructs that produce those "correct" results and then rail against C# for
it's "incorrect" results.

What is "correct" for mixed language projects and components is that which
uses the framework and CLR without embellishment. What is "correct" for
porting legacy code to .NET -- at least as a first step -- might, arguably,
be to use compatibility functions. But to steer the best course in any
situation, you have to step back from a parochial viewpoint and look at the
bigger picture of how your components and apps will interact with the rest
of the managed world.

--Bob

"Jon Skeet [C# MVP]" <sk***@pobox.co m> wrote in message
news:MP******** *************** *@msnews.micros oft.com...
Darrell Sparti, MCSD <we*******@baca usa.com> wrote: I've had a bit of an experiment, and unfortunately the behaviour varies
depending on whether you're using .NET 1.1 or .NET 2.0, which doesn't
help matters. For instance, try the following program:

Option Strict On

Imports Microsoft.Visua lBasic
Imports System
Imports System.Threadin g
Imports System.Globaliz ation

Public Class Test

Shared Sub Main()
Thread.CurrentT hread.CurrentCu lture = new CultureInfo(719 4)
Dim x As Char = Chr (240)
Console.WriteLi ne (AscW(x))
End Sub
End Class

Using .NET 1.1, this prints 240. Using .NET 2.0 it prints 1088. I've no
idea what it would do on VB6.

Changing the current culture of the thread makes a difference in *some*
situations but not others, which is plain bizarre.

Nov 17 '05 #4
Bob Grommes <bo*@bobgrommes .com> wrote:
This is a perfect example of why the real solution to coding components that
will work in cross-language envrionments is to not use language-specific
libraries that are, after all, mostly there for backward compatibility
anyway. Use DirectCast() in VB in the same way you'd use C# casting and you
should get the same results. Should, at any rate; you'd have to test to be
sure there isn't some kind of tap dance going on under the hood. You never
know with VB.

It's all a matter of perspective. If you're used to VB6 and think that's
the "right" way that everything should work then you'll use VB.NET
constructs that produce those "correct" results and then rail against C# for
it's "incorrect" results.

What is "correct" for mixed language projects and components is that which
uses the framework and CLR without embellishment. What is "correct" for
porting legacy code to .NET -- at least as a first step -- might, arguably,
be to use compatibility functions. But to steer the best course in any
situation, you have to step back from a parochial viewpoint and look at the
bigger picture of how your components and apps will interact with the rest
of the managed world.


I think it can only be correct to use the compatibility functions if
you're absolutely sure about what they do. Unfortunately, having
experimented with the 2.0 and 1.1 implementations of Chr and Asc, it's
far from obvious to me exactly what they do when the current thread's
culture changes. Sometimes they seem "sticky" (taking the encoding of
the thread which first calls them) and sometimes they don't - and as
I've said, the results seem to vary depending on the version of the
framework used. (This could be an issue with the beta of 2.0, of
course.)

Hopefully the VB6 semantics are better defined, so that anyone wanting
to interoperate with data produced by VB6 can do so in a precise
manner.

--
Jon Skeet - <sk***@pobox.co m>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 17 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
4940
by: Penn Markham | last post by:
Hello all, I am writing a script where I need to use the system() function to call htpasswd. I can do this just fine on the command line...works great (see attached file, test.php). When my webserver runs that part of the script (see attached file, snippet.php), though, it doesn't go through. I don't get an error message or...
5
1986
by: Elaine Jackson | last post by:
Suppose one of the arguments of a function f_1 is another function f_2. Can f_1 access the number of arguments taken by f_2? (I'm writing a little script to automate the construction of logical truth-tables.) Thanks. Peace
37
3380
by: Jeff Thies | last post by:
There's a number of elements that I set attributes for: <table border="1" cellpadding="3" cellspacing="0">.. <img align="middle"> Are there CSS equivalents for any of these? Jeff
10
4276
by: Michael | last post by:
Guys, I'm interested in how the compiler implements function calls, can anyone correct my understanding/point me towards some good articles. When a function is called, is the stack pointer incremented by the size of the variables declared in the called function, and then the function will know where these are in memory relative to the...
39
2192
by: zeus | last post by:
I know function overloading is not supported in C. I have a few questions about this: 1. Why? is it from technical reasons? if so, which? 2. why wasn't it introduced to the ANSI? 3. Is there any C implementation supporting this feature? I assume some of you will claim that there is no need in function overloading, so I would like to know...
0
737
by: Darrell Sparti, MCSD | last post by:
There have been many postings about this subject on this newsgroup. Unfortunately, they're incorrect. You can't just cast a value in C# and have it work for all ASCII characters. Nor can you use the ASCII encoding as some have suggested. The undocumented truth is, Microsoft uses the Western European encoding in these functions. If you...
21
4163
by: utab | last post by:
Hi there, Is there a way to convert a double value to a string. I know that there is fcvt() but I think this function is not a part of the standard library. I want sth from the standard if possible. The thing I am trying to do is to convert a double value to a string with 8 elements. 8 is fixed because of the files I work with. I will...
28
4294
by: Larax | last post by:
Best explanation of my question will be an example, look below at this simple function: function SetEventHandler(element) { // some operations on element element.onclick = function(event) {
1
1442
by: Michele Simionato | last post by:
I see that the pkgutil module has many useful functions which are however undocumented. Does anybody know why it is so? In particolar, can I safely use pkg.walk_packages without risking a change of interface in the future? I looks unlikely, since pkgutil is used in setuptools, but I want to be sure before relying on it. Michele Simionato
0
7609
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language...
0
7921
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. ...
0
8118
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
1
7666
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For...
0
6278
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
1
5504
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes...
0
5217
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert...
0
3651
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in...
0
3636
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.