473,395 Members | 1,541 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,395 software developers and data experts.

C# Chr and Asc Function Equivalents - The Undocumented Truth!

There have been many postings about this subject on this newsgroup.
Unfortunately, they're incorrect. You can't just cast a value in C#
and have it work for all ASCII characters. Nor can you use the ASCII
encoding as some have suggested.

The undocumented truth is, Microsoft uses the Western European encoding
in these functions. If you don't believe me, use 137 in your VB Chr
function then compare the C# output if you just cast it or use the
ASCII encoding. You'll see they don't match! You can even do a quick
loop and check all the values from 0 to 255 and you'll see that there
are many that won't match the VB function's output.

Now if you're doing simple stuff, that maybe OK but if your writing
components in one language and expect to communicate with components
written in the other, you're going to have a real problem. Case in
point, you've got an older VB6 object (or VB.Net object for that
matter) that uses it's own encryption algorithm and it must communicate
with a C# object that must mimic the encryption function. If you don't
use the proper implementation of the Chr and Asc functions in your C#
component, you'll never be able to decipher the encrypted data from the
VB component.

Here are the true implementations of the Asc and Chr functions:

internal static string Chr(int p_intByte)
{
if( (p_intByte < 0) || (p_intByte > 255) )
{
throw new ArgumentOutOfRangeException("p_intByte", p_intByte,
"Must be between 0 and 255.");
}
byte[] bytBuffer = new byte[]{(byte) p_intByte};
return Encoding.GetEncoding(1252).GetString(bytBuffer);
}

internal static int Asc(string p_strChar)
{
if( p_strChar.Length != 1 )
{
throw new ArgumentOutOfRangeException("p_strChar", p_strChar,
"Must be a single character.");
}
char[] chrBuffer = {Convert.ToChar(p_strChar)};
byte[] bytBuffer = Encoding.GetEncoding(1252).GetBytes(chrBuffer);
return (int) bytBuffer[0];
}

I hope this answers the question once and for all and puts an end to
the huge amount of misinformation that exists out there on this
subject.

Darrell Sparti, MCSD
Bikers Against Child Abuse National Webmaster
we*******@bacausa.com
www.bacausa.com
Because No Child Should Live In Fear

Nov 17 '05 #1
4 5311
Darrell Sparti, MCSD <we*******@bacausa.com> wrote:
There have been many postings about this subject on this newsgroup.
Unfortunately, they're incorrect. You can't just cast a value in C#
and have it work for all ASCII characters. Nor can you use the ASCII
encoding as some have suggested.
You can for *ASCII* characters. Don't forget that ASCII only extends to
126 or 127 (I can never remember whether 127 is considered to be part
of it or not; it's not particularly important though as it's
unprintable).
The undocumented truth is, Microsoft uses the Western European encoding
in these functions.
If you mean the VB.NET functions, it's perfectly well documented, and
it's not the Western European encoding - it's whatever the default
encoding is for the thread.

From the docs for Asc:

<quote>
Asc returns the code point, or character code, for the input character.
This can be 0 through 255 for single-byte character set (SBCS) values
and -32768 through 32767 for double-byte character set (DBCS) values.
The returned value depends on the code page for the current thread,
which is contained in the ANSICodePage property of the TextInfo class.
TextInfo.ANSICodePage can be obtained by specifying
System.Globalization.CultureInfo.CurrentCulture.Te xtInfo.ANSICodePage.
</quote>

And from the docs for Chr:

<quote>
Chr uses the Encoding class in the System.Text namespace to determine
if the current thread is using a single-byte character set (SBCS) or a
double-byte character set (DBCS). It then takes CharCode as a code
point in the appropriate set. The range can be 0 through 255 for SBCS
characters and -32768 through 65535 for DBCS characters. The returned
character depends on the code page for the current thread, which is
contained in the ANSICodePage property of the TextInfo class.
TextInfo.ANSICodePage can be obtained by specifying
System.Globalization.CultureInfo.CurrentCulture.Te xtInfo.ANSICodePage.
</quote>
If you don't believe me, use 137 in your VB Chr
function then compare the C# output if you just cast it or use the
ASCII encoding. You'll see they don't match! You can even do a quick
loop and check all the values from 0 to 255 and you'll see that there
are many that won't match the VB function's output.
And that's what I'd expect, as ASCII doesn't have any values above 127,
and Unicode 128-159 is not the same as most ANSI code pages for the
same range.
Now if you're doing simple stuff, that maybe OK but if your writing
components in one language and expect to communicate with components
written in the other, you're going to have a real problem. Case in
point, you've got an older VB6 object (or VB.Net object for that
matter) that uses it's own encryption algorithm and it must communicate
with a C# object that must mimic the encryption function. If you don't
use the proper implementation of the Chr and Asc functions in your C#
component, you'll never be able to decipher the encrypted data from the
VB component.

Here are the true implementations of the Asc and Chr functions:
<snip>

Those would be fine if the thread's default code page is 1252, but
otherwise it's not correct.

I've had a bit of an experiment, and unfortunately the behaviour varies
depending on whether you're using .NET 1.1 or .NET 2.0, which doesn't
help matters. For instance, try the following program:

Option Strict On

Imports Microsoft.VisualBasic
Imports System
Imports System.Threading
Imports System.Globalization

Public Class Test

Shared Sub Main()
Thread.CurrentThread.CurrentCulture = new CultureInfo(7194)
Dim x As Char = Chr (240)
Console.WriteLine (AscW(x))
End Sub
End Class

Using .NET 1.1, this prints 240. Using .NET 2.0 it prints 1088. I've no
idea what it would do on VB6.

Changing the current culture of the thread makes a difference in *some*
situations but not others, which is plain bizarre.

Fortunately, C# is considerably more consistent in these matters. If
you need to interoperate with legacy VB code, I'd strongly suggest you
make sure you know *exactly* what that VB code is going to produce in
terms of actual encodings, including what happens in various cultures.
Once you know that, getting the C# side to work should be easy...
I hope this answers the question once and for all and puts an end to
the huge amount of misinformation that exists out there on this
subject.


Personally I think it just added to the misinformation, I'm afraid...

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 17 '05 #2
Darrell Sparti, MCSD <we*******@bacausa.com> wrote:
There have been many postings about this subject on this newsgroup.
Unfortunately, they're incorrect. You can't just cast a value in C#
and have it work for all ASCII characters. Nor can you use the ASCII
encoding as some have suggested.
You can for *ASCII* characters. Don't forget that ASCII only extends to
126 or 127 (I can never remember whether 127 is considered to be part
of it or not; it's not particularly important though as it's
unprintable).
The undocumented truth is, Microsoft uses the Western European encoding
in these functions.
If you mean the VB.NET functions, it's perfectly well documented, and
it's not the Western European encoding - it's whatever the default
encoding is for the thread.

From the docs for Asc:

<quote>
Asc returns the code point, or character code, for the input character.
This can be 0 through 255 for single-byte character set (SBCS) values
and -32768 through 32767 for double-byte character set (DBCS) values.
The returned value depends on the code page for the current thread,
which is contained in the ANSICodePage property of the TextInfo class.
TextInfo.ANSICodePage can be obtained by specifying
System.Globalization.CultureInfo.CurrentCulture.Te xtInfo.ANSICodePage.
</quote>

And from the docs for Chr:

<quote>
Chr uses the Encoding class in the System.Text namespace to determine
if the current thread is using a single-byte character set (SBCS) or a
double-byte character set (DBCS). It then takes CharCode as a code
point in the appropriate set. The range can be 0 through 255 for SBCS
characters and -32768 through 65535 for DBCS characters. The returned
character depends on the code page for the current thread, which is
contained in the ANSICodePage property of the TextInfo class.
TextInfo.ANSICodePage can be obtained by specifying
System.Globalization.CultureInfo.CurrentCulture.Te xtInfo.ANSICodePage.
</quote>
If you don't believe me, use 137 in your VB Chr
function then compare the C# output if you just cast it or use the
ASCII encoding. You'll see they don't match! You can even do a quick
loop and check all the values from 0 to 255 and you'll see that there
are many that won't match the VB function's output.
And that's what I'd expect, as ASCII doesn't have any values above 127,
and Unicode 128-159 is not the same as most ANSI code pages for the
same range.
Now if you're doing simple stuff, that maybe OK but if your writing
components in one language and expect to communicate with components
written in the other, you're going to have a real problem. Case in
point, you've got an older VB6 object (or VB.Net object for that
matter) that uses it's own encryption algorithm and it must communicate
with a C# object that must mimic the encryption function. If you don't
use the proper implementation of the Chr and Asc functions in your C#
component, you'll never be able to decipher the encrypted data from the
VB component.

Here are the true implementations of the Asc and Chr functions:
<snip>

Those would be fine if the thread's default code page is 1252, but
otherwise it's not correct.

I've had a bit of an experiment, and unfortunately the behaviour varies
depending on whether you're using .NET 1.1 or .NET 2.0, which doesn't
help matters. For instance, try the following program:

Option Strict On

Imports Microsoft.VisualBasic
Imports System
Imports System.Threading
Imports System.Globalization

Public Class Test

Shared Sub Main()
Thread.CurrentThread.CurrentCulture = new CultureInfo(7194)
Dim x As Char = Chr (240)
Console.WriteLine (AscW(x))
End Sub
End Class

Using .NET 1.1, this prints 240. Using .NET 2.0 it prints 1088. I've no
idea what it would do on VB6.

Changing the current culture of the thread makes a difference in *some*
situations but not others, which is plain bizarre.

Fortunately, C# is considerably more consistent in these matters. If
you need to interoperate with legacy VB code, I'd strongly suggest you
make sure you know *exactly* what that VB code is going to produce in
terms of actual encodings, including what happens in various cultures.
Once you know that, getting the C# side to work should be easy...
I hope this answers the question once and for all and puts an end to
the huge amount of misinformation that exists out there on this
subject.


Personally I think it just added to the misinformation, I'm afraid...

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 17 '05 #3
This is a perfect example of why the real solution to coding components that
will work in cross-language envrionments is to not use language-specific
libraries that are, after all, mostly there for backward compatibility
anyway. Use DirectCast() in VB in the same way you'd use C# casting and you
should get the same results. Should, at any rate; you'd have to test to be
sure there isn't some kind of tap dance going on under the hood. You never
know with VB.

It's all a matter of perspective. If you're used to VB6 and think that's
the "right" way that everything should work then you'll use VB.NET
constructs that produce those "correct" results and then rail against C# for
it's "incorrect" results.

What is "correct" for mixed language projects and components is that which
uses the framework and CLR without embellishment. What is "correct" for
porting legacy code to .NET -- at least as a first step -- might, arguably,
be to use compatibility functions. But to steer the best course in any
situation, you have to step back from a parochial viewpoint and look at the
bigger picture of how your components and apps will interact with the rest
of the managed world.

--Bob

"Jon Skeet [C# MVP]" <sk***@pobox.com> wrote in message
news:MP************************@msnews.microsoft.c om...
Darrell Sparti, MCSD <we*******@bacausa.com> wrote: I've had a bit of an experiment, and unfortunately the behaviour varies
depending on whether you're using .NET 1.1 or .NET 2.0, which doesn't
help matters. For instance, try the following program:

Option Strict On

Imports Microsoft.VisualBasic
Imports System
Imports System.Threading
Imports System.Globalization

Public Class Test

Shared Sub Main()
Thread.CurrentThread.CurrentCulture = new CultureInfo(7194)
Dim x As Char = Chr (240)
Console.WriteLine (AscW(x))
End Sub
End Class

Using .NET 1.1, this prints 240. Using .NET 2.0 it prints 1088. I've no
idea what it would do on VB6.

Changing the current culture of the thread makes a difference in *some*
situations but not others, which is plain bizarre.

Nov 17 '05 #4
Bob Grommes <bo*@bobgrommes.com> wrote:
This is a perfect example of why the real solution to coding components that
will work in cross-language envrionments is to not use language-specific
libraries that are, after all, mostly there for backward compatibility
anyway. Use DirectCast() in VB in the same way you'd use C# casting and you
should get the same results. Should, at any rate; you'd have to test to be
sure there isn't some kind of tap dance going on under the hood. You never
know with VB.

It's all a matter of perspective. If you're used to VB6 and think that's
the "right" way that everything should work then you'll use VB.NET
constructs that produce those "correct" results and then rail against C# for
it's "incorrect" results.

What is "correct" for mixed language projects and components is that which
uses the framework and CLR without embellishment. What is "correct" for
porting legacy code to .NET -- at least as a first step -- might, arguably,
be to use compatibility functions. But to steer the best course in any
situation, you have to step back from a parochial viewpoint and look at the
bigger picture of how your components and apps will interact with the rest
of the managed world.


I think it can only be correct to use the compatibility functions if
you're absolutely sure about what they do. Unfortunately, having
experimented with the 2.0 and 1.1 implementations of Chr and Asc, it's
far from obvious to me exactly what they do when the current thread's
culture changes. Sometimes they seem "sticky" (taking the encoding of
the thread which first calls them) and sometimes they don't - and as
I've said, the results seem to vary depending on the version of the
framework used. (This could be an issue with the beta of 2.0, of
course.)

Hopefully the VB6 semantics are better defined, so that anyone wanting
to interoperate with data produced by VB6 can do so in a precise
manner.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 17 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
by: Penn Markham | last post by:
Hello all, I am writing a script where I need to use the system() function to call htpasswd. I can do this just fine on the command line...works great (see attached file, test.php). When my...
5
by: Elaine Jackson | last post by:
Suppose one of the arguments of a function f_1 is another function f_2. Can f_1 access the number of arguments taken by f_2? (I'm writing a little script to automate the construction of logical...
37
by: Jeff Thies | last post by:
There's a number of elements that I set attributes for: <table border="1" cellpadding="3" cellspacing="0">.. <img align="middle"> Are there CSS equivalents for any of these? Jeff
10
by: Michael | last post by:
Guys, I'm interested in how the compiler implements function calls, can anyone correct my understanding/point me towards some good articles. When a function is called, is the stack pointer...
39
by: zeus | last post by:
I know function overloading is not supported in C. I have a few questions about this: 1. Why? is it from technical reasons? if so, which? 2. why wasn't it introduced to the ANSI? 3. Is there any...
0
by: Darrell Sparti, MCSD | last post by:
There have been many postings about this subject on this newsgroup. Unfortunately, they're incorrect. You can't just cast a value in C# and have it work for all ASCII characters. Nor can you use...
21
by: utab | last post by:
Hi there, Is there a way to convert a double value to a string. I know that there is fcvt() but I think this function is not a part of the standard library. I want sth from the standard if...
28
by: Larax | last post by:
Best explanation of my question will be an example, look below at this simple function: function SetEventHandler(element) { // some operations on element element.onclick = function(event) {
1
by: Michele Simionato | last post by:
I see that the pkgutil module has many useful functions which are however undocumented. Does anybody know why it is so? In particolar, can I safely use pkg.walk_packages without risking a change...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.