473,386 Members | 1,741 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,386 software developers and data experts.

Unicode characters and System.Globalization.CultureInfo


I ran into a situation at work regarding unicode character encodings
and .NET cultures that left me a tad bit confused.

I was trying to instantiate a CultureInfo object from a locale
identifying a south chinese destination called the Hmong. Its usually
represented as hm-HMN. I peruse the NativeName property to extract
the name of the culture and display it on my application. When I do:

CultureInfo ci = new CultureInfo("hm-HMN");

ci.NativeName prints something that looks like this:

H'mong

However between the letter 'o' and 'n' I see what the unicode
consortium calls the replacement character (http://en.wikipedia.org/
wiki/Replacement_character), which is basically a diamond with a
question mark inside it. Reading through that section on Replacement
Character in the wikipedia link it appears that the character appears
whenever the application is not able to decode the original byte
stream correctly and when it can't it replaces it with 0xfffd.

What I would like to know is what exactly is causing this problem?

1) Does the native windows API or whatever is called when I
instantiate a new CultureInfo (I haven't had a chance to reflector
into it yet) object encodes that character differently but .NET is not
able to display it because it is trying to decode it using UTF-16
rules?

2) Or is it because the character cannot be displayed because the
default code page is set at 1252?

Can anyone offer some insights on how to get it to display the
characters correctly and also clue me in on the differences between
encodings and code pages?

thanks!

Oct 15 '08 #1
0 2623

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

32
by: Wolfgang Draxinger | last post by:
I understand that it is perfectly possible to store UTF-8 strings in a std::string, however doing so can cause some implicaions. E.g. you can't count the amount of characters by length() | size()....
5
by: Borko | last post by:
hi I am having problems getting unicode characters into VB. Using VB6 (sp3) and Access 2000 Characters are displayed correctly in Access, just when I use ADODB (2.7) to read them in VB i get ?...
14
by: Jon Davis | last post by:
I have put my users through so much crap with this bug it is an absolute shame. I have a product that reads/writes RSS 2.0 documents, among other things. The RSS 2.0 spec mandates an en-US style...
11
by: Cor Ligthert | last post by:
Hello everybody, Jay and Herfried are telling me every time when I use CDate that using the datetime.parseexact is always the best way to do String to datetime conversions. They don't tell why...
4
by: LinasB | last post by:
Hi, How to read system setting value of "Language for non-Unicode programs" ? Or how to set it programmatically? LinasB
24
by: ChaosKCW | last post by:
Hi I am reading from an oracle database using cx_Oracle. I am writing to a SQLite database using apsw. The oracle database is returning utf-8 characters for euopean item names, ie special...
2
by: jason | last post by:
DOTNET 2.0 VS 2005. My client is saying August 13,2006.. julian date should equal 225. Here's what I'm doing: <%@ Import Namespace="system.globalization" %> <script language="VB"...
1
by: newpuritangrant | last post by:
All Apologies for the naivety of the following question, but how can one iterate over a Managed C++ String, and identify if any of the characters belong to a certain unicode range.? For example...
0
by: Jedediah Marcus | last post by:
I would like to parse a hebrew date into a System::DateTime variable: System::Globalization::CultureInfo ^CulInfo = gcnew System::Globalization::CultureInfo("he-IL");...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.