Sorry for not making myself clear, i am interested in the impact surrogate
char/char pairs have, these are used to extend unicode and in effect are
32bit chars.
what i need to know is if i ahve a string that consists of chinese text -
3 graphically symbols - which in the string are stored as char, surrogate
char and char, char the situation is that the i am using 4 .net chars to
represent 3 graphical characters.
so would Len(myString) return 3 or 4?
would Left(myString,2) give me the first 2 graphical characters (3 chars) or
a string consisting of 1 graphical char and a char which is the first half of
the second (surrogate) char?
btw I am Not a C person:-) i have been using basic since 1976!
"Mattias Sjögren" wrote:
if a string contains surrogate chars (i.e. Unicode characters that consiste
of more than 1 char) do functions that use an indexer or a string length into
the string e.g. Mid, Len work correctly?
What is correct to you? They work the same way the String class works,
i.e. Len (just like String.Length) returns the number of 16-bit Chars
in the string. If you want to treat a surrogate pair as a single
element you should have a look at the StringInfo class in .NET 2.0.
Mattias
--
Mattias Sjögren [C# MVP] mattias @ mvps.org
http://www.msjogren.net/dotnet/ | http://www.dotnetinterop.com
Please reply only to the newsgroup.