Bill,
I would extend the pattern to also match the square brackets also, then
modify the MatchEvaluator function to behave according to either the first
escape sequence or the second escape sequence...
--
Hope this helps
Jay B. Harlow [MVP - Outlook]
..NET Application Architect, Enthusiast, & Evangelist
T.S. Bradley -
http://www.tsbradley.net
"Bill Nguyen" <bi*****************@jaco.comwrote in message
news:u%****************@TK2MSFTNGP04.phx.gbl...
| Jay;
|
| If you look at the string again, you'll see that not only the 4-digit
group
| that needs to be translated but also other characters as well: (those in
| squared brackets as below):
|
| Nghi[ê]n Cứu - Ph[ê ]B[ì]nh
|
| I'm using phpWebsite and mySQL database from an ISP (IpowerWeb.com).
| Input text is Unicode when a webpage is created/updated.
| The text string above is stored in mySQL table instead.
| I gues I have to convert the text back to Unicode to view/edit then put it
| back. mySQL probably converts the text to the above format by itself.
|
| Any suggestion on how to accomplish this?
|
| Thanks again
|
| Bill
|
|
| "Jay B. Harlow [MVP - Outlook]" <Ja************@tsbradley.netwrote in
| message news:OE**************@TK2MSFTNGP04.phx.gbl...
| Bill,
| You could use a RegEx to convert the char escape codes to chars.
| >
| You could implement what Herfried suggested with something like:
| >
| Const input As String = "Nghiên Cứu - Phê Bình"
| >
| Const pattern As String = "\&\#\d{4}\;"
| Static parser As New Regex(pattern, RegexOptions.Compiled)
| Dim output As String = parser.Replace(input, AddressOf
| MatchEvaluator)
| >
| Private Function MatchEvaluator(ByVal input As Match) As String
| Dim value As String = input.Value.Substring(2, 4)
| Return ChrW(CInt(value))
| End Function
| >
| >
| Does the 7913 represent a 4 digit decimal or hexidecimal number? You may
| need to change the call to CInt accordingly...
| >
| --
| Hope this helps
| Jay B. Harlow [MVP - Outlook]
| .NET Application Architect, Enthusiast, & Evangelist
| T.S. Bradley -
http://www.tsbradley.net
| >
| >
| "Bill nguyen" <bi*****************@jaco.comwrote in message
| news:eU**************@TK2MSFTNGP03.phx.gbl...
| | Herfried;
| |
| | I don't know if this will work, but I need help to try it:
| | here's sample of the text string
| |
| | "Nghiên Cứu - Phê Bình"
| |
| | I need to read each byte in the text string, then use chrW to convert
it
| to
| | Unicode.
| |
| | I tried chrW(ascW(textString)) but it only converts the 1st letter.
| |
| | Is there a function to read all bytes in the text string in 1 pass?
| | Thanks
| |
| | Bill
| |
| |
| |
| | "Herfried K. Wagner [MVP]" <hi***************@gmx.atwrote in message
| | news:ed**************@TK2MSFTNGP03.phx.gbl...
| | "Bill Nguyen" <bi*****************@jaco.comschrieb:
| | >I'm getting data from a mySQL database (default char set = UTF-8).
| | >I need to display data in Unicode but got only mongolian characters
| like
| | >this: Phạm Thị Ngọc
| | >>
| | >I changed the textbox font to Arial Unicode MS but still not
working.
| | >>
| | >Do I need conversion of data stored in mySQL database before
| displaying?
| | >
| | Windows Forms controls cannot directly convert the character
entities
| like
| | 'ạ' to the appropriate character. You may want to replace the
| | string "&#<number>;" with the value of 'ChrW(<number>)' or simply do
| not
| | encode the characters in the database using that way.
| | >
| | --
| | M S Herfried K. Wagner
| | M V P <URL:http://dotnet.mvps.org/>
| | V B <URL:http://classicvb.org/petition/>
| |
| |
| >
| >
|
|