By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
457,712 Members | 1,271 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 457,712 IT Pros & Developers. It's quick & easy. textbox foriegn character encoding issue

P: 1
I'm getting some Chinese language data from sql server in some encoded format, maybe unicode?... it starts with an ampersand '&', then there is a '#' pound sign, then a 5 digit integer, then a semi colon. I notice it gets translated when I post it exactly, so I'm trying to descibe it here. If I put some spaces between it, a single character looks like this (remove spaces):

& # 35843 ;

Anyway, THIS is the big problem:

1. In ASP.NET 2.0 web page, if I programmatically move the data into an asp:textbox, or an html input textbox, I get the data exactly as it looks in SQL server with the ampersand, pound, number, semicolon format. But I need this to show the Chinese characters.

2. If I move this data into a the innerHtml property of a div tag for example, I see Chinese characters as expected, for example:

articleArea.InnerHtml = article.NativeHeadline; //this works great!!!

4. Interstingly, if I cut and paste from the DIV in step 2, to the textbox, the textbox shows the Chinese characters just fine... but this does not solve the problem of course, just an interesting note.

5. I did also try changing the font to MS Arial Unicode and that didn't work, also tried changing the encoding in the page directive to UTF-8, no dice

6. Another interesting note, a gridview column will display the data just fine too!

7. Is there some conversion I can do in c# server side to get it to work? I know for example, that IF the data is in this format:

char[] chinese = {'\u6B22','\u8FCE','\u4F7F','\u7528','\u0020'};

and I move this to the textbox, the data also shows as expected. But how to get the ampersand, pound, number, semicolon format translated to the above?

Anyone fix this problem already?
Sep 29 '08 #1
Share this Question
Share on Google+
3 Replies

Curtis Rutland
Expert 2.5K+
P: 3,256
Just curious, is your database table's column a varchar or an nvarchar?
Sep 29 '08 #2

Expert 5K+
P: 7,872
The &#<characters>; format is the html encoding of unicode characters.
Sep 29 '08 #3

Expert 100+
P: 190
Just to expand a bit on Plater's correct answer: when you have extended character sets, they can be encoded as unicode or utf-8, that is, a two or three byte numeric representation.
When you want to tell a browser to display an extended character, you can use a "numeric character reference" which takes the form of "&" + "#" +the unicode value fo the character + ";"

Someone already translated your chinese characters into numeric character references and stored them that way in your database.

When you paste the value of the database field into a "div", the div is expecting pure html, hence it translates your character into its correct display.

When you programatically fill in the value of a text input, the text input reads the value literally, hence it does not display it correctly.

You could, for example, programatically load the characters into a hidden input, and then copy the value to the text box, as this very simple example shows: (remove all the spaces in the numeric references first)

Expand|Select|Wrap|Line Numbers
  1. <html>
  2. <body>
  3. <div>& # 35843;</div>
  4. <input type="text" value="& # 35843;" id="tb1"/>
  5. <input type="text" value="" id="tb2"/>
  6. <input type="hidden" value=" & # 35843;" id="stor"/>
  7. </body>
  8. <script language="javascript">
  9.  tb2.value = stor.value;
  10. </script>
  11. </html>
The DIV displays the chinese character;
The first input box displays a literal copy of the numeric reference;
The second input box displays the chinese character;
Sep 29 '08 #4

Post your reply

Sign in to post your reply or Sign up for a free account.