473,386 Members | 1,803 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,386 software developers and data experts.

Server.HTMLEncode with UTF-8

While working on some multilingual code I found a rather strange thing
happening with Server.HTMLEncode.

While loading different languages I change the Codepage and Charset in
ASP to reflect the language. This all works fine. However when I tried
to use Charset UTF-8 with Codepage 65001 everywhere I found that
HTMLEncode always translates all UTF-8 characters to &#xxxx.

Example:

Response.Charset = "shift_jis"
Response.Codepage = 932
Response.Write "Some Japanese Text"
Response.Write Server.HTMLEncode("Some Japanese Text")

Both Write actions output a character string in Shift_JIS, no UTF-8,
no &#xxxx sequences. Just fine and as it should be.

But when I do this:

Response.Charset = "utf-8"
Response.Codepage = 65001
Response.Write "Some Japanese Text"
Response.Write Server.HTMLEncode("Some Japanese Text")

The first write outputs an UTF-8 character string but the second Write
outputs a string encoded into &#xxxx sequences.

Why is that ???

Grtz,
Marco
Sep 15 '06 #1
1 14445

"Marco Miltenburg" <mi**@xs4all.nlwrote in message
news:4h********************************@4ax.com...
While working on some multilingual code I found a rather strange thing
happening with Server.HTMLEncode.

While loading different languages I change the Codepage and Charset in
ASP to reflect the language. This all works fine. However when I tried
to use Charset UTF-8 with Codepage 65001 everywhere I found that
HTMLEncode always translates all UTF-8 characters to &#xxxx.

Example:

Response.Charset = "shift_jis"
Response.Codepage = 932
Response.Write "Some Japanese Text"
Response.Write Server.HTMLEncode("Some Japanese Text")

Both Write actions output a character string in Shift_JIS, no UTF-8,
no &#xxxx sequences. Just fine and as it should be.

But when I do this:

Response.Charset = "utf-8"
Response.Codepage = 65001
Response.Write "Some Japanese Text"
Response.Write Server.HTMLEncode("Some Japanese Text")

The first write outputs an UTF-8 character string but the second Write
outputs a string encoded into &#xxxx sequences.

Why is that ???
Whilst all string handling in script is done in unicode, script itself can't
be encoded in unicode. It is possible to run a script encoded as UTF-8
simply because all keywords and operators etc are within the ASCII character
set and therefore are identical when encoded as UTF-8. However string
literals in the code will be treated as single byte ANSI characters despite
having been encoded as UTF-8.

In the real world where the string being encoded by HTMLEncode has be
retrieved from say a database this problem wouldn't occur. If you need
string literals in a multi-language output you will need to store them
somewhere else.

Anthony.
Grtz,
Marco

Sep 15 '06 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: abcd | last post by:
Simple but not working at this moment I have to fill the HTML select control.... my values could be for example "test number 1" sample code, dim x Data = "test ...
4
by: Andrea Williams | last post by:
I have a class that imports System.Web.UI.Page. The code below works: public string JScriptEncode(string strValue) { strValue = strValue.Replace(@"'", "\x27"); //' JScript encode...
3
by: Michael Gaillez | last post by:
Hi, I'm dynamically loading an assembly into an ASP.NET application. In the control that I load from that dynamic assembly I've tried to use Server.HTMLEncode but this results in a serious...
3
by: Mark Rae | last post by:
Hi, The following is lifted straight from the MSDN help topic for HttpServerUtility.HtmlEncode Method --------------------------------------------------------------- The following example...
3
by: TCB | last post by:
Hi, There are a couple of things that are bothering me, any help on these is greatly appreciated. This is using ASP.NET 2.0 1. I am sending email in my web app, using a simple form here is the...
1
by: burnsy2000 | last post by:
We are running ASP classic on IIS 5 on W2k SP4. We recently added <% Response.charset="utf-8"%to the top of all our pages (as suggested by validator.w3.org). A problem arose on a page which applies...
6
by: teddysnips | last post by:
In my application I need to allow users to cut 'n' paste stuff from various sources, some of which might include dodgy characters such as "<". Natch, IE interprets these as potentially dangerous...
1
by: Alexander Higgins | last post by:
>>Thanks for the response.... Point Taken but this is not the case. Thus, if a person writes a text file on her or his computer and does not use UNICODE to save it, the current code page is...
8
by: emailmygroup | last post by:
I am trying to decode character ‘#; but it is not working. Anybody knows how to decode this character?
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.