473,837 Members | 1,499 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

ASP converts Unicode Chars to HTML entities?

Hello

i have following problem with ASP (using Interdev, Win2003 Server): if a
special character is entered in a textbox, ASP or the Client Browser (IE 6)
seems to convert this character in HTML entities.
eg characters on this site:
http://unicode.e-workers.de/kyrillisch.php

come back as eg &#1051 . i'm not shure, where exactly this happens. it
doesn't happen on ASP.NET sites though. the top of those documents looks
like this:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html lang="de" >
<head>
<meta HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=utf8">
<title>Beorda AG - Account Detail</title>
</head>
......
does anybody know how to avoid this? basically i'll need a utf8 postback i
guess. or i convert the entities to unicode before storing the values in the
database.

thanks for your hints

beat
Sep 5 '05 #1
2 12988


Beat Richli wrote:

i have following problem with ASP (using Interdev, Win2003 Server): if a
special character is entered in a textbox, ASP or the Client Browser (IE 6)
seems to convert this character in HTML entities.
eg characters on this site:
http://unicode.e-workers.de/kyrillisch.php

come back as eg &#1051 . i'm not shure, where exactly this happens.


Browsers have a tendency to do that if encodings are not properly
declared and have to be guessed or even if an encoding is properly
declared but characters the user enters are not representable in the
declared encoding. See
<http://ppewww.ph.gla.a c.uk/~flavell/charset/form-i18n.html>
If for instance your HTML document is encoded as ISO-8859-1 and then a
user enters the character "Л" in a form then browsers indeed pass that
on as %26%231051%3B which ASP would then decode as %26 for the character
'&', %23 for the character '#', the unencoded sequence of digits 1051
and as %3B as the character ';' which ends up as the string
'Л'
in your ASP Request.Form or Request.QuerySt ring.

Thus one way to make sure the browser submits a properly encoded
character and not an encoded HTML character reference is to author the
HTML documents in the encoding UTF-8 and declare that properly, e.g. at
least with a
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
in the head of the document or even better by having the HTTP server
configured to send the HTTP response header
Content-Type: text/html; charset=UTF-8
That way the browser will then for instance encode the entered 'Л' as
'%D0%9B'.

ASP pages can also be authored using UTF-8 by using and indicating the
corresponding code page 65001 e.g.
<%@ Language="VBScr ipt" CodePage="65001 " %>

--

Martin Honnen --- MVP XML
http://JavaScript.FAQTs.com/
Sep 7 '05 #2

"Martin Honnen" <ma*******@yaho o.de> schrieb im Newsbeitrag
news:O4******** ******@tk2msftn gp13.phx.gbl...


Beat Richli wrote:

i have following problem with ASP (using Interdev, Win2003 Server): if a
special character is entered in a textbox, ASP or the Client Browser (IE
6) seems to convert this character in HTML entities.
eg characters on this site:
http://unicode.e-workers.de/kyrillisch.php

come back as eg &#1051 . i'm not shure, where exactly this happens.


Browsers have a tendency to do that if encodings are not properly declared
and have to be guessed or even if an encoding is properly declared but
characters the user enters are not representable in the declared encoding.
See
<http://ppewww.ph.gla.a c.uk/~flavell/charset/form-i18n.html>
If for instance your HTML document is encoded as ISO-8859-1 and then a
user enters the character "?" in a form then browsers indeed pass that on
as %26%231051%3B which ASP would then decode as %26 for the character '&',
%23 for the character '#', the unencoded sequence of digits 1051 and as
%3B as the character ';' which ends up as the string
'Л'
in your ASP Request.Form or Request.QuerySt ring.

Thus one way to make sure the browser submits a properly encoded character
and not an encoded HTML character reference is to author the HTML
documents in the encoding UTF-8 and declare that properly, e.g. at least
with a
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
in the head of the document or even better by having the HTTP server
configured to send the HTTP response header
Content-Type: text/html; charset=UTF-8
That way the browser will then for instance encode the entered '?' as
'%D0%9B'.

ASP pages can also be authored using UTF-8 by using and indicating the
corresponding code page 65001 e.g.
<%@ Language="VBScr ipt" CodePage="65001 " %>

--

Martin Honnen --- MVP XML
http://JavaScript.FAQTs.com/

thanks a lot Martin. i will check the site again using this information.

greets
beat
Sep 7 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
4000
by: Horst Gutmann | last post by:
Hi :-) I currently have quite a big problem with minidom and special chars (for example &uuml;) in HTML. Let's say I have following input file: -------------------------------------------------- <?xml version="1.0"?> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html>
5
4420
by: Nancy | last post by:
I recently completed a web page, "Browser Tests of Entities in 2004". http://www.santagata.us/characters/CharacterEntities.html It shows those characters that work in all of the version 5.2+ browsers that were tested and those that only work in some of them. Take a look, maybe you'll consider it useful. This is not my field (I'm an architect - you know the house construction kind), so if you notice any inaccuracies I'd appreciate a...
3
7782
by: hunterb | last post by:
I have a file which has no BOM and contains mostly single byte chars. There are numerous double byte chars (Japanese) which appear throughout. I need to take the resulting Unicode and store it in a DB and display it onscreen. No matter which way I open the file, convert it to Unicode/leave it as is or what ever, I see all single bytes ok, but double bytes become 2 seperate single bytes. Surely there is an easy way to convert these mixed...
7
4208
by: Robert | last post by:
Hello, I'm using Pythonwin and py2.3 (py2.4). I did not come clear with this: I want to use win32-fuctions like win32ui.MessageBox, listctrl.InsertItem ..... to get unicode strings on the screen - best results according to the platform/language settings (mainly XP Home, W2K, ...). Also unicode strings should be displayed as nice as possible at the console with normal print-s to stdout (on varying platforms, different
6
8671
by: bruce | last post by:
hi... i'm running into a problem where i'm seeing non-ascii chars in the parsing i'm doing. in looking through various docs, i can't find functions to remove/restrict strings to valid ascii chars. i'm assuming python has something like valid_str = strip(invalid_str)
2
2805
by: Frantic | last post by:
I'm working on a list of japaneese entities that contain the entity, the unicode hexadecimal code and the xml/sgml entity used for that entity. A unicode document is read into the program, then the program sorts out every doublet and the hexadecimal unicode code is extracted, but I dont know a way to find the xml or sgml-entity equivalent to the unicode code. Anyone who could give me a pointer? Best regards
3
8750
by: Laangen_LU | last post by:
Dear Group, my first post to this group, so if I'm on the wrong group, my apologies. I'm trying to send out an email in Chinese lanuage using the mail() function in PHP. Subject and mailbody are stored as Unicode entities (eg. 註)
8
19881
by: Steven D'Aprano | last post by:
I have a string containing Latin-1 characters: s = u" and many more..." I want to convert it to HTML entities: result => "&copy; and many more..." Decimal/hex escapes would be acceptable:
6
4710
by: Clodoaldo | last post by:
I was looking for a function to transform a unicode string into htmlentities. Not only the usual html escaping thing but all characters. As I didn't find I wrote my own: # -*- coding: utf-8 -*- from htmlentitydefs import codepoint2name def unicode2htmlentities(u):
0
9839
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, well explore What is ONU, What Is Router, ONU & Routers main usage, and What is the difference between ONU and Router. Lets take a closer look ! Part I. Meaning of...
0
10564
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10621
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
10268
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
9396
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development projectplanning, coding, testing, and deploymentwithout human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7806
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupr who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5668
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5846
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
3
3123
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.