Help | Site Map
Connecting Tech Pros Worldwide
 
 
LinkBack Thread Tools
  #1  
Old February 24th, 2006, 11:55 AM
Emmanuel
Guest
 
Posts: n/a
Default iso-8859-1 and UTF-8

hi,



I have this ASP application which is taking text from an NVARCHAR and NTEXT
columns and outputs them in the response.



The problem has to do with encoding.. basically the text appears to be UTF-8
(i'm not sure about this, but its definitely not iso-8859-1) but i need it
to be outputted as iso-8859-1.



Why I need to do this? Because the asp application includes files which have
been saved in iso-8859-1 encoding and unfortunately i cannot just change
their encoding as these files are used everywhere on the website and
therefore the consequences could be disastrous.



If i were to set Response.CharSet = "utf-8" the text i'm taking from the
database appears correctly, but then the include files will not :(

the inverse will happen if i were to set response.CharSet = "iso-8859-1" :(



Seems like I'm in deep trouble. How can i fix this? :(



thanks


  #2  
Old February 24th, 2006, 10:05 PM
Anthony Jones
Guest
 
Posts: n/a
Default Re: iso-8859-1 and UTF-8

I'll assume that the DB fields are NVARCHAR for a reason. Hence ISO-8859-1
doesn't cover what you need because NVARCHAR is able to represent a wider
range of characters than ISO-8859-1.

So you need to send your pages as UTF-8.

Are the 'Include files' ASP pages that need processing or static content
that simply needs to be sent to the response?

What would you say the was the frequency of characters outside of the 7-bit
ASCII code range in the include files?



  #3  
Old February 25th, 2006, 05:45 AM
Emmanuel
Guest
 
Posts: n/a
Default Re: iso-8859-1 and UTF-8

hi,

thanks for your reply



the fields are nvarchar or ntext because they store text in different
languages. Mainly European, but wanted it to be open for any future
expansion.



I must admit that encoding is a subject that i know very little about. :(
The fact is however, that this system I'm speaking about is an old one, i.e.
i had developed it a long time ago but it is working perfectly on the live
servers. Now i needed to make an update (minor changes) to it but all of a
sudden my test server is acting as i described below. :( and i'm not risking
to upload the changes on the live... :(



The include files are static design files (html) which contains menus and
side bars etc..

The problem lies in the non-english characters, i.e. the French, Spanish,
German languages have characters which are not found in the ASCII range.
Iso-8859-1 seems to have these though since the include files are displayed
correctly..



Could this be an issue of SQL Server?

From your answer I understood that the problem most probably is that SQL
Server is storing the text as utf-8. Maybe the live server has different
collation then the test server? But before doing the updates I first backed
up the live database and restored it on a local test server, so collation
should have remained the same, or no?



"Anthony Jones" <Ant@yadayadayada.com> wrote in message
news:ea0OwyYOGHA.3944@tk2msftngp13.phx.gbl...[color=blue]
> I'll assume that the DB fields are NVARCHAR for a reason. Hence
> ISO-8859-1
> doesn't cover what you need because NVARCHAR is able to represent a wider
> range of characters than ISO-8859-1.
>
> So you need to send your pages as UTF-8.
>
> Are the 'Include files' ASP pages that need processing or static content
> that simply needs to be sent to the response?
>
> What would you say the was the frequency of characters outside of the
> 7-bit
> ASCII code range in the include files?
>
>
>[/color]


  #4  
Old February 25th, 2006, 11:25 PM
Anthony Jones
Guest
 
Posts: n/a
Default Re: iso-8859-1 and UTF-8

>Could this be an issue of SQL Server?
[color=blue]
>From your answer I understood that the problem most probably is that SQL
>Server is storing the text as utf-8. Maybe the live server has different
>collation then the test server? But before doing the updates I first backed
>up the live database and restored it on a local test server, so collation
>should have remained the same, or no?[/color]

This is not a SQL Server problem. NVARCHAR stores text as unicode 2-btye
characters. This is the same character format used natively by the script
engines used by ASP.

Ultimately any string you retrieve from any database stored in a field of
any data type will always end up being a unicode string.

When you subsequently send that string to the client with Response.Write the
response object will convert the string to the appropriate charset as
specified on the Session.CodePage property.

Your problem is that static content in a page must already be in the
appropriate code page since it will be sent to the client as-is.

You will need to change save each page in UTF-8 format.

Place this at the top each:-

<%@ CodePage=65001 %>

and ensure you inform the client it is recieve UTF-8 with

Response.CharSet = "UTF-8"

Anthony.



 

Bookmarks


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are Off
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On

What is Bytes?

We are a network of experts and professionals in IT and software development that help one another with answers to tough questions and share insights. Get the best answers to your questions from over network members.
Post your question now . . .
It's fast and it's free

Popular Articles