472,145 Members | 1,961 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,145 software developers and data experts.

Encoding problem......

I'm working on a multilanguage ASP/HTML site using a IIS6 web server.

It perfectly works with two languages (english and italian) in this way:
- basically the same ASP code for every language
- language-specific content is stored in text files, every language has it's
own directory contents.
- to enhance usability and formatting the language-specific contents are
stored with html syntax; basically the code that normally stands between the
<bodyand </bodystatements.
- the ASP code loads those (html) files and it arranges them with
dinamically generated html code, then it outputs everything to the browser.

No problems here. It works

So I've decided to give it a try with chinese and to accomplish the task I
did:

- Inserted the following code in the .asp pages

<%@ CodePage=65001 Language="VBScript"%>
<%
Response.CodePage = 65001
Response.CharSet = "utf-8"
%>

- Changed the HTML header of output pages:

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

- Saved the asp pages containing chinese characters in UTF-8 encoding

Till here everything works properly and the browser (IE7) shows chinese
characters correctly.

Now the problem.

As I've said before ASP code loads some html code from text files on disk
and formats them. Those html templates have the chinese-counterpart text
inside and are stored in UTF-8, nevertheless when showed into the browser
the characters are scrambled up in something else. The chinese characters
generated by ASP are shown correctly, but the contents in the text files are
not!

Could it be that when the ASP code loads the text file from disk, the
contents gets screwed up
or
the web server tries to 'translate' the allready UTF-8 encoded text?


Dec 5 '06 #1
12 8241
Any help ??? :-)


"Atlas" <at*******@my-deja.comwrote in message
news:12*************@news.supernews.com...
I'm working on a multilanguage ASP/HTML site using a IIS6 web server.

It perfectly works with two languages (english and italian) in this way:
- basically the same ASP code for every language
- language-specific content is stored in text files, every language has
it's own directory contents.
- to enhance usability and formatting the language-specific contents are
stored with html syntax; basically the code that normally stands between
the <bodyand </bodystatements.
- the ASP code loads those (html) files and it arranges them with
dinamically generated html code, then it outputs everything to the
browser.

No problems here. It works

So I've decided to give it a try with chinese and to accomplish the task I
did:

- Inserted the following code in the .asp pages

<%@ CodePage=65001 Language="VBScript"%>
<%
Response.CodePage = 65001
Response.CharSet = "utf-8"
%>

- Changed the HTML header of output pages:

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

- Saved the asp pages containing chinese characters in UTF-8 encoding

Till here everything works properly and the browser (IE7) shows chinese
characters correctly.

Now the problem.

As I've said before ASP code loads some html code from text files on disk
and formats them. Those html templates have the chinese-counterpart text
inside and are stored in UTF-8, nevertheless when showed into the browser
the characters are scrambled up in something else. The chinese characters
generated by ASP are shown correctly, but the contents in the text files
are not!

Could it be that when the ASP code loads the text file from disk, the
contents gets screwed up
or
the web server tries to 'translate' the allready UTF-8 encoded text?


Dec 6 '06 #2

"Atlas" <at*******@my-deja.comwrote in message
news:12*************@news.supernews.com...
I'm working on a multilanguage ASP/HTML site using a IIS6 web server.

It perfectly works with two languages (english and italian) in this way:
- basically the same ASP code for every language
- language-specific content is stored in text files, every language has
it's
own directory contents.
- to enhance usability and formatting the language-specific contents are
stored with html syntax; basically the code that normally stands between
the
<bodyand </bodystatements.
- the ASP code loads those (html) files and it arranges them with
dinamically generated html code, then it outputs everything to the
browser.
>
No problems here. It works

So I've decided to give it a try with chinese and to accomplish the task I
did:

- Inserted the following code in the .asp pages

<%@ CodePage=65001 Language="VBScript"%>
<%
Response.CodePage = 65001
Response.CharSet = "utf-8"
%>

- Changed the HTML header of output pages:

<meta http-equiv="Content-Type" content="text/html;
charset=UTF-8">
>
- Saved the asp pages containing chinese characters in UTF-8 encoding

Till here everything works properly and the browser (IE7) shows chinese
characters correctly.

Now the problem.

As I've said before ASP code loads some html code from text files on disk
and formats them. Those html templates have the chinese-counterpart text
inside and are stored in UTF-8, nevertheless when showed into the browser
the characters are scrambled up in something else. The chinese characters
generated by ASP are shown correctly, but the contents in the text files
are
not!
Clearly the focal point of this problem is the code you use to place the
content of these text files into the response. You may be able to elicit
more help if you post the example code you are using to place the content
into the response.

However my guess is you are using Scripting.FileSystemObject to do this.
This object does not support UTF-8.
>
Could it be that when the ASP code loads the text file from disk, the
contents gets screwed up
or
the web server tries to 'translate' the allready UTF-8 encoded text?


Dec 6 '06 #3
Clearly the focal point of this problem is the code you use to place the
content of these text files into the response. You may be able to elicit
more help if you post the example code you are using to place the content
into the response.

However my guess is you are using Scripting.FileSystemObject to do this.
This object does not support UTF-8.
Ditto!

I'am using Scripting.FileSystemObject to load contents from disk!

Are there some solutions?
Dec 6 '06 #4
Anthony I'm posting the code I use to load UTF-8 coded files form disk

Function loadContent(functUriZ)
myToken = "@#$"

if functUriZ = "" then
cliUrl = request.servervariables("URL")
else
cliUrl = functUriZ
end if

Set objFSO = Server.CreateObject("Scripting.FileSystemObject")
begz = len(cliUrl)
while mid(cliUrl,begz,1) <"/"
begz = begz - 1
wend
endz = InStr(begz,cliUrl,".")
strFileName = Server.MapPath("content/") & "\" &
mid(cliUrl,begz+1,endz-begz) & "htm"

If (objFSO.FileExists(strFileName))=true Then

Set TS = objFSO.OpenTextFile(strFileName, 1, false)
temp = empty
idx = -1
p = 1
while (Not TS.AtEndOfStream)
temp = TS.Readline
if mid(temp,1,len(myToken)) = myToken then
idx = idx + 1
dynamicContent(idx) = ""
p = 4
else
p = 1
end if
dynamicContent(idx) = dynamicContent(idx) & mid(temp,p,len(temp) - p +
1)
wend

TS.close()
End If
End Function
Dec 6 '06 #5

"Atlas" <at*******@my-deja.comwrote in message
news:12*************@news.supernews.com...
Clearly the focal point of this problem is the code you use to place the
content of these text files into the response. You may be able to
elicit
more help if you post the example code you are using to place the
content
into the response.

However my guess is you are using Scripting.FileSystemObject to do this.
This object does not support UTF-8.

Ditto!

I'am using Scripting.FileSystemObject to load contents from disk!

Are there some solutions?
Yes use XML and XSL.


Dec 6 '06 #6

"Atlas" <at*******@my-deja.comwrote in message
news:12*************@news.supernews.com...
Anthony I'm posting the code I use to load UTF-8 coded files form disk

Function loadContent(functUriZ)
myToken = "@#$"

if functUriZ = "" then
cliUrl = request.servervariables("URL")
else
cliUrl = functUriZ
end if

Set objFSO = Server.CreateObject("Scripting.FileSystemObject")
begz = len(cliUrl)
while mid(cliUrl,begz,1) <"/"
begz = begz - 1
wend
endz = InStr(begz,cliUrl,".")
strFileName = Server.MapPath("content/") & "\" &
mid(cliUrl,begz+1,endz-begz) & "htm"

If (objFSO.FileExists(strFileName))=true Then

Set TS = objFSO.OpenTextFile(strFileName, 1, false)
temp = empty
idx = -1
p = 1
while (Not TS.AtEndOfStream)
temp = TS.Readline
if mid(temp,1,len(myToken)) = myToken then
idx = idx + 1
dynamicContent(idx) = ""
p = 4
else
p = 1
end if
dynamicContent(idx) = dynamicContent(idx) & mid(temp,p,len(temp) - p +
1)
wend

TS.close()
End If
End Function
Another option would be to save your text files as Unicode.
FileSystemObject can read unicode files

However if you take the learning curve XML/XSL is a more appropriate
solution.
Dec 6 '06 #7
>
>
Another option would be to save your text files as Unicode.
FileSystemObject can read unicode files

However if you take the learning curve XML/XSL is a more appropriate
solution.
Oooooooooooooohhh yes!!!!!!!!
changed the format to unicode and opentextfile to -1 (unicode) and voilą!
working perfectly!!!!!

Thanks a lot!!!
Dec 6 '06 #8
>Another option would be to save your text files as Unicode.
>FileSystemObject can read unicode files

However if you take the learning curve XML/XSL is a more appropriate
solution.

Oooooooooooooohhh yes!!!!!!!!
changed the format to unicode and opentextfile to -1 (unicode) and voilą!
working perfectly!!!!!

Thanks a lot!!!
Sadness after happiness.

Unfortunatelly I was successfully testing your suggestions on a IIS6, and
the production server is running on IIS 5.0.

As a result it returned some errors on response.codepage statements
(unsupported).
So I had a go with session.codepage, but I'm getting unwanted results.
Should I move to another ISP or I still can try something.
I'm not sure at this point that IIS 5.0 is capable of handling UTF-8 and
Unicode code mixture......
Dec 7 '06 #9

"Atlas" <at*******@my-deja.comwrote in message
news:12*************@news.supernews.com...
Another option would be to save your text files as Unicode.
FileSystemObject can read unicode files

However if you take the learning curve XML/XSL is a more appropriate
solution.
Oooooooooooooohhh yes!!!!!!!!
changed the format to unicode and opentextfile to -1 (unicode) and
voilą!
working perfectly!!!!!

Thanks a lot!!!

Sadness after happiness.

Unfortunatelly I was successfully testing your suggestions on a IIS6, and
the production server is running on IIS 5.0.

As a result it returned some errors on response.codepage statements
(unsupported).
So I had a go with session.codepage, but I'm getting unwanted results.
Should I move to another ISP or I still can try something.
I'm not sure at this point that IIS 5.0 is capable of handling UTF-8 and
Unicode code mixture......

Problem is Response.codepage is new in IIS6 wasn't present on IIS5.
Session.CodePage will stick for the duration of the session. Hence any
pages that do not assign to Session.CodePage will end up using what ever
codepage was last set.

A kludgy alternative is:-

Dim lCodePage : lCodePage = Session.CodePage
Session.CodePage = WhateEverYourDesiredCodePageIs

..
.. Do all your stuff here
..

Session.CodePage = lCodePage

Or make sure all your pages throughout the whole application specify the
Session.Codepage applicable.

Personally I would use Response.charset=UTF-8 and Session.CodePage=65001
throughout the whole site. Just be sure to save any static content that
contains characters outside the ASCII range as UTF-8 files and don't use any
such characters in script literals. Your Unicode text file based approach
will still work since you are no doubt using Response.Write to send the
content.
Dec 7 '06 #10
Problem is Response.codepage is new in IIS6 wasn't present on IIS5.
Session.CodePage will stick for the duration of the session. Hence any
pages that do not assign to Session.CodePage will end up using what ever
codepage was last set.

A kludgy alternative is:-

Dim lCodePage : lCodePage = Session.CodePage
Session.CodePage = WhateEverYourDesiredCodePageIs

.
. Do all your stuff here
.

Session.CodePage = lCodePage

Or make sure all your pages throughout the whole application specify the
Session.Codepage applicable.

Personally I would use Response.charset=UTF-8 and Session.CodePage=65001
throughout the whole site. Just be sure to save any static content that
contains characters outside the ASCII range as UTF-8 files and don't use
any
such characters in script literals.
Dunno if my approach is overloading, but I've taken a transactional
approach, so every asp page includes some initalizing asp pages. I could set
in those includes the session settings

Your Unicode text file based approach
will still work since you are no doubt using Response.Write to send the
content.
Not always; often the asp code contains HTML code plus some <%=something%>.
Is it equivalent to a response.write?
Dec 9 '06 #11

"Atlas" <at*******@my-deja.comwrote in message
news:u8******************@tornado.fastwebnet.it...
>
Problem is Response.codepage is new in IIS6 wasn't present on IIS5.
Session.CodePage will stick for the duration of the session. Hence any
pages that do not assign to Session.CodePage will end up using what ever
codepage was last set.

A kludgy alternative is:-

Dim lCodePage : lCodePage = Session.CodePage
Session.CodePage = WhateEverYourDesiredCodePageIs

.
. Do all your stuff here
.

Session.CodePage = lCodePage

Or make sure all your pages throughout the whole application specify the
Session.Codepage applicable.

Personally I would use Response.charset=UTF-8 and Session.CodePage=65001
throughout the whole site. Just be sure to save any static content that
contains characters outside the ASCII range as UTF-8 files and don't use
any
such characters in script literals.

Dunno if my approach is overloading, but I've taken a transactional
approach, so every asp page includes some initalizing asp pages. I could
set
in those includes the session settings
I'm not sure what 'transactional approach' means in this context however if
every page shares a common include, then yes placing the lines:-

Session.CodePage = 65001
Response.ContentType = "text/thml"
Response.CharSet = UTF-8

Would ensure everything ends up as UTF-8 when sent to the client.
>
Your Unicode text file based approach
will still work since you are no doubt using Response.Write to send the
content.

Not always; often the asp code contains HTML code plus some
<%=something%>.
Is it equivalent to a response.write?
Not it's equivalent to Response.BinaryWrite with the chunk of bytes outside
of the script delimiters being sent. Hence HTML code saved in an ASP file
needs to be encoded as per the Response.CharSet value sent to the client.
If the HTML is entirely composed of ASCII characters (0-127) then even a
file saved in in an ANSI format will be ok. However where the HTML code
contains characters outside this range you will need to save the file in
UTF-8 encoding. The only limitation here is you can't then use characters
outside the ASCII range in stings literal (contants) inside the ASP script
code.


>

Dec 9 '06 #12
"Anthony Jones" <An*@yadayadayada.comwrote in message
news:uK**************@TK2MSFTNGP05.phx.gbl...
>
"Atlas" <at*******@my-deja.comwrote in message
news:u8******************@tornado.fastwebnet.it...
>>
Problem is Response.codepage is new in IIS6 wasn't present on IIS5.
Session.CodePage will stick for the duration of the session. Hence any
pages that do not assign to Session.CodePage will end up using what
ever
codepage was last set.

A kludgy alternative is:-

Dim lCodePage : lCodePage = Session.CodePage
Session.CodePage = WhateEverYourDesiredCodePageIs

.
. Do all your stuff here
.

Session.CodePage = lCodePage

Or make sure all your pages throughout the whole application specify
the
Session.Codepage applicable.

Personally I would use Response.charset=UTF-8 and
Session.CodePage=65001
throughout the whole site. Just be sure to save any static content
that
contains characters outside the ASCII range as UTF-8 files and don't
use
any
such characters in script literals.

Dunno if my approach is overloading, but I've taken a transactional
approach, so every asp page includes some initalizing asp pages. I could
set
>in those includes the session settings

I'm not sure what 'transactional approach' means in this context however
if
every page shares a common include, then yes placing the lines:-

Session.CodePage = 65001
Response.ContentType = "text/thml"
Response.CharSet = UTF-8

Would ensure everything ends up as UTF-8 when sent to the client.
>>
Your Unicode text file based approach
will still work since you are no doubt using Response.Write to send the
content.

Not always; often the asp code contains HTML code plus some
<%=something%>.
>Is it equivalent to a response.write?

Not it's equivalent to Response.BinaryWrite with the chunk of bytes
outside
of the script delimiters being sent. Hence HTML code saved in an ASP file
needs to be encoded as per the Response.CharSet value sent to the client.
If the HTML is entirely composed of ASCII characters (0-127) then even a
file saved in in an ANSI format will be ok. However where the HTML code
contains characters outside this range you will need to save the file in
UTF-8 encoding. The only limitation here is you can't then use characters
outside the ASCII range in stings literal (contants) inside the ASP script
code.

Anthony I had some tries using your advises, but still getting garbage out
on the browser, and thought I had some troubles with the server, so I've
asked the ISP to move our site to IIS6 and they did, when I've discovered
that I was still getting garbage on the new server; finally I've discovered
that when trasferring the unicode/UTF8 pages by FTP, the client was
scrambling up the contents 'cause transferring in A (ASCII) mode. Once
switched transfers to I (binary) it worked perfectly. So probabily it would
have worked using your tips also on IIS5.

Nevertheless, thanks a lot for helping
Dec 19 '06 #13

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

2 posts views Thread by Ann | last post: by
8 posts views Thread by davisjoseph | last post: by
8 posts views Thread by Demon News | last post: by
4 posts views Thread by fitsch | last post: by
23 posts views Thread by Allan Ebdrup | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.