Hello Everybody!
I am going to develop a multi-language website which will include
Traditional Chinese, Simplified Chinese, Thai, Japanese, etc. (and
English as well). I would like to take an advantage of support that
ASP.NET has to offer for localizing. Probably I will use resources and
[ResourceManager] to retrieve localized text among with XML
serialization of datasets.
There is no need to store data in the SQL Server database, since the
website has a nature of company's portfolio (articles, news, offered
products and services) and there will be very seldom updates. However,
the client wants to be able to edit, modify, and add new data, so I
will be required to create a separate administration web-based tool
for them as well.
Front-end and back-end (the tool) probably both will have to support
multiple languages.
Do you think that UTF-8 encoding should handle all the languages
above, or UTF-16 would be better? I just worry for the file sizes, and
performance issues when using UTF-16.
I would store the entire website's internal data (*.aspx files, *.xml
files, *.resx) files using UTF-8 encoding (with signature whenever
possible).
I would also set up web.confing in the globalization section as
following:
fileEncoding="utf-8"
requestEncoding="utf-8"
responseEncoding="utf-8"
However, I have heard that win98 and winME do not support Unicode.
Should I rather go for other encoding as below:
Big5, gb2312, shift-jis depending on the country?
I could dynamically change encoding before showing an *.aspx page
Response.ContentEncoding depending on the country, but will it really
work? I mean if I use utf-8 encoding internally for storing all *.aspx
pages (as well as data), will ASP.NET Framework convert encodings on
the fly before showing them to the client?
It would have to convert UTF-8 to Big5 for example when showing
Chinese page. I do not want to create separate websites (copies) for
each country.
Also one more question. If I receive a text file which is using Big5
encoding can I simply open it in Notepad and save as UTF-8 – will this
take care for conversion between them?
Any suggestions will be appreciated.
Regards,
<Remi>