473,396 Members | 2,106 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

utf8-encoding

Hello all,

I am building a CMS which has 2 language: English & Traditional Chinese
my problem is all data are represent as "?????????", all pagecode are set to
utf8

do I need to encoding(-> utf8) before insert into DB?
OR do I need to do anything when content display?

Thanks in advanced.
Feb 2 '06 #1
11 3156
beachboy wrote:
I am building a CMS which has 2 language: English & Traditional Chinese
my problem is all data are represent as "?????????", all pagecode are set to
utf8

do I need to encoding(-> utf8) before insert into DB?
OR do I need to do anything when content display?


You'll need to give more information about your situation, I'm afraid.
Describe the different layers, whether it's a webapp or a Windows Forms
app etc.

You shouldn't need to do any work before putting a string into a
database though, so long as the database (and the column in question)
supports Unicode.

See http://www.pobox.com/~skeet/csharp/d...ngunicode.html for more
information on how to diagnose problems.

Jon

Feb 2 '06 #2
> Hello all,

I am building a CMS which has 2 language: English & Traditional
Chinese
my problem is all data are represent as "?????????", all pagecode are
set to
utf8
do I need to encoding(-> utf8) before insert into DB? OR do I need to
do anything when content display?

Thanks in advanced.


Additionally to what Jon listed, information about how you display those
????? values would be useful. All too often I see long discussions about
MS SQL Server not handling Unicode and then finally it turns out they are
using Enterprise Manager to look at the data, and EM doesn't handle non-english
character sets very well. The data might very well be ok though.

--
Lasse Vågsæther Karlsen
http://usinglvkblog.blogspot.com/
mailto:la***@vkarlsen.no
PGP KeyID: 0x2A42A1C2
Feb 2 '06 #3
Thanks all advise.

My project is ASP.NET c# project,
I use simple form to input the content into database
and use simple query from database, and then "?????" displayed on my
webpage.
my page encoding set to utf-8 defaultly.

Is much enough? Thanks in advanced.

"Jon Skeet [C# MVP]" <sk***@pobox.com> ???
news:11*********************@z14g2000cwz.googlegro ups.com ???...
beachboy wrote:
I am building a CMS which has 2 language: English & Traditional Chinese
my problem is all data are represent as "?????????", all pagecode are set to utf8

do I need to encoding(-> utf8) before insert into DB?
OR do I need to do anything when content display?


You'll need to give more information about your situation, I'm afraid.
Describe the different layers, whether it's a webapp or a Windows Forms
app etc.

You shouldn't need to do any work before putting a string into a
database though, so long as the database (and the column in question)
supports Unicode.

See http://www.pobox.com/~skeet/csharp/d...ngunicode.html for more
information on how to diagnose problems.

Jon

Feb 2 '06 #4
beachboy wrote:
Thanks all advise.

My project is ASP.NET c# project,
I use simple form to input the content into database
and use simple query from database, and then "?????" displayed on my
webpage.
my page encoding set to utf-8 defaultly.

Is much enough? Thanks in advanced.


That's enough to start with. Now apply the techniques talked about in
the article I linked to before, and find out where the problem is. It
sounds like it shouldn't be too hard to find. Just log the contents of
the string (as Unicode numbers) when you store it in the database and
when you retrieve it. If they're the same, the problem is in the code
communicating the server and the web browser. If they're different, the
problem is in the code communicating with the database.

Jon

Feb 2 '06 #5
Are you using nvarchar in the database? If not, the database cannot handle
these characters.

"beachboy" <st*****@javacatz.com> wrote in message
news:%2****************@TK2MSFTNGP14.phx.gbl...
Hello all,

I am building a CMS which has 2 language: English & Traditional Chinese
my problem is all data are represent as "?????????", all pagecode are set
to
utf8

do I need to encoding(-> utf8) before insert into DB?
OR do I need to do anything when content display?

Thanks in advanced.

Feb 2 '06 #6
yes. you are right.
I am using nvarchar and ntext for data store.
Thanks.
"Mats-Lennart Hansson" <ap********@hotmail.com> ¼¶¼g©ó¶l¥ó·s»D:Ob**************@TK2MSFTNGP10.phx.g bl...
Are you using nvarchar in the database? If not, the database cannot handle
these characters.

"beachboy" <st*****@javacatz.com> wrote in message
news:%2****************@TK2MSFTNGP14.phx.gbl...
Hello all,

I am building a CMS which has 2 language: English & Traditional Chinese
my problem is all data are represent as "?????????", all pagecode are set
to
utf8

do I need to encoding(-> utf8) before insert into DB?
OR do I need to do anything when content display?

Thanks in advanced.


Feb 2 '06 #7
Thanks.

use this method to convert unicode number?

static void DumpString (string value){ foreach (char c in value)
{
Console.Write ("{0:x4} ", (int)c);
}
Console.WriteLine();
}

"Jon Skeet [C# MVP]" <sk***@pobox.com> ???
news:11**********************@g43g2000cwa.googlegr oups.com ???...
beachboy wrote:
Thanks all advise.

My project is ASP.NET c# project,
I use simple form to input the content into database
and use simple query from database, and then "?????" displayed on my
webpage.
my page encoding set to utf-8 defaultly.

Is much enough? Thanks in advanced.


That's enough to start with. Now apply the techniques talked about in
the article I linked to before, and find out where the problem is. It
sounds like it shouldn't be too hard to find. Just log the contents of
the string (as Unicode numbers) when you store it in the database and
when you retrieve it. If they're the same, the problem is in the code
communicating the server and the web browser. If they're different, the
problem is in the code communicating with the database.

Jon

Feb 3 '06 #8
My project is ASP.NET c# project,
I use simple form to input the content into database
and use simple query from database, and then "?????" displayed on my
webpage.
my page encoding set to utf-8 defaultly.

Thanks in advanced.
"Lasse Vågsæther Karlsen" <la***@vkarlsen.no> ???
news:9d**************************@news.microsoft.c om ???...
Hello all,

I am building a CMS which has 2 language: English & Traditional
Chinese
my problem is all data are represent as "?????????", all pagecode are
set to
utf8
do I need to encoding(-> utf8) before insert into DB? OR do I need to
do anything when content display?

Thanks in advanced.

Additionally to what Jon listed, information about how you display those
????? values would be useful. All too often I see long discussions about
MS SQL Server not handling Unicode and then finally it turns out they are
using Enterprise Manager to look at the data, and EM doesn't handle

non-english character sets very well. The data might very well be ok though.

--
Lasse Vågsæther Karlsen
http://usinglvkblog.blogspot.com/
mailto:la***@vkarlsen.no
PGP KeyID: 0x2A42A1C2

Feb 3 '06 #9
beachboy <st*****@javacatz.com> wrote:
use this method to convert unicode number?

static void DumpString (string value){ foreach (char c in value)
{
Console.Write ("{0:x4} ", (int)c);
}
Console.WriteLine();
}


Yes.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Feb 3 '06 #10
Sorry. I am a beginner of CMS developer...
==========
private void Page_Load(object sender, System.EventArgs e)
{
// Put user code to initialize the page here
DumpString( "stan");
}

public void DumpString(string src)
{
StringBuilder sb = new StringBuilder();
foreach (char c in src)
{
Response.Write( "{0:x4}" , (int)c);
Response.Write("[]");
}
}

==========
but it has error on "Response.Write( "{0:x4}" , (int)c);"

Thank you!

"Jon Skeet [C# MVP]" <sk***@pobox.com> ???
news:MP************************@msnews.microsoft.c om ???...
beachboy <st*****@javacatz.com> wrote:
use this method to convert unicode number?

static void DumpString (string value){ foreach (char c in value)
{
Console.Write ("{0:x4} ", (int)c);
}
Console.WriteLine();
}


Yes.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too

Feb 3 '06 #11
beachboy wrote:
Sorry. I am a beginner of CMS developer...
==========
private void Page_Load(object sender, System.EventArgs e)
{
// Put user code to initialize the page here
DumpString( "stan");
}

public void DumpString(string src)
{
StringBuilder sb = new StringBuilder();
foreach (char c in src)
{
Response.Write( "{0:x4}" , (int)c);
Response.Write("[]");
}
}

==========
but it has error on "Response.Write( "{0:x4}" , (int)c);"


Well, you can use Response.Write (string.Format("{0:x4}", (int)c));

However, another alternative is to write the string content to a log
file instead, so that you can still see what appears on the web page,
but separately find out what that consists of.

I'd look at the database side first though, as you can do that with a
console app very easily.

Jon

Feb 3 '06 #12

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: sinasalek | last post by:
i have a problem with MySQL 4.1.x and UTF8. in version 4.0, i'm using html forms with utf8 charset for inserting unicode strings. but in version 4.1.x it is not working! if i change the charset of...
10
by: pekka niiranen | last post by:
Hi there, I have two files "my.utf8" and "my.utf16" which both contain BOM and two "a" characters. Contents of "my.utf8" in HEX: EFBBBF6161 Contents of "my.utf16" in HEX: FEFF6161
6
by: Spamtrap | last post by:
I only work in Perl occasionaly, and have been searching for a solution for a conversion, and everything I found seems much too complex. All I need to do is take a simple text file and copy...
0
by: Sagi Bashari | last post by:
Hello, I would like to know the status of the UTF8 support in MySQL 4.1. I tried to create a table using utf8 charset, and inserting hebrew text into it. it seems like it still treats this...
0
by: JJ | last post by:
Hi, I have a little, big, boring problem :) I have a utf8 txt file to import in a MySQL db, cause I must create a web-application in PHP for reading this information on-line. I have create...
1
by: Paul | last post by:
Assume you have two varchar (or Text) columns named L and U which are identical except that the charset for L is latin1 and the charset for U is utf8. All the records in L and U are identical in...
12
by: chunhui_true | last post by:
i have a class, it can read one line(\r\n ended) from string,when i read line from utf8 string i can't get any thing! maybe i should conversion utf8 to ascii??there is any function can conversion...
4
by: chris_fieldhouse | last post by:
Hi, I'm almost done with a php driven email filter and automated forwarder, I've tested it out with various emails and ironed out plain text and html. But this final item has me stumped. ...
7
by: amygdala | last post by:
Hi, I'm trying to let PHP write a 'sitemap.xml' sitemap for Google and other searchengines. It's working, except that the content in the XML file doesn't seem to be UTF8. (Which it should be,...
39
by: alex | last post by:
I've converted a latin1 database I have to utf8. The process has been: # mysqldump -u root -p --default-character-set=latin1 -c --insert-ignore --skip-set-charset mydb mydb.sql # iconv -f...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.