473,598 Members | 3,029 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

XML is UTF-16 encoding but I want UTF-8 encoding ?

Hi there XML Gurus ;)

I am trying to use XML Serialization to create a xml from a class, this is the output which I get when I create the XML and put in a string variale.

"<?xml version=\"1.0\" encoding=\"utf-16\"?><OutBS><F irstName>XML</FirstName><Last Name>Guru</LastName><Membe rName>gu**@xml. com</MemberName></OutBS>"

My question is why am I getting the encoding as UTF-16, I want a UTF-8 encoding, how can this be achieve by getting the output in the xout string variable.

Help is really appreciated.
thanks,
shailendra batham

// Create the xout xml here.
OutBS obs = new OutBS();
obs.FirstName = "XML";
obs.LastName = "Guru";
obs.MemberName = "gu**@xml.c om";

StringWriter sw = new StringWriter();
XmlSerializer xs = new XmlSerializer(o bs.GetType());
xs.Serialize(sw , obs);
String xout = sw.ToString();

public class OutBS
{
public String FirstName;
public String LastName;
public String MemberName;
}

--------------------------------------------------------------------------------

Nov 12 '05 #1
2 18989
StringWriter can only produce UTF-16 by definition.
If you want some other encoding use StreamWriter.
"Shailendra Batham" <sg******@sbcgl obal.net> wrote in message news:vm******** **********@news svr21.news.prod igy.com...
Hi there XML Gurus ;)

I am trying to use XML Serialization to create a xml from a class, this is the output which I get when I create the XML and put in a string variale.

"<?xml version=\"1.0\" encoding=\"utf-16\"?><OutBS><F irstName>XML</FirstName><Last Name>Guru</LastName><Membe rName>gu**@xml. com</MemberName></OutBS>"

My question is why am I getting the encoding as UTF-16, I want a UTF-8 encoding, how can this be achieve by getting the output in the xout string variable.

Help is really appreciated.
thanks,
shailendra batham

// Create the xout xml here.
OutBS obs = new OutBS();
obs.FirstName = "XML";
obs.LastName = "Guru";
obs.MemberName = "gu**@xml.c om";

StringWriter sw = new StringWriter();
XmlSerializer xs = new XmlSerializer(o bs.GetType());
xs.Serialize(sw , obs);
String xout = sw.ToString();

public class OutBS
{
public String FirstName;
public String LastName;
public String MemberName;
}

------------------------------------------------------------------------------

Nov 12 '05 #2
"Shailendra Batham" <sg******@sbcgl obal.net> wrote in message news:vm******** **********@news svr21.news.prod igy.com...
My question is why am I getting the encoding as UTF-16, I want a
UTF-8 encoding, how can this be achieve by getting the output in
the xout string variable.


Shailendra,

In .NET, a String is a sequence of zero or more Chars, and each
Char is a two-byte Unicode character. Therefore, a String will
always be UTF-16, and can be no other encoding.

If you really care about getting UTF-8 encoding, then instead of
serializing to a StringWriter you should serialize to a Memory-
Stream wrapping a byte[]. You'd have decode the byte[] into
UTF-16 to read it naturally, but the byte[] would contain UTF-8
encoded data.

- - - Utf8Serialize.c s (excerpt)
using System.IO;
using System.Text;
using System.Xml;
using System.Xml.Seri alization;
// . . .
MemoryStream memStrm = new MemoryStream( );
XmlTextWriter xmlSink = new XmlTextWriter( memStrm, Encoding.UTF8);
xs.Serialize( xmlSink, ob);

byte[] utf8EncodedData = memStrm.GetBuff er( );
// . . .
- - -
Derek Harmon
Nov 12 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
3649
by: Hardy Wang | last post by:
Hi, I have following code: Encoding mode; // Encoding.Default or Encoding.UTF8 FileStream sb = new FileStream(fullPathAndFileName, FileMode.Create); StreamWriter sw = new StreamWriter(sb, mode); sw.Write(textContent); sw.Close(); My question is under what situation, the saved files are different by calling Encoding.Default and Encoding.UTF8.
1
2198
by: Frank Esser | last post by:
Hello! On a PC with German Codepage settings I want to get UTF8 out of string in my application. I use this function: Byte array = Encoding.UTF8.GetBytes("à"); When I look at the Unicode tables then this character is in the Latin table
11
3171
by: beachboy | last post by:
Hello all, I am building a CMS which has 2 language: English & Traditional Chinese my problem is all data are represent as "?????????", all pagecode are set to utf8 do I need to encoding(-> utf8) before insert into DB? OR do I need to do anything when content display? Thanks in advanced.
7
4458
by: EmeraldShield | last post by:
We have an application that uses UTF8 everywhere to load / save / process documents. One of our clients is having a problem with BIG Encoded files being trashed after running through our app. Indeed I have verified that if I go to a website in Taiwan and save the file in BIG5 and then just load / save the file with a UTF8 text reader / write some bytes are modified. How can I correct this? It was my understanding the UTF8 was...
3
6297
by: Matt | last post by:
I have a problem where I am working with extended character sets in XML but I have also found that any time I work with a translation or internally generated Xml document I get the dreaded message, "Data at the root level is invalid. Line 1 position 1". If you run the following code there will be extra bytes at the beginning of the resulting string. I believe this is some type of BigEndian encoding or something. My question is this,...
1
7784
by: Heron | last post by:
Hi, I'm having a problem deserializing my streams since they are utf8 encoded (they are being received over tcp/ip) so I was looking for a way to make the serializer use utf8, is there any? I'm getting the following error: There is an error in XML document (0, 0). innerexception: There is no Unicode byte order mark. Cannot switch to Unicode."
0
1446
by: Daniel | last post by:
how does ado.net SqlDataReader.GetString() know which encoding to read the data into a string as? Does sql sever set this at the column data type level, server wide encoding setting, os encoding?
0
2781
YuriyRusinov
by: YuriyRusinov | last post by:
Hello ! I have database linkdb with UTF8 encoding. Script \encoding WIN1251 begin; \i ./database_schema.sql
1
4316
by: Tejas | last post by:
Hi, I am using ldap_get_values() call to get the user attributes from LDAP. This call is returning the user attributes in UTF-8 encoding and its a PCHAR*. For normal English characters this is working well. When Multibyte characters are involved like Japanese, Chinese or Korean, I need to convert UTF8 to ANSI encoding to get the correct values.
0
1598
by: damonwischik | last post by:
I use emacs 22 and python-mode. Emacs can display utf8 characters (e.g. when I open a utf8-encoded file with Chinese, those characters show up fine), and I'd like to see utf8-encoded output from my python session. From googling, I've found references to * locale.getdefaultlocale(), which is ('en_GB', 'cp1252') * sys.stdout.encoding, which is None * the environment variables LANG, LC_CHARSET and LC_ALL....
0
7981
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
7894
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
8284
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
8046
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
6711
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
5847
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5437
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
3938
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
1500
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.