471,309 Members | 1,042 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,309 software developers and data experts.

EncoderFallbackException when writing characters not available in the specified encoding with XMLWriter to a Stream - feature or bug?

I've stumbled upon unexpected behavior of the .NET 2.0 System.Xml.XmlWriter
class when using it to write data to a binary stream (System.IO.Stream). If
the amount of data is less than a certain value (which varies depending on
the data being written), characters not available in the encoding specified
in the Encoding property of the XmlWritterSettings instance used to create
the XmlWriter are being written to the resulting XML document as character
entities, which, despite not being able to provide an exact link to a
specific part of the documentation, I would call the expected behavior. For
a higher amount of data, either implicit or explicit flush of the XmlWriter
buffer causes a System.Text.EncoderFallbackException with the message
"Unable to translate Unicode character \uXXXX at index XX to specified code
page.", with the character indicated being one not available in the
destination encoding. Documentation of the Encoding property of the
XmlWriterSettings says "An exception is thrown when the Flush method is
called if any encoding errors are encountered.", but the Encoding instance
being used in the XmlWriterSettings instance used to create XmlWriter would
not cause the exception - it has an
System.Text.InternalEncoderBestFitFallback instance as the value of the
EncoderFallback property, which is replaced with
System.Xml.CharEntityEncoderFallback in the created XmlWriter instance.

An example stack trace is as follows:

System.Text.EncoderFallbackException: Unable to translate Unicode character
\u2014 at index 92 to specified code page.
at System.Text.EncoderExceptionFallbackBuffer.Fallbac k(Char charUnknown,
Int32 index)
at System.Xml.CharEntityEncoderFallbackBuffer.Fallbac k(Char charUnknown,
Int32 index)
at System.Text.EncoderFallbackBuffer.InternalFallback (Char ch, Char*&
chars)
at System.Text.SBCSCodePageEncoding.GetBytes(Char* chars, Int32
charCount, Byte* bytes, Int32 byteCount, EncoderNLS encoder)
at System.Text.EncoderNLS.Convert(Char* chars, Int32 charCount, Byte*
bytes, Int32 byteCount, Boolean flush, Int32& charsUsed, Int32& bytesUsed,
Boolean& completed)
at System.Text.EncoderNLS.Convert(Char[] chars, Int32 charIndex, Int32
charCount, Byte[] bytes, Int32 byteIndex, Int32 byteCount, Boolean flush,
Int32& charsUsed, Int32& bytesUsed, Boolean& completed)
at System.Xml.XmlEncodedRawTextWriter.EncodeChars(Int 32 startOffset,
Int32 endOffset, Boolean writeAllToStream)
at System.Xml.XmlEncodedRawTextWriter.FlushBuffer()
at System.Xml.XmlEncodedRawTextWriter.WriteElementTex tBlock(Char* pSrc,
Char* pSrcEnd)
at System.Xml.XmlEncodedRawTextWriter.WriteString(Str ing text)
at System.Xml.XmlWellFormedWriter.WriteString(String text)
at XmlWriterEncoderFallbackExceptionTest.Program.Main (String[] args)

An example code to reproduce the problem is as follows:

using System;
using System.Text;
using System.Xml;
using System.IO;

namespace XmlWriterEncoderFallbackExceptionTest
{
class Program
{
static void Main(string[] args)
{
//Encoding enc = Encoding.UTF8;
Encoding enc = Encoding.GetEncoding("iso-8859-2");
//Encoding enc = Encoding.ASCII;

string u2014 = "any string with characters not available in the
"
+ "specified encoding \u2014 like ISO-8859-2, for example";

MemoryStream ms = new MemoryStream();

XmlWriterSettings set = new XmlWriterSettings();
set.Encoding = enc;

XmlWriter xwr = XmlWriter.Create(ms, set);

try
{
xwr.WriteStartElement("root-element");

for (int i = 0; i < 1000; i++)
{
// causes the said exception if (presumably) the
internal
// buffer is implicitly flushed
xwr.WriteString(u2014);
}

// causes the said exception after a certain amount of data
// has been written, if the WriteString call didn't do so
// earlier
xwr.Flush();

xwr.WriteEndElement();

// implicitly flushes the buffer thus may also cause the
said
// exception, but not specifically in this example program
xwr.Close();
}
catch (EncoderFallbackException efe)
{
Console.WriteLine(efe);
}

byte[] result = ms.ToArray();
Console.WriteLine(result.Length);
Console.WriteLine(
result.Length 512 ?
enc.GetString(result, result.Length - 512, 512) :
enc.GetString(result));
}
}
}

Is there a problem in my code or is it an issue in the .NET Framework 2.0
(System.Xml.CharEntityEncoderFallback class specifically, I'd guess) itself?

--
Janusz Nykiel

Mar 1 '07 #1
0 3404

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

2 posts views Thread by Riku Kangas | last post: by
1 post views Thread by romiko2000 | last post: by
9 posts views Thread by Mantorok | last post: by
4 posts views Thread by =?Utf-8?B?RGF2aWQgVGhpZWxlbg==?= | last post: by
16 posts views Thread by billsahiker | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.