Help | Site Map
Connecting Tech Pros Worldwide
 
 
LinkBack Thread Tools
  #1  
Old March 1st, 2007, 07:05 PM
Janusz Nykiel
Guest
 
Posts: n/a
Default EncoderFallbackException when writing characters not available in the specified encoding with XMLWriter to a Stream - feature or bug?

I've stumbled upon unexpected behavior of the .NET 2.0 System.Xml.XmlWriter
class when using it to write data to a binary stream (System.IO.Stream). If
the amount of data is less than a certain value (which varies depending on
the data being written), characters not available in the encoding specified
in the Encoding property of the XmlWritterSettings instance used to create
the XmlWriter are being written to the resulting XML document as character
entities, which, despite not being able to provide an exact link to a
specific part of the documentation, I would call the expected behavior. For
a higher amount of data, either implicit or explicit flush of the XmlWriter
buffer causes a System.Text.EncoderFallbackException with the message
"Unable to translate Unicode character \uXXXX at index XX to specified code
page.", with the character indicated being one not available in the
destination encoding. Documentation of the Encoding property of the
XmlWriterSettings says "An exception is thrown when the Flush method is
called if any encoding errors are encountered.", but the Encoding instance
being used in the XmlWriterSettings instance used to create XmlWriter would
not cause the exception - it has an
System.Text.InternalEncoderBestFitFallback instance as the value of the
EncoderFallback property, which is replaced with
System.Xml.CharEntityEncoderFallback in the created XmlWriter instance.

An example stack trace is as follows:

System.Text.EncoderFallbackException: Unable to translate Unicode character
\u2014 at index 92 to specified code page.
at System.Text.EncoderExceptionFallbackBuffer.Fallbac k(Char charUnknown,
Int32 index)
at System.Xml.CharEntityEncoderFallbackBuffer.Fallbac k(Char charUnknown,
Int32 index)
at System.Text.EncoderFallbackBuffer.InternalFallback (Char ch, Char*&
chars)
at System.Text.SBCSCodePageEncoding.GetBytes(Char* chars, Int32
charCount, Byte* bytes, Int32 byteCount, EncoderNLS encoder)
at System.Text.EncoderNLS.Convert(Char* chars, Int32 charCount, Byte*
bytes, Int32 byteCount, Boolean flush, Int32& charsUsed, Int32& bytesUsed,
Boolean& completed)
at System.Text.EncoderNLS.Convert(Char[] chars, Int32 charIndex, Int32
charCount, Byte[] bytes, Int32 byteIndex, Int32 byteCount, Boolean flush,
Int32& charsUsed, Int32& bytesUsed, Boolean& completed)
at System.Xml.XmlEncodedRawTextWriter.EncodeChars(Int 32 startOffset,
Int32 endOffset, Boolean writeAllToStream)
at System.Xml.XmlEncodedRawTextWriter.FlushBuffer()
at System.Xml.XmlEncodedRawTextWriter.WriteElementTex tBlock(Char* pSrc,
Char* pSrcEnd)
at System.Xml.XmlEncodedRawTextWriter.WriteString(Str ing text)
at System.Xml.XmlWellFormedWriter.WriteString(String text)
at XmlWriterEncoderFallbackExceptionTest.Program.Main (String[] args)

An example code to reproduce the problem is as follows:

using System;
using System.Text;
using System.Xml;
using System.IO;

namespace XmlWriterEncoderFallbackExceptionTest
{
class Program
{
static void Main(string[] args)
{
//Encoding enc = Encoding.UTF8;
Encoding enc = Encoding.GetEncoding("iso-8859-2");
//Encoding enc = Encoding.ASCII;

string u2014 = "any string with characters not available in the
"
+ "specified encoding \u2014 like ISO-8859-2, for example";

MemoryStream ms = new MemoryStream();

XmlWriterSettings set = new XmlWriterSettings();
set.Encoding = enc;

XmlWriter xwr = XmlWriter.Create(ms, set);

try
{
xwr.WriteStartElement("root-element");

for (int i = 0; i < 1000; i++)
{
// causes the said exception if (presumably) the
internal
// buffer is implicitly flushed
xwr.WriteString(u2014);
}

// causes the said exception after a certain amount of data
// has been written, if the WriteString call didn't do so
// earlier
xwr.Flush();

xwr.WriteEndElement();

// implicitly flushes the buffer thus may also cause the
said
// exception, but not specifically in this example program
xwr.Close();
}
catch (EncoderFallbackException efe)
{
Console.WriteLine(efe);
}

byte[] result = ms.ToArray();
Console.WriteLine(result.Length);
Console.WriteLine(
result.Length 512 ?
enc.GetString(result, result.Length - 512, 512) :
enc.GetString(result));
}
}
}

Is there a problem in my code or is it an issue in the .NET Framework 2.0
(System.Xml.CharEntityEncoderFallback class specifically, I'd guess) itself?

--
Janusz Nykiel

 

Bookmarks

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are Off
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On

What is Bytes?

We are a network of experts and professionals in IT and software development that help one another with answers to tough questions and share insights. Get the best answers to your questions from over network members.
Post your question now . . .
It's fast and it's free

Popular Articles