473,394 Members | 1,718 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,394 software developers and data experts.

Invalid character in XML

Hi there,

I have a 600MB xml file that I am trying to pull a small amount of
data from, using an XMLTextReader in C#.

All works well, until I get an exception thrown in linw 4,277,905
because of an illegal character for the encoding type.
"There is an invalid character in the given encoding. Lin 4277905,
position 26."

Now, this file is obviously fairly large - too large for a text editor
- so I was wondering two things

1) Is there a way to change the encoding type of the XmlTextReader
object? (I had a quick look but it seems to be read only)
2) Is there another way to ignore errors in an element? I know that
the error at the line mentioned above is not within data that I am
trying to extract on this run through, so I can safely ignore it.

TIA
Marc.
Nov 12 '05 #1
2 6032
Hi Marc,

As far as I know, we can get the Xml file as a stream and when constructing
the stream reader object, we can specify the encoding type. We cannot
change the encoding type after during reading after constructed. We can
also spaecify the encoding type in the XmlTextReader constructor with
XmlParserContext. If you need to ignore the exception, you just catch it
and do nothing.

Kevin Yu
=======
"This posting is provided "AS IS" with no warranties, and confers no
rights."

Nov 12 '05 #2
Marc Jennings wrote:
Hi there,

I have a 600MB xml file that I am trying to pull a small amount of
data from, using an XMLTextReader in C#.

All works well, until I get an exception thrown in linw 4,277,905
because of an illegal character for the encoding type.
"There is an invalid character in the given encoding. Lin 4277905,
position 26."
Not a very useful piece of software if it doesn't actually say what
the character is...
Now, this file is obviously fairly large - too large for a text editor
No, Emacs can easily handle a file this big (assuming you have some
sensible amount of memory). Otherwise use standard text utilities, eg
$ head -4277905 myfile.xml | tail -1
These are available for Windows systems if you install CygWin.

Some people dislike using console utilities, but they should be in the
toolbag of any heavy XML user for the occasions when other methods fail.

[...] 2) Is there another way to ignore errors in an element? I know that
the error at the line mentioned above is not within data that I am
trying to extract on this run through, so I can safely ignore it.


Not easily. XML processors usually work on the basis that you process the
whole document. But it's often possible to use a stream utility as a
non-XML filter to extract the well-formed subset which contains the data
you are interested in.

///Peter

Nov 12 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
by: cgbusch | last post by:
"Character reference "&#c" is an invalid XML character" With JDOM and Xerces in Java, I get the above error with sequences. I need to be able to encode arbitrary char sequences in xml. any...
3
by: Kaidi | last post by:
Hello guys, I get the "an invalid XML character" error when using xerces to parse a XML file. I know that XML will correspond the &, <, >, " to special strings like "&gt;&lt;". However, how about if the...
6
by: Marco Montel | last post by:
I have two applications that should comunicate through an xml file. This xml will contain a CDATA section with a digital signature. The problem is that the digital signature is composed of...
9
by: Safalra | last post by:
The idea here is relatively simple: a java program (I'm using JDK1.4 if that makes a difference) that loads an HTML file, removes invalid characters (or replaces them in the case of common ones...
3
by: Gabriele Poggi | last post by:
How can I load an xml document, with some invalid character, with the method doc.load(), without exceptions? I have already tried to insert a CDATA Sections in the xml file source, but the result...
1
by: King Kong | last post by:
we are facing this kind of error when we double click the infragistic web grid please help me on this Regards Moid Iqbal Server Error in '/NetworkAccess' Application....
2
by: none | last post by:
Hi, I'm opening a popup window with JavaScript. It has a command button. This is from the primary source (opener) window: ###################################################################...
2
by: sachinik19 | last post by:
Hi, We are using SAXParser (xerces) for parsing an xml with utf-8 encoding. For some special character it gives SAXException with error message - "The invalid character is found in the document...
1
by: =?Utf-8?B?UGF1bCBQaGlsbGlwcw==?= | last post by:
I have read many things about this but I haven't got a clear vision on what to do if anything about this. I have a system that tries to find holes in my web site. One of the things it has...
1
by: eBob.com | last post by:
I have some code which is trying to determine where text will wrap in a custom text box (which Inherits from Control). It determines the number of characters which will fit in the first line, but...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.