473,795 Members | 3,231 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Error when using XMLTextReader to read HTML

I have some simple HTML I'm trying to read with the XMLTextReader. As in the
MSDS examples, I set up a loop to read each XML node:

while (reader.Read())
{
switch (reader.NodeTyp e)
{
case XmlNodeType.Ele ment:
Console.WriteLi ne("<{0}>", reader.Name);
break;
case XmlNodeType.Tex t:
Console.WriteLi ne(reader.Value );
break;
case XmlNodeType.Att ribute:
Console.WriteLi ne(reader.Value );
break;
default:
Console.WriteLi ne(reader.NodeT ype);
break;
}
}

The reader moves along fine until it attempts to read the </head> node. in
this html:
<html>
<head>
<title>Sir</title>
<meta name="Author" content="Bar01" >
<meta name="Descripti on" content="Instru ctions">
<link href="css/results.css" media="SCREEN" rel="StyleSheet "
type="text/css" />
</head>

The error is:

System.Xml.XmlE xception: The 'meta' start tag on line '5' does
not match the end tag of 'head'. Line 7, position 4.
at System.Xml.XmlT extReader.Parse Tag()
at System.Xml.XmlT extReader.Parse BeginTagExpandC harEntities()
at System.Xml.XmlT extReader.Read( )
at PIDProvider.Ana lyze.PIDrefs() in c:\vdev2\PID\An alysis.cs:line 29

What does that exception mean?

Am I missing something? Am I wrong to assume that I can read the HTML with
the XMLTextReader?

Thanks

Mitch
Nov 12 '05 #1
2 5593


Mitch wrote:
I have some simple HTML I'm trying to read with the XMLTextReader. As in the
MSDS examples, I set up a loop to read each XML node: The reader moves along fine until it attempts to read the </head> node. in
this html:
<html>
<head>
<title>Sir</title>
<meta name="Author" content="Bar01" >
<meta name="Descripti on" content="Instru ctions">
<link href="css/results.css" media="SCREEN" rel="StyleSheet "
type="text/css" />
</head>

The error is:

System.Xml.XmlE xception: The 'meta' start tag on line '5' does
not match the end tag of 'head'. Line 7, position 4.
at System.Xml.XmlT extReader.Parse Tag()
at System.Xml.XmlT extReader.Parse BeginTagExpandC harEntities()
at System.Xml.XmlT extReader.Read( )
at PIDProvider.Ana lyze.PIDrefs() in c:\vdev2\PID\An alysis.cs:line 29

What does that exception mean?

Am I missing something? Am I wrong to assume that I can read the HTML with
the XMLTextReader?


Yes, completely wrong, HTML is an SGML application and you can't parse
HTML with an XML parser unless you author XHTML.
If you want to read HTML there is an SGML reader implementation in .NET
around, google for it.

--

Martin Honnen
http://JavaScript.FAQTs.com/
Nov 12 '05 #2
Thanks Martin. I also found something called Tidy HTML that converts HTML to
be well formed XML. In the end though, I decided to just use regex to find
the stuff I need because I'm really not interested in the over node
structure.

Mitch
"Martin Honnen" <ma*******@yaho o.de> wrote in message
news:Oh******** ******@TK2MSFTN GP11.phx.gbl...


Mitch wrote:
I have some simple HTML I'm trying to read with the XMLTextReader. As in the MSDS examples, I set up a loop to read each XML node:

The reader moves along fine until it attempts to read the </head> node. in this html:
<html>
<head>
<title>Sir</title>
<meta name="Author" content="Bar01" >
<meta name="Descripti on" content="Instru ctions">
<link href="css/results.css" media="SCREEN" rel="StyleSheet "
type="text/css" />
</head>

The error is:

System.Xml.XmlE xception: The 'meta' start tag on line '5' does
not match the end tag of 'head'. Line 7, position 4.
at System.Xml.XmlT extReader.Parse Tag()
at System.Xml.XmlT extReader.Parse BeginTagExpandC harEntities()
at System.Xml.XmlT extReader.Read( )
at PIDProvider.Ana lyze.PIDrefs() in c:\vdev2\PID\An alysis.cs:line 29

What does that exception mean?

Am I missing something? Am I wrong to assume that I can read the HTML with the XMLTextReader?


Yes, completely wrong, HTML is an SGML application and you can't parse
HTML with an XML parser unless you author XHTML.
If you want to read HTML there is an SGML reader implementation in .NET
around, google for it.

--

Martin Honnen
http://JavaScript.FAQTs.com/

Nov 12 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
2433
by: Bill Cohagan | last post by:
I'm writing a console app in c# and am encountering a strange problem. I'm trying to use redirection of the standard input stream to read input from a (xml) file. The following code snippet is from this app: =============================== static void Main(string args) { if (args.Length > 0) Console.SetIn(new StreamReader(args)); //executes if I don't use the "<", ">" redirection syntax when invoking XmlTextReader xmlin = new...
3
2952
by: Daniel | last post by:
Greetings. Just wondering if it is possible to use XmlTextReader to read off a html doc: e.g. XmlTextReader tr = new XmlTextReader ("http://localhost/test.xml"); where test.xml contains the following:
1
27247
by: Jim P. | last post by:
I have a client server set of apps that can connect through socets and send data back and forth. I'm trying to get it to send XML messages back and both. Currently it works as string data. I collect all of the incoming data to a string but when I try to parse the incoming XML I get the following message: ------------------------------------------- Error Parsing message: System.Xml.Exception: There is no Unicode byte order mark. ...
0
442
by: SqlJunkies User | last post by:
I have pretty same problem with XmlDocument.Load(). It seems to appear after KB834623 hotfix installed. Here is the information to reproduce error: Technical info: • Windows XP Professional SP1 • .NET Framework 1.1 • KB834623 hotfix for .NET Framework 1.1 installed (installing with .NET Speech SDK 1.0) Steps to reproduce: 1. Create new ASP.NET web application in IIS, for example, XmlDocBugTest 2. Place test XML document in the...
7
4905
by: SQLScott | last post by:
I have a Web Service in which I am trying to pass an XMLDocument as a parameter to one of the methods. I would like to use the XMLTextReader to read the XML but I am getting the following error: Value of type System.xml.xmldocument cannot be converted to System.IO.textreader. I would think this is possible to do. Code snippet is below:
5
1839
by: Patrick | last post by:
I understand it is built in behaviour that if an ASP.NET's web.config is set to: <customErrors mode="RemoteOnly" /> then I only get a detailed error message on screen when the ASP.NET application is executed on the IIS server itself. However, I note that with the following
6
32011
by: jasn | last post by:
Hello I am getting the following error message when I try and send an XML sting to a web service, I read somewhere that most web services prefer ascii and some throw errors when using unicode so I have changed the encoding but still cant get rid of the error. System.Xml.XmlException: '', hexadecimal value 0x00, is an invalid character. Line 6, position 124. at System.Xml.XmlScanner.ScanHexEntity() at...
1
649
by: Cesar | last post by:
Hello, I've developed a .NET C# web service; which has one method named, let's say, upload_your_data. This method has one parameter ( string your_data). The value that this parameter will actually have is the content of a XML document. This data will be processed and check for a well-formed xml document and will be validated against a XSD. Before putting my code, let me go on and explain the whole situation. This web method is invoked...
4
4151
by: XML newbie: Urgent pls help! | last post by:
I am using VB.Net. My program is to connect to a remote IPAddress. Once, it verifies the login information it should display the SessionID and enable some button . I appreciate your help and thanku in advance When I run the pgm , I get the error: Can't parse login information. Namespace Manager or XsltContext needed. This query has a prefix, variable or userdefined function. I have added the Try-catch in all my functions. In...
0
9519
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10435
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
10163
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
10000
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
7538
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5436
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5563
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4113
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
3
2920
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.