473,395 Members | 1,468 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,395 software developers and data experts.

Parsing XHTML

I'm trying to parse an XHTML document like this:

file.html:
<?xml version="1.0" ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
<head>
<title>Some title</title>
</head>
<body>
<p>Some text</p>
</body>
</html>

with the following code:

XmlReaderSettings xs = new XmlReaderSettings();
xs.ProhibitDtd = false;
XmlReader reader = XmlReader.Create("file.html", xs);
reader.MoveToContent();

The last line generates the following exception:
An error has occurred while opening external DTD
'http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd': Unable to connect
to the remote server

There is no difference if I'm using XmlReader, XmlDocument.Load...
There is no difference if I'm loading XML from file or from string...

The following line:
xs.ProhibitDtd = true;
produces of course an exception:
For security reasons DTD is prohibited in this XML document. To enable
DTD processing set the ProhibitDtd property on XmlReaderSettings to
false and pass the settings into XmlReader.Create method.

Ok, but I need this DTD! :)

How to do it?

Feb 23 '06 #1
1 4244


gi***@poczta.onet.pl wrote:
I'm trying to parse an XHTML document like this:

file.html:
<?xml version="1.0" ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
<head>
<title>Some title</title>
</head>
<body>
<p>Some text</p>
</body>
</html>

with the following code:

XmlReaderSettings xs = new XmlReaderSettings();
xs.ProhibitDtd = false;
XmlReader reader = XmlReader.Create("file.html", xs);
reader.MoveToContent();

The last line generates the following exception:
An error has occurred while opening external DTD
'http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd': Unable to connect
to the remote server


Works for me if I am connected to the internet. Are you connected? How?
Is there any proxy used?
Make sure there is no code setting the property
xs.XmlResolver
to null.
--

Martin Honnen --- MVP XML
http://JavaScript.FAQTs.com/
Feb 23 '06 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

8
by: Anders Eriksson | last post by:
Hello! I want to extract some info from a some specific HTML pages, Microsofts International Word list (e.g. http://msdn.microsoft.com/library/en-us/dnwue/html/swe_word_list.htm). I want to...
6
by: Hans Kamp | last post by:
Is it possible to write a function like the following: string ReadURL(string URL) { .... } The purpose is that it reads the URL (determined by the parameter) and returns the string in which...
9
by: wardy | last post by:
I'm trying to undestand the impact of using content negotiation when rendering my Web pages to various different browsers as I would like to use the XHTML Strict DOCTYPE declaration. Reading the...
4
by: JustASymbol | last post by:
I'm not sure if this is possible, but if someone has an idea of how to do this or a link that can point me in the right direction, it would be much appreciated. I have a very simple form with...
0
by: june | last post by:
Hi, I have a big problem with parsing HTML into a XHTML using Cberneko to validate the html. First I tried to work with a HTML-File. This solutions works fine: String aHTMLFile =...
12
by: Klaus Alexander Seistrup | last post by:
Hi group, I am new to xgawk (and seemingly to xml also), and I've been struggling all afternoon to have xgawką parsing an XHTML file containing a hCard˛, without luck. I wonder if you guys...
2
by: ICPooreMan | last post by:
I have a page which I'm trying to add ajax functionality to however I'm having some difficulties. Basically I have a section of my page that looks like this <div id="pageDiv"> <!--a random...
9
by: seberino | last post by:
I understand that the web is full of ill-formed XHTML web pages but this is Microsoft: http://moneycentral.msn.com/companyreport?Symbol=BBBY I can't validate it and xml.minidom.dom.parseString...
6
by: jackwootton | last post by:
Hello everyone, I understand that XML can be parsed using JavaScript using the XML Document object. However, it is possible to parse XHTML using JavaScript? I currently listen for DOMMutation...
0
by: Mitchel Haas | last post by:
I've noticed several inquiries in the past for libraries/toolkits to generate or parse xhtml. Although there are already a few libraries available for this purpose, I'd like to announce a new...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.