Greetings.
Just wondering if it is possible to use XmlTextReader to
read off a html doc:
e.g. XmlTextReader tr = new XmlTextReader
("http://localhost/test.xml");
where test.xml contains the following:
<table cellspacing="1" cellpadding="1" width="100%">
<tr valign="top">
<td class="head" width="20%">tes t heading1</td>
<td class="head" width="10%">tes t heading2</td>
</tr>
<tr valign="top">
<td class="content" width="20%">con tent1</td>
<td class="content" width="10%">
<table cellspacing="0" width="100%">
<tr>
<td align="left">te st</td>
<td nowarp align="right">
<nobr>0.12345 6</nobr>
</td>
</tr>
</table>
</td>
</tr>
</table>
It seems to work for the first few seconds and then it
crashes my win app after the XmlTextReader come across
certain situation when doing a Xml.TextReader. Read(). Is
it to do with the well-formness(is there such a word??) of
this html doc? Also, is there a way to detect and convert
to the #1390(can't remember if this is right but I
am trying to say the equivalent special character) on the
fly (i.e. without saving the html onto disk)?
Any thought will be appreciated. 3 2952
Daniel wrote: Just wondering if it is possible to use XmlTextReader to read off a html doc:
Not really, because html is not xml. Some html docs might be well-formed, so
they can be read be XmlTextReader, but in general a single <br> tag or
ubiquitous in HTML will stop reading.
e.g. XmlTextReader tr = new XmlTextReader ("http://localhost/test.xml");
where test.xml contains the following:
<table cellspacing="1" cellpadding="1" width="100%"> <tr valign="top"> <td class="head" width="20%">tes t heading1</td> <td class="head" width="10%">tes t heading2</td> </tr> <tr valign="top"> <td class="content" width="20%">con tent1</td> <td class="content" width="10%"> <table cellspacing="0" width="100%"> <tr> <td align="left">te st</td> <td nowarp align="right">
Watch nowrap - it's so-called boolean attribute, XML doesn't support that.
Try SGMLReader instead of XmlTextReader http://www.gotdotnet.com/Community/U...4-C3BD760564BC
--
Oleg Tkachenko http://www.tkachenko.com/blog
Multiconn Technologies, Israel
Thanks Oleg,
The url you provided looks very interesting. And looking
at the replies the sgmlreader has got, people are
definitely finding it useful. And I will definitely
download it and have a play with it.
However, I do want to learn more about reading html using
the XmlTextReader. Do you (or anybody out there) know of a
good url to get me started?
Cheers. -----Original Message----- Daniel wrote:
Just wondering if it is possible to use XmlTextReader
to read off a html doc:Not really, because html is not xml. Some html docs might
be well-formed, sothey can be read be XmlTextReader, but in general a
single <br> tag orubiquitous in HTML will stop reading.
e.g. XmlTextReader tr = new XmlTextReader ("http://localhost/test.xml");
where test.xml contains the following:
<table cellspacing="1" cellpadding="1" width="100%"> <tr valign="top"> <td class="head" width="20%">tes t heading1</td> <td class="head" width="10%">tes t heading2</td> </tr> <tr valign="top"> <td class="content" width="20%">con tent1</td> <td class="content" width="10%"> <table cellspacing="0" width="100%"> <tr> <td align="left">te st</td> <td nowarp align="right"> Watch nowrap - it's so-called boolean attribute, XML
doesn't support that. Try SGMLReader instead of XmlTextReader http://www.gotdotnet.com/Community/U...es/Details.asp
x?SampleGuid=B9 0FDDCE-E60D-43F8-A5C4-C3BD760564BC-- Oleg Tkachenko http://www.tkachenko.com/blog Multiconn Technologies, Israel
.
Daniel wrote: However, I do want to learn more about reading html using the XmlTextReader. Do you (or anybody out there) know of a good url to get me started?
Not really. It's just technically impossible to read HTML by XmlTextReader
without some sort of preprocessing of HTML (aka conversion HTML to XML or
XHTML). Often Tidy is used for that too. Google for "HTML Tidy".
--
Oleg Tkachenko http://www.tkachenko.com/blog
Multiconn Technologies, Israel This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics |
by: Russell Mangel |
last post by:
Is it possible to use the using statement with XmlTextReader?
I tryed to use it, but it gives me the error message:
Cannot implicitly convert type 'System.Xml.XmlTextReader' to
'System.IDisposable'
Is there something I am doing wrong?
// This no worky
using(XmlTextReader xtr = new XmlTextReader("C:\\myfile.xml"))
{
|
by: MLibby |
last post by:
How do I convert an HTML page into XML? My initial thought is to convert the page to xslt but I'm not sure how to do this. Please provide any source code examples if you have them.
Thanks,
Mike
--
mcp, mcse, mcsd, mcad.net, mcsd.net
|
by: Mitch |
last post by:
I have some simple HTML I'm trying to read with the XMLTextReader. As in the
MSDS examples, I set up a loop to read each XML node:
while (reader.Read())
{
switch (reader.NodeType)
{
case XmlNodeType.Element:
Console.WriteLine("<{0}>", reader.Name);
break;
|
by: Jordan |
last post by:
I have a unicode XML file that I am trying to read using the .NET
XmlTextReader in C#. How do I read the unicode file? If I try to
using the XmlTextReader.Read() method, it throws an exception.
The exception reads:
The '€' character, hexadecimal value 0x80, cannot begin with a name.
Line 1, position 2.
Any suggestions? I read on Microsoft's website about writing surrogate
pairs, but I can't find any documentation that confirms the
|
by: newbie |
last post by:
Hello
anybody knows how i can strip out the html from a control's innerhtml to show only the relevant text? i.e
<span id="test1" runat="server"><a href="test1.htm">abcde</a><b>blah</b></span
so that on my server side code, somehow i would get "abcde blah"?
thanks!
| |
by: Amil |
last post by:
I want to read/parse some simple HTML and look for certain tokens and
content. Can I use a XmlTextReader for this? If not, any other ideas?
Amil
|
by: atyant |
last post by:
hey i want to know the funda of screen scrapping that how it is done using C#
|
by: bthubbard |
last post by:
Hello All,
I am hoping to hear other people's suggestions and advice regarding
this issue.
Generating and sending a basic HTML email with Microsoft.Net is easy.
I can throw some HTML together from strings, templates, excreta and
then fire it out to the world via System.Net.Mail.
If I want to generate a more advanced HTML email using the same the
same templates as my site I could write a component which uses screen
|
by: Ramdas |
last post by:
I am doing some HTML scrapping for a side project.
I need a method using sgmllib or HTMLParser to parse an HTML file and
get line nos of all the tags
I tried a few things, but I am just not able to work with either if
the parsers.
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look !
Part I. Meaning of...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it.
First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
| |
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed.
This is as boiled down as I can make it.
Here is my compilation command:
g++-12 -std=c++20 -Wnarrowing bit_field.cpp
Here is the code in...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own....
Now, this would greatly impact the work of software developers. The idea...
|
by: conductexam |
last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one.
At the time of converting from word file to html my equations which are in the word document file was convert into image.
Globals.ThisAddIn.Application.ActiveDocument.Select();...
|
by: 6302768590 |
last post by:
Hai team
i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
|
by: muto222 |
last post by:
How can i add a mobile payment intergratation into php mysql website.
| |
by: bsmnconsultancy |
last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...
| |