Connecting Tech Pros Worldwide Forums | Help | Site Map

HOWTO: Read Html File with XML classes?

alejandro lapeyre
Guest
 
Posts: n/a
#1: Nov 12 '05
How can I load / parse an HTML file with .NET?

Thanks!
Best regards,
Alejandro Lapeyre



Martin Honnen
Guest
 
Posts: n/a
#2: Nov 12 '05

re: HOWTO: Read Html File with XML classes?




alejandro lapeyre wrote:
[color=blue]
> How can I load / parse an HTML file with .NET?[/color]

If it is XHTML then you can parse it with the XML classes
(XmlTextReader, XmlDocument). If it is HTML then .NET 1.0 and 1.1 have
nothing appropriate built-in but there is an SGMLReader class available
here:
<http://www.gotdotnet.com/Community/UserSamples/Details.aspx?SampleGuid=B90FDDCE-E60D-43F8-A5C4-C3BD760564BC>


--

Martin Honnen
http://JavaScript.FAQTs.com/
alejandro lapeyre
Guest
 
Posts: n/a
#3: Nov 12 '05

re: HOWTO: Read Html File with XML classes?


Thanks Martin
Thats the answer I was praying not to receive. I was hoping that maybe a
Schema, DTD... snif.
:-)
Ok, keep working.
Happy New Year.

"Martin Honnen" <mahotrash@yahoo.de> escribió en el mensaje
news:%23K98DmO8EHA.2608@TK2MSFTNGP10.phx.gbl...[color=blue]
>
>
> alejandro lapeyre wrote:
>[color=green]
>> How can I load / parse an HTML file with .NET?[/color]
>
> If it is XHTML then you can parse it with the XML classes (XmlTextReader,
> XmlDocument). If it is HTML then .NET 1.0 and 1.1 have nothing appropriate
> built-in but there is an SGMLReader class available here:
> <http://www.gotdotnet.com/Community/UserSamples/Details.aspx?SampleGuid=B90FDDCE-E60D-43F8-A5C4-C3BD760564BC>
>
>
> --
>
> Martin Honnen
> http://JavaScript.FAQTs.com/[/color]


Christoph Schittko [MVP]
Guest
 
Posts: n/a
#4: Nov 12 '05

re: HOWTO: Read Html File with XML classes?



Alejandro,

The SgmlReader was written by a member of the same team that worked on
System.Xml in .NET V1.0. It closely follows the XmlReader model and it's
definitely worth checking out. The SgmlReader does produce XHTML from
HTML ... and then you would have a schema. I'm not sure more what you're
looking for though.

HTH,
Christoph Schittko
MVP XML
http://weblogs.asp.net/cschittko
[color=blue]
> -----Original Message-----
> From: alejandro lapeyre [mailto:alejandrolapeyre@jotmail.com]
> Posted At: Sunday, January 02, 2005 10:47 AM
> Posted To: microsoft.public.dotnet.xml
> Conversation: HOWTO: Read Html File with XML classes?
> Subject: Re: HOWTO: Read Html File with XML classes?
>
> Thanks Martin
> Thats the answer I was praying not to receive. I was hoping that maybe[/color]
a[color=blue]
> Schema, DTD... snif.
> :-)
> Ok, keep working.
> Happy New Year.
>
> "Martin Honnen" <mahotrash@yahoo.de> escribió en el mensaje
> news:%23K98DmO8EHA.2608@TK2MSFTNGP10.phx.gbl...[color=green]
> >
> >
> > alejandro lapeyre wrote:
> >[color=darkred]
> >> How can I load / parse an HTML file with .NET?[/color]
> >
> > If it is XHTML then you can parse it with the XML classes[/color]
> (XmlTextReader,[color=green]
> > XmlDocument). If it is HTML then .NET 1.0 and 1.1 have nothing[/color]
> appropriate[color=green]
> > built-in but there is an SGMLReader class available here:
> >[/color]
>[/color]
<http://www.gotdotnet.com/Community/U...px?SampleGuid=
B9[color=blue]
> 0FDDCE-E60D-43F8-A5C4-C3BD760564BC>[color=green]
> >
> >
> > --
> >
> > Martin Honnen
> > http://JavaScript.FAQTs.com/[/color][/color]


alejandro lapeyre
Guest
 
Posts: n/a
#5: Nov 12 '05

re: HOWTO: Read Html File with XML classes?


Thanks for your attention Christoph,

I have a web site and want to do some replacement in the pages to include a
common header and footer, and also the classic "next" "previous" links.

I have a working program in VB5 and was looking to do it in .NET.

In my case a simple stream read and some text replacement works fine, but
now I am looking for a more general approach so I can also use it for other
webs.

The SgmlReader works fine.

Thank you.

"Christoph Schittko [MVP]" <INVALIDEMAIL@austin.rr.com> escribió en el
mensaje news:OTmfnKV8EHA.3840@tk2msftngp13.phx.gbl...[color=blue]
>
> Alejandro,
>
> The SgmlReader was written by a member of the same team that worked on
> System.Xml in .NET V1.0. It closely follows the XmlReader model and it's
> definitely worth checking out. The SgmlReader does produce XHTML from
> HTML ... and then you would have a schema. I'm not sure more what you're
> looking for though.
>
> HTH,
> Christoph Schittko
> MVP XML
> http://weblogs.asp.net/cschittko
>[color=green]
>> -----Original Message-----
>> From: alejandro lapeyre [mailto:alejandrolapeyre@jotmail.com]
>> Posted At: Sunday, January 02, 2005 10:47 AM
>> Posted To: microsoft.public.dotnet.xml
>> Conversation: HOWTO: Read Html File with XML classes?
>> Subject: Re: HOWTO: Read Html File with XML classes?
>>
>> Thanks Martin
>> Thats the answer I was praying not to receive. I was hoping that maybe[/color]
> a[color=green]
>> Schema, DTD... snif.
>> :-)
>> Ok, keep working.
>> Happy New Year.
>>
>> "Martin Honnen" <mahotrash@yahoo.de> escribió en el mensaje
>> news:%23K98DmO8EHA.2608@TK2MSFTNGP10.phx.gbl...[color=darkred]
>> >
>> >
>> > alejandro lapeyre wrote:
>> >
>> >> How can I load / parse an HTML file with .NET?
>> >
>> > If it is XHTML then you can parse it with the XML classes[/color]
>> (XmlTextReader,[color=darkred]
>> > XmlDocument). If it is HTML then .NET 1.0 and 1.1 have nothing[/color]
>> appropriate[color=darkred]
>> > built-in but there is an SGMLReader class available here:
>> >[/color]
>>[/color]
> <http://www.gotdotnet.com/Community/U...px?SampleGuid=
> B9[color=green]
>> 0FDDCE-E60D-43F8-A5C4-C3BD760564BC>[color=darkred]
>> >
>> >
>> > --
>> >
>> > Martin Honnen
>> > http://JavaScript.FAQTs.com/[/color][/color]
>
>[/color]


Patrick Philippot
Guest
 
Posts: n/a
#6: Nov 12 '05

re: HOWTO: Read Html File with XML classes?


alejandro lapeyre wrote:[color=blue]
> How can I load / parse an HTML file with .NET?[/color]

Hi,

You should have a look the HTML Agility Pack

http://blogs.msdn.com/smourier/archi...6/04/8265.aspx

--
Patrick Philippot - Microsoft MVP
MainSoft Consulting Services
www.mainsoft.fr


Closed Thread