473,786 Members | 2,712 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

The trouble with my code ?

Any ideas on this. I am trying to loop through an xml document to remove
attributes, but Im having so much trouble, any help is appreciated

//THIS IS THE EXCEPTION ( SEE CODE LINE WHERE FAILURE OCCURS

'//Unexpected XML declaration. The XML declaration must be the first node in
the document, and no white space characters are allowed to appear before it.
Line 13, position 11.

//THE XHTML TEXT WHICH IS BEING LOOOKED AT

<table cellspacing="0" rules="all" border="1" id="dgArticles "
style="font-family:Arial;fo nt-size:8pt;width: 762px;border-collapse:collap se;">
<tr style="color:Wh ite;background-color:Blue;">
<td>&nbsp;</td><td style="width:0. 75cm;">ID</td><td
style="width:7c m;">Title</td><td style="width:13 cm;">Summary</td><td
style="width:1c m;">Published </td>
</tr><tr valign="Top">
<td><a href='Articles/Art226/Art226.html'
target=_blank>O pen</a></td><td>226</td><td>SQL Server 2005
Permissions</td><td>See this article for a handy reference to the complete
list of permissons on SQL Server 2005 </td><td>28/12/2006</td>
</tr><tr valign="Top">
<td><a href='Articles/Art223/Art223.html'
target=_blank>O pen</a></td><td>223</td><td>SQL Schemas In SQL
2005</td><td>Want to know a little more about schemas in SQL Server 2005,
take a look at this quick overview. </td><td>25/12/2006</td>
</tr><tr valign="Top">
<td><a href='Articles/Art224/Art224.html'
target=_blank>O pen</a></td><td>224</td><td>SQL Server 2005 - Must_Change
option</td><td>When de-checking Enforce Password Policy, SQL Security
responds with an error and refers to Must_Change being in force. This
article shows you how to reverse this. </td><td>27/12/2006</td>
</tr><tr valign="Top">
<td><a href='Articles/Art220/Art220.html'
target=_blank>O pen</a></td><td>220</td><td>Installi ng Adventureworks
Sample</td><td>If you dont install the samples for Adventureworks first
time, getting them on can be a little tricky. This article explains.
</td><td>23/12/2006</td>
</tr>
</table>

'// THE CODE WHICH PROCESSES THE xhtml

Private Sub useXmlDocButton _Click(ByVal sender As System.Object, ByVal e As
System.EventArg s) Handles useXmlDocButton .Click

GC.Collect()

'Clear message

Me.messageTextB ox.Text = String.Empty

Dim xmlString As String

'//Some pre-processing here

xmlString = Me.sourcetextBo x.Text.ToLower

'//Remove nbsp

xmlString = Regex.Replace(x mlString, "&nbsp;", "")

'//Remove any explorer codes

xmlString = Regex.Replace(x mlString, "&[a-zA-Z0-9]*;", "")

'//Remove any unquoted attributes which appear at the end of a tag

xmlString = Regex.Replace(x mlString, " [A-Za-z0-9]*=[A-Za-z0-9_]*>", ">")

'//Remove any unquoted attributes which before end of tag

xmlString = Regex.Replace(x mlString, " [A-Za-z0-9]*=[A-Za-z0-9_]* ", "")

'Finally prepend the cml declaration needed

xmlString = "<?xml version='1.0' encoding='utf-8'?" & xmlString

Me.sourcetextBo x.Text = xmlString

'Get the xml into a stream

Dim stream As New System.IO.Memor yStream

stream.Write((N ew System.Text.UTF 8Encoding).GetB ytes(xmlString) , 0,
xmlString.Lengt h)

stream.Position = 0

Dim xDoc As New System.Xml.XmlD ocument

xDoc.Load(strea m)

stream.Position = 0

Dim xreader As New System.Xml.XmlT extReader(strea m)

Dim xNode As System.Xml.XmlN ode

stream.Position = 0

While xreader.Read()

If xreader.NodeTyp e = Xml.XmlNodeType .Element Then

xNode = xDoc.ReadNode(x reader) '//************* THIS IS WHERE IT FAILS //

xNode.Attribute s.RemoveAll()

End If

End While

Dim sr As New System.IO.Strea mReader(stream)

stream.Position = 0

targetTextBox.T ext = sr.ReadToEnd

sr.Close()

sr.Dispose()

xreader.Close()

stream.Close()

stream.Dispose( )


Jan 6 '07 #1
6 1724

"Just Me" <news.microsoft .comwrote in message
news:eo******** ******@TK2MSFTN GP06.phx.gbl...
: Any ideas on this. I am trying to loop through an xml document to
: remove attributes, but Im having so much trouble, any help is
: appreciated
:
: //THIS IS THE EXCEPTION ( SEE CODE LINE WHERE FAILURE OCCURS
:
: '//Unexpected XML declaration. The XML declaration must be the first
: node in the document, and no white space characters are allowed to
: appear before it. Line 13, position 11.
:
: //THE XHTML TEXT WHICH IS BEING LOOOKED AT
:
: <table cellspacing="0" rules="all" border="1" id="dgArticles "
: style="font-family:Arial;fo nt-size:8pt;width: 762px;border-collapse
: :collapse;">
: <tr style="color:Wh ite;background-color:Blue;">
: <td>&nbsp;</td><td style="width:0. 75cm;">ID</td><td
: style="width:7c m;">Title</td><td style="width:13 cm;">Summary</td><td
: style="width:1c m;">Published </td>
: </tr><tr valign="Top">
: <td><a href='Articles/Art226/Art226.html'
: target=_blank>O pen</a></td><td>226</td><td>SQL Server 2005
: Permissions</td><td>See this article for a handy reference to the
: complete list of permissons on SQL Server 2005
: </td><td>28/12/2006</td>
: </tr><tr valign="Top">
: <td><a href='Articles/Art223/Art223.html'
: target=_blank>O pen</a></td><td>223</td><td>SQL Schemas In SQL
: 2005</td><td>Want to know a little more about schemas in SQL Server
: 2005, take a look at this quick overview.
: </td><td>25/12/2006</td>
: </tr><tr valign="Top">
: <td><a href='Articles/Art224/Art224.html'
: target=_blank>O pen</a></td><td>224</td><td>SQL Server 2005 -
: Must_Change option</td><td>When de-checking Enforce Password Policy,
: SQL Security responds with an error and refers to Must_Change being
: in force. This article shows you how to reverse this.
: </td><td>27/12/2006</td>
: </tr><tr valign="Top">
: <td><a href='Articles/Art220/Art220.html'
: target=_blank>O pen</a></td><td>220</td><td>Installi ng Adventureworks
: Sample</td><td>If you dont install the samples for Adventureworks
: first time, getting them on can be a little tricky. This article
: explains.
: </td><td>23/12/2006</td>
: </tr>
: </table>
:
: '// THE CODE WHICH PROCESSES THE xhtml
:
:
:
: Private Sub useXmlDocButton _Click(ByVal sender As System.Object,
: ByVal e As System.EventArg s) Handles useXmlDocButton .Click
:
: GC.Collect()
:
: 'Clear message
:
: Me.messageTextB ox.Text = String.Empty
:
: Dim xmlString As String
:
: '//Some pre-processing here
:
: xmlString = Me.sourcetextBo x.Text.ToLower
:
: '//Remove nbsp
:
: xmlString = Regex.Replace(x mlString, "&nbsp;", "")
:
: '//Remove any explorer codes
:
: xmlString = Regex.Replace(x mlString, "&[a-zA-Z0-9]*;", "")
:
: '//Remove any unquoted attributes which appear at the end of a tag
:
: xmlString = Regex.Replace(x mlString, " [A-Za-z0-9]*=[A-Za-z0-9_]*>",
: ">")
:
: '//Remove any unquoted attributes which before end of tag
:
: xmlString = Regex.Replace(x mlString, " [A-Za-z0-9]*=[A-Za-z0-9_]* ",
: "")
:
: 'Finally prepend the cml declaration needed
:
: xmlString = "<?xml version='1.0' encoding='utf-8'?" & xmlString
:
: Me.sourcetextBo x.Text = xmlString
:
: 'Get the xml into a stream
:
: Dim stream As New System.IO.Memor yStream
:
: stream.Write((N ew System.Text.UTF 8Encoding).GetB ytes(xmlString) , 0,
: xmlString.Lengt h)
:
: stream.Position = 0
:
: Dim xDoc As New System.Xml.XmlD ocument
:
: xDoc.Load(strea m)
:
: stream.Position = 0
:
: Dim xreader As New System.Xml.XmlT extReader(strea m)
:
: Dim xNode As System.Xml.XmlN ode
:
: stream.Position = 0
:
: While xreader.Read()
:
: If xreader.NodeTyp e = Xml.XmlNodeType .Element Then
:
: xNode = xDoc.ReadNode(x reader) '//************* THIS IS WHERE IT
: FAILS //
:
: xNode.Attribute s.RemoveAll()
:
: End If
:
: End While
:
:
:
: Dim sr As New System.IO.Strea mReader(stream)
:
: stream.Position = 0
:
: targetTextBox.T ext = sr.ReadToEnd
:
: sr.Close()
:
: sr.Dispose()
:
: xreader.Close()
:
: stream.Close()
:
: stream.Dispose( )
Try something along these lines instead (VB.NET 2.0):

xmlString As String = Me.sourcetextBo x.Text.ToLower
xmlString = Regex.Replace(x mlString, _
"&nbsp;", "")
xmlString = Regex.Replace(x mlString, _
"&[a-zA-Z0-9]*;", "")
xmlString = Regex.Replace(x mlString, _
" [A-Za-z0-9]*=[A-Za-z0-9_]*>", ">")
xmlString = Regex.Replace(x mlString, _
" [A-Za-z0-9]*=[A-Za-z0-9_]* ", "")

'NOT SURE WHY YOU'D WANT THIS BUT NO HARM IN IT
xmlString = "<?xml version='1.0' encoding='utf-8'?" & xmlString

Dim tmpDoc as New XmlDocument
tmpdoc.loadxml( xmlstring)
ZapAttributes(t mpdoc.selectSin gleNode("/table"))
Me.targetTextBo x.Text = tmpdoc.InnerXml

[...]

Private Sub ZapAttributes(x Node as xmlnode)
If xNode.attribute s IsNot Nothing Then
xnode.Attribute s.RemoveAll
End If
For each child As xmlNode in xNode.childNOde s
ZapAttributes(c hild)
Next
End Sub

Ralf
--
--
----------------------------------------------------------
* ^~^ ^~^ *
* _ {~ ~} {~ ~} _ *
* /_``>*< >*<''_\ *
* (\--_)++) (++(_--/) *
----------------------------------------------------------
There are no advanced students in Aikido - there are only
competent beginners. There are no advanced techniques -
only the correct application of basic principles.
Jan 6 '07 #2
Thanks for your help. But it doesent really answer my question about my own
failing code. Where am I going wrong, this is important for me to learn as I
need to know why its failing.

Many Thanks

"_AnonCowar d" <ab*@xyz.comwro te in message
news:45******** *************** @roadrunner.com ...
>
"Just Me" <news.microsoft .comwrote in message
news:eo******** ******@TK2MSFTN GP06.phx.gbl...
: Any ideas on this. I am trying to loop through an xml document to
: remove attributes, but Im having so much trouble, any help is
: appreciated
:
: //THIS IS THE EXCEPTION ( SEE CODE LINE WHERE FAILURE OCCURS
:
: '//Unexpected XML declaration. The XML declaration must be the first
: node in the document, and no white space characters are allowed to
: appear before it. Line 13, position 11.
:
: //THE XHTML TEXT WHICH IS BEING LOOOKED AT
:
: <table cellspacing="0" rules="all" border="1" id="dgArticles "
: style="font-family:Arial;fo nt-size:8pt;width: 762px;border-collapse
: :collapse;">
: <tr style="color:Wh ite;background-color:Blue;">
: <td>&nbsp;</td><td style="width:0. 75cm;">ID</td><td
: style="width:7c m;">Title</td><td style="width:13 cm;">Summary</td><td
: style="width:1c m;">Published </td>
: </tr><tr valign="Top">
: <td><a href='Articles/Art226/Art226.html'
: target=_blank>O pen</a></td><td>226</td><td>SQL Server 2005
: Permissions</td><td>See this article for a handy reference to the
: complete list of permissons on SQL Server 2005
: </td><td>28/12/2006</td>
: </tr><tr valign="Top">
: <td><a href='Articles/Art223/Art223.html'
: target=_blank>O pen</a></td><td>223</td><td>SQL Schemas In SQL
: 2005</td><td>Want to know a little more about schemas in SQL Server
: 2005, take a look at this quick overview.
: </td><td>25/12/2006</td>
: </tr><tr valign="Top">
: <td><a href='Articles/Art224/Art224.html'
: target=_blank>O pen</a></td><td>224</td><td>SQL Server 2005 -
: Must_Change option</td><td>When de-checking Enforce Password Policy,
: SQL Security responds with an error and refers to Must_Change being
: in force. This article shows you how to reverse this.
: </td><td>27/12/2006</td>
: </tr><tr valign="Top">
: <td><a href='Articles/Art220/Art220.html'
: target=_blank>O pen</a></td><td>220</td><td>Installi ng Adventureworks
: Sample</td><td>If you dont install the samples for Adventureworks
: first time, getting them on can be a little tricky. This article
: explains.
: </td><td>23/12/2006</td>
: </tr>
: </table>
:
: '// THE CODE WHICH PROCESSES THE xhtml
:
:
:
: Private Sub useXmlDocButton _Click(ByVal sender As System.Object,
: ByVal e As System.EventArg s) Handles useXmlDocButton .Click
:
: GC.Collect()
:
: 'Clear message
:
: Me.messageTextB ox.Text = String.Empty
:
: Dim xmlString As String
:
: '//Some pre-processing here
:
: xmlString = Me.sourcetextBo x.Text.ToLower
:
: '//Remove nbsp
:
: xmlString = Regex.Replace(x mlString, "&nbsp;", "")
:
: '//Remove any explorer codes
:
: xmlString = Regex.Replace(x mlString, "&[a-zA-Z0-9]*;", "")
:
: '//Remove any unquoted attributes which appear at the end of a tag
:
: xmlString = Regex.Replace(x mlString, " [A-Za-z0-9]*=[A-Za-z0-9_]*>",
: ">")
:
: '//Remove any unquoted attributes which before end of tag
:
: xmlString = Regex.Replace(x mlString, " [A-Za-z0-9]*=[A-Za-z0-9_]* ",
: "")
:
: 'Finally prepend the cml declaration needed
:
: xmlString = "<?xml version='1.0' encoding='utf-8'?" & xmlString
:
: Me.sourcetextBo x.Text = xmlString
:
: 'Get the xml into a stream
:
: Dim stream As New System.IO.Memor yStream
:
: stream.Write((N ew System.Text.UTF 8Encoding).GetB ytes(xmlString) , 0,
: xmlString.Lengt h)
:
: stream.Position = 0
:
: Dim xDoc As New System.Xml.XmlD ocument
:
: xDoc.Load(strea m)
:
: stream.Position = 0
:
: Dim xreader As New System.Xml.XmlT extReader(strea m)
:
: Dim xNode As System.Xml.XmlN ode
:
: stream.Position = 0
:
: While xreader.Read()
:
: If xreader.NodeTyp e = Xml.XmlNodeType .Element Then
:
: xNode = xDoc.ReadNode(x reader) '//************* THIS IS WHERE IT
: FAILS //
:
: xNode.Attribute s.RemoveAll()
:
: End If
:
: End While
:
:
:
: Dim sr As New System.IO.Strea mReader(stream)
:
: stream.Position = 0
:
: targetTextBox.T ext = sr.ReadToEnd
:
: sr.Close()
:
: sr.Dispose()
:
: xreader.Close()
:
: stream.Close()
:
: stream.Dispose( )
Try something along these lines instead (VB.NET 2.0):

xmlString As String = Me.sourcetextBo x.Text.ToLower
xmlString = Regex.Replace(x mlString, _
"&nbsp;", "")
xmlString = Regex.Replace(x mlString, _
"&[a-zA-Z0-9]*;", "")
xmlString = Regex.Replace(x mlString, _
" [A-Za-z0-9]*=[A-Za-z0-9_]*>", ">")
xmlString = Regex.Replace(x mlString, _
" [A-Za-z0-9]*=[A-Za-z0-9_]* ", "")

'NOT SURE WHY YOU'D WANT THIS BUT NO HARM IN IT
xmlString = "<?xml version='1.0' encoding='utf-8'?" & xmlString

Dim tmpDoc as New XmlDocument
tmpdoc.loadxml( xmlstring)
ZapAttributes(t mpdoc.selectSin gleNode("/table"))
Me.targetTextBo x.Text = tmpdoc.InnerXml

[...]

Private Sub ZapAttributes(x Node as xmlnode)
If xNode.attribute s IsNot Nothing Then
xnode.Attribute s.RemoveAll
End If
For each child As xmlNode in xNode.childNOde s
ZapAttributes(c hild)
Next
End Sub

Ralf
--
--
----------------------------------------------------------
* ^~^ ^~^ *
* _ {~ ~} {~ ~} _ *
* /_``>*< >*<''_\ *
* (\--_)++) (++(_--/) *
----------------------------------------------------------
There are no advanced students in Aikido - there are only
competent beginners. There are no advanced techniques -
only the correct application of basic principles.


Jan 6 '07 #3

"Just Me" <news.microsoft .comwrote in message
news:OY******** ******@TK2MSFTN GP04.phx.gbl...
:
: Thanks for your help. But it doesent really answer my question about
: my own failing code. Where am I going wrong, this is important for
: me to learn as I need to know why its failing.
:
: Many Thanks

<snip>

Well, at first glance it would appear that the problem is here:

=============== ==============
xmlString = "<?xml version='1.0' encoding='utf-8'?" & xmlString
=============== ==============

This is the Xml Declaration the exception is referring to. However, if
you remove this line you just end up with a different exception -
"There are multiple root elements" - so in reality, the xml
declaration isn't the actual problem.

What these two exceptions have in common is that they are reporting
the underlying xml as being malformed and I think that is an important
clue. I'm not an expert with the memory stream object, so I cannot
give you a specific answer as to what is happening but it appears that
the when the xml reader gets to the end of the memory stream, it is
looping back on itself. What the xml text reader object therefore ends
up seeing is something like this:

<?xml version='1.0'?>
<table>
<tr>
[...]
</tr>
</table>
<?xml version='1.0'?>
<table>
<tr>
[...]
</tr>
</table>

In the first exception message, it's objecting because it thinks it's
seeing the <?xml...?declar ation embedded in the complete document.
In second exception, it's objecting to the what it thinks is a second
root element.

As I've stated, I'm not familiar with the memory stream object so I
don't know in fact that this what is happening, but this certainly
strikes me as plausible. This argument is reinforced when you consider
that if you copy the xml into a text file and make the following
change, the xmlexceptions go away:

'Dim xreader As New System.Xml.XmlT extReader(strea m)
Dim xreader As New System.Xml.XmlT extReader("xhtm ldoc.xml")

Ralf
--
--
----------------------------------------------------------
* ^~^ ^~^ *
* _ {~ ~} {~ ~} _ *
* /_``>*< >*<''_\ *
* (\--_)++) (++(_--/) *
----------------------------------------------------------
There are no advanced students in Aikido - there are only
competent beginners. There are no advanced techniques -
only the correct application of basic principles.
Jan 7 '07 #4
Ok Ralf

Thanks for your insight into this problem, I find this whole area a little
confusing, there seems to be so many ways of skinning the same cat. You have
the xpath stuff, the xldocument itself, the xmlreader, the streams.

Blows my head off sometimes.

I am trying to alter the code you gave me so that I can re-apply specific
class attributes to the first row and another to the tables cells and one
for the table tag itself.

I seem to have almost got it, but not quite.

Thanks anyway for your help.

"_AnonCowar d" <ab*@xyz.comwro te in message
news:45******** **************@ roadrunner.com. ..
>
"Just Me" <news.microsoft .comwrote in message
news:OY******** ******@TK2MSFTN GP04.phx.gbl...
:
: Thanks for your help. But it doesent really answer my question about
: my own failing code. Where am I going wrong, this is important for
: me to learn as I need to know why its failing.
:
: Many Thanks

<snip>

Well, at first glance it would appear that the problem is here:

=============== ==============
xmlString = "<?xml version='1.0' encoding='utf-8'?" & xmlString
=============== ==============

This is the Xml Declaration the exception is referring to. However, if
you remove this line you just end up with a different exception -
"There are multiple root elements" - so in reality, the xml
declaration isn't the actual problem.

What these two exceptions have in common is that they are reporting
the underlying xml as being malformed and I think that is an important
clue. I'm not an expert with the memory stream object, so I cannot
give you a specific answer as to what is happening but it appears that
the when the xml reader gets to the end of the memory stream, it is
looping back on itself. What the xml text reader object therefore ends
up seeing is something like this:

<?xml version='1.0'?>
<table>
<tr>
[...]
</tr>
</table>
<?xml version='1.0'?>
<table>
<tr>
[...]
</tr>
</table>

In the first exception message, it's objecting because it thinks it's
seeing the <?xml...?declar ation embedded in the complete document.
In second exception, it's objecting to the what it thinks is a second
root element.

As I've stated, I'm not familiar with the memory stream object so I
don't know in fact that this what is happening, but this certainly
strikes me as plausible. This argument is reinforced when you consider
that if you copy the xml into a text file and make the following
change, the xmlexceptions go away:

'Dim xreader As New System.Xml.XmlT extReader(strea m)
Dim xreader As New System.Xml.XmlT extReader("xhtm ldoc.xml")

Ralf
--
--
----------------------------------------------------------
* ^~^ ^~^ *
* _ {~ ~} {~ ~} _ *
* /_``>*< >*<''_\ *
* (\--_)++) (++(_--/) *
----------------------------------------------------------
There are no advanced students in Aikido - there are only
competent beginners. There are no advanced techniques -
only the correct application of basic principles.


Jan 7 '07 #5

Just Me wrote :
<backposted/>

If what you want is to extract the contents of the html in a structured
way, then I suggest you use a tool to convert html to xml first --
there are so many details on dealing with html that any ad hoc approach
is sure to leave something out.

It seems HTMLTidy is such a tool (I never used, can't say anything
about it).

Another approach you may consider is using the WebBrowser control to
"navigate" the document structure. Maybe its easier than your current
approach:

<aircode>
Private WithEvents WB As WebBrowser
Private mText As String

Sub ExtractText(ByV al Text As String)
mText = ""
If WB Is Nothing Then WB = New WebBrowser
WB.DocumentText = Text
End Sub

Private Sub WB_DocumentComp leted( _
ByVal sender As System.Object, _
ByVal E As WebBrowserDocum entCompletedEve ntArgs _
) Handles WB.DocumentComp leted

Dim S As New System.Text.Str ingBuilder
MapHtmlItems(WB .Document.Body. Children, S, 0)
mText = S.ToString
Debug.Print(mTe xt)
End Sub

Sub MapHtmlItems(By Val Items As HtmlElementColl ection, _
ByVal Builder As System.Text.Str ingBuilder, _
ByVal Level As Integer)

For Each E As HtmlElement In Items
MapHtmlItem(E, Builder, Level)
Next

End Sub

Sub MapHtmlItem(ByV al Element As HtmlElement, _
ByVal Builder As System.Text.Str ingBuilder, _
ByVal Level As Integer)

If Element.CanHave Children Then
Dim Tag As String = Element.TagName
Dim Text As String = Nothing

If Element.Childre n.Count = 0 Then
Text = Element.InnerTe xt
End If

Select Case Element.TagName .ToLower
Case "table", "tr", "td"
'does nothing
Case Else
Tag = Nothing
End Select

Dim Tab As String = New String(" "c, Level * 2)
If Not String.IsNullOr Empty(Text) Then
Dim S As String
If Not String.IsNullOr Empty(Tag) Then
S = String.Format(" {0}<{1}>{2}</{1}>", Tab, Tag, Text)
Else
S = String.Format(" {0}{1}", Tab, Text)
End If
Builder.AppendL ine(S)
Else
If Not String.IsNullOr Empty(Tag) Then
Builder.AppendL ine(String.Form at("{0}<{1}>", Tab, Tag))
End If

MapHtmlItems(El ement.Children, Builder, Level + 1)

If Not String.IsNullOr Empty(Tag) Then
Builder.AppendL ine(String.Form at("{0}</{1}>", Tab, Tag))
End If

End If

End If

End Sub

</aircode>

The previous code will extract all table structures from the htmltext
you provide. To this, just pass the text to ExtractText(); the result
will be saved in the mText global string. Maybe this can give you new
ideas. ;-)

HTH.

Regards,

Branco.
Any ideas on this. I am trying to loop through an xml document to remove
attributes, but Im having so much trouble, any help is appreciated

//THIS IS THE EXCEPTION ( SEE CODE LINE WHERE FAILURE OCCURS

'//Unexpected XML declaration. The XML declaration must be the first node in
the document, and no white space characters are allowed to appear before it.
Line 13, position 11.

//THE XHTML TEXT WHICH IS BEING LOOOKED AT

<table cellspacing="0" rules="all" border="1" id="dgArticles "
style="font-family:Arial;fo nt-size:8pt;width: 762px;border-collapse:collap se;">
<tr style="color:Wh ite;background-color:Blue;">
<td>&nbsp;</td><td style="width:0. 75cm;">ID</td><td
style="width:7c m;">Title</td><td style="width:13 cm;">Summary</td><td
style="width:1c m;">Published </td>
</tr><tr valign="Top">
<td><a href='Articles/Art226/Art226.html'
target=_blank>O pen</a></td><td>226</td><td>SQL Server 2005
Permissions</td><td>See this article for a handy reference to the complete
list of permissons on SQL Server 2005 </td><td>28/12/2006</td>
</tr><tr valign="Top">
<td><a href='Articles/Art223/Art223.html'
target=_blank>O pen</a></td><td>223</td><td>SQL Schemas In SQL
2005</td><td>Want to know a little more about schemas in SQL Server 2005,
take a look at this quick overview. </td><td>25/12/2006</td>
</tr><tr valign="Top">
<td><a href='Articles/Art224/Art224.html'
target=_blank>O pen</a></td><td>224</td><td>SQL Server 2005 - Must_Change
option</td><td>When de-checking Enforce Password Policy, SQL Security
responds with an error and refers to Must_Change being in force. This
article shows you how to reverse this. </td><td>27/12/2006</td>
</tr><tr valign="Top">
<td><a href='Articles/Art220/Art220.html'
target=_blank>O pen</a></td><td>220</td><td>Installi ng Adventureworks
Sample</td><td>If you dont install the samples for Adventureworks first
time, getting them on can be a little tricky. This article explains.
</td><td>23/12/2006</td>
</tr>
</table>

'// THE CODE WHICH PROCESSES THE xhtml

Private Sub useXmlDocButton _Click(ByVal sender As System.Object, ByVal e As
System.EventArg s) Handles useXmlDocButton .Click

GC.Collect()

'Clear message

Me.messageTextB ox.Text = String.Empty

Dim xmlString As String

'//Some pre-processing here

xmlString = Me.sourcetextBo x.Text.ToLower

'//Remove nbsp

xmlString = Regex.Replace(x mlString, "&nbsp;", "")

'//Remove any explorer codes

xmlString = Regex.Replace(x mlString, "&[a-zA-Z0-9]*;", "")

'//Remove any unquoted attributes which appear at the end of a tag

xmlString = Regex.Replace(x mlString, " [A-Za-z0-9]*=[A-Za-z0-9_]*>", ">")

'//Remove any unquoted attributes which before end of tag

xmlString = Regex.Replace(x mlString, " [A-Za-z0-9]*=[A-Za-z0-9_]* ", "")

'Finally prepend the cml declaration needed

xmlString = "<?xml version='1.0' encoding='utf-8'?" & xmlString

Me.sourcetextBo x.Text = xmlString

'Get the xml into a stream

Dim stream As New System.IO.Memor yStream

stream.Write((N ew System.Text.UTF 8Encoding).GetB ytes(xmlString) , 0,
xmlString.Lengt h)

stream.Position = 0

Dim xDoc As New System.Xml.XmlD ocument

xDoc.Load(strea m)

stream.Position = 0

Dim xreader As New System.Xml.XmlT extReader(strea m)

Dim xNode As System.Xml.XmlN ode

stream.Position = 0

While xreader.Read()

If xreader.NodeTyp e = Xml.XmlNodeType .Element Then

xNode = xDoc.ReadNode(x reader) '//************* THIS IS WHERE IT FAILS //

xNode.Attribute s.RemoveAll()

End If

End While

Dim sr As New System.IO.Strea mReader(stream)

stream.Position = 0

targetTextBox.T ext = sr.ReadToEnd

sr.Close()

sr.Dispose()

xreader.Close()

stream.Close()

stream.Dispose( )
Jan 8 '07 #6
In the end I was able to get just what I needed. Here you go!

Imports System.xml

Imports System.Text.Reg ularExpressions

Private idNo As Integer

Private rowCount As Integer

Private Sub processButton_C lick(ByVal sender As System.Object, ByVal e As
System.EventArg s) Handles processButton.C lick

Dim xmlString As String

idNo = 0

'//Some pre-processing here

xmlString = Me.sourceTextbo x.Text.ToLower

'//Remove nbsp

xmlString = Regex.Replace(x mlString, "&nbsp;", "")

'//Remove any explorer codes

xmlString = Regex.Replace(x mlString, "&[a-zA-Z0-9]*;", "")

'//Remove any unquoted attributes which appear at the end of a tag

xmlString = Regex.Replace(x mlString, "\sp*[A-Za-z0-9]*=[A-Za-z0-9_]*>", ">")

'//Remove any unquoted attributes which before end of tag

xmlString = Regex.Replace(x mlString, "\sp*[A-Za-z0-9]*=[A-Za-z0-9_]* ", "")

Dim tmpDoc As New XmlDocument

Try

tmpDoc.LoadXml( xmlString)

ZapAttributes(t mpDoc.SelectSin gleNode("/table"), tmpDoc)

Me.targetTextBo x.Text = tmpDoc.InnerXml

Catch ex As XmlException

End Try

End Sub

Private Sub ZapAttributes(B yVal xNode As XmlNode, ByVal xd As
System.Xml.XmlD ocument)

If Not (xNode.Attribut es Is Nothing) Then

Dim xAttr As System.Xml.XmlA ttribute

xNode.Attribute s.RemoveAll()

Select Case xNode.Name

Case "table"

xAttr = xd.CreateAttrib ute("class")

xAttr.Value = "ArticleTableTa g"

xNode.Attribute s.Append(xAttr)

Case "tr"

rowCount += 1

Case "td"

If rowCount = 1 Then

xAttr = xd.CreateAttrib ute("class")

xAttr.Value = "ArticleTableHe ader"

xNode.Attribute s.Append(xAttr)

ElseIf rowCount 1 Then

xAttr = xd.CreateAttrib ute("class")

xAttr.Value = "ArticleTableCe lls"

xNode.Attribute s.Append(xAttr)

End If

Case "a"

End Select

End If

For Each child As XmlNode In xNode.ChildNode s

ZapAttributes(c hild, xd)

Next

End Sub

End Class
Jan 8 '07 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
3847
by: Jacek Dziedzic | last post by:
Hi! First of all, I hope my problem is not too loosely tied to the "standard C++" that is the topic of this group. I have some code that exhibits a strange behaviour: on one computer, where I compile with g++-3.2 it compiles and works fine, on another computer, whether I compile with g++-3.0 or g++-2.95 (unfortunately 3.2 is not available) it crashes with a segmentation violation on runtime.
6
3807
by: Daniel Walzenbach | last post by:
Hi, I have a web application which sometimes throws an “out of memory” exception. To get an idea what happens I traced some values using performance monitor and got the following values (for one day): \\FFDS24\ASP.NET Applications(_LM_W3SVC_1_Root_ATV2004)\Errors During Execution: 7 \\FFDS24\ASP.NET Apps v1.1.4322(_LM_W3SVC_1_Root_ATV2004)\Compilations
0
1639
by: cwbp17 | last post by:
I'm having trouble updating individual datagrid cells. Have two tables car_master (columns include Car_ID, YEAR,VEHICLE) and car_detail (columns include Car_ID,PRICE,MILEAGE,and BODY);both tables have a FK relationship on CAR_ID so the oracledataadapter1 select statement(CommandText) is: select car_master.car_id, car_master.year,car_master.vehicle,car_detail.car_id AS EXPR1,
3
5006
by: Olivier BESSON | last post by:
Hello, I have a web service of my own on a server (vb.net). I must declare it with SoapRpcMethod to be used with JAVA. This is a simple exemple method of my vb source : >************************************************************************ > <WebMethod(), System.Web.Services.Protocols.SoapRpcMethod()> _ > Public Function HelloWorld() As > <System.Xml.Serialization.SoapElementAttribute("return")> String
1
1829
by: rh1200la | last post by:
Hi there. I'm having trouble with an HTTP Post in my code behind. Can anyone help? Here's my code: string data = "&fields_fname = " + txtFirstName.Text + "&fields_lname=" + txtLastName.Text; string url = "http://localhost/app/default.asp";
1
2147
by: yucikala | last post by:
Hello, I'm a "expert of beginner" in C#. I have a dll - in C. And in this dll is this struct: typedef struct msg_s { /* please make duplicates of strings before next call to emi_read() ! */ int op_type; /* of "op_t" type: operation type; submit (>0), response (<0) */
2
1556
by: JLupear | last post by:
I am having trouble with my code again, I had prepared a question and the code to upload, however I am having trouble posting it, are there limits to the amount of lines you can post? I split it into 4 parts and am still having trouble uploading it. What am I doing wrong?
2
1220
by: roger26 | last post by:
I am having trouble with a registration page. Which contains 3 groups of radio buttons and a check box i having hard time to making it work. I have added code for matching the password too but having trouble making it work. I also wanted to know how to make the select box mandatory through javascript. Can anyone help me out by providing me with a registration page or some code which covers above problems. Thanks for all the help guys. ...
1
1227
by: sndive | last post by:
i have a lot of trouble selling twisted a a client lib for network access (on embedded platform) the group i'm a member of wants to write some unmaintainable threaded blocking junk in c--. does anyone can give me an idea how many kilobytes extra would it take to have a pyqt linked in and a running qtreactor? (qtreactor natirally depends on the qt bindings) is there an exaample of going thru a passworded proxy
9
2059
by: itdevries | last post by:
Hi, I've ran into some trouble with an overloaded + operator, maybe someone can give me some hints what to look out for. I've got my own custom vector class, as a part of that I've overloaded the + operator by means of a friend function. Everything worked fine until I decided to use a variable array size (by using new/delete), now I get an error when a temporary object is deleted (e.g. after addition), the error occurs at the delete...
0
10363
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10169
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
9964
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8993
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development projectplanning, coding, testing, and deploymentwithout human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7517
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupr who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6749
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5398
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5534
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
2
3670
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.