473,395 Members | 1,694 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,395 software developers and data experts.

Relative URL in <A> tag being converted to absolute URL

I'm using the WebBrowser control and Microsoft HTML Object Library
(MSHTML) in a VB .Net program. I'm trying to read <P> tags and their
contents from one HTML document, test1.htm, and insert them into
another, test2.htm. The problem is the relative URL in the <A> tag in
the 3rd paragraph.

test1.htm contains:
<HTML><HEAD></HEAD>
<BODY>
<P>This is the first paragraph</P>
<P>This is the second paragraph</P>
<P>This is the third paragraph with a <A
href="../../dir1/dir2/page2.htm">link</A> included</P>
</BODY>
</BODY>

test2.htm contains:
<HTML><HEAD></HEAD>
<BODY>
<HR>
</BODY>
</HTML>

The <P> tags are read from test1.htm and saved in a listbox using:

mElements = mDoc.getElementsByTagName("P")
For Each mElement In mElements
lstPTags.Items.Add(mElement.outerHTML)
Next

They are then inserted before the <HR> tag in test2.htm using:

For Each mElement In mDoc.all
If mElement.tagName = "HR" Then
For i = 0 To lstPTags.Items.Count - 1
mElement.insertAdjacentHTML("beforeBegin",
lstPTags.Items(i))
Next
End If
Next

This works fine, EXCEPT for the 3rd paragraph which contains an <A>
tag link. The link is converted from a relative URL to an absolute
URL in the modified test2.htm :

<P>This is the third paragraph with a <A
href="file:///C:/Documents%20and%20Settings/dir1/dir2/page2.htm">link</A>
included</P>

My question is: how can I copy and insert the <P> tag and embedded <A>
tag exactly as it appears in test1.htm into test2.htm?
Nov 20 '05 #1
5 2125
Cor
Hi John,

I am not sure but are you looking for "URLUnencoded" in the DOM or
something,

when that is so, there is also a lot of stuff about this in the HttpUtility
members.

I hope this helps a little bit?

Cor
Nov 20 '05 #2
"Cor" <no*@non.com> wrote in message news:<eB**************@TK2MSFTNGP12.phx.gbl>...
Hi John,

I am not sure but are you looking for "URLUnencoded" in the DOM or
something,


URLUnencoded returns the URL of the document containing the html, i.e.
test1.htm in my case. I need to read the <P> tag elements from one
document and write them without modification to another html document.

Perhaps I'm doing it the wrong way. I've also tried saving the <P>
tag IHTMLElement objects in an array:

Public mElementArray As mshtml.IHTMLElement()
Public iElements as Integer

mElementArray(iElements) = mElement
----- or -----
mElementArray(iElements) = mElement.cloneNode(True)

but both these statements lock up the program.
Nov 20 '05 #3
Cor
Hi John,

I think I saw it wrong, you want the innerHTML

If I was you I would take a look for that, it sets or retrieves the HTML
between the start and end tags of the object.

This is some text I did copy from MSDN

The innerHTML property is valid for both block and inline elements. By
definition, elements that do not have both an opening and closing tag cannot
have an innerHTML property.

The innerHTML property takes a string that specifies a valid combination of
text and elements.

When the innerHTML property is set, the given string completely replaces the
existing content of the object. If the string contains HTML tags, the string
is parsed and formatted as it is placed into the document.

This property is accessible at run time, as of Microsoft® Internet Explorer
5. Removing elements at run time, before the closing tag is parsed, could
prevent other areas of the document from rendering.

I don't know if it is it, but it is nearby I think?

Cor
Nov 20 '05 #4
"Cor" <no*@non.com> wrote in message news:<ug**************@TK2MSFTNGP09.phx.gbl>...
Hi John,

I think I saw it wrong, you want the innerHTML


Thank you for your help and interest in my problem.

Yes, innerHTML is better than outerHTML for reading the <P> tag
element. However, I can't get the <A> anchor tag to appear correctly
in the output .htm document. I've tried 2 methods:

Private Function createIntroHTML(ByRef HRElement As
mshtml.IHTMLElement, ByRef mDoc As mshtml.HTMLDocument) As String

Dim mNewElement As mshtml.IHTMLElement
Dim mNewNode As mshtml.IHTMLDOMNode
Dim strLines() As String = Split(txtIntro.Text, vbNewLine)
Dim i as Integer
Dim strHTML As String

For i = 0 To UBound(strLines) - 1

'Nearly, but puts anchor tag as text, i.e. <A
href=.....>Link</A>

'mNewElement = mDoc.createElement("P")
'mNewElement.setAttribute("class", "text4")
'mNewElement.innerText = strLines(i)
'mNewNode = mNewElement
'HRElement.insertAdjacentElement("beforeBegin", mNewNode)

'Nearly, but converts relative URL in anchor tag to
absolute URL

strHTML = "<P class=""text4"">" + strLines(i) + "</P>"
HRElement.insertAdjacentHTML("beforeBegin", strHTML)

Next

End Function
I just need a way to prevent the relative URL being changed to an
absolute URL.
Nov 20 '05 #5
Cor
Hi John,

It goes a little bit to far to figure this out the MSHTML is not the nicest
thing to do.

But I use it on a way like this, (although it is not a complete innerHtml
that I ever did try)
Charles, the one who knows the most of this in this newsgroup (but I have
not seen him a while) told me that the document2 was much faster and when I
tried, it was.
Dim iDocument As mshtml.IHTMLDocument2
Dim myText as string
Dim i As Integer
For i = 0 To iDocument.all.length - 1
Dim tagname As String = iDocument.all.item(i).tagname
If (tagname = "p") Then myText =
iDocument.all.item(i).innerText.ToString
next

You can try it, it is not that big deal to try it yourself.

Give me a message if it did go or not, I am strugling also with some things
in MSHTML?

Cor

Nov 20 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Donald Firesmith | last post by:
I am having trouble having Google Adsense code stored in XSL converted properly into HTML. The <> unfortunately become &lt; and &gt; and then no longer work. XSL code is: <script...
9
by: Simple Simon | last post by:
Java longs are 8 bytes. I have a Java long that is coming in from the network, and that represents milliseconds since Epoch (Jan 1 1970 00:00:00). I'm having trouble understanding how to get it...
3
by: ajay2552 | last post by:
Hi, I have a query. All html tags start with < and end with >. Suppose i want to display either '<' or '>' or say some text like '<Company>' in html how do i do it? One method is to use &lt,...
1
by: baburk | last post by:
Hi, Can anybody tell me what is the Diffrence between Relative, Absolute, Physical Path Thanks in advance
1
by: mfaisalwarraich | last post by:
hi everyone, i have recenlty started web designing and im not very much familiar with the CSS techniques. i have gone through tutorial like w3schools but unable to understand the CSS position...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.