473,323 Members | 1,537 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,323 software developers and data experts.

I Want to Extract all URL's from HTML

Hi
i have stored the html of a web page into a string.And i want to
extract all the URL's and want to store them into an array of string.
plz help me if some body had write this Function pls send me the code i
will be thank full to u.

CreimeMaster.

Aug 1 '06 #1
2 2108
CrimeMaster wrote:
Hi
i have stored the html of a web page into a string.And i want to
extract all the URL's and want to store them into an array of string.
plz help me if some body had write this Function pls send me the code i
will be thank full to u.
Personally I'd use HtmlAgilityPack to parse the html into a DOM then
query that for <aelements. But no doubt someone is even now preparing
a five line regex that will work nearly all the time...

--
Larry Lard
la*******@googlemail.com
The address is real, but unread - please reply to the group
For VB and C# questions - tell us which version
Aug 1 '06 #2
here is RegularExpression for u
new Regex("(?<=href *= *'?\"?)[^'\";>
]*",RegexOptions.IgnoreCase|RegexOptions.Compile d);
"CrimeMaster" wrote:
Hi
i have stored the html of a web page into a string.And i want to
extract all the URL's and want to store them into an array of string.
plz help me if some body had write this Function pls send me the code i
will be thank full to u.

CreimeMaster.

Aug 1 '06 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
by: Nickolay Kolev | last post by:
Hi all, I am looking for a way to extract the titles of HTML documents. I have made an honest attempt at doing it, and it even works. Is there an easier (faster / more efficient / clearer) way?...
10
by: mark4 | last post by:
Hello, Are there any utilities to help me extract Content from HTML ? I'd like to store this data in a database. The HTML consists of about 10,000 files with a total size of about 160 Mb....
6
by: Nitin | last post by:
Hi All, Is there any ANSI library ( header file ) which contains functions to extract or tokenize arguments of main( ) ? We have strtok( ), but thats non re-entrant, as it maintains static...
6
by: Selen | last post by:
I would like to be able to extract a BLOB from the database (SqlServer) and pass it to a browser without writing it to a file. (The BLOB's are word doc's, MS project doc's, and Excel spreadsheets....
9
by: chrisspencer02 | last post by:
I am looking for a method to extract the links embedded within the Javascript in a web page: an ActiveX component, or example code in C++/Pascal/etc. I am looking for a general solution, not one...
0
by: CrimeMaster | last post by:
Hi i have stored the html of a web page into a string.And i want to extract all the URL's and want to store them into an array of string. plz help me if some body had write this Function pls send...
7
by: Ulysse | last post by:
Hello, I'm trying to extract the data from HTML table. Here is the part of the HTML source : """ <tr> <td class="tdn" valign="top"> <input name="x44553130" value="y" type="checkbox"></td>...
2
by: learnyourabc | last post by:
For a webcrawler, you need to extract all links from the web page. For normal html anchor tags or any of the src and href attribute on the tag can be easily extracted using ihtmldocument. What...
1
by: rcamarda | last post by:
I'd need to have a function that allows me to extract 'fields' from within the string I.E. (kinda pseudo code) declare @foo as varchar(100) set @foo = "Robert*Camarda*123 Main Street" select...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.