Hi,
I'd like to list all links (url's, not interested in email adresses) from a
web page (htm, html, asp, php ...).
I'am not sure if this can be done using php. So please some advice or maybe
an example.
Thanks in advance,
Robertico 7 1515
Robertico wrote: I'd like to list all links (url's, not interested in email adresses) from a web page (htm, html, asp, php ...).
<?php
$url = 'http://tobyinkster.co.uk';
$n = 12 + strlen(`lynx -dump -number_links -nolist '$url'`);
$list = substr(`lynx -dump -number_links '$url'`, $n);
print "<pre>$list</pre>";
?>
Magic.
--
Toby A Inkster BSc (Hons) ARCS
Contact Me ~ http://tobyinkster.co.uk/contact
Thanks, <?php $url = 'http://tobyinkster.co.uk'; $n = 12 + strlen(`lynx -dump -number_links -nolist '$url'`); $list = substr(`lynx -dump -number_links '$url'`, $n); print "<pre>$list</pre>"; ?>
Great solution using Lynx. My first impression is that it works great !
Robertico
> <?php $url = 'http://tobyinkster.co.uk'; $n = 12 + strlen(`lynx -dump -number_links -nolist '$url'`); $list = substr(`lynx -dump -number_links '$url'`, $n); print "<pre>$list</pre>"; ?>
I tried to remove the numbers (removed -number_links) but it doesn't work.
I 'd like to store the results in a database.
Can you explain why adding 12 to the string length ?
Robertico
Robertico wrote: I tried to remove the numbers (removed -number_links) but it doesn't work.
Use a regular expression:
$list = preg_replace('/\[[0-9]+\]/', $list);
Can you explain why adding 12 to the string length ?
Try it without the +12 and see.
--
Toby A Inkster BSc (Hons) ARCS
Contact Me ~ http://tobyinkster.co.uk/contact
> Use a regular expression: $list = preg_replace('/\[[0-9]+\]/', $list);
Ok, thats correct, but is there a reason why is doesn't work
without -number_links.
I always want to know why. I'd like to understand what i'am doing. :-))
Try it without the +12 and see.
I already did, but why exactly 12.
Hope i don't tease you to much
Robertico
Robertico wrote: Ok, thats correct, but is there a reason why is doesn't work without -number_links. I always want to know why. I'd like to understand what i'am doing. :-))
ISTR that I couldn't get the two lynx dumps to "line up" without using
"-number_links" for both. Try it without the +12 and see.
I already did, but why exactly 12.
if (strlen("References\n\n")==12)
{
echo "That's why!\n";
/* Again, IIRC */
}
--
Toby A Inkster BSc (Hons) ARCS
Contact Me ~ http://tobyinkster.co.uk/contact This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: Steve |
last post by:
Hello, I am writing a script that calls a URL and reads the resulting
HTML into a function that strips out everthing and returns ONLY the
links, this is so that I can build a link index of various...
|
by: Jan Tuxen |
last post by:
Jakob Nielsen in his most recent Alertbox
(http://www.useit.com/alertbox/20040503.html) tells web authors to
change the color of visited links.
I agree to his purpose: Help users understand...
|
by: Darryl B |
last post by:
I can not get anywhere on this project I'm tryin to do. I'm not
expecting any major help with this but any would be appreciated. The
assignment is attached. The problem I'm having is trying to set...
|
by: Muffinman |
last post by:
Hi,
I have a webpage with two Iframe's. I want to be able to change the
target of all links in one frame so it will point to the other frame and
all this from the main page. Is this possible and...
|
by: craig.lloyd |
last post by:
Hi all,
Can anyone tell me how I can display details of the 3 most recently
visited links onto my webpage...
i.e. My webpage has many links off to various documents etc. I would
like to place...
|
by: NoSpamThankYouMam |
last post by:
I am looking for a product that I am not sure exists.
I have bookmarks to webpages in Internet Explorer, Mozilla
Firefox, Opera, Netscape Navigator, and on a "Favorite
Links" page on my website....
|
by: Patrick Olurotimi Ige |
last post by:
I have a simple Stored Procedure with multiple select statements..doing
select * from links for example.
I created a DataTable and then fill the tables
But the first dtTemplate DataTable doesn't...
|
by: Jetus |
last post by:
Is there a good place to look to see where I can find some code that
will help me to save webpage's links to the local drive, after I have
used urllib2 to retrieve the page?
Many times I have to...
|
by: metameta |
last post by:
This question may be a little complicated, at least for me, since I am fairly new to python. So I know a webpage that has two drop-down selection boxes. and a 'search' button. When I choose the...
|
by: BarryA |
last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
|
by: nemocccc |
last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
|
by: Sonnysonu |
last post by:
This is the data of csv file
1 2 3
1 2 3
1 2 3
1 2 3
2 3
2 3
3
the lengths should be different i have to store the data by column-wise with in the specific length.
suppose the i have to...
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
|
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome a new...
| |