473,320 Members | 2,088 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

Need help replacing a specific <a href ... </a>

Hi
I've been coding in PHP for a little while now and was starting to feel
pretty confident but then realise I need to understand regular expressions
to solve a particular problem I've got ... What a horrible can of worms
regex is!! (to the uninitiated at least).

I realise that if I spend the next few weeks researching regex I'll probably
find the answer but I was wondering if anyone here would kindly help speed
up the process?

Basically I am grabbing the html from another URL, finding a specific <a
href .. </a> block and replacing the contents of that block with another
link. (n.b. the code examples below are pseudo code - I realise I need to
'escape' some characters)

Here's how I'm grabbing the source html :
$html = join ("", file (http://www.sitexyz.com/index.htm));

the link I want to replace looks something like :
<a href="http://www.sitexyz.com/page2"><img height="120"
src="origpic.jpg"></a>
Note - the only thing I know to be constant about the above link is that the
height is always 120 - the link destination and source image are completely
unpredictable.

Here's a variable holding the replacement link to my site using a different
image :
$newlink = "<a href="http://www.mysite.com"><img height="120"
src="mypic.jpg"></a>";

So... how do I use preg_replace() or ereg_replace() to find the <a href ..
</a> block encompassing an image of height 120 and replace it with the
contents of my $newlink variable?

*** To precis my question ... How do I replace an HTML link (where all I
know is the image height) with a link of my own? ***

TIA!

p.s. I promise I'll go and sit on the top of a mountain and not come down
until I've memorised & understood everything about regex ... once I've
sorted this problem! :o)

--
Sorby

Jul 17 '05 #1
5 2581
Sorby wrote:
Here's how I'm grabbing the source html :
$html = join ("", file (http://www.sitexyz.com/index.htm));
You realize the parameter to file() should be between quotes, right?
the link I want to replace looks something like :
<a href="http://www.sitexyz.com/page2"><img height="120"
src="origpic.jpg"></a>
Note - the only thing I know to be constant about the above link is that the
height is always 120 - the link destination and source image are completely
unpredictable.
What about the order they appear in?
What about white space (including newlines)?
What about other attributes?
Here's a variable holding the replacement link to my site using a different
image :
$newlink = "<a href="http://www.mysite.com"><img height="120"
src="mypic.jpg"></a>";

So... how do I use preg_replace() or ereg_replace() to find the <a href ..
</a> block encompassing an image of height 120 and replace it with the
contents of my $newlink variable?
You're better off using an HTML parser than relying on the structure of
the HTML you're fetching.

Anyway, if they don't ever change, here's a starting point:

<?php
error_reporting(E_ALL);
ini_set('display_errors', '1');

$x = '... <a href="http://www.sitexyz.com/page2">';
$x .= '<img height="120" src="origpic.jpg"></a> ...';

$newlink = '<a href="http://www.mysite.com">';
$newlink .='<img height="120" src="mypic.jpg"></a>';

$y = preg_replace('@<a href="[^"]+"><img height="120" src="[^"]+"></a>@',
$newlink, $x);

echo "$x\n$y\n";
?>

*** To precis my question ... How do I replace an HTML link (where all I
know is the image height) with a link of my own? ***


HTML links do not always have an image associated with them!
--
USENET would be a better place if everybody read: : mail address :
http://www.catb.org/~esr/faqs/smart-questions.html : is valid for :
http://www.netmeister.org/news/learn2quote2.html : "text/plain" :
http://www.expita.com/nomime.html : to 10K bytes :
Jul 17 '05 #2
"Pedro Graca" <he****@hotpop.com> wrote in message
news:c6************@ID-203069.news.uni-berlin.de...
Sorby wrote:
Here's how I'm grabbing the source html :
$html = join ("", file (http://www.sitexyz.com/index.htm));
You realize the parameter to file() should be between quotes, right?


Yes - thanks. I just typed it in that way - but in my source code it has
quotes.
the link I want to replace looks something like :
<a href="http://www.sitexyz.com/page2"><img height="120"
src="origpic.jpg"></a>
Note - the only thing I know to be constant about the above link is that the height is always 120 - the link destination and source image are completely unpredictable.


What about the order they appear in?


Here is an exact cut'n'paste of the bit I want to replace.

<a href="/1/files/small/north/page2.htm"><img height="120" hspace="0"
align="left" vspace="0" border="0" width="80" alt="original photo"
src="http://www.mysite.com/origpic.jpg" /></a>
What about white space (including newlines)?
I don't think this will be a problem. Hope not anyway. All the examples I've
seen are as above and with no line-breaks/carriage-returns.
What about other attributes?
I've included them in the new example above.
Here's a variable holding the replacement link to my site using a different image :
$newlink = "<a href="http://www.mysite.com"><img height="120"
src="mypic.jpg"></a>";

So... how do I use preg_replace() or ereg_replace() to find the <a href ... </a> block encompassing an image of height 120 and replace it with the
contents of my $newlink variable?


You're better off using an HTML parser than relying on the structure of
the HTML you're fetching.


Am I relying on the structure of the HTML I'm fetching? If the structure of
the HTML changes over time but the image height remains 120 then the code
should still work, right?
Anyway, if they don't ever change, here's a starting point:

<?php
error_reporting(E_ALL);
ini_set('display_errors', '1');

$x = '... <a href="http://www.sitexyz.com/page2">';
$x .= '<img height="120" src="origpic.jpg"></a> ...';

$newlink = '<a href="http://www.mysite.com">';
$newlink .='<img height="120" src="mypic.jpg"></a>';

$y = preg_replace('@<a href="[^"]+"><img height="120" src="[^"]+"></a>@',
$newlink, $x);

echo "$x\n$y\n";
?>


Thanks for taking the time to post this solution Pedro. I will try it out
now.
*** To precis my question ... How do I replace an HTML link (where all I
know is the image height) with a link of my own? ***


HTML links do not always have an image associated with them!


Good point but thankfully I am assured that in my case the links I'm looking
for will always have an image.

Thanks again - your help is greatly appreciated.

--
Sorby
Jul 17 '05 #3
"Pedro Graca" <he****@hotpop.com> wrote in message
news:c6************@ID-203069.news.uni-berlin.de...
Sorby wrote:
Here's how I'm grabbing the source html :
$html = join ("", file (http://www.sitexyz.com/index.htm));


You realize the parameter to file() should be between quotes, right?
the link I want to replace looks something like :
<a href="http://www.sitexyz.com/page2"><img height="120"
src="origpic.jpg"></a>
Note - the only thing I know to be constant about the above link is that the height is always 120 - the link destination and source image are completely unpredictable.


What about the order they appear in?
What about white space (including newlines)?
What about other attributes?
Here's a variable holding the replacement link to my site using a different image :
$newlink = "<a href="http://www.mysite.com"><img height="120"
src="mypic.jpg"></a>";

So... how do I use preg_replace() or ereg_replace() to find the <a href ... </a> block encompassing an image of height 120 and replace it with the
contents of my $newlink variable?


You're better off using an HTML parser than relying on the structure of
the HTML you're fetching.

Anyway, if they don't ever change, here's a starting point:

<?php
error_reporting(E_ALL);
ini_set('display_errors', '1');

$x = '... <a href="http://www.sitexyz.com/page2">';
$x .= '<img height="120" src="origpic.jpg"></a> ...';


Sorry - I probably wasn't clear enough - the href link and the image source
name could be anything - I can't predict them - or even part of them.

--
Sorby
Jul 17 '05 #4
Sorby wrote:
"Pedro Graca" <he****@hotpop.com> wrote in message
news:c6************@ID-203069.news.uni-berlin.de...
$x = '... <a href="http://www.sitexyz.com/page2">';
$x .= '<img height="120" src="origpic.jpg"></a> ...';

Sorry - I probably wasn't clear enough - the href link and the image source
name could be anything - I can't predict them - or even part of them.


It's ok, just try it with with different $x's :)
$x = '<a href="one"><img src="one"/></a>';
# $y = preg_replace();

$x = '<a href="two"><img src="two"></a>';
# $y = preg_replace();

$x = '<a href="three"><img src="three"></a>';
# $y = preg_replace();

....

--
USENET would be a better place if everybody read: : mail address :
http://www.catb.org/~esr/faqs/smart-questions.html : is valid for :
http://www.netmeister.org/news/learn2quote2.html : "text/plain" :
http://www.expita.com/nomime.html : to 10K bytes :
Jul 17 '05 #5
Sorby wrote:
Here is an exact cut'n'paste of the bit I want to replace.

<a href="/1/files/small/north/page2.htm"><img height="120" hspace="0"
align="left" vspace="0" border="0" width="80" alt="original photo"
src="http://www.mysite.com/origpic.jpg" /></a>


Hmmmm ... and you want to change just the URL and image SRC ?

<a href="=========== CHANGED =========="><img height="120" hspace="0"
align="left" vspace="0" border="0" width="80" alt="original photo"
src="============== CHANGED ==========" /></a>

Copy the stuff that you want to keep to the regexp and for the stuff you
want replaced use
[^"]+
which means: one or more of anything except quotes

so

$regexp = '@' .
'<a href="[^"]+"><img height="120" hspace="0" ' .
## =URL=
'align="left" vspace="0" border="0" width="80" alt="original photo" ' .
'src="[^"]+" /></a>' .
## =SRC=
'@';

$destin = preg_replace($regexp, $newlink, $source);
--
USENET would be a better place if everybody read: : mail address :
http://www.catb.org/~esr/faqs/smart-questions.html : is valid for :
http://www.netmeister.org/news/learn2quote2.html : "text/plain" :
http://www.expita.com/nomime.html : to 10K bytes :
Jul 17 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Charles Nadeau | last post by:
Hello, I am trying to craft a regular expression to filter an URL from a <a href=""></a> tag and the one I have doesn't seen right. I use the regular expression from this snippet of code: ...
4
by: higabe | last post by:
Three questions 1) I have a string function that works perfectly but according to W3C.org web site is syntactically flawed because it contains the characters </ in sequence. So how am I...
2
by: Erik Ronne | last post by:
Hello Gurus I have an intranet web page with links to all kinds of Microsoft Word document that we use at my work, so my co-works can go to the web page when they need a special document....
1
by: mdbrown | last post by:
Hi guys, I have been searching for solutions, but in vain. Hope i'll be able to source for answers here from you guys, thanks in anticipation. I have this application that works fine in IE7,...
1
by: Rob | last post by:
I have values like <span id = "URL">URL</span> where the id "URL" is generated by javascript automatically. I know in JSP, one can pass the value directly like <a href = <%=URL>> but how do I...
2
by: Peter Laman | last post by:
In my app I need to dynamically generate a series hyperlinks. Each hyperlink's action must be to focus a field in a <form>. I created the following function to create such a link (the argument is a...
1
by: ghjk | last post by:
In my php page there is a search function and when user search it will display table containing search results. I want to add <a href ..></a> . Because when user click one record I want to pass...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.