By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
435,131 Members | 1,437 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 435,131 IT Pros & Developers. It's quick & easy.

Preg_match and Preg_replace trouble regular expressions Please HELP!

P: n/a
need to extract all text between the following strings, but not
include the strings.
"<!-- #BeginEditable "Title name" -->"
"<p align="center">#### </p>"
I am using preg_match(????, $s, $results)
but I have had no success. I tried items like.. "<!-- #BeginEditable
"Title name" -->".*."<p align="center">#### </p>"
But have had no luck.
Please any assistenc is appreciated.
In addition, I need to use preg_replace to replace the following:
<a href="/folder2/folder1/folder/*page.html">
with
<a href="site.com=page.php?option*=paper&itemid=page. html">
Again, any help is appreciated. The regular expressions confuse me, my

brain is about to explode from all the trouble shooting, I guess I'm
just not smart emough to figure it out.
PLEASE HELP!!
thanks

Aug 1 '05 #1
Share this Question
Share on Google+
22 Replies


P: n/a
st*****@hotmail.com wrote:
need to extract all text between the following strings, but not
include the strings.
"<!-- #BeginEditable "Title name" -->"
"<p align="center">#### </p>"
I am using preg_match(????, $s, $results)
but I have had no success. I tried items like.. "<!-- #BeginEditable
"Title name" -->".*."<p align="center">#### </p>"

Try something like '/Title name" -->(.*)<p align="center/' for the
regex, and look for the answer in $results[1].

When you attempt to learn regexes, empty your mind of everything you
know from PHP or other computer languages. The parens, dot, star, and
other special characters are used in completely different ways than in
procedural languages.

Here's a quick link to PHP man:
http://us2.php.net/manual/en/function.preg-match.php

Here's a toot that others have found useful:
http://weblogtoolscollection.com/regex/regex.php
(myself, I learned regex back in the day of awk, when all we had was
woodburning computers)

<snip>
In addition, I need to use preg_replace to replace the following:
<snip>

Well, you'll probably want something along the lines of

$newString = preg_replace('/OldStuff/', "NewStuff", $oldString);

but check out a tutorial or two to get a handle on all this 'stuff'.

Friedl's _Mastering Regular Expressions_ is a good book on the subject.
It is not a light read. But then regular expressions are really a
computer language of their own, embedded into languages like PHP and
Perl, but with their own separate syntax and rules.
The regular expressions confuse me, my
brain is about to explode from all the trouble shooting

Yeah, been there. Wish you luck.

Aug 1 '05 #2

P: n/a
Thank you for the responses, I'm going to try this stuff tonight and I
hope it works.

thanks

Aug 5 '05 #3

P: n/a
Regular Expressions are great. Less than 10 lines of code can be so
powerful when you start to grasp the technology.
============================
<?
$fp=fopen("pregtest.txt","r+");
$content=fread($fp,filesize("pregtest.txt"));
fclose($fp);

echo "\n\nBEFORE:\n------------------\n";
echo $content;

preg_match_all("|<a href=\".+\">|m",$content,$matches);

foreach($matches[0] as $value)
{
preg_match("|\".+/([^/]+)\">$|",$value,$page);
$pattern="(.+<!-- #BeginEditable \"Title name\" -->.+)" . $value .
"(.+<p align=\"center\">#### </p>.+)";
$replacement="<a href=\"site.com=page.php?option*=paper&itemid=" .
$page[1] . "\">";
$content=preg_replace("|$pattern|s","\${1}" . $replacement .
"\${2}",$content);

}

echo "\n\nAFTER:\n------------------\n";
echo $content . "\n\n";
?>

HERE IS THE OUTPUT:
-------------------------------------------------

BEFORE:
------------------
some content that I don't want to edit...
<a href="/folder9/folder8/folder7/-page555.html">
<!-- #BeginEditable "Title name" -->
<a href="/folder2/folder1/folder/*page.html">
<a href="/folder9/folder8/folder7/-page555.html">
<p align="center">#### </p>
More content that I don't want to edit...
<a href="/folder2/folder1/folder/*page.html">
AFTER:
------------------
some content that I don't want to edit...
<a href="/folder9/folder8/folder7/-page555.html">
<!-- #BeginEditable "Title name" -->
<a href="site.com=page.php?option*=paper&itemid=*page .html">
<a href="site.com=page.php?option*=paper&itemid=-page555.html">
<p align="center">#### </p>
More content that I don't want to edit...
<a href="/folder2/folder1/folder/*page.html">

Aug 6 '05 #4

P: n/a
cameron7,

Thank you for your post. But I'm still having trouble. I think your
complicating what I'm trying to do. When I tried your code, nothing
appeared on the page. I assume, there is a syntax error, but I may be
wrong. I've tried debugging, with no luck. You included a loop
through for each apearance of the pattern, but in my pages it only
occurs once. Basically I'm removing an old header and footer, and
replacing old links with a new path from the existing page. I
appreciate you helping me.

thanks

Aug 6 '05 #5

P: n/a
Ok, I am getting closer but still no solution. I figured out your code
had extra spaces, most likely a google thing, but I removed them and
now I get the before part but the after section is blank. Any ideas?

thanks

Aug 6 '05 #6

P: n/a
Ok, sorry for all the extra posts, but I wanted to fill you in on where
I am.

When I use....

$fp=fopen("http://www.site.com/pregtest.html","r+") ;
$content=fread($fp,filesize("http://www.site.com/pregtest.html"));
fclose($fp);

etc. etc.

I get .....

"Before---------------After----------------", but that is it, no other
content.
If I change it to ...
$content=implode('', file("http://www.site.com/pregtest.html"));
etc.etc..

I get .....

"Before---------------------
....ALL NORMAL CONTENT......"

But then nothing under After, nor do I see the "AFTER"

I am so lost I want to bang my head against a wall, I don't know much
programming and this stuff is way over my head.

thanks, any assistence is apprecaited.

Aug 6 '05 #7

P: n/a
<?

error_reporting(E_ALL);

/* INSERT CODE HERE */

?>

Aug 6 '05 #8

P: n/a
On 2005-08-06, cameron7 <ca******@gmail.com> wrote:
<?

error_reporting(E_ALL);


// i find it easier to remember two calls to ini_set
//ini_set('error_reporting', E_ALL);

// make sure they are displayed
ini_set('display_errors', TRUE);

--
Met vriendelijke groeten,
Tim Van Wassenhove <http://timvw.madoka.be>
Aug 6 '05 #9

P: n/a
I added the

// i find it easier to remember two calls to ini_set
//ini_set('error_reporting', E_ALL);
// make sure they are displayed
ini_set('display_errors', TRUE);

line and when I use....

$fp=fopen("http://www.site.com/pregtest.h tml","r+") ;
$content=fread($fp,filesize("http://www.site.com/pregtest.h tml"));
fclose($fp);

etc. etc.

I get .....
"Warning: fopen() expects at least 2 parameters, 1 given in
/Users/imac/Sites/folder/test.php on line 8

Warning: filesize(): Stat failed for http://www.site.com/test.php
(errno=2 - No such file or directory) in
/Users/imac/Sites/folder/test.php on line 9

Warning: fread(): supplied argument is not a valid stream resource in
/Users/imac/Sites/folder/test.php on line 9

Warning: fclose(): supplied argument is not a valid stream resource in
/Users/imac/Sites/folder/test.php on line 10
BEFORE: ------------------ AFTER: ----------------- -"

But, If I change it to ...
$content=implode('', file("http://www.site.com/pregtest.h tml"));
etc.etc..

I get .....

"Before---------------------
....ALL NORMAL CONTENT......"

Fatal error: Maximum execution time of 30 seconds exceeded in
/Users/imac/Sites/Folder/test.php on line 31

Aug 6 '05 #10

P: n/a
I was reviewing the code again, and correct me if I'm wrong, from what
I understand the current code only replaces.....

<a href="/folder2/folder1/folder/ page.html">

with

<a href="site.com=page.php?option =paper&itemid=page.html">

if it's between each occurance of the preg_match.

I was trying to do two totally seperate items.

1) find one distinct pregmatch on a single page

2) replace links within a page

But as stated, they are totally independent.

If this simplifies matters, I can still use the help.

thanks again.

I've been working on this all morning, but am not having any luck.

thanks

Aug 6 '05 #11

P: n/a
I decided to focus on one item at a time. This is what I have so far,
but I get the following error...

Warning: Unknown modifier '(' in /Users/imac/Sites/folder/test.php on
line 17

when using the following code....

<?php
// i find it easier to remember two calls to ini_set
//ini_set('error_reporting', E_ALL);
// make sure they are displayed
ini_set('display_errors', TRUE);

//replace "news.htm" with your file
$s=implode('', file("http://www.site.com/test.php"));

$pattern="<!-- #BeginEditable \"Title of Firmrecall\" -->(.*?)<p
align=\"center\">#### </p>";
if(preg_match($pattern, $s, $results)){

echo $results[0];

}
?>

Aug 6 '05 #12

P: n/a
this is line 17

$pattern="<!-- #BeginEditable \"Title of Firmrecall\" -->(.*?)<p
align=\"center\">#### </p>";

Aug 6 '05 #13

P: n/a
I'm begging for someone to help me out here. I've been working at this
non stop for the last for days, and I and not find a solution to my
problem.

This is the pattern I'm using to find everything between...

"<!-- #BeginEditable "Title of name" --> " AND "<p
align="center">#### </p>"
$pattern="/<\!-- #BeginEditable \"Title of Firmrecall\" -->.*<p
align=\"center\">#### </p>/i";

But I get nothing.

PLEASE HELP!!!!

Aug 6 '05 #14

P: n/a
Somebody wrote:
$pattern="/<\!-- #BeginEditable \"Title of Firmrecall\" -->.*<p
align=\"center\">#### </p>/i";


a) Encode/escape the slash in your pattern; e.g., /<\/p>/.

b) Use different delimiters; e.g., `...`.

[Just probing, but why have you set the i modifier?]

--
Jock
Aug 6 '05 #15

P: n/a
I tried changing it, but still not luck, can you post what you mean?

I tried many varations, but my last was...

$pattern="'/<\!-- #BeginEditable \"Title of Firmrecall\" -->/.*/<p
align=\"center\">#### /<\/p>/'";

I had the i to make the text case insensitive, but it's really not
nessesary I guess.

Aug 6 '05 #16

P: n/a
Somebody wrote:
$pattern="'/<\!-- #BeginEditable \"Title of Firmrecall\" -->/.*/<p
align=\"center\">#### /<\/p>/'";


$pattern="/<\!-- #BeginEditable \"Title of Firmrecall\"
-->.*<p align=\"center\">#### <\/p>/";

--
Jock
Aug 7 '05 #17

P: n/a
still no luck, so I decided to take it one piece at a time and find out
which part it the problem.

I think it might have some thing to do with "<"

I tried

preg_match("/<h.t/", "<hat", $results)
echo $results[0];

and it did not work.

I changed it to...
preg_match("/h.t/", "hat", $results)
echo $results[0];

and it did work. So I thought ok just escape the "<" character with..
preg_match("/\<h.t/", "<hat", $results)
echo $results[0];

but no luck, any ideas why?

thanks

Aug 7 '05 #18

P: n/a
Somebody wrote:
preg_match("/\<h.t/", "<hat", $results)
echo $results[0];

but no luck, any ideas why?


Sure, you're relying on your browser to tell you the results
and it's hiding them from you. Your browser likely interprets
the string '<hat' as the beginning of a tag. Look at the
source and see what results you actually get.

--
Jock
Aug 7 '05 #19

P: n/a
OK, I'm getting closer but still having trouble.

does any one know why the preg_match stops the match if the string goes
to the next line.

for example....

//replace "news.htm" with your file
$s=implode('',
file("http://www.fda.gov/oc/po/firmrecalls/perrigo07_05.html"));

//echo $s;

preg_match("/\"Title of Firmrecall\".*/",$s,$result);

echo $result[0];

displays....
"Title of Firmrecall" The following is the title and

When it should be
"Title of Firmrecall" The following is the title and
this is the next line of the title.
But the "this is the next line of the title" and everything after it
does not get displayed.

PLEASE HELP!!! I've been working at this non stop but can not get it
working, I'm disperate!!!

Aug 7 '05 #20

P: n/a
your correct, "<hat" was in the source. any idea on the next issue?

Aug 7 '05 #21

P: n/a
Somebody wrote:
preg_match("/\"Title of Firmrecall\".*/",$s,$result);

echo $result[0];

displays....
"Title of Firmrecall" The following is the title and

When it should be
"Title of Firmrecall" The following is the title and
this is the next line of the title.
The dot doesn't match newline characters. The s pattern
modifier changes its default meaning so that it matches
newlines as well.

/\"Title of Firmrecall\".*/s
PLEASE HELP!!! I've been working at this non stop but can not get it
working, I'm disperate!!!


It's Sunday, give it a rest for a while, and make the most of
it before it becomes Monday morning.

--
Jock
Aug 7 '05 #22

P: n/a
John,

THANK YOU!!!!!!!!!

I got it working! I'm giving it a rest, THANK YOU THANK YOU!!!!!!
THANK YOU!!

I 'm going to relax now and enjoy the rest of my day, maybe go
swimming! PLEASE ENJOY YOUR DAY AS WELL!!!!

THANK YOU!!!!

Aug 7 '05 #23

This discussion thread is closed

Replies have been disabled for this discussion.