470,866 Members | 1,930 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 470,866 developers. It's quick & easy.

selecting specific tag

I am making a script which displays an RSS feed on my website. To do
this I would like to take all the <div class="blabla"><img src="bla.jpg"
alt="bla"> </div> tags and put them in front of my text. So the
following string:

<p>This is text blah blah blah</p>
<div class="blabla"><img src="bla.jpg" alt="bla"> </div>
<p>this is another text</p>

should give as output:

<div class="blabla"><img src="bla.jpg" alt="bla"> </div>
<p>This is text blah blah blah</p>
<p>this is another text</p>
I tried using regular expressions, but I do not really get it done. Here
is the code I use:
while(ereg('/<DIV class=\"blabla\"*>(.*?)<\/DIV>/i',$content, $matches)){
$img=$img.$matches[0];
$content=ereg_replace('/<DIV
class=\"imgbar50\"*>(.*?)<\/DIV>/i',"",$content,1);
}

what do i do wrong?
Apr 3 '06 #1
16 2112
I recommend using preg_* since it's usually faster than ereg_*.

Try this one:

preg_replace(
'#^(.*)(<div class="foo">.*</div>)#isU',
'$2$1',
$content
);

If you store your contents in a database it would be easier (and
performance saving) to store the image in a separate row.

Apr 3 '06 #2
actually, it reads in the data from another site (the weblog has to be
hosted by them). Your thing works partially. It does not remove the
<div> from the oriiginal messaqge...

milahu wrote:
I recommend using preg_* since it's usually faster than ereg_*.

Try this one:

preg_replace(
'#^(.*)(<div class="foo">.*</div>)#isU',
'$2$1',
$content
);

If you store your contents in a database it would be easier (and
performance saving) to store the image in a separate row.

Apr 3 '06 #3
I'm sorry, it actually workd! thank you
Y.G. wrote:
actually, it reads in the data from another site (the weblog has to be
hosted by them). Your thing works partially. It does not remove the
<div> from the oriiginal messaqge...

milahu wrote:
I recommend using preg_* since it's usually faster than ereg_*.

Try this one:

preg_replace(
'#^(.*)(<div class="foo">.*</div>)#isU',
'$2$1',
$content
);

If you store your contents in a database it would be easier (and
performance saving) to store the image in a separate row.

Apr 3 '06 #4
in article e0**********@info.science.uva.nl, Y.G. at
yg_blah_@this_domain.bla wrote on 4/3/06 11:03 AM:
preg_replace(
'#^(.*)(<div class="foo">.*</div>)#isU',
'$2$1',
$content
);

OK, I'm stupid, but how do you find and replace the BODY tag with
preg_replace?

Here's an example:

<BODY TEXT="#99FFFF" BGCOLOR="#000000" LINK="#ff99ff" VLINK="#ff99ff"
ALINK="#d3d3d3">
So I want to find "<BODY " and everything up to the next ">", and replace it
with "<!-- Body tag was here -->" (for example).

I have not figured out this preg syntax yet. :-(

--
Stephen Kay
Karma-Lab sk@karma-lab.NOSPAM.com
^^^^^^^
Apr 3 '06 #5
That would be sth. like

preg_replace(
'/(<body.*?>)/i',
'<!-- Body tag was here -->',
$content
);

There are some nice tutorials about PRCE, for example:
- http://www.regular-expressions.info/tutorial.html
- http://www.tote-taste.de/X-Project/regex/

Apr 3 '06 #6
Stephen Kay wrote:
in article e0**********@info.science.uva.nl, Y.G. at
yg_blah_@this_domain.bla wrote on 4/3/06 11:03 AM:

preg_replace(
'#^(.*)(<div class="foo">.*</div>)#isU',
'$2$1',
$content
);


OK, I'm stupid, but how do you find and replace the BODY tag with
preg_replace?

Here's an example:

<BODY TEXT="#99FFFF" BGCOLOR="#000000" LINK="#ff99ff" VLINK="#ff99ff"
ALINK="#d3d3d3">
So I want to find "<BODY " and everything up to the next ">", and replace it
with "<!-- Body tag was here -->" (for example).

I have not figured out this preg syntax yet. :-(


You need to be careful - you can't just replace to the next ">", i.e. (on three
lines for the heck of it)

<body TEXT="#99FFFF"
<!-- BG color was black -->
BGCOLOR="#0000FF">
--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attglobal.net
==================
Apr 3 '06 #7
in article 11**********************@i40g2000cwc.googlegroups. com, milahu at
mi****@googlemail.com wrote on 4/3/06 4:52 PM:
preg_replace(
'/(<body.*?>)/i',
'<!-- Body tag was here -->',
$content
);


Thank you! Works great.

I'll be sure to look into those tutorials - 1 question: what does the /i at
the end of the pattern mean?

--
Stephen Kay
Karma-Lab sk@karma-lab.NOSPAM.com
^^^^^^^
Apr 3 '06 #8
in article D4******************************@comcast.com, Jerry Stuckle at
js*******@attglobal.net wrote on 4/3/06 7:01 PM:
You need to be careful - you can't just replace to the next ">", i.e. (on
three
lines for the heck of it)

<body TEXT="#99FFFF"
<!-- BG color was black -->
BGCOLOR="#0000FF">


OK, I see your point, but did you have a solution to offer? How would you
search for this kind of situation, in order to strip out a body tag?

Luckily, I don't think any of the pages I need to treat this way have that
kind of stuff in them...

--
Stephen Kay
Karma-Lab sk@karma-lab.NOSPAM.com
^^^^^^^
Apr 3 '06 #9
Stephen Kay wrote:
in article D4******************************@comcast.com, Jerry Stuckle at
js*******@attglobal.net wrote on 4/3/06 7:01 PM:

You need to be careful - you can't just replace to the next ">", i.e. (on
three
lines for the heck of it)

<body TEXT="#99FFFF"
<!-- BG color was black -->
BGCOLOR="#0000FF">

OK, I see your point, but did you have a solution to offer? How would you
search for this kind of situation, in order to strip out a body tag?

Luckily, I don't think any of the pages I need to treat this way have that
kind of stuff in them...


First of all, I wouldn't do it like you are. I'd have each page and include
header and footer files as necessary.

But if I *had* to do this (kicking and screaming) I'd go ahead and delete the
tags from the files.

Remember - it isn't just <head>, <body>, etc. - you may have metatags and all
kinds of other things.

Otherwise you need to parse each page one character at a time, looking for
nesting, quotes and all kinds of other things. And it would have to be done
every time the page is loaded.

Shortcuts seldom are shortcuts in the long run!

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attglobal.net
==================
Apr 4 '06 #10
in article Ru******************************@comcast.com, Jerry Stuckle at
js*******@attglobal.net wrote on 4/3/06 11:19 PM:
OK, I see your point, but did you have a solution to offer? How would you
search for this kind of situation, in order to strip out a body tag?

Luckily, I don't think any of the pages I need to treat this way have that
kind of stuff in them...


First of all, I wouldn't do it like you are. I'd have each page and include
header and footer files as necessary.

But if I *had* to do this (kicking and screaming) I'd go ahead and delete the
tags from the files.


I would say that using templates and reading material into content variables
is a pretty standard way of doing things with php. Almost all of the
php-based forums use this method. The advantages are that you can completely
change the layout and appearance of all the pages

If you have a template, you can have a table on the page, with a $navbar
variable, $header, $footer, and $content variables (maybe several of them),
all positioned inside different cells of the table and display all of the
pages of the site in that template. If you suddenly decide you'd like the
navbar on the other side of the page, or the bottom, or to make the page
background a different color or image, you just change to a different master
template.

With the method you are talking about, it seems you would have to change
each page to completely change the layout. Maybe I don't know that much
about it, but I think you are using each page to do the main formatting, and
then just including a header, footer, etc.? The positioning of those
elements is determined by each page, and where the includes are located.
Whereas the positioning of the elements is determined by the template in the
way I'm doing it.

My problem is that I'm trying to adapt an existing straight HTML site to
this other method. Normally, if creating this method from scratch, you
wouldn't put headers, and HTML and BODY tags in the parts that are intended
for content.

Anyway, I now have some code that strips off the header, and removes the
HTML and BODY tags (thanks to all for the suggestions etc.). I realize this
would not be robust for a general solution, but I'm only applying it to one
existing site that has no fancy stuff like nested comments inside the tags
etc.

For anyone who cares (improvements welcome!):

// split the returned source at the end of the /HEAD section
$source_array = preg_split('/(<\/head>)/i', $raw_content);

// if the split fails to find a head tag (like there already
// is no header), the second piece of the array will not be there
if (isset($source_array[1])){ // if content is NOT empty
$headers = "\n<!-- Start Stripped Header\n" . $source_array[0] . "\nEnd
Stripped Header -->\n"; // debug
echo $headers; // debug

$raw_content = $source_array[1];
}else{
echo "<!-- No Header To Strip -->\n"; // debug

$raw_content = $source_array[0];
}
// strip out HTML and BODY tags
$pattern_array = array( '/(<body.*?>)/i',
'/(<\/body>)/i',
'/(<html.*?>)/i',
'/(<\/html>)/i' );

// for debugging; later replace with ""
$replace_array = array( '<!-- body start tag was here -->',
'<!-- body end tag was here -->',
'<!-- html start tag was here -->',
'<!-- html end tag was here -->' );
$raw_content = preg_replace($pattern_array, $replace_array, $raw_content);
--
Stephen Kay
Karma-Lab sk@karma-lab.NOSPAM.com
^^^^^^^
Apr 4 '06 #11
Stephen Kay wrote:
in article Ru******************************@comcast.com, Jerry Stuckle at
js*******@attglobal.net wrote on 4/3/06 11:19 PM:


But if I *had* to do this (kicking and screaming) I'd go ahead and delete the
tags from the files.

I would say that using templates and reading material into content variables
is a pretty standard way of doing things with php. Almost all of the
php-based forums use this method. The advantages are that you can completely
change the layout and appearance of all the pages


Yes, it is.
If you have a template, you can have a table on the page, with a $navbar
variable, $header, $footer, and $content variables (maybe several of them),
all positioned inside different cells of the table and display all of the
pages of the site in that template. If you suddenly decide you'd like the
navbar on the other side of the page, or the bottom, or to make the page
background a different color or image, you just change to a different master
template.

That's one way to do it.
With the method you are talking about, it seems you would have to change
each page to completely change the layout. Maybe I don't know that much
about it, but I think you are using each page to do the main formatting, and
then just including a header, footer, etc.? The positioning of those
elements is determined by each page, and where the includes are located.
Whereas the positioning of the elements is determined by the template in the
way I'm doing it.

Nope. The page itself contains only information specific to that page, i.e.

<!DOCTYPE ...>
<html>
<head>
<title>...</title>
<meta...>
<?php include('header file location')

The header file contains common header elements, the </HEAD> and <BODY> tags
plus the rest of the header.
Next comes the page-specific content

This is followed by an include for the footer - which finishes the document.

So - formatting for every page can be changed by simply changing the
header/footer page.

My problem is that I'm trying to adapt an existing straight HTML site to
this other method. Normally, if creating this method from scratch, you
wouldn't put headers, and HTML and BODY tags in the parts that are intended
for content.

See above.
Anyway, I now have some code that strips off the header, and removes the
HTML and BODY tags (thanks to all for the suggestions etc.). I realize this
would not be robust for a general solution, but I'm only applying it to one
existing site that has no fancy stuff like nested comments inside the tags
etc.


OK, but it sounds like a recipe for disaster to me. You don't have any of that
now - but what about the next guy who comes along and has to maintain the site?
Or even you six months from now?

Personally I'd rather do it right than do it over.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attglobal.net
==================
Apr 4 '06 #12
in article m6********************@comcast.com, Jerry Stuckle at
js*******@attglobal.net wrote on 4/4/06 4:02 PM:
Nope. The page itself contains only information specific to that page, i.e.

<!DOCTYPE ...>
<html>
<head>
<title>...</title>
<meta...>
<?php include('header file location')

The header file contains common header elements, the </HEAD> and <BODY> tags
plus the rest of the header.
Next comes the page-specific content

This is followed by an include for the footer - which finishes the document.

So - formatting for every page can be changed by simply changing the
header/footer page.

I see. Yes, now that I think about it, you could do it that way too. You
basically are creating a template out of the two pieces of the header and
the footer, and chopping them in half, and the content area (which is on the
main page) ends up in the middle, say inside a table cell.

That would be a good way for a site you are writing from scratch.

But for my task at hand, which is "turn a straight HTML site with 100+ pages
into a dynamically served php site with consistent header, navbar, footer
areas, without rewriting, editing and renaming every page", this way works
better/easier.

(Because, in your case, every page needs to end in .php. In my case, they
can just stay as .html pages. So most of them don't even need to be touched.
They just work in the new format.) And most of the existing search engine
links and links on other site's pages that point to the site still work.

I accept that there's some risk of things breaking. Someday I'll rewrite the
whole thing from scratch. But that wasn't the goal at this point.

But thanks for the tips!

--
Stephen Kay
Karma-Lab sk@karma-lab.NOSPAM.com
^^^^^^^
Apr 5 '06 #13
Stephen Kay wrote:
in article m6********************@comcast.com, Jerry Stuckle at
js*******@attglobal.net wrote on 4/4/06 4:02 PM:

Nope. The page itself contains only information specific to that page, i.e.

<!DOCTYPE ...>
<html>
<head>
<title>...</title>
<meta...>
<?php include('header file location')

The header file contains common header elements, the </HEAD> and <BODY> tags
plus the rest of the header.
Next comes the page-specific content

This is followed by an include for the footer - which finishes the document.

So - formatting for every page can be changed by simply changing the
header/footer page.
I see. Yes, now that I think about it, you could do it that way too. You
basically are creating a template out of the two pieces of the header and
the footer, and chopping them in half, and the content area (which is on the
main page) ends up in the middle, say inside a table cell.

That would be a good way for a site you are writing from scratch.


Yep, except I don't use tables for content. I use tables for tables :-).

Also, each page has it's own static URL; I don't need to use GET or POST parms
to identify the page.
But for my task at hand, which is "turn a straight HTML site with 100+ pages
into a dynamically served php site with consistent header, navbar, footer
areas, without rewriting, editing and renaming every page", this way works
better/easier.

(Because, in your case, every page needs to end in .php. In my case, they
can just stay as .html pages. So most of them don't even need to be touched.
They just work in the new format.) And most of the existing search engine
links and links on other site's pages that point to the site still work.

No, they don't *have* to end in .php. That was an example. For the ones which
don't need php code, I can use SSI to include the header and footer.
I accept that there's some risk of things breaking. Someday I'll rewrite the
whole thing from scratch. But that wasn't the goal at this point.

But thanks for the tips!

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attglobal.net
==================
Apr 5 '06 #14
in article dP********************@comcast.com, Jerry Stuckle at
js*******@attglobal.net wrote on 4/4/06 9:36 PM:
I see. Yes, now that I think about it, you could do it that way too. You
basically are creating a template out of the two pieces of the header and
the footer, and chopping them in half, and the content area (which is on the
main page) ends up in the middle, say inside a table cell.

That would be a good way for a site you are writing from scratch.

Yep, except I don't use tables for content. I use tables for tables :-).


Too bad you're missing out on all the other cool things they can do, then.
;-P

It would be impossible to format a typical php-based forum, in any sort of
decent looking way, for example, without tables holding the different
content pieces.

BTW, if you wanted to include a navbar in a thin column down the side of the
page, and the content next to it, how would *you* do it?
No, they don't *have* to end in .php. That was an example. For the ones
which
don't need php code, I can use SSI to include the header and footer.


But that has its own problems - then they can end in .shtml, or you can
start doing all sorts of Apache tricks and chmod +x -ing the files, and it
again turns into a bunch of additional work, on the hundreds of files in a
bunch of nested folders. Much easier to just do what I'm doing. But if I was
going to write it from scratch, then I would choose some other way.
--
Stephen Kay
Karma-Lab sk@karma-lab.NOSPAM.com
^^^^^^^
Apr 5 '06 #15
Stephen Kay wrote:
in article dP********************@comcast.com, Jerry Stuckle at
js*******@attglobal.net wrote on 4/4/06 9:36 PM:

I see. Yes, now that I think about it, you could do it that way too. You
basically are creating a template out of the two pieces of the header and
the footer, and chopping them in half, and the content area (which is on the
main page) ends up in the middle, say inside a table cell.

That would be a good way for a site you are writing from scratch.

Yep, except I don't use tables for content. I use tables for tables :-).

Too bad you're missing out on all the other cool things they can do, then.
;-P


Nope. I can do anything without tables.
It would be impossible to format a typical php-based forum, in any sort of
decent looking way, for example, without tables holding the different
content pieces.

Not at all.
BTW, if you wanted to include a navbar in a thin column down the side of the
page, and the content next to it, how would *you* do it?


CSS.

No, they don't *have* to end in .php. That was an example. For the ones
which
don't need php code, I can use SSI to include the header and footer.

But that has its own problems - then they can end in .shtml, or you can
start doing all sorts of Apache tricks and chmod +x -ing the files, and it
again turns into a bunch of additional work, on the hundreds of files in a
bunch of nested folders. Much easier to just do what I'm doing. But if I was
going to write it from scratch, then I would choose some other way.


Nope. They can end in html. A minor change to the httpd.conf file fixes that.
And since all .html files would use SSI, there's no extra overhead - they'd
all have to be parsed anyway. And no chmoding the files at all.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attglobal.net
==================
Apr 5 '06 #16
What tags are you using?

If you're just using:

<?
//Some code here
?>

Then try

<?php
//Some code here
?>

Leveller
Apr 5 '06 #17

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

4 posts views Thread by webhigh | last post: by
3 posts views Thread by Simon G Best | last post: by
reply views Thread by Chris N | last post: by
12 posts views Thread by Lawrence Oluyede | last post: by
2 posts views Thread by Henrik Goldman | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.