473,398 Members | 2,088 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,398 software developers and data experts.

A (hopefully) simple regular expression question...

For anyone who can't be bothered to read my code and examples, scroll
to the bottom, the question's there. Thanks.

I'm using php and regular expressions to convert bbcode style things to
html. My code to convert something like this:

[quote Bob]
Hello there
[/quote]

to something like this:

<fieldset>
<legend>&nbsp;Bob&nbsp;</legend>
<div class="quote">Hello there</div>
</fieldset>

goes something like this:

$ret = preg_replace("/\[quote (.+?)\](.+?)\[\/quote\]/i", "\n
<fieldset>\n <legend>&nbsp;\\1&nbsp;</legend>\n <div
class=\"quote\">\\2</div>\n </fieldset>\n ", $ret);
Now I have some code that cuts the text at a certain point (in order to
get a general preview of a message) using this function:

function cut_post($string, $length = DEFAULT_CUT_LENGTH) {
if (strlen($string) > $length) { // if the string IS too long...
$ret = substr($string, 0, $length); // cut it

for ($i = strlen($ret) - 1; $i >= 0; $i--) { // going from the
end to the beginning
if (substr($ret, $i, 1) == "<") { // if the current
character is an html open tag (<)
if (substr($ret, $i + 1, 1) != "/") { // if we found
anything except "</" (i.e. an actual start tag)
$ret = substr($ret, 0, $i); // return everything up
to that point
$i = -1; // and set our pointer to before the
string starts so we don't carry on looking
} else {
$i = -1;
}
}
}

if (substr($ret, strlen($ret) - 1, 1) == " ")
$ret = substr($ret, 0, strlen($ret) - 1);

$ret .= "...";
} else { // if it's not too long...
$ret = $string; // don't cut it. amazing!
}

return $ret;
}
I have a problem when it just happens to cut half way through a long
quote the post cutter does its job and cuts everything from the last
HTML tag it may have been half way through (the <div> inside the
<fieldset>) but unfortunately doesn't account for the open <fieldset>
and it never gets closed as the post was cut there, ending up with HTML
similar to:

<fieldset>
<legend>&nbsp;Bob&nbsp;</legend>
<div class="quote">...

and then the page continues, messing up my nice formatting. So my
question is:

Is there a way to use a regular expression to search for an open
<fieldset> and close it?
I'm pretty sure there is, but I'm not so good at those negative
regexps, I get that you'd have to search the output HTML for <fieldset>
followed by any string of characters and NOT </fieldset> and then if
that matches, then add </fieldset> to the end. Although that also
doesn't close the </div> although I'm not sure how it manages not to
close that (well actually cut it out) itself.

Hmm maybe I don't want to just close stuff at all... it might just be
best to remove the quote altogether or something. Anyway, can anyone
help? It's not a major problem, but it ruins my layout, and I like my
layout.

Well thanks to anyone who can help, and sorry I've rambled on. Just
trying to give as much info as I can.

Jan 12 '06 #1
6 1485
al************@gmail.com wrote:

[snip]
Is there a way to use a regular expression to search for an open
<fieldset> and close it?
I'm pretty sure there is, but I'm not so good at those negative
regexps, I get that you'd have to search the output HTML for <fieldset>
followed by any string of characters and NOT </fieldset> and then if
that matches, then add </fieldset> to the end. Although that also
doesn't close the </div> although I'm not sure how it manages not to
close that (well actually cut it out) itself.

Hmm maybe I don't want to just close stuff at all... it might just be
best to remove the quote altogether or something. Anyway, can anyone
help? It's not a major problem, but it ruins my layout, and I like my
layout.

Well thanks to anyone who can help, and sorry I've rambled on. Just
trying to give as much info as I can.


You can use all the negative look-aheads you want, but if you are just
using it with preg_match in an if statement, you may be able to get away
with something along these lines:

$pattern='`<fieldset>.*</fieldset>`isU';
if(preg_match($pattern,$str)){
// fieldset is closed
}else{
// fieldset set is *not* closed
}

--
Justin Koivisto, ZCE - ju****@koivi.com
http://koivi.com
Jan 12 '06 #2
Al
I just tested this and i get fieldset not closed for all 3 cases
(fieldset works properly, no fieldset was used at all and fieldset not
closed)

I was mainly testing to see whether it caught no fieldset used at all
as fieldset not closed.

Jan 12 '06 #3
Al wrote:
I just tested this and i get fieldset not closed for all 3 cases
(fieldset works properly, no fieldset was used at all and fieldset not
closed)


Really?? Here's what I get:

<?php
$str=array();
$str[]=<<<EOS
<fieldset>
<legend>&nbsp;Bob&nbsp;</legend>
<div class="quote">Hello there</div>
</fieldset>

EOS;

$str[]=<<<EOS
<fieldset>
<legend>&nbsp;Bob&nbsp;</legend>
<div class="quote">Hello there</div>

EOS;

$str[]=<<<EOS
<legend>&nbsp;Bob&nbsp;</legend>
<div class="quote">Hello there</div>

EOS;

$pattern='`<fieldset>.*</fieldset>`isU';

foreach($str as $html){
echo $html,"---\n";
if(preg_match($pattern,$html)){
// fieldset is closed
echo 'fieldset closed',"\n\n";
}else{
// fieldset set is *not* closed
echo 'fieldset not closed',"\n\n";
}
}
?>
Result:
<fieldset>
<legend>&nbsp;Bob&nbsp;</legend>
<div class="quote">Hello there</div>
</fieldset>
---
fieldset closed

<fieldset>
<legend>&nbsp;Bob&nbsp;</legend>
<div class="quote">Hello there</div>
---
fieldset not closed

<legend>&nbsp;Bob&nbsp;</legend>
<div class="quote">Hello there</div>
---
fieldset not closed
--
Justin Koivisto, ZCE - ju****@koivi.com
http://koivi.com
Jan 12 '06 #4
Al
I do apologise. It seems to work well, and I realised that my mistake
was in copying your code directly. the string in my code is $ret and
yours is $str. So basically I'm an idiot :)

Nonetheless, thanks for the help. And sorry I've been so long getting
back on the matter, I've been away for the weekend.

I'll implement the code as soon as I can and work out how to go about
fixing my output (whether I should cut the fieldset off entirely or
leave it and just ad trailing '...'s etc.)

I've thought of a few problems I could encounter. If I just leave the
'...'s and close the fieldset, I run the risk of one particular quote
being just long enough to finish the internal <div> but not the
fieldset, e.g.:

<fieldset>
<legend>Bob</legend>
<div>Just long enough to get the end of this in, but not the rest</div>
....

That'll be alright, but the '...'s won't be enclosed on the div, not
that that's a *major* problem, just if I want to style the divs more
than now, it won't style the '...' (although the same styling applied
to fieldset, fieldset * {} will work.

Anyway thanks again for helping me out.

Jan 16 '06 #5
Al wrote:
I do apologise. It seems to work well, and I realised that my mistake
was in copying your code directly. the string in my code is $ret and
yours is $str. So basically I'm an idiot :)

Nonetheless, thanks for the help. And sorry I've been so long getting
back on the matter, I've been away for the weekend.

I'll implement the code as soon as I can and work out how to go about
fixing my output (whether I should cut the fieldset off entirely or
leave it and just ad trailing '...'s etc.)

I've thought of a few problems I could encounter. If I just leave the
'...'s and close the fieldset, I run the risk of one particular quote
being just long enough to finish the internal <div> but not the
fieldset, e.g.:

<fieldset>
<legend>Bob</legend>
<div>Just long enough to get the end of this in, but not the rest</div>
...

That'll be alright, but the '...'s won't be enclosed on the div, not
that that's a *major* problem, just if I want to style the divs more
than now, it won't style the '...' (although the same styling applied
to fieldset, fieldset * {} will work.

Anyway thanks again for helping me out.


Rather than Using a fieldset, why not just use blockquote? Those are
meant for quoting others anyway.

--
Justin Koivisto, ZCE - ju****@koivi.com
http://koivi.com
Jan 16 '06 #6
Al
I know, but I wanted something that had a title. The quotes look a bit
like this:

____
--| Al |---------------------
| ¯¯¯¯ |
| Hello there |
|_____________________________|

Currently a fieldset works quite nicely for this task, although I may
eventually move on to a blockquote with a border and a relatively
positioned (top + some pixels) span before it in HTML.

But thanks for the info, and I know my method wasn't exaclty
stylistically sound.

Jan 17 '06 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

7
by: Reckless | last post by:
I've got a file with this in it: The data I'd like extracted is within the quotes: Some string data I can read the file out and extract (using string positions) the data I'd like but it...
4
by: peterbe | last post by:
I want to match a word against a string such that 'peter' is found in "peter bengtsson" or " hey peter," or but in "thepeter bengtsson" or "hey peterbe," because the word has to stand on its own....
10
by: Lee Kuhn | last post by:
I am trying the create a regular expression that will essentially match characters in the middle of a fixed-length string. The string may be any characters, but will always be the same length. In...
18
by: Q. John Chen | last post by:
I have Vidation Controls First One: Simple exluce certain special characters: say no a or b or c in the string: * Second One: I required date be entered in "MM/DD/YYYY" format: //+4 How...
20
by: Larry Woods | last post by:
I'm drawing a blank... What is the regular expression for search a string for NO occurances of a substring? Example: I want to find all lines that do NOT have the substing "image" in them. ...
5
by: Ryan | last post by:
HELLO I am using the following MICROSOFT SUGGESTED (somewhere on msdn) regular expression to validate email addresses however I understand that the RFP allows for "+" symbols in the email address...
7
by: Billa | last post by:
Hi, I am replaceing a big string using different regular expressions (see some example at the end of the message). The problem is whenever I apply a "replace" it makes a new copy of string and I...
6
by: Ludwig | last post by:
Hi, i'm using the regular expression \b\w to find the beginning of a word, in my C# application. If the word is 'public', for example, it works. However, if the word is '<public', it does not...
25
by: Mike | last post by:
I have a regular expression (^(.+)(?=\s*).*\1 ) that results in matches. I would like to get what the actual regular expression is. In other words, when I apply ^(.+)(?=\s*).*\1 to " HEART...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.