473,566 Members | 2,785 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Regex Nested Backreferences

For my web-based php regex find/replace do-hickey, I need to match
individual back references and wrap a tag around them so they'll be unique
to the rest of the match for individual color markup. Initially this
would seem easy enough, however not all of a potential regex match is
going to be within a back reference. So it's necessary to replace the
back reference, and only the back reference, while preserving the context
of the match. For example, if I were to search the text

fish this fish fish

looking for
..*?(?<=this )(fish).*

I'd match everything, capturing the second instance of fish into the back
reference. I can't simply take the match and run a replace for fish in
order to apply the highlighting, because then i'd end up with 3
highlighted "fish", 2 of which weren't supposed to be. I also couldn't
simply return the back reference with the markup, as that wouldn't return
the non-back referenced stuff.

My initial solution was to run the original find text over the match to
get the back references, using an extra flag to have it return the offset
of each back reference. So now I have the location of the text within the
string, and can get the length of it from that point from the string
itself. Going backwards so as not to mess with the numeric location with
in the string, it captures back references without losing context or
data. Perfect.

.. . . until back references are nested.

In this example:
(.*?(?<=this )(fish).*)

back reference 1 would be fish this fish fish, back reference 2 would be
fish -- here's where the problem surfaces.

If I wrap back reference 2 in the markup, when I apply back reference 1's
markup it's going to apply the end tag in the wrong place since the string
has increased and the original length calculated no longer applies. If I
replace back reference 1 first, same problem. I'm sure there's some
obvious, simple solution I'm overlooking having exhausted a bunch of
complex attempts to compensate for it. Any fresh perspectives on the best
way to markup nested groups while preserving the integrity of the return?

Below is the function the matches are being passed through, you'll see I'm
useing preg_match_all to get the capture groups as well as the match
location and then using substr_repalce to insert the pseudo-markup.

function hltr($text,$fin d) {
preg_match_all( $find,$text,$hl ight,PREG_OFFSE T_CAPTURE+PREG_ SET_ORDER);
if ( isset($_POST['debug']) || isset($_GET['debug']) ) {
echo "<pre>";
print_r($hlight );
echo "</pre>";
}
$n=count($hligh t[0])-1;
$text = $hlight[0][0][0];
while ( $n > 0 ) {
$text =
substr_replace( $text,"back$n:: ".$hlight[0][$n][0]."::bk",$hli ght[0][$n][1],strlen($hlight[0][$n][0]));
$n--;
}
return('<strong class="result"> '.$text.'</strong>');
}

To see it highlight backreferences correctly:
http://tinyurl.com/aongu
And failing on nested groups
http://tinyurl.com/7jp8c

Thanks . . .

Allen
Feb 6 '06 #1
4 4497
On Mon, 06 Feb 2006 20:20:58 -0500, bobzimuta <ej******@gmail .com> wrote:
http://roblocher.com/technotes/regexp.aspx


I don't believe you read my message, Bob -- I'm not asking for help with
regex, I know regex. My problem is that I'm trying to take regex and
highlight various aspects of the syntax, in this case the different sub
groups. Had you read the post, you'd have seen that the links to what I'm
working on can do everything and more than what you linked to. Thanks
anyway.

Allen
Feb 7 '06 #3
I skimmed. I saw you wanted to do some highlighting of regex matches.
This guy (Rob Locher) wrote a nice regex highlighter. Thought you could
possibly get something useful out of it (i.e. analyze his algorithm).
You're welcome anyway.

Feb 8 '06 #4
On Tue, 07 Feb 2006 19:50:12 -0500, bobzimuta <ej******@gmail .com> wrote:
I skimmed. I saw you wanted to do some highlighting of regex matches.
This guy (Rob Locher) wrote a nice regex highlighter. Thought you could
possibly get something useful out of it (i.e. analyze his algorithm).
You're welcome anyway.


I'd have appreciated that explanation -- at any rate, I'm sorry for my
curt response, I'd spent too many hours with code to be any good with
people. I did put together a solution, The working model is linked
below. I might have to check out his source to see if there's anything I
can glean from it anyway. Thanks.

A.

--
http://ReReplace.com
A Web based regular expressions powered find/replace utility
Feb 8 '06 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
2483
by: Ivan Marsh | last post by:
Hey Folks, This isn't actually a perl question but since you folks ar the regex experts I thought you might be able to help. I want to do a search and replace in my text editor (that supports regex) for any line that begins with print( and ends with "); and replace the ending "); with \n"); where the line doesn't alredy end in \n"); ex:
0
3419
by: Dean H. Saxe | last post by:
I'm currently developing a tool in perl to search out potential XSS (Cross Site Scripting) vulnerabilities and correct them in a ColdFusion based web app. I've been having great success so far, however, one scenario has me banging my head against the wall. I need a regex to find all <cfoutput ...>...</cfoutput> blocks in a CFM template. ...
3
9725
by: Vibha Tripathi | last post by:
Hi Folks, I put a Regular Expression question on this list a couple days ago. I would like to rephrase my question as below: In the Python re.sub(regex, replacement, subject) method/function, I need the second argument 'replacement' to be another regular expression ( not a string) . So when I find a 'certain kind of string' in
5
2268
by: bluesrift | last post by:
Using the WYSIWYG contenteditable feature, Internet Explorer will often add a style to the image tag to define its display size after the image has been dragged to display at something other than its natural size. For example: style="WIDTH: 432px; HEIGHT: 344px" The values contained within the style are the correct ones resulting from...
4
1890
by: Amy Dillavou | last post by:
Can someone help me with understanding how python uses backreferences? I need to remember the item that was last matched by the re engine but i cant seem to understand anything that I find on backreferences. if I want to access the last match do i use \number or is there something else i have to do? heres part of my code: renDate =...
9
4240
by: Whitless | last post by:
Okay I am ready to pull what little hair I have left out. I pass the function below my String to search, my find string (a regular expression) and my replace string (another regular expression). Why does this function replace the found reg ex. with the actual string "\t" and not a tab? (in the example below out of frustration I actually...
13
2353
by: Chris Lieb | last post by:
I am trying to write a regex that will parse BBcode into HTML using JavaScript. Everything was going smoothly using the string class replace() operator with regex's until I got to the list tag. Implementing the list tag itself was fairly easy. What was not was trying to handle the list items. For some reason, in BBcode, they didn't bother...
8
2579
by: Bob | last post by:
I need to create a Regex to extract all strings (including quotations) from a C# or C++ source file. After being unsuccessful myself, I found this sample on the internet: @"@?""""|@?"".*?(?!\\).""|''|'.*?(?!\\).'" I am inputting the entire source file string and using it with RegexOptions.Singleline. This works OK with, unless the...
15
3217
by: Kay Schluehr | last post by:
I have a list of strings ls = and want to create a regular expression sx from it, such that sx.match(s) yields a SRE_Match object when s starts with an s_i for one i in . There might be relations between those strings: s_k.startswith(s_1) -> True or s_k.endswith(s_1) -> True. An extreme case would be ls = . For this reason SRE_Match should...
0
7666
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main...
0
7584
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language...
0
7888
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. ...
0
8108
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
0
7951
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the...
1
5484
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes...
0
5213
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert...
1
2083
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
1
1201
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.