Connecting Tech Pros Worldwide Forums | Help | Site Map

Regular Expression: Backreference Inside Look-ahead

Clint Pachl
Guest
 
Posts: n/a
#1: Jul 17 '05
Are backreferences inside look-aheads permitted?

I am trying to remove a name=value pair from a url's query string
using php's preg_replace.

Here's what I am trying to accomplish:

// query string tests
$qs = 'var2=two&var1=one&var3=three'; // start
$qs = 'var1=one&var2=two&var3=three'; // middle
$qs = 'var1=one&var3=three&var2=two'; // end

// re - here's my best guess
preg_replace( '/&(?=(?:\1$))(var2=[^&]*)&?/', '', $qs );

// the output I'd like from preg_replace inputting any of the three $qs
'var1=one&var3=three'

Any help appreciated.

John Dunlop
Guest
 
Posts: n/a
#2: Jul 17 '05

re: Regular Expression: Backreference Inside Look-ahead


Clint Pachl wrote:
[color=blue]
> Are backreferences inside look-aheads permitted?[/color]

Yes.
[color=blue]
> I am trying to remove a name=value pair from a url's query string
> using php's preg_replace.
>
> Here's what I am trying to accomplish:
>
> // query string tests
> $qs = 'var2=two&var1=one&var3=three'; // start
> $qs = 'var1=one&var2=two&var3=three'; // middle
> $qs = 'var1=one&var3=three&var2=two'; // end
>
> // re - here's my best guess
> preg_replace( '/&(?=(?:\1$))(var2=[^&]*)&?/', '', $qs );[/color]

This, I believe, always fails. When the back reference is
reached, its subpattern hasn't yet captured anything. So
you're in effect looking ahead for an empty string that is
at the end [1] of the subject string; but after that you're
matching 'var2=[^&]*)&?'.
[color=blue]
> // the output I'd like from preg_replace inputting any of the three $qs
> 'var1=one&var3=three'[/color]

You don't need lookaheads:

`(&)?var2=[^&]*(?(1)|&)`

First match and capture an optional ampersand, followed by
'var2=' and zero or more characters excluding ampersands.
Then, if the optional ampersand matched, don't match
anything else; if it didn't, match an ampersand.


[1] 'End' meaning, by default, either the very end or right
before a newline that is the last character.

--
Jock
Clint Pachl
Guest
 
Posts: n/a
#3: Jul 17 '05

re: Regular Expression: Backreference Inside Look-ahead


John Dunlop <usenet+2004@john.dunlop.name> wrote in message news:<MPG.1bc4f21c4e33aa7a98977a@News.Individual.N ET>...[color=blue]
> Clint Pachl wrote:
>[color=green]
> > Are backreferences inside look-aheads permitted?[/color]
>
> Yes.
>[color=green]
> > I am trying to remove a name=value pair from a url's query string
> > using php's preg_replace.
> >
> > Here's what I am trying to accomplish:
> >
> > // query string tests
> > $qs = 'var2=two&var1=one&var3=three'; // start
> > $qs = 'var1=one&var2=two&var3=three'; // middle
> > $qs = 'var1=one&var3=three&var2=two'; // end
> >
> > // re - here's my best guess
> > preg_replace( '/&(?=(?:\1$))(var2=[^&]*)&?/', '', $qs );[/color]
>
> This, I believe, always fails. When the back reference is
> reached, its subpattern hasn't yet captured anything. So
> you're in effect looking ahead for an empty string that is
> at the end [1] of the subject string; but after that you're
> matching 'var2=[^&]*)&?'.
>[color=green]
> > // the output I'd like from preg_replace inputting any of the three $qs
> > 'var1=one&var3=three'[/color]
>
> You don't need lookaheads:
>
> `(&)?var2=[^&]*(?(1)|&)`[/color]

This expression works like a charm.

preg_replace( '/(&)?var2=[^&]*(?(1)|&)/', '', $qs )
[color=blue]
>
> First match and capture an optional ampersand, followed by
> 'var2=' and zero or more characters excluding ampersands.
> Then, if the optional ampersand matched, don't match
> anything else; if it didn't, match an ampersand.
>
>
> [1] 'End' meaning, by default, either the very end or right
> before a newline that is the last character.[/color]

It's nice when the solution is simple and elegant.
Thanks John.
Closed Thread


Similar PHP bytes