| re: Regular Expression: Backreference Inside Look-ahead
John Dunlop <usenet+2004@john.dunlop.name> wrote in message news:<MPG.1bc4f21c4e33aa7a98977a@News.Individual.N ET>...[color=blue]
> Clint Pachl wrote:
>[color=green]
> > Are backreferences inside look-aheads permitted?[/color]
>
> Yes.
>[color=green]
> > I am trying to remove a name=value pair from a url's query string
> > using php's preg_replace.
> >
> > Here's what I am trying to accomplish:
> >
> > // query string tests
> > $qs = 'var2=two&var1=one&var3=three'; // start
> > $qs = 'var1=one&var2=two&var3=three'; // middle
> > $qs = 'var1=one&var3=three&var2=two'; // end
> >
> > // re - here's my best guess
> > preg_replace( '/&(?=(?:\1$))(var2=[^&]*)&?/', '', $qs );[/color]
>
> This, I believe, always fails. When the back reference is
> reached, its subpattern hasn't yet captured anything. So
> you're in effect looking ahead for an empty string that is
> at the end [1] of the subject string; but after that you're
> matching 'var2=[^&]*)&?'.
>[color=green]
> > // the output I'd like from preg_replace inputting any of the three $qs
> > 'var1=one&var3=three'[/color]
>
> You don't need lookaheads:
>
> `(&)?var2=[^&]*(?(1)|&)`[/color]
This expression works like a charm.
preg_replace( '/(&)?var2=[^&]*(?(1)|&)/', '', $qs )
[color=blue]
>
> First match and capture an optional ampersand, followed by
> 'var2=' and zero or more characters excluding ampersands.
> Then, if the optional ampersand matched, don't match
> anything else; if it didn't, match an ampersand.
>
>
> [1] 'End' meaning, by default, either the very end or right
> before a newline that is the last character.[/color]
It's nice when the solution is simple and elegant.
Thanks John. |