By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
455,637 Members | 3,219 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 455,637 IT Pros & Developers. It's quick & easy.

A small regex question

P: n/a
Hello,

This code:

<?php
preg_replace_callback("/(a)|(b)/",
create_function('$x',
'foreach ($x as $y => $z)'.
'echo "\\\$y: $z<br>";'),
"b");
?>

Outputs:

\0: b
\1:
\2: b

I expect 'b' to be placed in \1, but for some reason \1 contains a
null value and 'b' is moved to \2. Can someone please explain why it
happens?

Thank you
Jul 17 '05 #1
Share this Question
Share on Google+
3 Replies


P: n/a
On 29 Oct 2003 16:00:25 -0800, on*******@hotmail.com (Online) wrote:
This code:

<?php
preg_replace_callback("/(a)|(b)/",
create_function('$x',
'foreach ($x as $y => $z)'.
'echo "\\\$y: $z<br>";'),
"b");
?>

Outputs:

\0: b
\1:
\2: b

I expect 'b' to be placed in \1, but for some reason \1 contains a
null value and 'b' is moved to \2. Can someone please explain why it
happens?


You have two capturing subexpressions; 'b' is in the second one.

Did you mean (a|b) instead?

--
Andy Hassall (an**@andyh.co.uk) icq(5747695) (http://www.andyh.co.uk)
Space: disk usage analysis tool (http://www.andyhsoftware.co.uk/space)
Jul 17 '05 #2

P: n/a
> You have two capturing subexpressions; 'b' is in the second one.

Did you mean (a|b) instead?


Yea, I have two capturing brackets, but the pattern is separated by |
into two parts - (a) and (b).

Afaik when (a) fails to match, the regex engine should completely
ignore the rest of the current part and move on to (b) and try to
match it against the text. And if (b) matched, the letter 'b' should
be placed inside \1.

The problem here is that \1 gets a null value while 'b' is pushed to
\2, and I don't understand why it acts this way. On mIRC, that
includes an implementation of PCRE, the letter 'b' is placed in \1 (as
I expect), though on PHP it's moved to \2.

I am aware there are other ways of doing this simple match, like
"(a|b)" or "([ab])", but this problem occured to me with a more
complicated pattern and I think "(a)|(b)" is the simplest form of
presenting it.
Jul 17 '05 #3

P: n/a
Online wrote:
Yea, I have two capturing brackets, but the pattern is separated by |
into two parts - (a) and (b).

Afaik when (a) fails to match, the regex engine should completely
ignore the rest of the current part and move on to (b) and try to
match it against the text. And if (b) matched, the letter 'b' should
be placed inside \1.
No. With a pattern like "(a)|(b)" matching against a string like "b",
$0 contains the entire pattern match; $1 contains the text matched by
the first capturing subpattern (nothing); and $2 contains the text
matched by the second capturing subpattern ("b").

The description of preg_replace's replacement parameter might be
helpful in understanding what's really happening:

| Replacement may contain references of the form \\n or (since PHP
| 4.0.4) $n, with the latter form being the preferred one. Every such
| reference will be replaced by the text captured by the n'th
| parenthesized pattern. n can be from 0 to 99, and \\0 or $0 refers
| to the text matched by the whole pattern. Opening parentheses are
| counted from left to right (starting from 1) to obtain the number
| of the capturing subpattern.

http://www.php.net/manual/en/function.preg-replace.php
I am aware there are other ways of doing this simple match, like
"(a|b)" or "([ab])", but this problem occured to me with a more
complicated pattern and I think "(a)|(b)" is the simplest form of
presenting it.


Those are different to your first pattern. Which do you really want?

--
Jock
Jul 17 '05 #4

This discussion thread is closed

Replies have been disabled for this discussion.