473,387 Members | 1,611 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

Regex help

Hello everyone,

I am puzzled at PHP's handling of regex. Here's the code:

<?php

$str="aabcc";
$pattern="/((a+)b?(c+))/";

preg_match_all($pattern,$str,$matches);
print_r($matches[0]);

?>

The behaviour I expect from the above would be to match:
a
aa
c
cc
abc
aabc
abcc
aabbcc

The output is ALWAYS the maximum strings :

Array
(
[0] =Array
(
[0] =aabcc
)

[1] =Array
(
[0] =aabcc
)

[2] =Array
(
[0] =aa
)

[3] =Array
(
[0] =cc
)

)
Any idea why the substrings are not picked up?

Thanks
Patrick
Dec 17 '07 #1
10 2051
..oO(Patrick Drouin)
>I am puzzled at PHP's handling of regex. Here's the code:

<?php

$str="aabcc";
$pattern="/((a+)b?(c+))/";

preg_match_all($pattern,$str,$matches);
print_r($matches[0]);

?>

The behaviour I expect from the above would be to match:
a
aa
c
cc
abc
aabc
abcc
aabbcc
Nope. What's returned is the entire matched string and all parenthesized
sub strings (if there are any), but not every single matching point from
during the execution.
>The output is ALWAYS the maximum strings :
Correct.
>Array
(
[0] =Array
(
[0] =aabcc
)

[1] =Array
(
[0] =aabcc
)

[2] =Array
(
[0] =aa
)

[3] =Array
(
[0] =cc
)

)
Any idea why the substrings are not picked up?
The above is exactly what you told preg_match() to return:

0: the entire matched string
1: the first sub pattern: ((a+)b?(c+)) =the entire string again
2: the second sub pattern: (a+) =aa
3: the third sub pattern: (c+) =cc

Micha
Dec 17 '07 #2
Hello Michael,
Nope. What's returned is the entire matched string and all parenthesized
sub strings (if there are any), but not every single matching point from
during the execution.

The above is exactly what you told preg_match() to return:
Well OK, let me rephrase then, how can I tell PHP to match the
substrings. In my mind, (a+) means a, aa, aaa, ... and not only the
maximum string. I don't see how that behaviour is logical in any way.

Thanks,
Patrick
Dec 17 '07 #3
Patrick Drouin wrote:
$pattern="/((a+)b?(c+))/";
$pattern="/(a+)b?(c+)/";

HTH

--
Toby A Inkster BSc (Hons) ARCS
[Geek of HTML/SQL/Perl/PHP/Python/Apache/Linux]
[OS: Linux 2.6.17.14-mm-desktop-9mdvsmp, up 10 days, 21:10.]

Sharing Music with Apple iTunes
http://tobyinkster.co.uk/blog/2007/1...tunes-sharing/
Dec 18 '07 #4
On Mon, 17 Dec 2007 17:03:19 +0100, Patrick Drouin <no**@none.comwrote:
Hello everyone,

I am puzzled at PHP's handling of regex. Here's the code:

<?php

$str="aabcc";
$pattern="/((a+)b?(c+))/";

preg_match_all($pattern,$str,$matches);
print_r($matches[0]);

?>

The behaviour I expect from the above would be to match:
a
aa
c
cc
abc
aabc
abcc
aabbcc
Is this one string or a set of strings you're trying to match? Only the
5th, 6th and 7th line will match this pattern...
>
The output is ALWAYS the maximum strings :
What do you mean by 'maximum'? I see nothing weird here...
--
Rik Wasmus
Dec 18 '07 #5
Moi
Hello,

On 18 déc, 05:34, Toby A Inkster <usenet200...@tobyinkster.co.uk>
wrote:
Patrick Drouin wrote:
$pattern="/((a+)b?(c+))/";

$pattern="/(a+)b?(c+)/";
If you try this, you will see that it spit out

[0] =aaabccc
[0] =aaa
[0] =ccc

That's not what I'm looking for...

Thanks,
P
Dec 18 '07 #6
Moi
Hello Rick,

On 18 déc, 06:52, "Rik Wasmus" <luiheidsgoe...@hotmail.comwrote:
<?php
$str="aabcc";
$pattern="/((a+)b?(c+))/";
preg_match_all($pattern,$str,$matches);
print_r($matches[0]);
?>
The behaviour I expect from the above would be to match:
a
aa
c
cc
abc
aabc
abcc
aabbcc
The output is ALWAYS the maximum strings :

What do you mean by 'maximum'? I see nothing weird here...
OK,what I don't understand is why "a" is not captured as (a+) means
"a" or repeated "a"...

Thanks,
P
Dec 18 '07 #7
On Tue, 18 Dec 2007 15:31:14 +0100, Moi <pa************@gmail.comwrote:
Hello Rick,

On 18 déc, 06:52, "Rik Wasmus" <luiheidsgoe...@hotmail.comwrote:
<?php
$str="aabcc";
$pattern="/((a+)b?(c+))/";
preg_match_all($pattern,$str,$matches);
print_r($matches[0]);
?>
The behaviour I expect from the above would be to match:
a
aa
c
cc
abc
aabc
abcc
aabbcc
The output is ALWAYS the maximum strings :

What do you mean by 'maximum'? I see nothing weird here...

OK,what I don't understand is why "a" is not captured as (a+) means
"a" or repeated "a"...
because /(a+)b?(c+)/ means:
At least one or more a
Optionally followed by one b
Followed by at least one or more c

If 'a' is taken as a single string, it will not match because there's no c
after it, if it was part of a total string you gave us above, it will
still not match because both 'a' and 'aa' are not followed by 'c' or ('b'
and some 'c's), they're followed by a newline character (\n).
--
Rik Wasmus
Dec 18 '07 #8
Moi
Thanks Rik, I guess that makes sense.
P
Dec 18 '07 #9
..oO(Patrick Drouin)
>Hello Michael,
>Nope. What's returned is the entire matched string and all parenthesized
sub strings (if there are any), but not every single matching point from
during the execution.

The above is exactly what you told preg_match() to return:

Well OK, let me rephrase then, how can I tell PHP to match the
substrings. In my mind, (a+) means a, aa, aaa, ... and not only the
maximum string. I don't see how that behaviour is logical in any way.
That's how regular expressions work in general. The only thing that you
can control in many regex engines is whether the engine should stop the
matching process after it has found a minimum match (ungreedy) or if it
should continue until the maximum length (greedy), which is usually the
default.

Micha
Dec 18 '07 #10
Greetings, Moi.
In reply to Your message dated Tuesday, December 18, 2007, 17:29:22,
Hello,
On 18 déc, 05:34, Toby A Inkster <usenet200...@tobyinkster.co.uk>
wrote:
>Patrick Drouin wrote:
$pattern="/((a+)b?(c+))/";

$pattern="/(a+)b?(c+)/";
If you try this, you will see that it spit out
[0] =aaabccc
[0] =aaa
[0] =ccc
TH, it is

[0] =aaabccc
[1] =aaa
[2] =ccc

which is HIGHLY different.
That's not what I'm looking for...
RTFM FTW.
In (0) it always return the whole matched [sub]string.
Just ignore the [0] entry if You do not want to deal with it.
--
Sincerely Yours, AnrDaemon <an*******@freemail.ru>

Dec 21 '07 #11

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
by: Dave | last post by:
I'm struggling with something that should be fairly simple. I just don't know the regext syntax very well, unfortunately. I'd like to parse words out of what is basically a boolean search...
20
by: jeevankodali | last post by:
Hi I have an .Net application which processes thousands of Xml nodes each day and for each node I am using around 30-40 Regex matches to see if they satisfy some conditions are not. These Regex...
17
by: clintonG | last post by:
I'm using an .aspx tool I found at but as nice as the interface is I think I need to consider using others. Some can generate C# I understand. Your preferences please... <%= Clinton Gallagher ...
7
by: Mike Labosh | last post by:
I have the following System.Text.RegularExpressions.Regex that is supposed to remove this predefined list of garbage characters from contact names that come in on import files : Dim...
9
by: jmchadha | last post by:
I have got the following html: "something in html ... etc.. city1... etc... <a class="font1" href="city1.html" onclick="etc."click for <b>info</bon city1 </a> ... some html. city1.. can repeat...
4
by: Chris | last post by:
Hi Everyone, I am using a regex to check for a string. When all the file contains is my test string the regex returns a match, but when I embed the test string in the middle of a text file a...
7
by: Extremest | last post by:
I am using this regex. static Regex paranthesis = new Regex("(\\d*/\\d*)", RegexOptions.IgnoreCase); it should find everything between parenthesis that have some numbers onyl then a forward...
6
by: Phil Barber | last post by:
I am using Regex to validate a file name. I have everything I need except I would like the dot(.) in the filename only to appear once. My question is it possible to allow one instance of character...
1
by: jonnyboy6969 | last post by:
Hi All Really hoping someone can help me out here with my deficient regex skills :) I have a function which takes a string of HTML and replaces a term (word or phrase) with a link. The pupose...
0
by: Support Desk | last post by:
That’s it exactly..thx -----Original Message----- From: Reedick, Andrew Sent: Tuesday, June 03, 2008 9:26 AM To: Support Desk Subject: RE: regex help The regex will now skip anything with...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.