473,396 Members | 1,714 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

regular expression - help

can anyone translate this into plain english?

preg_match_all("/(\w+[,. ?])+/U", $text, $words);
Jul 17 '05 #1
4 2316
kaptain kernel wrote:
can anyone translate this into plain english?

preg_match_all("/(\w+[,. ?])+/U", $text, $words);


http://us2.php.net/manual/en/functio...-match-all.php
-Eric Kincl
Jul 17 '05 #2
kaptain kernel wrote:
can anyone translate this into plain english?

preg_match_all("/(\w+[,. ?])+/U", $text, $words);


/(\w+[,. ?])+/U

\w+ : word character, one or more times

[,. ?] : (looks like) a literal comman, period, space, or question mark

(\w+[,. ?])+ : the above explanations, one or more times

/U : ungreedy (doesn't effect this pattern)

See here for more info:
http://www.comp.leeds.ac.uk/Perl/matching.html
http://www.anaesthetist.com/mnm/perl/regex.htm
http://sitescooper.org/tao_regexps.html

--
Justin Koivisto - sp**@koivi.com
PHP POSTERS: Please use comp.lang.php for PHP related questions,
alt.php* groups are not recommended.

Jul 17 '05 #3
Justin Koivisto wrote:
preg_match_all("/(\w+[,. ?])+/U", $text, $words);


/U : ungreedy (doesn't effect this pattern)


I had thought that initially too.

The pattern has one capturing subpattern, meaning -- with the
default flag for preg_match_all (PREG_PATTERN_ORDER), which
applies in this case as no flag was explicitly specified -- that
the $words array contains two further arrays: one array containing
full pattern matches, and another array containing the capturing
subpattern matches.

But because the U modifier (PCRE_UNGREEDY) is set, both arrays
will have exactly the same contents; that is, they'll both contain
values of one or more word characters followed by a single comma,
period, space, or question mark. That seems redundant to me.

I wonder what this pattern is supposed to accomplish.

--
Jock
Jul 17 '05 #4
On Mon, 10 Nov 2003 17:16:47 +0000, kaptain kernel <no****@nospam.gov> wrote:
can anyone translate this into plain english?

preg_match_all("/(\w+[,. ?])+/U", $text, $words);


/U isn't a Perl regex modifier, manual for PCRE says it makes matches
non-greedy by default though.

For the rest, YAPE::Regex::Explain (a useful Perl module) comes up with:
(after appending '?' to all the quantifiers to make them non-greedy)

The regular expression:

(?-imsx:(\w+?[,. ?])+?)

matches as follows:

NODE EXPLANATION
----------------------------------------------------------------------
(?-imsx: group, but do not capture (case-sensitive)
(with ^ and $ matching normally) (with . not
matching \n) (matching whitespace and #
normally):
----------------------------------------------------------------------
( group and capture to \1 (1 or more times
(matching the least amount possible)):
----------------------------------------------------------------------
\w+? word characters (a-z, A-Z, 0-9, _) (1 or
more times (matching the least amount
possible))
----------------------------------------------------------------------
[,. ?] any character of: ',', '.', ' ', '?'
----------------------------------------------------------------------
)+? end of \1 (NOTE: because you're using a
quantifier on this capture, only the LAST
repetition of the captured pattern will be
stored in \1)
----------------------------------------------------------------------
) end of grouping
----------------------------------------------------------------------

--
Andy Hassall (an**@andyh.co.uk) icq(5747695) (http://www.andyh.co.uk)
Space: disk usage analysis tool (http://www.andyhsoftware.co.uk/space)
Jul 17 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: Bradley Plett | last post by:
I'm hopeless at regular expressions (I just don't use them often enough to gain/maintain knowledge), but I need one now and am looking for help. I need to parse through a document to find a URL,...
4
by: Neri | last post by:
Some document processing program I write has to deal with documents that have headers and footers that are unnecessary for the main processing part. Therefore, I'm using a regular expression to go...
10
by: Lee Kuhn | last post by:
I am trying the create a regular expression that will essentially match characters in the middle of a fixed-length string. The string may be any characters, but will always be the same length. In...
3
by: James D. Marshall | last post by:
The issue at hand, I believe is my comprehension of using regular expression, specially to assist in replacing the expression with other text. using regular expression (\s*) my understanding is...
7
by: Billa | last post by:
Hi, I am replaceing a big string using different regular expressions (see some example at the end of the message). The problem is whenever I apply a "replace" it makes a new copy of string and I...
9
by: Pete Davis | last post by:
I'm using regular expressions to extract some data and some links from some web pages. I download the page and then I want to get a list of certain links. For building regular expressions, I use...
3
by: Zach | last post by:
Hello, Please forgive if this is not the most appropriate newsgroup for this question. Unfortunately I didn't find a newsgroup specific to regular expressions. I have the following regular...
25
by: Mike | last post by:
I have a regular expression (^(.+)(?=\s*).*\1 ) that results in matches. I would like to get what the actual regular expression is. In other words, when I apply ^(.+)(?=\s*).*\1 to " HEART...
3
by: Mr.Steskal | last post by:
Posted: Wed Jul 11, 2007 7:01 am Post subject: Regular Expression Help -------------------------------------------------------------------------------- I need help writing a regular...
18
by: Lit | last post by:
Hi, I am looking for a Regular expression for a password for my RegExp ValidationControl Requirements are, At least 8 characters long. At least one digit At least one upper case character
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.