472,342 Members | 1,449 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,342 software developers and data experts.

How can I improve this regex ?

290 100+

I am not sure if this can be solved with regex,
possibly the string needs to be chopped into words
and then stepped through ( but not sure how).

Anyway, this is what I have and it is very close to what
I want.:
Expand|Select|Wrap|Line Numbers
  1. preg_match_all("#((?:\b\w{1,20}\b\s+){2})#", $data, $matches) 
Here is part of the out put from print_r($matches);

[32] => technical support [33] => services attempt [34] => to help
[35] => the user [36] => solve specific [37] => problems with

As you can see, the data is just being divided into two word chunks.

And I am missing half of the possible phrases eg "support services"
is not reported.

This is not quite what I expected

What I wanted was a list of all the two word phrases,
so I should be getting:

[32] => technical support [33] => support services [34] => services attempt
[35] => attempt to [36] => to help [37] => help the [38] => the user

You see the overlap ?
This ensures that I do get all the phrases.

Any ideas on how would I need to change my
regex to achieve my desired output ?

If not possible, how else can I achieve it ?
Nov 7 '09 #1
4 1413
8,658 Expert Mod 8TB
there’s nothing you can do while using preg_match_all()

After the first match is found, the subsequent searches are continued on from end of the last match.
Nov 7 '09 #2
290 100+
So I think the "b" part of the question comes into play.

Any suggestions ?
Nov 7 '09 #3
8,658 Expert Mod 8TB
\b = word boundary

otherwise see above quote
Nov 7 '09 #4
5,058 Expert 4TB

I don't see a way to do this using regexp alone. It just searches for patterns, it doesn't do logic.

You could just split the string into induvidual words and have PHP pair the together, two and two.
A loop that goes through each word in the array, partnering it up with the next word in the list, added to a second array.

Expand|Select|Wrap|Line Numbers
  1. $words = explode(' ', $input);
  2. for($i = 1; $i < count($words) - 1; ++$i) {
  3.   $pairs[] = $words[$i-1] . " " . $words[$i];
  4. }
Nov 7 '09 #5

Sign in to post your reply or Sign up for a free account.

Similar topics

by: Tim Conner | last post by:
Is there a way to write a faster function ? public static bool IsNumber( char Value ) { if (Regex.IsMatch( Value.ToString(), @"^+$" )) {...
by: philipl | last post by:
hi, I have some code here which basically look for within a string, the occurance of any 3 consectative characters which are the same. so AAA...
by: jeevankodali | last post by:
Hi I have an .Net application which processes thousands of Xml nodes each day and for each node I am using around 30-40 Regex matches to see if...
by: clintonG | last post by:
I'm using an .aspx tool I found at but as nice as the interface is I think I need to consider using others. Some can generate C# I understand....
by: hardieca | last post by:
Hello, I'm creating a regex pattern that will pull out the attribute and value pairs from an HTML tag. What I have so far is: ...
by: Extremest | last post by:
I have a huge regex setup going on. If I don't do each one by itself instead of all in one it won't work for. Also would like to know if there is...
by: Extremest | last post by:
I am using this regex. static Regex paranthesis = new Regex("(\\d*/\\d*)", RegexOptions.IgnoreCase); it should find everything between...
by: aspineux | last post by:
My goal is to write a parser for these imaginary string from the SMTP protocol, regarding RFC 821 and 1869. I'm a little flexible with the BNF from...
by: morleyc | last post by:
Hi, i would like to remove a number of characters from my string (\t \r \n which are throughout the string), i know regex can do this but i have no...
by: better678 | last post by:
Question: Discuss your understanding of the Java platform. Is the statement "Java is interpreted" correct? Answer: Java is an object-oriented...
by: teenabhardwaj | last post by:
How would one discover a valid source for learning news, comfort, and help for engineering designs? Covering through piles of books takes a lot of...
by: Kemmylinns12 | last post by:
Blockchain technology has emerged as a transformative force in the business world, offering unprecedented opportunities for innovation and...
by: Naresh1 | last post by:
What is WebLogic Admin Training? WebLogic Admin Training is a specialized program designed to equip individuals with the skills and knowledge...
by: jalbright99669 | last post by:
Am having a bit of a time with URL Rewrite. I need to incorporate http to https redirect with a reverse proxy. I have the URL Rewrite rules made...
by: Matthew3360 | last post by:
Hi, I have a python app that i want to be able to get variables from a php page on my webserver. My python app is on my computer. How would I make it...
by: AndyPSV | last post by:
HOW CAN I CREATE AN AI with an .executable file that would suck all files in the folder and on my computerHOW CAN I CREATE AN AI with an .executable...
by: Arjunsri | last post by:
I have a Redshift database that I need to use as an import data source. I have configured the DSN connection using the server, port, database, and...
by: Oralloy | last post by:
Hello Folks, I am trying to hook up a CPU which I designed using SystemC to I/O pins on an FPGA. My problem (spelled failure) is with the...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.