473,386 Members | 1,864 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,386 software developers and data experts.

can't find the right PCRE pattern for a preg_match_all

i'm starting with a string such as "Na**3C**6H**5O**7*2H**20"

im attempting to match all **\d+ ...once i can match all the double
asterix \d i intend to wrap the \d in "<sub>" tags for display
purposes.

i have been trying to write the correct pattern for 2 weeks now without
any success...i can get preg_replace() to work, with several simple
patterns but when i use preg_match_all, i either get unintended results
or incomplete matches.

any assistance in matching all "\*{2}\d+" from the above string would
be appreciated.

Jun 5 '06 #1
10 2013
Carved in mystic runes upon the very living rock, the last words of
greatprovider of comp.lang.php make plain:
i'm starting with a string such as "Na**3C**6H**5O**7*2H**20"

im attempting to match all **\d+ ...once i can match all the double
asterix \d i intend to wrap the \d in "<sub>" tags for display
purposes.

i have been trying to write the correct pattern for 2 weeks now without
any success...i can get preg_replace() to work, with several simple
patterns but when i use preg_match_all, i either get unintended results
or incomplete matches.


The expression you have works fine for me. You'll have to give more
information about what's happening. "It doesn't work" isn't much to go
on.

--
Alan Little
Phorm PHP Form Processor
http://www.phorm.com/
Jun 5 '06 #2
unformated: Ca(C**2H**3O**2)**2

formated: Ca(C<sub>2</sub>H**3O<sub>2</sub>)<sub>2</sub>

using: preg_match_all('/\*{2}[\d]+/', $formula, $my_arr)

any combination of patterns i try, this is as close as i can get.

Alan Little wrote:
Carved in mystic runes upon the very living rock, the last words of
greatprovider of comp.lang.php make plain:
i'm starting with a string such as "Na**3C**6H**5O**7*2H**20"

im attempting to match all **\d+ ...once i can match all the double
asterix \d i intend to wrap the \d in "<sub>" tags for display
purposes.

i have been trying to write the correct pattern for 2 weeks now without
any success...i can get preg_replace() to work, with several simple
patterns but when i use preg_match_all, i either get unintended results
or incomplete matches.


The expression you have works fine for me. You'll have to give more
information about what's happening. "It doesn't work" isn't much to go
on.

--
Alan Little
Phorm PHP Form Processor
http://www.phorm.com/


Jun 5 '06 #3
Carved in mystic runes upon the very living rock, the last words of
greatprovider of comp.lang.php make plain:
Alan Little wrote:
Carved in mystic runes upon the very living rock, the last words of
greatprovider of comp.lang.php make plain:
> i'm starting with a string such as "Na**3C**6H**5O**7*2H**20"
>
> im attempting to match all **\d+ ...once i can match all the double
> asterix \d i intend to wrap the \d in "<sub>" tags for display
> purposes.
>
> i have been trying to write the correct pattern for 2 weeks now
> without any success...i can get preg_replace() to work, with
> several simple patterns but when i use preg_match_all, i either get
> unintended results or incomplete matches.


The expression you have works fine for me. You'll have to give more
information about what's happening. "It doesn't work" isn't much to
go on.


unformated: Ca(C**2H**3O**2)**2

formated: Ca(C<sub>2</sub>H**3O<sub>2</sub>)<sub>2</sub>

using: preg_match_all('/\*{2}[\d]+/', $formula, $my_arr)

any combination of patterns i try, this is as close as i can get.


What is? What's as close as you can get? You still haven't said what
you're getting. And what are you trying to get? What you have for
"formatted", there is no way you're going to get that with
preg_match_all, just by itself. preg_replace might do it for you, but you
said that you have that working. So what exactly are you trying to do
with preg_match_all?

http://www.catb.org/~esr/faqs/smart-questions.html

I just ran your example above, and I get:

Array
(
[0] => Array
(
[0] => **2
[1] => **3
[2] => **2
[3] => **2
)

)

--
Alan Little
Phorm PHP Form Processor
http://www.phorm.com/
Jun 5 '06 #4

greatprovider wrote:
i'm starting with a string such as "Na**3C**6H**5O**7*2H**20"

im attempting to match all **\d+ ...once i can match all the double
asterix \d i intend to wrap the \d in "<sub>" tags for display
purposes.

i have been trying to write the correct pattern for 2 weeks now without
any success...i can get preg_replace() to work, with several simple
patterns but when i use preg_match_all, i either get unintended results
or incomplete matches.

any assistance in matching all "\*{2}\d+" from the above string would
be appreciated.


Why do you need preg_match_all for what you're trying to do?
preg_replace() should be adequate. If not that,
preg_replace_callback().

I am guessing here, but perhaps you're forgetting to capture the
digits?

echo preg_replace('/\*{2}(\d+)/', '<sub>\1</sub>',
"Na**3C**6H**5O**7*2H**20");

Appears to produce a correct representation of Sodium Citrate.

Jun 5 '06 #5
the formulas are inputted into a database in the format:
"Na**3C**6H**5O**7*2H**20"
i query the database and pass the list of unformated formulas through
this function in a loop.

function format_replace($formula) {
$search = "/(\*{2}[\d]+)/";
$replace = "<sub>\l</sub>";
$formatted = preg_replace( $search, $replace, $formula);
return $formatted;
}
the result i get for this is: Na\lC\lH\lO\l*2H\l (where "\l" is
subscripted).

alternatively i wrote this function using preg_match_all(), you may
notice a few other patterns i have commented out, these too failed my
litmus test so to speak.

function format_formula($formula) {
//$search = array ( '/\*{2}[^\d+]*?[\d+\-\s\d+]/' ) -----
((\*{2})[^\d+]*?([\d+])) -----;
//preg_match_all('/(\*\*((\d+)|(\d+\w)|(\d+\s)))/', $formula,
$my_arr); // get the 2-12 subscripted "/(\*{2}([\d]+[\-][\d]+))/"
preg_match_all('/\*{2}[\d]+/', $formula, $my_arr);
var_dump($my_arr);
foreach($my_arr as $key => $val) {
if ($val == NULL || $val == '') {
return;
} else {
for ($i = 0; $i < sizeof($my_arr); $i++) {
for ($j = 0; $j < sizeof($my_arr[$i]); $j++) {
$fixed[$i][$j] = trim($my_arr[$i][$j], "**");
$fixedstr = "<sub>" . $fixed[$i][$j] . "</sub>";
$formatted = str_replace($my_arr[$i][$j], $fixedstr , $formula);

return $formatted;
}
}
}
}
}


Chung Leong wrote:
greatprovider wrote:
i'm starting with a string such as "Na**3C**6H**5O**7*2H**20"

im attempting to match all **\d+ ...once i can match all the double
asterix \d i intend to wrap the \d in "<sub>" tags for display
purposes.

i have been trying to write the correct pattern for 2 weeks now without
any success...i can get preg_replace() to work, with several simple
patterns but when i use preg_match_all, i either get unintended results
or incomplete matches.

any assistance in matching all "\*{2}\d+" from the above string would
be appreciated.


Why do you need preg_match_all for what you're trying to do?
preg_replace() should be adequate. If not that,
preg_replace_callback().

I am guessing here, but perhaps you're forgetting to capture the
digits?

echo preg_replace('/\*{2}(\d+)/', '<sub>\1</sub>',
"Na**3C**6H**5O**7*2H**20");

Appears to produce a correct representation of Sodium Citrate.


Jun 6 '06 #6
Rik
greatprovider wrote:
the formulas are inputted into a database in the format:
"Na**3C**6H**5O**7*2H**20"
i query the database and pass the list of unformated formulas through
this function in a loop.

function format_replace($formula) {
$search = "/(\*{2}[\d]+)/";
$replace = "<sub>\l</sub>";
$formatted = preg_replace( $search, $replace, $formula);
return $formatted;
}
the result i get for this is: Na\lC\lH\lO\l*2H\l (where "\l" is
subscripted).


Euhm, I think i found the problem after numerous, all working, examples:

Use the NUMBER 1, it's hard in certain fonts to see the difference, but you
write 'l', not'1'.....
$search = '/\*{2}([\d]+)/';
$replace = '<sub>$1</sub>';

Using $N instead of \N is preferred.
Single quotes so PHP won't try to match a '$' to an existing variable.

Grtz,
--
Rik Wasmus
Jun 6 '06 #7
greatprovider wrote:
the formulas are inputted into a database in the format:
"Na**3C**6H**5O**7*2H**20"
i query the database and pass the list of unformated formulas through
this function in a loop.

function format_replace($formula) {
$search = "/(\*{2}[\d]+)/";
$replace = "<sub>\l</sub>";
$formatted = preg_replace( $search, $replace, $formula);
return $formatted;
}
the result i get for this is: Na\lC\lH\lO\l*2H\l (where "\l" is
subscripted).


You mistyped the code I provided in a couple places. There is no need
for brackets around \d as it's already a class. You want the
parentheses around \d+, not the whole expression, since you only want
the number (and not the two asterisk). And the replacement is \1 not
\l--meaning what's inside the first pair of parentheses.

Just copy-and-paste.

Jun 6 '06 #8
hah...that was the trick...thank you all...

one last question...now would anyone mind explaining me how "$1" works?

Rik wrote:
greatprovider wrote:
the formulas are inputted into a database in the format:
"Na**3C**6H**5O**7*2H**20"
i query the database and pass the list of unformated formulas through
this function in a loop.

function format_replace($formula) {
$search = "/(\*{2}[\d]+)/";
$replace = "<sub>\l</sub>";
$formatted = preg_replace( $search, $replace, $formula);
return $formatted;
}
the result i get for this is: Na\lC\lH\lO\l*2H\l (where "\l" is
subscripted).


Euhm, I think i found the problem after numerous, all working, examples:

Use the NUMBER 1, it's hard in certain fonts to see the difference, but you
write 'l', not'1'.....
$search = '/\*{2}([\d]+)/';
$replace = '<sub>$1</sub>';

Using $N instead of \N is preferred.
Single quotes so PHP won't try to match a '$' to an existing variable.

Grtz,
--
Rik Wasmus


Jun 6 '06 #9
Rik
greatprovider wrote:
hah...that was the trick...thank you all...

one last question...now would anyone mind explaining me how "$1"
works?


In a regular expression, you can "capture" pieces that match a pattern with
(). The number after the $ indicates which piece, captures are numbered from
the first opening '('.

For instance:
**567HJK

'/((\*{2})(\d+))/'

$1 will contain the match: '**567';
$2 will contain the match: '**';
$3 will contain the match: '567';

Normally, pieces that have to match a certain regex, but aren't used any
further, don't need (). In some cases, it's necessary for the pattern. In
that case, you could just use the numbered matches you need, discarding the
others (with multiple captures in a regex, it is absolutely not necessary to
use them all). To keep a complex regex more clear, you could also make a
'non-capturing' group by adding ?: after the opening. For instance:
(?:\s+(\d+)) will capture the digits in $1, instead of $2, because the first
parenthesis is told not to capture anything.

Want to learn more about regexes?
http://www.regular-expressions.info/tutorialcnt.html was a big help for me.

Grtz,
--
Rik Wasmus
Jun 6 '06 #10
thank you for the explanation..."it all makes sense now..." until i
find another expression....heh

thank you again,
GP

Rik wrote:
greatprovider wrote:
hah...that was the trick...thank you all...

one last question...now would anyone mind explaining me how "$1"
works?


In a regular expression, you can "capture" pieces that match a pattern with
(). The number after the $ indicates which piece, captures are numbered from
the first opening '('.

For instance:
**567HJK

'/((\*{2})(\d+))/'

$1 will contain the match: '**567';
$2 will contain the match: '**';
$3 will contain the match: '567';

Normally, pieces that have to match a certain regex, but aren't used any
further, don't need (). In some cases, it's necessary for the pattern. In
that case, you could just use the numbered matches you need, discarding the
others (with multiple captures in a regex, it is absolutely not necessary to
use them all). To keep a complex regex more clear, you could also make a
'non-capturing' group by adding ?: after the opening. For instance:
(?:\s+(\d+)) will capture the digits in $1, instead of $2, because the first
parenthesis is told not to capture anything.

Want to learn more about regexes?
http://www.regular-expressions.info/tutorialcnt.html was a big help for me.

Grtz,
--
Rik Wasmus


Jun 6 '06 #11

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: Markus Elfring | last post by:
Hello, I try to use alternative delimiters for a regular expression. When will it be supported? www@mike:/home/www > /usr/local/bin/php -a Interactive mode enabled <?php...
0
by: R. Tarazi | last post by:
Hello together, I'm having extreme difficulties using RegExps for a specific problem and would really appreciate any help and hope somebody will read through my "long" posting... 1. <?php...
0
by: petrovitch | last post by:
While using the following loop to extract images from the google search engine I discovered that preg_match_all works much faster parsing small strings in a loop than extracting all of the urls at...
4
by: Fabian | last post by:
Hi all there, I have already tried asking for help a couple of days ago. I try to rephrase better my problem: I need to grab a webpage that looks like this: <td width=80 align=center...
26
by: rkleiner | last post by:
Is there a regular expression to find the first unmatched right bracket in a string, if there is one? For example, "(1*(2+3))))".search(/regexp/) == 9 Thanks in advance.
3
by: news | last post by:
Hi all, I am a beginner on 'c' and pcre...and I am on windows 2000 and VC6 with all of the patches, etc. The following program leaks lots of memory and I only make 1 pcre call. I read the...
16
by: mainland | last post by:
I use prce to match string. I define a pcre struct point, then compile, pass pcre struct point to 'pcre_compile'. When complete, Is it necessary to use 'pcre_free' to free pcre struct point. ...
2
by: Serman D. | last post by:
Hi all, I'm trying to complete the samples from the excellent 2003 developerWorks article "Bringing the Power of Regular Expression Matching to SQL" by Knut Stolze: http://tinyurl.com/3bhrnn...
2
by: Good Man | last post by:
Hi there I have a series of HTML tables (well-formed, with elements ID'd quite nicely) and I need to extract the contents from certain TDs. For example, I'd like to get "Hi Mom!" from the...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.