By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
435,131 Members | 1,494 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 435,131 IT Pros & Developers. It's quick & easy.

String indexing and preg_match

P: n/a
Quick-Fix: Can preg_match_all return the indexes of where it matched the
string?

More Detail: Read carefully: I'd like to seperate a string's parts into two
(2) arrays which can be subdivide themselves. Then modify them and glue
them back together. I will refer to the two arrays as highlight and
remnants. The method I'm using to glue the string together correctly is to
store the string indexes (positions) of both the highlight and remnants
parts of the string before sorting the text into arrays.

ASCII Art Diagram (Mono-space text only):

Original
|
/ \
/ \
/ \
A B
/ \ / \
/ \ / \
AA AB BA BB

HTML entities like example:

Original:

The <html> rain </html> *in* Spain %d %s foo.

Split:

A - Highlight B - Remnants
--------------------------------------
The | <html>
rain | </html>
*in* Spain %d %s foo. |

Glued:

The &lt;html&gt; rain &lt;/html&gt; *in* Spain %d %s foo.

Separation Code:

$finds = array();
$remnants = array();
foreach ($src_array as $src) {
preg_match_all($pattern, $src, $tmp);
$finds = array_merge($finds, $tmp[1]);
$remnants = array_merge($remnants, preg_split($pattern, $src));
}
return array($finds, $remnants);
I'd like to know: what would be the best way to index text directly in the
separation process - to avoid duplicates being miss-indexed?

For example the following function content is bogus because it can't
distinguish which space character (' ') came first:

$pos = 0;
$l_size = 0;
$list = array();
foreach($needles as $needle) {
$pos = strpos($haystack, $needle, $pos+$l_size);
array_push($list, array(
"pos" => $pos,
"raw" => $needle,
"enc" => $needle));
$l_size = strlen($needle);
}
return $list;

Accurate help much appreciated,

Jens.

--
Jabber ID: jt***@jabberafrica.co.za
Location: South Africa
Time Zone UTC +2
Jul 17 '05 #1
Share this Question
Share on Google+
1 Reply


P: n/a
"Jens Thiede" <je***********@webgear.co.za> wrote in message
news:cb*********@ctb-nnrp2.saix.net...
Quick-Fix: Can preg_match_all return the indexes of where it matched the
string?


Yes. Read the manual carefully.
Jul 17 '05 #2

This discussion thread is closed

Replies have been disabled for this discussion.