Quick-Fix: Can preg_match_all return the indexes of where it matched the
string?
More Detail: Read carefully: I'd like to seperate a string's parts into two
(2) arrays which can be subdivide themselves. Then modify them and glue
them back together. I will refer to the two arrays as highlight and
remnants. The method I'm using to glue the string together correctly is to
store the string indexes (positions) of both the highlight and remnants
parts of the string before sorting the text into arrays.
ASCII Art Diagram (Mono-space text only):
Original
|
/ \
/ \
/ \
A B
/ \ / \
/ \ / \
AA AB BA BB
HTML entities like example:
Original:
The <html> rain </html> *in* Spain %d %s foo.
Split:
A - Highlight B - Remnants
--------------------------------------
The | <html>
rain | </html>
*in* Spain %d %s foo. |
Glued:
The <html> rain </html> *in* Spain %d %s foo.
Separation Code:
$finds = array();
$remnants = array();
foreach ($src_array as $src) {
preg_match_all($pattern, $src, $tmp);
$finds = array_merge($finds, $tmp[1]);
$remnants = array_merge($remnants, preg_split($pattern, $src));
}
return array($finds, $remnants);
I'd like to know: what would be the best way to index text directly in the
separation process - to avoid duplicates being miss-indexed?
For example the following function content is bogus because it can't
distinguish which space character (' ') came first:
$pos = 0;
$l_size = 0;
$list = array();
foreach($needles as $needle) {
$pos = strpos($haystack, $needle, $pos+$l_size);
array_push($list, array(
"pos" => $pos,
"raw" => $needle,
"enc" => $needle));
$l_size = strlen($needle);
}
return $list;
Accurate help much appreciated,
Jens.
--
Jabber ID: jt***@jabberafrica.co.za
Location: South Africa
Time Zone UTC +2