473,503 Members | 1,700 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

String indexing and preg_match

Quick-Fix: Can preg_match_all return the indexes of where it matched the
string?

More Detail: Read carefully: I'd like to seperate a string's parts into two
(2) arrays which can be subdivide themselves. Then modify them and glue
them back together. I will refer to the two arrays as highlight and
remnants. The method I'm using to glue the string together correctly is to
store the string indexes (positions) of both the highlight and remnants
parts of the string before sorting the text into arrays.

ASCII Art Diagram (Mono-space text only):

Original
|
/ \
/ \
/ \
A B
/ \ / \
/ \ / \
AA AB BA BB

HTML entities like example:

Original:

The <html> rain </html> *in* Spain %d %s foo.

Split:

A - Highlight B - Remnants
--------------------------------------
The | <html>
rain | </html>
*in* Spain %d %s foo. |

Glued:

The &lt;html&gt; rain &lt;/html&gt; *in* Spain %d %s foo.

Separation Code:

$finds = array();
$remnants = array();
foreach ($src_array as $src) {
preg_match_all($pattern, $src, $tmp);
$finds = array_merge($finds, $tmp[1]);
$remnants = array_merge($remnants, preg_split($pattern, $src));
}
return array($finds, $remnants);
I'd like to know: what would be the best way to index text directly in the
separation process - to avoid duplicates being miss-indexed?

For example the following function content is bogus because it can't
distinguish which space character (' ') came first:

$pos = 0;
$l_size = 0;
$list = array();
foreach($needles as $needle) {
$pos = strpos($haystack, $needle, $pos+$l_size);
array_push($list, array(
"pos" => $pos,
"raw" => $needle,
"enc" => $needle));
$l_size = strlen($needle);
}
return $list;

Accurate help much appreciated,

Jens.

--
Jabber ID: jt***@jabberafrica.co.za
Location: South Africa
Time Zone UTC +2
Jul 17 '05 #1
1 1802
"Jens Thiede" <je***********@webgear.co.za> wrote in message
news:cb*********@ctb-nnrp2.saix.net...
Quick-Fix: Can preg_match_all return the indexes of where it matched the
string?


Yes. Read the manual carefully.
Jul 17 '05 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
4232
by: fartsniff | last post by:
hello all, here is a preg_match routine that i am using. basically, $image is set in some code above, and it can be either st-1.gif or sb-1.gif (actually it randomly picks them from about 100...
8
6968
by: Eric Linders | last post by:
Hi, I'm trying to figure out the most efficient method for taking the first character in a string (which will be a number), and use it as a variable to check to see if the other numbers in the...
9
4532
by: Martoni | last post by:
I need to parse a string with an embedded email address. The string always has the format NAME (name@domain) SOMETEXT. What I need to get is the email address as name@domain. I came up with this...
108
6298
by: Bryan Olson | last post by:
The Python slice type has one method 'indices', and reportedly: This method takes a single integer argument /length/ and computes information about the extended slice that the slice object would...
19
78748
by: Paul | last post by:
hi, there, for example, char *mystr="##this is##a examp#le"; I want to replace all the "##" in mystr with "****". How can I do this? I checked all the string functions in C, but did not...
4
14816
by: squash | last post by:
I have a string equal to 'www/' that I want to use in a preg_match. Php keeps giving me the warning: Warning: preg_match(): Unknown modifier '/' How can I escape the string so the / in www/ is...
3
18357
by: deko | last post by:
I'm sure someone has passed this way before... I want to check to see is a domain name is contained in a string, and if one is, I want to extract it. In these strings, domains are always...
4
5240
by: gfrith | last post by:
Hi, A quick regex question which I've worked around for the time being, but would like an answer to if anyone can help. I want to match on all strings which end _id, but not those ending...
14
2469
by: deko | last post by:
geturl.php Too much code to paste here, but have a look at http://www.liarsscourge.com/ So far, I have not found a string that can break this... Any built-in functions or suggestions for...
0
7202
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
7084
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
7328
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
1
6991
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
7458
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
5578
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
1
5013
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
1512
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...
0
380
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.