473,406 Members | 2,707 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,406 software developers and data experts.

need function: split string by list of strings and return delimiters and extra text

I am looking for a function that takes in a string and splits it using a
list of other strings (delimiters) and can return the delimiters as well
as the extra parts of the string. I was trying the split with a regex
built up of the delimiters separated by "|", but it doesn't return the
delimiters which I need.

The goal is an algorithm that can take a string containing html code using
<ul>, <ol>, and <li> and turn it into a formated plain text form like the
following:

<ol>
<li>foo</li>
<li>bar</li>
<ul>
<li>baz</li>
<li>qux</li>
</ul>
<li>asf</li>
</ol>

into:
1) foo
2) bar
* baz
* qux
3) asf

I have everything but tokenizing the html input. I'm sure there is
probably a function in the PHP 4.3.4 libraries that I am missing, but I
can't seem to find it.

--
Justin L. Kennedy
Georgia Institute of Technology, Atlanta Georgia, 30332
Email: jk***@prism.gatech.edu
Jul 17 '05 #1
3 2556
Justin L. Kennedy wrote:
I am looking for a function that takes in a string and splits it using a
list of other strings (delimiters) and can return the delimiters as well
as the extra parts of the string. I was trying the split with a regex
built up of the delimiters separated by "|", but it doesn't return the
delimiters which I need.
You might try preg_match_all().
The goal is an algorithm that can take a string containing html code using
<ul>, <ol>, and <li> and turn it into a formated plain text form like the
following:


Note that HTML is really tough to reliably parse with regular
expressions. If your source material is reasonably regular this might be
'good enough' though.

-- brion vibber (brion @ pobox.com)
Jul 17 '05 #2
"Justin L. Kennedy" <jk***@prism.gatech.edu> wrote in message
news:ct**********@news-int.gatech.edu...
I am looking for a function that takes in a string and splits it using a
list of other strings (delimiters) and can return the delimiters as well
as the extra parts of the string. I was trying the split with a regex
built up of the delimiters separated by "|", but it doesn't return the
delimiters which I need.


preg_split() let you do capturing, I think. Check the manual.
Jul 17 '05 #3
Chung Leong <ch***********@hotmail.com> wrote:
"Justin L. Kennedy" <jk***@prism.gatech.edu> wrote in message
news:ct**********@news-int.gatech.edu...
I am looking for a function that takes in a string and splits it using a
list of other strings (delimiters) and can return the delimiters as well
as the extra parts of the string. I was trying the split with a regex
built up of the delimiters separated by "|", but it doesn't return the
delimiters which I need.
preg_split() let you do capturing, I think. Check the manual.


Thanks, that was the one I was looking for. I was only looking in the
array and string sections of the manual.

--
Justin L. Kennedy
Georgia Institute of Technology, Atlanta Georgia, 30332
Email: jk***@prism.gatech.edu
Jul 17 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: William Stacey [MVP] | last post by:
Would like help with a (I think) a common regex split example. Thanks for your example in advance. Cheers! Source Data Example: one "two three" four Optional, but would also like to...
2
by: Trint Smith | last post by:
Ok, My program has been formating .txt files for input into sql server and ran into a problem...the .txt is an export from an accounting package and is only supposed to contain comas (,) between...
8
by: mannyGonzales | last post by:
Hey guys, Earliery I posted this common task of reading a csv file. My data read as: "1","2","3" Unfortunately it now reads as: "1","Text with, comma", "2" embedded commas!...
5
by: kurt sune | last post by:
The code: Dim aLine As String = "cat" & vbNewLine & "dog" & vbNewLine & "fox" & vbNewLine Dim csvColumns1 As String() = aLine.Split(vbNewLine, vbCr, vbLf) Dim csvColumns2 As String() =...
7
by: lgbjr | last post by:
Hi All, I'm trying to split a string on every character. The string happens to be a representation of a hex number. So, my regex expression is (). Seems simple, but for some reason, I'm not...
4
by: Michele Petrazzo | last post by:
Hello ng, I don't understand why split (string split) doesn't work with the same method if I can't pass values or if I pass a whitespace value: >>> "".split() >>> "".split(" ") But into...
21
by: Thelma Lubkin | last post by:
I would like my DLookup criteria to say this: Trim(fieldX) = strVar: myVar = _ DLookup("someField", "someTable", "Trim(fieldX) = '" & strVar & '") I don't believe that this will work, and I...
2
by: sorobor | last post by:
dear sir .. i am using cakephp freamwork ..By the way i m begener in php and javascript .. My probs r bellow I made a javascript calender ..there is a close button ..when i press close button...
2
by: WP | last post by:
Hello, below is my very first python program. I have some questions regarding it and would like comments in general. I won't be able to get my hands on a good python book until tomorrow at the...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.