473,326 Members | 2,133 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,326 software developers and data experts.

split a string of space separated substrings - elegant solution?

Hi,

I'm looking for an elegant solution to the following (quite common)
problem:

Given a string of substrings separated by white space,
split this into tuple/list of elements.
The problem are quoted substrings like

abc "xy z" "1 2 3" "a \" x"

should be split into ('abc','xy z','1 2 3','a " x')

For that, one probably has to protect white space between
quotes, then split by white space and finally converted the
'protected white space' to normal white space again.
Is there an elegant solution - perhaps without using a lexer
and something else. With regular expressions alone it seems
clumsy.

Many thanks for a hint,

Helmut Jarausch

Lehrstuhl fuer Numerische Mathematik
RWTH - Aachen University
D 52056 Aachen, Germany
Jul 31 '07 #1
5 3697
Helmut Jarausch wrote:
Hi,

I'm looking for an elegant solution to the following (quite common)
problem:

Given a string of substrings separated by white space,
split this into tuple/list of elements.
The problem are quoted substrings like

abc "xy z" "1 2 3" "a \" x"

should be split into ('abc','xy z','1 2 3','a " x')
import csv

s = 'abc "xy z" "1 2 3" "a \\" x"'
r = iter(csv.reader([s], delimiter=" ", escapechar="\\"))
print r.next()

w.
Jul 31 '07 #2
On Tue, 2007-07-31 at 22:30 +0200, Helmut Jarausch wrote:
Hi,

I'm looking for an elegant solution to the following (quite common)
problem:

Given a string of substrings separated by white space,
split this into tuple/list of elements.
The problem are quoted substrings like

abc "xy z" "1 2 3" "a \" x"

should be split into ('abc','xy z','1 2 3','a " x')
>>import shlex
shlex.split('abc "xy z" "1 2 3" "a \\" x"')
['abc', 'xy z', '1 2 3', 'a " x']

I hope that's elegant enough ;)

--
Carsten Haese
http://informixdb.sourceforge.net
Jul 31 '07 #3
On 7/31/07, Helmut Jarausch <ja******@skynet.bewrote:
I'm looking for an elegant solution to the following (quite common)
problem:

Given a string of substrings separated by white space,
split this into tuple/list of elements.
The problem are quoted substrings like

abc "xy z" "1 2 3" "a \" x"

should be split into ('abc','xy z','1 2 3','a " x')
Using the csv module gets you most of the way there. For instance:
>>import csv
text = r'abc "xy z" "1 2 3" "a \" x"'
reader = csv.reader([text], delimiter=" ", escapechar='\\')
for row in reader:
print row

['abc', 'xy z', '', '1 2 3', '', 'a " x']
>>>
That does leave you with empty elements where you had double spaces
between items though. you could fix that with something like:
>>for row in reader:
row = [element for element in row if element != '']
print row

['abc', 'xy z', '1 2 3', 'a " x']
>>>
The CSV module can handle lots of delimited data other that quote and
comma delimited. See the docs at:
http://docs.python.org/lib/module-csv.html and PEP 305:
http://www.python.org/dev/peps/pep-0305/

--
Jerry
Jul 31 '07 #4
On Jul 31, 3:30 pm, Helmut Jarausch <jarau...@skynet.bewrote:
I'm looking for an elegant solution to the following (quite common)
problem:

Given a string of substrings separated by white space,
split this into tuple/list of elements.
The problem are quoted substrings like

abc "xy z" "1 2 3" "a \" x"

should be split into ('abc','xy z','1 2 3','a " x')
Pyparsing has built-in support for special treatment of quoted
strings. Observe:

from pyparsing import *

data = r'abc "xy z" "1 2 3" "a \" x"'

quotedString.setParseAction(removeQuotes)
print OneOrMore(quotedString |
Word(printables) ).parseString(data)

prints:

['abc', 'xy z', '1 2 3', 'a \\" x']

Or perhaps a bit trickier, do the same while skipping items inside /*
*/ comments:

data = r'abc /* 456 "xy z" */ "1 2 3" "a \" x"'

quotedString.setParseAction(removeQuotes)
print OneOrMore(quotedString |
Word(printables) ) \
.ignore(cStyleComment).parseString(data)

prints:

['abc', '1 2 3', 'a \\" x']
-- Paul

Aug 1 '07 #5
Many thanks to all of you!
It's amazing how many elegant solutions there are in Python.
--
Helmut Jarausch

Lehrstuhl fuer Numerische Mathematik
RWTH - Aachen University
D 52056 Aachen, Germany
Aug 1 '07 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: Stu Cazzo | last post by:
I have the following: String myStringArray; String myString = "98 99 100"; I want to split up myString and put it into myStringArray. If I use this: myStringArray = myString.split(" "); it...
3
by: Ed Swartz | last post by:
All of a sudden this morning my ASP files started generating the error: Out of String Space: 'Replace' These script files have been running just fine for several weeks with no problems. I did...
2
by: AMB | last post by:
Hi all, I'm currently working on a large project which uses XML formatted data to communicate between all the various different systems the project ties together, apart from one, which...
2
by: Rosa | last post by:
Hi, I'm looking for an elegant solution on how to find the youngest file within a given directory. At the moment I'm storing all files in an array and loop through it comparing the creation date...
8
by: Braky Wacky | last post by:
Hello, I have an ASP.NET webpage that uses an instance of System.Web.UI.HtmlControls.HtmlInputFile for uploading files to our server. I came across the documentation at MSDN for upping the...
5
by: shaiful | last post by:
Hi all I have a simple problem with string. I want to split string, such as: dim s as string dim k(3) as string s="aa1,bv1,cc1,dt1" i want to split the value in k, k(0)=aa1 k(1)=bv1...
12
by: Helmut Jarausch | last post by:
Hi, I'm looking for an elegant solution of the following tiny but common problem. I have a list of tuples (Unique_ID,Date) both of which are strings. I want to delete the tuple (element) with...
4
by: dmitrey | last post by:
hi all, howto split string with both comma and semicolon delimiters? i.e. (for example) get from string "a,b;c" I have tried s.split(',;') but it don't work Thx, D.
4
by: N9 | last post by:
Hi Anyone who can help about split string. string text = "History about a boy, who loves to play baseball with his friends." I like to find indexOf "play" and read the string 10 char left...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.