473,729 Members | 2,371 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

split a string of space separated substrings - elegant solution?

Hi,

I'm looking for an elegant solution to the following (quite common)
problem:

Given a string of substrings separated by white space,
split this into tuple/list of elements.
The problem are quoted substrings like

abc "xy z" "1 2 3" "a \" x"

should be split into ('abc','xy z','1 2 3','a " x')

For that, one probably has to protect white space between
quotes, then split by white space and finally converted the
'protected white space' to normal white space again.
Is there an elegant solution - perhaps without using a lexer
and something else. With regular expressions alone it seems
clumsy.

Many thanks for a hint,

Helmut Jarausch

Lehrstuhl fuer Numerische Mathematik
RWTH - Aachen University
D 52056 Aachen, Germany
Jul 31 '07 #1
5 3717
Helmut Jarausch wrote:
Hi,

I'm looking for an elegant solution to the following (quite common)
problem:

Given a string of substrings separated by white space,
split this into tuple/list of elements.
The problem are quoted substrings like

abc "xy z" "1 2 3" "a \" x"

should be split into ('abc','xy z','1 2 3','a " x')
import csv

s = 'abc "xy z" "1 2 3" "a \\" x"'
r = iter(csv.reader ([s], delimiter=" ", escapechar="\\" ))
print r.next()

w.
Jul 31 '07 #2
On Tue, 2007-07-31 at 22:30 +0200, Helmut Jarausch wrote:
Hi,

I'm looking for an elegant solution to the following (quite common)
problem:

Given a string of substrings separated by white space,
split this into tuple/list of elements.
The problem are quoted substrings like

abc "xy z" "1 2 3" "a \" x"

should be split into ('abc','xy z','1 2 3','a " x')
>>import shlex
shlex.split(' abc "xy z" "1 2 3" "a \\" x"')
['abc', 'xy z', '1 2 3', 'a " x']

I hope that's elegant enough ;)

--
Carsten Haese
http://informixdb.sourceforge.net
Jul 31 '07 #3
On 7/31/07, Helmut Jarausch <ja******@skyne t.bewrote:
I'm looking for an elegant solution to the following (quite common)
problem:

Given a string of substrings separated by white space,
split this into tuple/list of elements.
The problem are quoted substrings like

abc "xy z" "1 2 3" "a \" x"

should be split into ('abc','xy z','1 2 3','a " x')
Using the csv module gets you most of the way there. For instance:
>>import csv
text = r'abc "xy z" "1 2 3" "a \" x"'
reader = csv.reader([text], delimiter=" ", escapechar='\\' )
for row in reader:
print row

['abc', 'xy z', '', '1 2 3', '', 'a " x']
>>>
That does leave you with empty elements where you had double spaces
between items though. you could fix that with something like:
>>for row in reader:
row = [element for element in row if element != '']
print row

['abc', 'xy z', '1 2 3', 'a " x']
>>>
The CSV module can handle lots of delimited data other that quote and
comma delimited. See the docs at:
http://docs.python.org/lib/module-csv.html and PEP 305:
http://www.python.org/dev/peps/pep-0305/

--
Jerry
Jul 31 '07 #4
On Jul 31, 3:30 pm, Helmut Jarausch <jarau...@skyne t.bewrote:
I'm looking for an elegant solution to the following (quite common)
problem:

Given a string of substrings separated by white space,
split this into tuple/list of elements.
The problem are quoted substrings like

abc "xy z" "1 2 3" "a \" x"

should be split into ('abc','xy z','1 2 3','a " x')
Pyparsing has built-in support for special treatment of quoted
strings. Observe:

from pyparsing import *

data = r'abc "xy z" "1 2 3" "a \" x"'

quotedString.se tParseAction(re moveQuotes)
print OneOrMore(quote dString |
Word(printables ) ).parseString(d ata)

prints:

['abc', 'xy z', '1 2 3', 'a \\" x']

Or perhaps a bit trickier, do the same while skipping items inside /*
*/ comments:

data = r'abc /* 456 "xy z" */ "1 2 3" "a \" x"'

quotedString.se tParseAction(re moveQuotes)
print OneOrMore(quote dString |
Word(printables ) ) \
.ignore(cStyleC omment).parseSt ring(data)

prints:

['abc', '1 2 3', 'a \\" x']
-- Paul

Aug 1 '07 #5
Many thanks to all of you!
It's amazing how many elegant solutions there are in Python.
--
Helmut Jarausch

Lehrstuhl fuer Numerische Mathematik
RWTH - Aachen University
D 52056 Aachen, Germany
Aug 1 '07 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
31177
by: Stu Cazzo | last post by:
I have the following: String myStringArray; String myString = "98 99 100"; I want to split up myString and put it into myStringArray. If I use this: myStringArray = myString.split(" "); it will split myString up using the delimiter of 1 space so that
3
9915
by: Ed Swartz | last post by:
All of a sudden this morning my ASP files started generating the error: Out of String Space: 'Replace' These script files have been running just fine for several weeks with no problems. I did review the code and yes the script is calling the Replace function at the line specified in the error information. I'm puzzled as to why the code would all of a sudden start generating this error ?
2
1490
by: AMB | last post by:
Hi all, I'm currently working on a large project which uses XML formatted data to communicate between all the various different systems the project ties together, apart from one, which communicates via space separated list. Obviously, having a space separated list floating around isn't desirable (it doesn't even support spaces in the data, for one), but I can't think of a killer reason to convince the owner of that system to
2
1710
by: Rosa | last post by:
Hi, I'm looking for an elegant solution on how to find the youngest file within a given directory. At the moment I'm storing all files in an array and loop through it comparing the creation date as follows: private string FileShare(string strURL) { string files; DateTime datLastWriteTime; DateTime datNewLastWriteTime;
8
2168
by: Braky Wacky | last post by:
Hello, I have an ASP.NET webpage that uses an instance of System.Web.UI.HtmlControls.HtmlInputFile for uploading files to our server. I came across the documentation at MSDN for upping the filesize limit, once I saw the behavior of the page bombing with files bigger than 4 MB. So far so good. But the situation I'm coming across is that there doesn't seem to be an elegant way of recovering from a user attempting to upload files
5
2532
by: shaiful | last post by:
Hi all I have a simple problem with string. I want to split string, such as: dim s as string dim k(3) as string s="aa1,bv1,cc1,dt1" i want to split the value in k, k(0)=aa1 k(1)=bv1 k(2)=cc1 k(3)=dt1
12
1322
by: Helmut Jarausch | last post by:
Hi, I'm looking for an elegant solution of the following tiny but common problem. I have a list of tuples (Unique_ID,Date) both of which are strings. I want to delete the tuple (element) with a given Unique_ID, but I don't known the corresponding Date. My straight forward solution is a bit lengthy, e.g.
4
10268
by: dmitrey | last post by:
hi all, howto split string with both comma and semicolon delimiters? i.e. (for example) get from string "a,b;c" I have tried s.split(',;') but it don't work Thx, D.
4
7732
by: N9 | last post by:
Hi Anyone who can help about split string. string text = "History about a boy, who loves to play baseball with his friends." I like to find indexOf "play" and read the string 10 char left and 10 char right
0
8921
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9284
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
9202
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
8151
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6722
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6022
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4528
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
3238
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
3
2165
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.