473,804 Members | 2,314 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

newby question: Splitting a string - separator

Hi all,

i am having a textfile which contains a single string with names.
I want to split this string into its records an put them into a list.
In "normal" cases i would do something like:
#!/usr/bin/python
inp = open("file")
data = inp.read()
names = data.split()
inp.close()


The problem is, that the names contain spaces an the records are also
just seprarated by spaces. The only thing i can rely on, ist that the
recordseparator is always more than a single whitespace.

I thought of something like defining the separator for split() by using
a regex for "more than one whitespace". RegEx for whitespace is \s, but
what would i use for "more than one"? \s+?

TIA,
Tom
Dec 8 '05
13 1458
On Fri, 09 Dec 2005 18:02:02 -0800, James Stroud wrote:
Thomas Liesner wrote:
Hi all,

i am having a textfile which contains a single string with names.
I want to split this string into its records an put them into a list.
In "normal" cases i would do something like:

#!/usr/bin/python
inp = open("file")
data = inp.read()
names = data.split()
inp.close( )

The problem is, that the names contain spaces an the records are also
just seprarated by spaces. The only thing i can rely on, ist that the
recordseparator is always more than a single whitespace.

I thought of something like defining the separator for split() by using
a regex for "more than one whitespace". RegEx for whitespace is \s, but
what would i use for "more than one"? \s+?

TIA,
Tom


The one I like best goes like this:

py> data = "Guido van Rossum Tim Peters Thomas Liesner"
py> names = [n for n in data.split() if n]
py> names
['Guido', 'van', 'Rossum', 'Tim', 'Peters', 'Thomas', 'Liesner']

I think it is theoretically faster (and more pythonic) than using regexes.

Yes, but the correct result would be:

['Guido van Rossum', 'Tim Peters', 'Thomas Liesner']

Your code is short, elegant but wrong.

It could also be shorter and more elegant:

# your version
py> data = "Guido van Rossum Tim Peters Thomas Liesner"
py> [n for n in data.split() if n]
['Guido', 'van', 'Rossum', 'Tim', 'Peters', 'Thomas', 'Liesner']

# my version
py> data = "Guido van Rossum Tim Peters Thomas Liesner"
py> data.split()
['Guido', 'van', 'Rossum', 'Tim', 'Peters', 'Thomas', 'Liesner']

The "if n" in the list comp is superfluous, and without that, the whole
list comp is unnecessary.

--
Steven.

Dec 10 '05 #11
Steven D'Aprano wrote:
On Fri, 09 Dec 2005 18:02:02 -0800, James Stroud wrote:

Thomas Liesner wrote:
Hi all,

i am having a textfile which contains a single string with names.
I want to split this string into its records an put them into a list.
In "normal" cases i would do something like:

#!/usr/bin/python
inp = open("file")
data = inp.read()
names = data.split()
inp.close ()
The problem is, that the names contain spaces an the records are also
just seprarated by spaces. The only thing i can rely on, ist that the
recordsepara tor is always more than a single whitespace.

I thought of something like defining the separator for split() by using
a regex for "more than one whitespace". RegEx for whitespace is \s, but
what would i use for "more than one"? \s+?

TIA,
Tom


The one I like best goes like this:

py> data = "Guido van Rossum Tim Peters Thomas Liesner"
py> names = [n for n in data.split() if n]
py> names
['Guido', 'van', 'Rossum', 'Tim', 'Peters', 'Thomas', 'Liesner']

I think it is theoretically faster (and more pythonic) than using regexes.


Yes, but the correct result would be:

['Guido van Rossum', 'Tim Peters', 'Thomas Liesner']

Your code is short, elegant but wrong.

It could also be shorter and more elegant:

# your version
py> data = "Guido van Rossum Tim Peters Thomas Liesner"
py> [n for n in data.split() if n]
['Guido', 'van', 'Rossum', 'Tim', 'Peters', 'Thomas', 'Liesner']

# my version
py> data = "Guido van Rossum Tim Peters Thomas Liesner"
py> data.split()
['Guido', 'van', 'Rossum', 'Tim', 'Peters', 'Thomas', 'Liesner']

The "if n" in the list comp is superfluous, and without that, the whole
list comp is unnecessary.

see my post from 1 hr before this one.
Dec 10 '05 #12
James Stroud <js*****@mbi.uc la.edu> wrote:

The one I like best goes like this:

py> data = "Guido van Rossum Tim Peters Thomas Liesner"
py> names = [n for n in data.split() if n]
py> names
['Guido', 'van', 'Rossum', 'Tim', 'Peters', 'Thomas', 'Liesner']

I think it is theoretically faster (and more pythonic) than using regexes.


But it is slower than this, which produces EXACTLY the same (incorrect)
result:

data = "Guido van Rossum Tim Peters Thomas Liesner"
names = data.split()
--
- Tim Roberts, ti**@probo.com
Providenza & Boekelheide, Inc.
Dec 10 '05 #13
James Stroud wrote:
py> data = "Guido van Rossum Tim Peters Thomas Liesner"
py> names = [n for n in data.split() if n]
py> names
['Guido', 'van', 'Rossum', 'Tim', 'Peters', 'Thomas', 'Liesner']

I think it is theoretically faster (and more pythonic) than using
regexes.


Unfortunately it gives the wrong result.


Just an example. Here is the "correct version":

names = [n for n in data.split(" ") if n]


where "correct" is "still wrong", and "theoretica lly faster" means "slightly
slower" (at least if fix your version, and precompile the pattern).

</F>

Dec 10 '05 #14

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
1673
by: Aaron Walker | last post by:
I have a feeling this going to end up being something so stupid, but right now I'm confused as hell. I'm trying to code a function, that given a string and a delimiter char, returns a vector of the sub-strings. Here's what I have (I've thrown a main() in there for this mail). --- #include <iostream> #include <string>
9
2049
by: robbie.carlton | last post by:
Hello! I've programmed in c a bit, but nothing very complicated. I've just come back to it after a long sojourn in the lands of functional programming and am completely stumped on a very simple function I'm trying to write. I'm writing a function that takes a string, and returns an array of strings which are the result of splitting the input on whitespace and parentheses (but the parentheses should also be included in the array as...
8
2457
by: ronrsr | last post by:
I'm trying to break up the result tuple into keyword phrases. The keyword phrases are separated by a ; -- the split function is not working the way I believe it should be. Can anyone see what I"m doing wrong? bests, -rsr-
12
1928
by: kevineller794 | last post by:
I want to make a split string function, but it's getting complicated. What I want to do is make a function with a String, BeginStr and an EndStr variable, and I want it to return it in a char array. For example: char teststr; strcpy(teststr, "test:blah;");
0
9712
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9594
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10343
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10341
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
10089
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
5530
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5673
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
2
3831
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
3001
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.