473,385 Members | 1,707 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

new string method in 2.5 (partition)

Forgive my excitement, especially if you are already aware of this, but
this seems like the kind of feature that is easily overlooked (yet could
be very useful):
Both 8-bit and Unicode strings have new partition(sep) and
rpartition(sep) methods that simplify a common use case.
The find(S) method is often used to get an index which is then used to
slice the string and obtain the pieces that are before and after the
separator. partition(sep) condenses this pattern into a single method
call that returns a 3-tuple containing the substring before the
separator, the separator itself, and the substring after the separator.
If the separator isn't found, the first element of the tuple is the
entire string and the other two elements are empty. rpartition(sep) also
returns a 3-tuple but starts searching from the end of the string; the
"r" stands for 'reverse'.

Some examples:

>>('http://www.python.org').partition('://')
('http', '://', 'www.python.org')
>>('file:/usr/share/doc/index.html').partition('://')
('file:/usr/share/doc/index.html', '', '')
>>(u'Subject: a quick question').partition(':')
(u'Subject', u':', u' a quick question')
>>'www.python.org'.rpartition('.')
('www.python', '.', 'org')
>>'www.python.org'.rpartition(':')
('', '', 'www.python.org')

(Implemented by Fredrik Lundh following a suggestion by Raymond Hettinger.)
Sep 19 '06 #1
25 12901
sweet thanks for the heads up.

John Salerno wrote:
Forgive my excitement, especially if you are already aware of this, but
this seems like the kind of feature that is easily overlooked (yet could
be very useful):
Both 8-bit and Unicode strings have new partition(sep) and
rpartition(sep) methods that simplify a common use case.
The find(S) method is often used to get an index which is then used to
slice the string and obtain the pieces that are before and after the
separator. partition(sep) condenses this pattern into a single method
call that returns a 3-tuple containing the substring before the
separator, the separator itself, and the substring after the separator.
If the separator isn't found, the first element of the tuple is the
entire string and the other two elements are empty. rpartition(sep) also
returns a 3-tuple but starts searching from the end of the string; the
"r" stands for 'reverse'.

Some examples:

>>('http://www.python.org').partition('://')
('http', '://', 'www.python.org')
>>('file:/usr/share/doc/index.html').partition('://')
('file:/usr/share/doc/index.html', '', '')
>>(u'Subject: a quick question').partition(':')
(u'Subject', u':', u' a quick question')
>>'www.python.org'.rpartition('.')
('www.python', '.', 'org')
>>'www.python.org'.rpartition(':')
('', '', 'www.python.org')

(Implemented by Fredrik Lundh following a suggestion by Raymond Hettinger.)
Sep 19 '06 #2
I'm confused.
What's the difference between this and string.split?

John Salerno wrote:
Forgive my excitement, especially if you are already aware of this, but
this seems like the kind of feature that is easily overlooked (yet could
be very useful):
Both 8-bit and Unicode strings have new partition(sep) and
rpartition(sep) methods that simplify a common use case.
The find(S) method is often used to get an index which is then used to
slice the string and obtain the pieces that are before and after the
separator. partition(sep) condenses this pattern into a single method
call that returns a 3-tuple containing the substring before the
separator, the separator itself, and the substring after the separator.
If the separator isn't found, the first element of the tuple is the
entire string and the other two elements are empty. rpartition(sep) also
returns a 3-tuple but starts searching from the end of the string; the
"r" stands for 'reverse'.

Some examples:

>>('http://www.python.org').partition('://')
('http', '://', 'www.python.org')
>>('file:/usr/share/doc/index.html').partition('://')
('file:/usr/share/doc/index.html', '', '')
>>(u'Subject: a quick question').partition(':')
(u'Subject', u':', u' a quick question')
>>'www.python.org'.rpartition('.')
('www.python', '.', 'org')
>>'www.python.org'.rpartition(':')
('', '', 'www.python.org')

(Implemented by Fredrik Lundh following a suggestion by Raymond Hettinger.)
Sep 19 '06 #3
ri************@gmail.com <ri************@gmail.comwrote:
What's the difference between this and string.split?
>>('http://www.python.org').partition('://')
('http', '://', 'www.python.org')
>>('http://www.python.org').split('://')
['http', 'www.python.org']

--
Lawrence - http://www.oluyede.org/blog
"Nothing is more dangerous than an idea
if it's the only one you have" - E. A. Chartier
Sep 19 '06 #4
ri************@gmail.com wrote:
I'm confused.
What's the difference between this and string.split?
>>s = 'hello, world'
>>s.split(',')
['hello', ' world']
>>s.partition(',')
('hello', ',', ' world')
split returns a list of the substrings on either side of the specified
argument.

partition returns a tuple of the substring on the left of the argument,
the argument itself, and the substring on the right. rpartition reads
from right to left.
But you raise a good point. Notice this:
>>s = 'hello, world, how are you'
>>s.split(',')
['hello', ' world', ' how are you']
>>s.partition(',')
('hello', ',', ' world, how are you')

split will return all substrings. partition (and rpartition) only return
the substrings before and after the first occurrence of the argument.
Sep 19 '06 #5
John Salerno a écrit :
Forgive my excitement, especially if you are already aware of this, but
this seems like the kind of feature that is easily overlooked (yet could
be very useful):
Both 8-bit and Unicode strings have new partition(sep) and
rpartition(sep) methods that simplify a common use case.
The find(S) method is often used to get an index which is then used to
slice the string and obtain the pieces that are before and after the
separator.
Err... is it me being dumb, or is it a perfect use case for str.split ?
partition(sep) condenses this pattern into a single method
call that returns a 3-tuple containing the substring before the
separator, the separator itself, and the substring after the separator.
If the separator isn't found, the first element of the tuple is the
entire string and the other two elements are empty. rpartition(sep) also
returns a 3-tuple but starts searching from the end of the string; the
"r" stands for 'reverse'.

Some examples:

>>('http://www.python.org').partition('://')
('http', '://', 'www.python.org')
>>('file:/usr/share/doc/index.html').partition('://')
('file:/usr/share/doc/index.html', '', '')
>>(u'Subject: a quick question').partition(':')
(u'Subject', u':', u' a quick question')
>>'www.python.org'.rpartition('.')
('www.python', '.', 'org')
>>'www.python.org'.rpartition(':')
('', '', 'www.python.org')
I must definitively be dumb, but so far I fail to see how it's better
than split and rsplit:
>>'http://www.python.org'.split('://')
['http', 'www.python.org']
>>'file:/usr/share/doc/index.html'.split('://')
['file:/usr/share/doc/index.html']
>>u'Subject: a quick question'.split(': ')
[u'Subject', u'a quick question']
>>u'Subject: a quick question'.rsplit(': ')
[u'Subject', u'a quick question']
>>'www.python.org'.rsplit('.', 1)
['www.python', 'org']
>>>
There are IMVHO much exciting new features in 2.5 (enhanced generators,
try/except/finally, ternary operator, with: statement etc...)
Sep 19 '06 #6
>partition(sep) condenses this pattern into a single method
>call that returns a 3-tuple containing the substring before
the separator, the separator itself, and the substring after
the separator. If the separator isn't found, the first
element of the tuple is the entire string and the other two
elements are empty. rpartition(sep) also returns a 3-tuple
but starts searching from the end of the string; the "r"
stands for 'reverse'.

I'm confused. What's the difference between this and
string.split?
(please don't top-post...I've inverted and trimmed for the sake
of readability)

I too am a bit confused but I can see uses for it, and there
could be good underlying reason to do as much. Split doesn't
return the separator. It's also guarnteed to return a 3-tuple. E.g.
>>s1 = 'one'
s2 = 'one|two'
len(s1.split('|', 1)
1
>>len(s2.split('|', 1))
2

which could make a difference when doing tuple-assignment:
>>v1, v2 = s2.split('|', 1)
# works fine
v1, v2 = s1.split('|', 1)
[traceback]

whereas one could consistently do something like
>>v1, _, v2 = s1.partition('|')
without fear of a traceback to deal with.

Just a few thoughts...

-tkc

Sep 19 '06 #7
But you raise a good point. Notice this:
>
>>s = 'hello, world, how are you'
>>s.split(',')
['hello', ' world', ' how are you']
>>s.partition(',')
('hello', ',', ' world, how are you')

split will return all substrings. partition (and rpartition) only return
the substrings before and after the first occurrence of the argument.
The split()/rsplit() functions do take an optional argument for
the maximum number of splits to make, just FYI...
>>help("".split)
Help on built-in function split:

split(...)
S.split([sep [,maxsplit]]) -list of strings

Return a list of the words in the string S, using sep as the
delimiter string. If maxsplit is given, at most maxsplit
splits are done. If sep is not specified or is None, any
whitespace string is a separator.

(as I use this on a regular basis when mashing up various text
files in a data conversion process)

-tkc


Sep 19 '06 #8
Bruno Desthuilliers wrote:
Err... is it me being dumb, or is it a perfect use case for str.split ?
Hmm, I suppose you could get nearly the same functionality as using
split(':', 1), but with partition you also get the separator returned as
well.
There are IMVHO much exciting new features in 2.5 (enhanced generators,
try/except/finally, ternary operator, with: statement etc...)
I definitely agree, but I figure everyone knows about those already.
There are also the startswith() and endswith() string methods that are
new and seem neat as well.
Sep 19 '06 #9
John Salerno schrieb:
Bruno Desthuilliers wrote:
>Err... is it me being dumb, or is it a perfect use case for str.split ?

Hmm, I suppose you could get nearly the same functionality as using
split(':', 1), but with partition you also get the separator returned as
well.
Well, x.split(":", 1) returns a list of one or two elements, depending on x,
while x.partition(":") always returns a three-tuple.

Thomas

Sep 19 '06 #10
Bruno Desthuilliers wrote:
I must definitively be dumb, but so far I fail to see how it's better
than split and rsplit:
I fail to see it too. What's the point of returning the separator since
the caller passes it anyway* ?

George

* unless the separator can be a regex, but I don't think so.

Sep 19 '06 #11
John Salerno wrote:
Bruno Desthuilliers wrote:
>Err... is it me being dumb, or is it a perfect use case for str.split ?

Hmm, I suppose you could get nearly the same functionality as using
split(':', 1), but with partition you also get the separator returned as
well.
>There are IMVHO much exciting new features in 2.5 (enhanced
generators, try/except/finally, ternary operator, with: statement etc...)

I definitely agree, but I figure everyone knows about those already.
There are also the startswith() and endswith() string methods that are
new and seem neat as well.
FYI- .startswith() and .endswith() string methods aren't new in 2.5.
They have been around since at least 2.3.

Larry Bates
Sep 19 '06 #12
Larry Bates wrote:
John Salerno wrote:
>Bruno Desthuilliers wrote:
>>Err... is it me being dumb, or is it a perfect use case for str.split ?
Hmm, I suppose you could get nearly the same functionality as using
split(':', 1), but with partition you also get the separator returned as
well.
>>There are IMVHO much exciting new features in 2.5 (enhanced
generators, try/except/finally, ternary operator, with: statement etc...)
I definitely agree, but I figure everyone knows about those already.
There are also the startswith() and endswith() string methods that are
new and seem neat as well.

FYI- .startswith() and .endswith() string methods aren't new in 2.5.
They have been around since at least 2.3.

Larry Bates
Oops, just a slight change in their functionality:

The startswith() and endswith() methods of string types now accept
tuples of strings to check for.

def is_image_file (filename):
return filename.endswith(('.gif', '.jpg', '.tiff'))

(Implemented by Georg Brandl following a suggestion by Tom Lynn.)
Sep 19 '06 #13
On Tue, Sep 19, 2006 at 07:23:50PM +0000, John Salerno wrote:
Bruno Desthuilliers wrote:
Err... is it me being dumb, or is it a perfect use case for str.split ?

Hmm, I suppose you could get nearly the same functionality as using
split(':', 1), but with partition you also get the separator returned as
well.
There are IMVHO much exciting new features in 2.5 (enhanced generators,
try/except/finally, ternary operator, with: statement etc...)

I definitely agree, but I figure everyone knows about those already.
There are also the startswith() and endswith() string methods that are
new and seem neat as well.
Partition is much, much nicer than index() or find() for many
(but not all) applications.

diff for cgi.py parsing "var=X"
- i = p.find('=')
- if i >= 0:
- name = p[:i]
- value = p[i+1:]
+ (name, sep_found, value) = p.partition('=')

Notice that preserving the seperator makes for a nice boolean
to test if the partition was successful. Partition raises an
error if you pass an empty seperator.

parition also has the very desirable feature of returning the orignal
string when the seperator isn't found

ex/

script = 'foo.cgi?a=7'
script, sep, params = script.partition('?')

"script" will be "foo.cgi" even if there are no params. With
find or index you have to slice the string by hand and with split
you would do something like.

try:
script, params = script.split('?')
except ValueError: pass

or

parts = script.split('?', 1)
script = parts[0]
params = ''.join(parts[1:])
Grep your source for index, find, and split and try rewriting
the code with partition. Not every instance will turn out cleaner
but many will.

Long-live-partition-ly,

-Jack
Sep 19 '06 #14
"George Sakkis" <ge***********@gmail.comwrote:
Bruno Desthuilliers wrote:
>I must definitively be dumb, but so far I fail to see how it's better
than split and rsplit:

I fail to see it too. What's the point of returning the separator since
the caller passes it anyway* ?
The separator is only returned if it was found otherwise you get back an
empty string. Concatenating the elements of the tuple that is returned
always gives you the original string.

It is quite similar to using split(sep,1), but reduces the amount of
special case handling for cases where the separator isn't found.

Sep 19 '06 #15

"Bruno Desthuilliers" <bd*****************@free.quelquepart.frwrote in
message news:45***********************@news.free.fr...
>Err... is it me being dumb, or is it a perfect use case for str.split ?
s.partition() was invented and its design settled on as a result of looking
at some awkward constructions in the standard library and other actual use
cases. Sometimes it replaces s.find or s.index instead of s.split. In
some cases, it is meant to be used within a loop. I was not involved and
so would refer you to the pydev discussions.

tjr

Sep 19 '06 #16
s = "There should be one -- and preferably only one -- obvious way to
do it".partition('only one')
print s[0]+'more than one'+s[2]

;)

Regards,
Jordan

Sep 20 '06 #17
Terry Reedy wrote:
"Bruno Desthuilliers" <bd*****************@free.quelquepart.frwrote in
message news:45***********************@news.free.fr...
>Err... is it me being dumb, or is it a perfect use case for str.split ?

s.partition() was invented and its design settled on as a result of looking
at some awkward constructions in the standard library and other actual use
cases. Sometimes it replaces s.find or s.index instead of s.split. In
some cases, it is meant to be used within a loop. I was not involved and
so would refer you to the pydev discussions.
While there is the functional aspect of the new partition method, I was
wondering about the following /technical/ aspect:

Because the result of partition is a non mutable tuple type containing
three substrings of the original string, is it perhaps also the case
that partition works without allocating extra memory for 3 new string
objects and copying the substrings into them?
I can imagine that the tuple type returned by partition is actually
a special object that contains a few internal pointers into the
original string to point at the locations of each substring.
Although a quick type check of the result object revealed that
it was just a regular tuple type, so I don't think the above is true...

--Irmen
Sep 20 '06 #18
Irmen de Jong wrote:
Terry Reedy wrote:
>>"Bruno Desthuilliers" <bd*****************@free.quelquepart.frwrote in
message news:45***********************@news.free.fr...
>>>Err... is it me being dumb, or is it a perfect use case for str.split ?

s.partition() was invented and its design settled on as a result of looking
at some awkward constructions in the standard library and other actual use
cases. Sometimes it replaces s.find or s.index instead of s.split. In
some cases, it is meant to be used within a loop. I was not involved and
so would refer you to the pydev discussions.


While there is the functional aspect of the new partition method, I was
wondering about the following /technical/ aspect:

Because the result of partition is a non mutable tuple type containing
three substrings of the original string, is it perhaps also the case
that partition works without allocating extra memory for 3 new string
objects and copying the substrings into them?
I can imagine that the tuple type returned by partition is actually
a special object that contains a few internal pointers into the
original string to point at the locations of each substring.
Although a quick type check of the result object revealed that
it was just a regular tuple type, so I don't think the above is true...
It's not.

regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://holdenweb.blogspot.com
Recent Ramblings http://del.icio.us/steve.holden

Sep 20 '06 #19
John Salerno a écrit :
Bruno Desthuilliers wrote:
>Err... is it me being dumb, or is it a perfect use case for str.split ?


Hmm, I suppose you could get nearly the same functionality as using
split(':', 1), but with partition you also get the separator returned as
well.
Well, you already know it since you use it to either split() or
partition the string !-)

Not to say these two new methods are necessary useless - sometimes a
small improvement to an API greatly simplifies a lot of common use cases.
>There are IMVHO much exciting new features in 2.5 (enhanced
generators, try/except/finally, ternary operator, with: statement etc...)


I definitely agree, but I figure everyone knows about those already.
There are also the startswith() and endswith() string methods that are
new
Err... 'new' ???
and seem neat as well.
Sep 20 '06 #20
At Wednesday 20/9/2006 15:11, Irmen de Jong wrote:
>Because the result of partition is a non mutable tuple type containing
three substrings of the original string, is it perhaps also the case
that partition works without allocating extra memory for 3 new string
objects and copying the substrings into them?
I can imagine that the tuple type returned by partition is actually
a special object that contains a few internal pointers into the
original string to point at the locations of each substring.
Although a quick type check of the result object revealed that
it was just a regular tuple type, so I don't think the above is true...
Nope, a python string has both a length *and* a null terminator (for
ease of interfacing C routines, I guess) so you can't just share a substring.

Gabriel Genellina
Softlab SRL

__________________________________________________
Preguntá. Respondé. Descubrí.
Todo lo que querías saber, y lo que ni imaginabas,
está en Yahoo! Respuestas (Beta).
¡Probalo ya!
http://www.yahoo.com.ar/respuestas

Sep 20 '06 #21
Gabriel Genellina wrote:
Nope, a python string has both a length *and* a null terminator (for
ease of interfacing C routines, I guess) so you can't just share a
substring.
Ofcourse, that makes perfect sense. Should have thought a little
bit further myself .... :)

--Irmen
Sep 20 '06 #22
Irmen de Jong wrote:
Because the result of partition is a non mutable tuple type containing
three substrings of the original string, is it perhaps also the case
that partition works without allocating extra memory for 3 new string
objects and copying the substrings into them?
nope. the core string type doesn't support sharing, and given the
typical use cases for partition, I doubt it would be more efficient
than actually creating the new strings.

(note that partition reuses the original string and the separator,
where possible)

(and yes, you're not the first one who thought of this. check the
python-dev archives from May this year for more background).

</F>

Sep 21 '06 #23
In message <ma**************************************@python.o rg>, Gabriel
Genellina wrote:
... a python string has both a length *and* a null terminator (for
ease of interfacing C routines ...
How does that work for strings with embedded nulls? Or are the C routines
simply fooled into seeing a truncated part of the string?
Sep 22 '06 #24
Lawrence D'Oliveiro <ld*@geek-central.gen.new_zealandwrote:
In message <ma**************************************@python.o rg>, Gabriel
Genellina wrote:
>... a python string has both a length *and* a null terminator (for
ease of interfacing C routines ...

How does that work for strings with embedded nulls? Or are the C routines
simply fooled into seeing a truncated part of the string?
If passed to a C library function it would mean that the C code would
generally only use up to the first embedded null. However the Python
standard library will usually check for nulls first so it can throw an
error:
>>with open('test.txt', 'r') as f:
.... print f.read()
....
Hello world
>>with open('test.txt\x00junk', 'r') as f:
.... print f.read()
....
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: file() argument 1 must be (encoded string without NULL
bytes), not str
>>>
What actually happens is that Python argument parsing code will reject
values with embedded nulls if asked to convert a parameter to a C string
('s', 'z', 'es', or 'et' formats), but will allow them if converting to a C
string and a length ('s#', 'z#', 'es#', or 'et#').
Sep 22 '06 #25
At Friday 22/9/2006 04:53, Lawrence D'Oliveiro wrote:
... a python string has both a length *and* a null terminator (for
ease of interfacing C routines ...

How does that work for strings with embedded nulls? Or are the C routines
simply fooled into seeing a truncated part of the string?
This is for simple char* strings, ASCIIZ. If your C code can accept
embedded nulls, surely has made other provisions - like receiving the
buffer length as a parameter. If not, it will see only a truncated string.

Gabriel Genellina
Softlab SRL

__________________________________________________
Preguntá. Respondé. Descubrí.
Todo lo que querías saber, y lo que ni imaginabas,
está en Yahoo! Respuestas (Beta).
¡Probalo ya!
http://www.yahoo.com.ar/respuestas

Sep 22 '06 #26

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: Connelly Barnes | last post by:
Yet another useful code snippet! This StringBuffer class is a FIFO for character data. Example: B = StringBuffer('Hello W') B.append('orld!') print B.read(5) # 'Hello' print B.read()...
108
by: Bryan Olson | last post by:
The Python slice type has one method 'indices', and reportedly: This method takes a single integer argument /length/ and computes information about the extended slice that the slice object would...
5
by: ikshefem | last post by:
I often need to re-code for myself a small code snippet to define string.upto() and string.from(), which are used like : # canonical examples > "1234456789".upto("45") '1234' >...
5
by: sameer_deshpande | last post by:
Hi, I need to create a partition table but the column on which I need to create a partition may not have any logical ranges. So while creating or defining partition function I can not use any...
8
by: girish | last post by:
Hi, I want to generate all non-empty substrings of a string of length >=2. Also, each substring is to be paired with 'string - substring' part and vice versa. Thus, gives me , , , , , ] etc....
1
by: Laurence | last post by:
Hi folks, As I konw: database partition (aka data partition?), the database can span multiple machines; table partition, the data within a table can seperate by certain condition. How about...
0
by: Vinod Sadanandan | last post by:
Table Partition Performance analysis ============================================ Collection of Statistics for Cost-Based Optimization/DBMS_STATS vs. ANALYZE The cost-based approach relies on...
0
debasisdas
by: debasisdas | last post by:
SAMPLE CODE TO CREATE SUB PARTITIONS ======================================= RANGE-HASH-9i ------------------------- CREATE TABLE SUBPART ( ID NUMBER(10) PRIMARY KEY, NAME VARCHAR2(20) )
0
debasisdas
by: debasisdas | last post by:
USING PARTITION =================== PARTITION BY RANGE-as per Oracle 8 -------------------------------------- CREATE TABLE RANGEPART ( ID NUMBER(2) PRIMARY KEY, NAME VARCHAR2(20) )
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.