473,402 Members | 2,055 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,402 software developers and data experts.

Why " ".some_string is often used ?


Hi all,

This is not the first time I see this way of coding in Python and
I wonder why this is coded this way:

Howto on PyXML
(http://pyxml.sourceforge.net/topics/howto/node14.html)
shows it on this function, but I saw that in many other pieces of code:

def normalize_whitespace(text):
"Remove redundant whitespace from a string"
return ' '.join(text.split())

Is there a reason to do instead of just returning join(text.split()) ?
why concatenate " " to the string and not just returning the string instead ?

Thanks in advance for your explanations.

Regards,

--
Stephane Ninin

Jul 18 '05 #1
12 1507
"Stéphane Ninin" wrote:
Is there a reason to do instead of just returning join(text.split()) ?
why concatenate " " to the string and not just returning the string
instead ?


Because they're not the same thing unless you've already done

from string import join

first. join is not a builtin function.

--
__ Erik Max Francis && ma*@alcyone.com && http://www.alcyone.com/max/
/ \ San Jose, CA, USA && 37 20 N 121 53 W && &tSftDotIotE
\__/ I get my kicks above the wasteline, sunshine
-- The American, _Chess_
Jul 18 '05 #2
Also sprach Erik Max Francis :
Is there a reason to do instead of just returning join(text.split()) ?
why concatenate " " to the string and not just returning the string
instead ?


Because they're not the same thing unless you've already done

from string import join

first. join is not a builtin function.


Ok. Thanks.
I just realized that "." had also nothing to do with concatenation here.
Jul 18 '05 #3
On 2004-01-07, Erik Max Francis <ma*@alcyone.com> wrote:
Because they're not the same thing unless you've already done

from string import join

first. join is not a builtin function.


You know, given the volumes of text Pythonistas write about Python not
falling to the Perlish trap of magic linenoise this certainly smacks of it,
don'tcha think? Wonder how this idiom slipped in. To think all this time I
have been doing:

import string
string.join()

--
Steve C. Lamb | I'm your priest, I'm your shrink, I'm your
PGP Key: 8B6E99C5 | main connection to the switchboard of souls.
-------------------------------+---------------------------------------------
Jul 18 '05 #4
"Stéphane Ninin" <st************@yahoo.fr> wrote in message
news:Xn**********************************@213.228. 0.4...

Hi all,

This is not the first time I see this way of coding in Python and
I wonder why this is coded this way:

Howto on PyXML
(http://pyxml.sourceforge.net/topics/howto/node14.html)
shows it on this function, but I saw that in many other pieces of code:

def normalize_whitespace(text):
"Remove redundant whitespace from a string"
return ' '.join(text.split())

Is there a reason to do instead of just returning join(text.split()) ?
why concatenate " " to the string and not just returning the string instead ?

This particular idiom replaces sequences of multiple whitespace
charaters with a single blank.

And I agree, it's not entirely obvious why it's a string
method rather than a list method, since it operates on
a list, not on a string. The only explanation that makes
sense is that, as a list method, it would fail if the list
contained something other than a string. That's still
not very friendly, though.

John Roth
Thanks in advance for your explanations.

Regards,

--
Stephane Ninin

Jul 18 '05 #5
John Roth wrote:
And I agree, it's not entirely obvious why it's a string
method rather than a list method, since it operates on
a list, not on a string. The only explanation that makes
sense is that, as a list method, it would fail if the list
contained something other than a string. That's still
not very friendly, though.


On the contrary, I think that's the best reason. Lists have nothing to
do with strings, and so very string-specific methods (discounting
system-wide things such as str or repr) being included in lists is not
the right approach. Furthermore, the methods associated with a list
tend to become the "pattern" that sequence types must fulfill, and it
sets a terribly bad precedent to attach whatever domain-specific
application that's needed into a sequence type just because it's easiest
on the eyes at the moment.

The .join method is inherently string specific, and belongs on strings,
not lists. There's no doubting that seeing S.join(...) for the first
time is a bit of a surprise, but once you understand the reasoning
behind it, it makes perfect sense and makes it clear just how much it
deserves to stay that way.

And above all, of course, if you think it personally looks ugly, you can

from string import join

or write your own join function that operates over sequences and does
whatever else you might wish. That's what the flexibility is there for.

--
__ Erik Max Francis && ma*@alcyone.com && http://www.alcyone.com/max/
/ \ San Jose, CA, USA && 37 20 N 121 53 W && &tSftDotIotE
\__/ Life is not a spectacle or a feast; it is a predicament.
-- George Santayana
Jul 18 '05 #6
John Roth wrote:

And I agree, it's not entirely obvious why it's a string
method rather than a list method, since it operates on
a list, not on a string. The only explanation that makes
sense is that, as a list method, it would fail if the list
contained something other than a string. That's still
not very friendly, though.


One could about as easily argue (and I believe several have done
this quite well in the past, better than I anyway) that you are
actually operating on the *string*, not the list. You are in
effect asking the string to act as a joiner for the elements in the
list, not asking the list to join itself using the specified
string.

At least, if you look at it that way, it might be easier to swallow.

-Peter
Jul 18 '05 #7
Peter Hansen <pe***@engcorp.com> writes:
John Roth wrote:

And I agree, it's not entirely obvious why it's a string
method rather than a list method, since it operates on
a list, not on a string. The only explanation that makes
sense is that, as a list method, it would fail if the list
contained something other than a string. That's still
not very friendly, though.


One could about as easily argue (and I believe several have done
this quite well in the past, better than I anyway) that you are
actually operating on the *string*, not the list. You are in
effect asking the string to act as a joiner for the elements in the
list, not asking the list to join itself using the specified
string.

At least, if you look at it that way, it might be easier to swallow.


Can't we have both. This is called a reversing method (Beck, Smalltalk
Best Practice Patterns) because it allows you to send several messages
to the same object instead of switching between different instances,
allowing the code to be more regular.

class MyList(list):
def join(self, aString):
return aString.join(self)
Like this:

lst = ['one', 'two', 'three']
print lst
print lst.join('\n')

I'd also like a reversing method for len

class MyList(list):
def len(self):
return len(self)

Often when I program against an instance I intuitively start each line
of code by writing the variable name and then a dot and then the
operation. The lack of a reversing method for len and join means that
my concentration is broken a tiny fraction of a second when I have to
remember to use another object or the global scope to find the
operation that I am after. Not a showstopper by any definition, but
irritating nonetheless.

--

Syver Enstad
Jul 18 '05 #8
On Thu, 08 Jan 2004 03:50:04 -0800, rumours say that Erik Max Francis
<ma*@alcyone.com> might have written:

[' '.join discussion]
And above all, of course, if you think it personally looks ugly, you can

from string import join

or write your own join function that operates over sequences and does
whatever else you might wish. That's what the flexibility is there for.


I believe str.join(string, sequence) works best for the functional types
(no need to rely on the string module).
--
TZOTZIOY, I speak England very best,
Ils sont fous ces Redmontains! --Harddix
Jul 18 '05 #9
On 08 Jan 2004 16:34:39 +0100, rumours say that Syver Enstad
<sy*************@online.no> might have written:
I'd also like a reversing method for len

class MyList(list):
def len(self):
return len(self)


You can always use the __len__ attribute in this specific case.

And now for the hack value:

class MyList(list):
import new as _new, __builtin__
def __getattr__(self, attr):
try:
return self._new.instancemethod( \
getattr(self.__builtin__, attr), \
self, \
None)
except AttributeError:
raise AttributeError, \
"there is no '%s' builtin" % attr

allowing:
a=MyList()
a.append(12)
a.append(24)
a.len() 2 a.min() 12 a.max()

24

It works for all builtins that can take a list as a first argument.
Of course it should not be taken seriously :)
--
TZOTZIOY, I speak England very best,
Ils sont fous ces Redmontains! --Harddix
Jul 18 '05 #10

"John Roth" <ne********@jhrothjr.com> wrote in message
news:vv************@news.supernews.com...
And I agree, it's not entirely obvious why it's a string
method rather than a list method,
Because, as I and others have posted several times in previous threads, and
explicated with several examples, <str,unicode>.join is NOT, NOT, NOT a
list method, anymore than it is a tuple, dict, array, generator-iterator,
or any other iteratable method. Taking 'string' genericly (as either type
str or unicode), .join joins a sequence (iterable) of strings with a
string.
since it operates on a list, not on a string.
Huh? It operates on a sequence of strings. It has nothing to do with
lists in particular. The builtinness and mutability of lists is irrelevant
to this generic read-only operation.
help(str.join) # or ''.join or unicode.join or u''.join

join(...)
S.join(sequence) -> string
Return a string which is the concatenation of the strings in the
sequence. The separator between elements is S.

Notice the absence of 'list'. Please do not confuse newbies with
misinformation.
The only explanation that makes
sense is that, as a list method, it would fail if the list
contained something other than a string.


This is true for any iterable that contains or yields something other than
a string.
Again, this function/method has nothing in particular to do with lists
other than the fact that lists are one of several types of iterables. That
is why join cannot be a list method and certain not just a list method.

If 'iterable' were a concrete type/class/interface that all iterables had
to inherit from in order to be recognized as an iterable, rather that an
abstract protocol to be implemented, then one might suggest that join be an
iterable method. But the statement above, with 'list' replaced by
'iterable', would still be true. Given that only a small finite subset of
the unbounded set of iterable functions could be designated as basic by
being made a method, one could easily argue that such designation should be
restricted to functions potentially applicable to any iterable. Count,
filter, map, reduce, iterate (apply for-loop body), and others in itertools
would be such candidates.

If there were an iterable-of-basestrings object subbing the hypothetical
iterable object, then join might an appropriate method for that. But that
is not the Python universe we have. Not do I necessarily wish it. The
beauty of the abstract iterable/iterator interfaces, to me, is that they
are so simple, clean, and genericly useful, without having to privilege
anyone's idea of which sequence functions are 'basic'.

Terry J. Reedy
Jul 18 '05 #11
In article <Wu********************@comcast.com>, Terry Reedy wrote:

"John Roth" <ne********@jhrothjr.com> wrote in message
news:vv************@news.supernews.com...
And I agree, it's not entirely obvious why it's a string
method rather than a list method,


Because, as I and others have posted several times in previous threads, and
explicated with several examples, <str,unicode>.join is NOT, NOT, NOT a
list method, anymore than it is a tuple, dict, array, generator-iterator,
or any other iteratable method. Taking 'string' genericly (as either type
str or unicode), .join joins a sequence (iterable) of strings with a
string.


It's not a list method because it's not a list method or any other kind of
iterable method? That seems like circular reasoning.

Consider the following two pieces of data:

1. 'the,quick,brown,fox'
2. ['the', 'quick', 'brown', 'fox']

They are both lists of words. Perhaps the first is not a Python-list of
words, but it's a list of words nonetheless. #1 can be converted into #2 by
calling ".split(',')" on it. Doesn't it seem natural that #2 be converted to
#1 by calling ".join(',')"? It works this way in JavaScript and Ruby, at
least.

The argument is more of a technical issue. There are only two kinds of
strings. There are many kinds of "iterables". So, it's easier to define
"join" on the string, and force implementers of custom string types to
implement "join" as well (since this is more rare) than to define "join" on
an iterable and force implementers of the many kinds of iterables to define
"join" as well. Conceptually, I'm not sure that the case is so strong that
"join" is a string method.

In reality, "join" isn't really a string method any more than it's an
iterable method. It's a string-iterable<string> method; it operates on the
relationship between a string and an iterable of strings. If we had a
class that could represent that relationship, "join" would be a method of
that class, ie.:

seq = ['the', 'quick', 'brown', 'fox']
sep = ','

ssi = StringStringIterable(sep, seq)
result = ssi.join()

But this would be somewhat pointless because:

1. This would be a pain to type.
2. The class probably wouldn't pull its weight.
3. The elegance Python has with text processing is lost.

Another solution might be to use a mixin class that provides StringIterable
methods, and have the built-in list include this mixin. Then, you could
always mix it into your own iterable classes if you wanted "join" to be
available. But then, you've still got issues trying to integrate it with
tuples and generators.

Sometimes, object-orientedness gets in the way, and I think this is one of
those cases. "str.join" is probably the winner here, but since it's really
just a string method being used "out of context", the delimeter is the first
argument, and this doesn't read well to me. I think that "string.join" makes
more sense; it says "join this sequence using this delimeter" instead of
str.join's "join using this delimeter this sequence".
since it operates on a list, not on a string.


Huh? It operates on a sequence of strings. It has nothing to do with
lists in particular. The builtinness and mutability of lists is irrelevant
to this generic read-only operation.


Only because it is defined as such. Ruby and JavaScript define the "join"
method on built-in arrays. Newcomers to Python who have programmed in those
languages will naturally associate "join" with lists, even though
technically, in the Python world, it's really something associated with
the relationship between a string and an iterable of strings. Which is an
awful lot of semantics to digest when you just want to stick some commas
between words in a list.

--
..:[ dave benjamin (ramenboy) -:- www.ramenfest.com -:- www.3dex.com ]:.
: d r i n k i n g l i f e o u t o f t h e c o n t a i n e r :
Jul 18 '05 #12
In article <sl******************@lackingtalent.com>,
Dave Benjamin <ra***@lackingtalent.com> wrote:

The argument is more of a technical issue. There are only two kinds of
strings. There are many kinds of "iterables". So, it's easier to define
"join" on the string, and force implementers of custom string types to
implement "join" as well (since this is more rare) than to define "join" on
an iterable and force implementers of the many kinds of iterables to define
"join" as well. Conceptually, I'm not sure that the case is so strong that
"join" is a string method.

[ ... ]

Sometimes, object-orientedness gets in the way, and I think this is one of
those cases. "str.join" is probably the winner here, but since it's really
just a string method being used "out of context", the delimeter is the first
argument, and this doesn't read well to me. I think that "string.join" makes
more sense; it says "join this sequence using this delimeter" instead of
str.join's "join using this delimeter this sequence".


Why not something really simple which does something like this?

def myjoin(seq,sep):
def _addsep(l, r, s=sep): return l+s+r
return reduce(_addsep, seq)
myjoin(['a','b','c'], ",") 'a,b,c' myjoin(['a','b','c'], "") 'abc' myjoin([1,2,3,4], 0) 10 myjoin("abcd", ',')

'a,b,c,d'

It might not be the fastest, but it is straightforward and generic,
and could be optimized in C, if desired.

Gary Duzan
BBN Technologies
A Verizon Company
Jul 18 '05 #13

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

23
by: Invalid User | last post by:
While trying to print a none empty list, I accidentaly put an "else" statement with a "for" instead of "if". Here is what I had: if ( len(mylist)> 0) : for x,y in mylist: print x,y else:...
27
by: Ron Adam | last post by:
There seems to be a fair amount of discussion concerning flow control enhancements lately. with, do and dowhile, case, etc... So here's my flow control suggestion. ;-) It occurred to me (a...
38
by: Haines Brown | last post by:
I'm having trouble finding the character entity for the French abbreviation for "number" (capital N followed by a small supercript o, period). My references are not listing it. Where would I...
32
by: James Curran | last post by:
I'd like to make the following proposal for a new feature for the C# language. I have no connection with the C# team at Microsoft. I'm posting it here to gather input to refine it, in an "open...
43
by: markryde | last post by:
Hello, I saw in some open source projects a use of "!!" in "C" code; for example: in some header file #define event_pending(v) \ (!!(v)->vcpu_info->evtchn_upcall_pending & \...
7
by: Girish Sahani | last post by:
Hi, Please check out the following loop,here indexList1 and indexList2 are a list of numbers. for index1 in indexList1: for index2 in indexList2: if ti1 == ti2 and not index1 !=...
94
by: Samuel R. Neff | last post by:
When is it appropriate to use "volatile" keyword? The docs simply state: " The volatile modifier is usually used for a field that is accessed by multiple threads without using the lock...
5
by: raylopez99 | last post by:
I understand delegates (static and non-static) and I agree they are very useful, and that the "Forms" used in the Windows .NET API could not work without them. That said, I'm curious as to how...
5
by: Kyle Hayes | last post by:
Is there a way to use the 'r' in front of a variable instead of directly in front of a string? Or do I need to use a function to get all of the slashes automatically fixed? /thanks -Kyle
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.