473,386 Members | 1,764 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,386 software developers and data experts.

Feature Proposal: Sequence .join method

Hi all!

I could not find out whether this has been proposed before (there are
too many discussion on join as a sequence method with different
semantics). So, i propose a generalized .join method on all sequences
with these semantics:

def join(self, seq):
T = type(self)
result = T()
if len(seq):
result = T(seq[0])
for item in seq[1:]:
result = result + self + T(item)
return result

This would allow code like the following:

[0].join([[5], [42, 5], [1, 2, 3], [23]])

resulting in:
[5, 0, 42, 5, 0, 1, 2, 3, 0, 23]

You might have noticed that this contains actually two propsals, so if
you don't like str.join applying str() on each item in the sequence
replace the line
result = result + self + T(item)
with
result = result + self + item
My version has been turned down in the past as far as i read, yet, i
find the first version more useful in the new context... you can pass a
sequence of lists or tuples or really any sequence to the method and it
does what you think (at least what i think :).

Any comments welcome,
David.
Sep 30 '05 #1
10 1778
David Murmann wrote:
replace the line
result = result + self + T(item)
with
result = result + self + item

and of course the line
result = T(seq[0])
with
result = seq[0]
Sep 30 '05 #2
David Murmann wrote:
Hi all!

I could not find out whether this has been proposed before (there are
too many discussion on join as a sequence method with different
semantics). So, i propose a generalized .join method on all sequences
with these semantics:

def join(self, seq):
T = type(self)
result = T()
if len(seq):
result = T(seq[0])
for item in seq[1:]:
result = result + self + T(item)
return result

This would allow code like the following:

[0].join([[5], [42, 5], [1, 2, 3], [23]])


I don't like the idea of having to put this on all sequences. If you
want this, I'd instead propose it as a function (perhaps builtin,
perhaps in some other module).

Also, this particular implementation is a bad idea. The repeated += to
result is likely to result in O(N**2) behavior.

STeVe
Sep 30 '05 #3
Steven Bethard wrote:
David Murmann wrote:
Hi all!

I could not find out whether this has been proposed before (there are
too many discussion on join as a sequence method with different
semantics). So, i propose a generalized .join method on all sequences
with these semantics:

def join(self, seq):
T = type(self)
result = T()
if len(seq):
result = T(seq[0])
for item in seq[1:]:
result = result + self + T(item)
return result

This would allow code like the following:

[0].join([[5], [42, 5], [1, 2, 3], [23]])


I don't like the idea of having to put this on all sequences. If you
want this, I'd instead propose it as a function (perhaps builtin,
perhaps in some other module).

Also, this particular implementation is a bad idea. The repeated += to
result is likely to result in O(N**2) behavior.

STeVe


Hi and thanks for the fast reply,

i just figured out that the following implementation is probably much
faster, and short enough to be used in place for every of my use cases:

def join(sep, seq):
return reduce(lambda x, y: x + sep + y, seq, type(sep)())

so, i'm withdrawing my proposal, and instead propose to keep reduce and
lambda in py3k ;).

thanks again,
David.
Sep 30 '05 #4
> def join(sep, seq):
return reduce(lambda x, y: x + sep + y, seq, type(sep)())


damn, i wanted too much. Proper implementation:

def join(sep, seq):
if len(seq):
return reduce(lambda x, y: x + sep + y, seq)
return type(sep)()

but still short enough

see you,
David.
Sep 30 '05 #5
David Murmann wrote:
I could not find out whether this has been proposed before (there are
too many discussion on join as a sequence method with different
semantics). So, i propose a generalized .join method on all sequences


so all you have to do now is to find the sequence base class, and
you're done...

</F>

Sep 30 '05 #6
On Thu, 29 Sep 2005 20:37:31 -0600
Steven Bethard wrote:
I don't like the idea of having to put this on all sequences. If you
want this, I'd instead propose it as a function (perhaps builtin,
perhaps in some other module).


itertools module seems the right place for it.

itertools.chain(*a)

is the same as the proposed

[].join(a)

--
jk
Sep 30 '05 #7

"David Murmann" <da***********@rwth-aachen.de> wrote in message
news:3q************@news.dfncis.de...
def join(sep, seq):
return reduce(lambda x, y: x + sep + y, seq, type(sep)())


damn, i wanted too much. Proper implementation:

def join(sep, seq):
if len(seq):
return reduce(lambda x, y: x + sep + y, seq)
return type(sep)()

but still short enough


For general use, this is both too general and not general enough.

If len(seq) exists then seq is probably reiterable, in which case it may be
possible to determine the output length and preallocate to make the process
O(n) instead of O(n**2). I believe str.join does this. A user written
join for lists could also. A tuple function could make a list first and
then tuple(it) at the end.

If seq is a general (non-empty) iterable, len(seq) may raise an exception
even though the reduce would work fine.

Terry J. Reedy

Sep 30 '05 #8
Terry Reedy wrote:
"David Murmann" <da***********@rwth-aachen.de> wrote in message
news:3q************@news.dfncis.de...
def join(sep, seq):
return reduce(lambda x, y: x + sep + y, seq, type(sep)())


damn, i wanted too much. Proper implementation:

def join(sep, seq):
if len(seq):
return reduce(lambda x, y: x + sep + y, seq)
return type(sep)()

but still short enough

For general use, this is both too general and not general enough.

If len(seq) exists then seq is probably reiterable, in which case it may be
possible to determine the output length and preallocate to make the process
O(n) instead of O(n**2). I believe str.join does this. A user written
join for lists could also. A tuple function could make a list first and
then tuple(it) at the end.

If seq is a general (non-empty) iterable, len(seq) may raise an exception
even though the reduce would work fine.

Terry J. Reedy

For the general iterable case, you could have something like this:
def interleave(sep, iterable): ... it = iter(iterable)
... next = it.next()
... try:
... while 1:
... item = next
... next = it.next()
... yield item
... yield sep
... except StopIteration:
... yield item
... list(interleave(100,range(10))) [0, 100, 1, 100, 2, 100, 3, 100, 4, 100, 5, 100, 6, 100, 7, 100, 8, 100, 9]


but I can't think of a use for it ;-)

Michael

Sep 30 '05 #9
Michael Spencer wrote:
Terry Reedy wrote:
"David Murmann" <da***********@rwth-aachen.de> wrote in message
news:3q************@news.dfncis.de...
def join(sep, seq):
return reduce(lambda x, y: x + sep + y, seq, type(sep)())

damn, i wanted too much. Proper implementation:

def join(sep, seq):
if len(seq):
return reduce(lambda x, y: x + sep + y, seq)
return type(sep)()

but still short enough

For general use, this is both too general and not general enough.

If len(seq) exists then seq is probably reiterable, in which case it
may be possible to determine the output length and preallocate to make
the process O(n) instead of O(n**2). I believe str.join does this. A
user written join for lists could also. A tuple function could make a
list first and then tuple(it) at the end.

If seq is a general (non-empty) iterable, len(seq) may raise an
exception even though the reduce would work fine.

Terry J. Reedy

For the general iterable case, you could have something like this:
>>> def interleave(sep, iterable): ... it = iter(iterable)
... next = it.next()
... try:
... while 1:
... item = next
... next = it.next()
... yield item
... yield sep
... except StopIteration:
... yield item
... >>> list(interleave(100,range(10)))

[0, 100, 1, 100, 2, 100, 3, 100, 4, 100, 5, 100, 6, 100, 7, 100, 8,
100, 9]


Well, as en**********@ospaz.ru pointed out, there is already
itertools.chain which almost does this. In my opinion it could be useful
to add an optional keyword argument to it (like "connector" or "link"),
which is iterated between the other arguments.
but I can't think of a use for it ;-)


Of course, i have a use case, but i don't know whether this is useful
enough to be added to the standard library. (Yet this would be a much
smaller change than changing all sequences ;)

thanks for all replies,
David.
Sep 30 '05 #10
Hi again,

i wrote a small patch that changes itertools.chain to take a "link"
keyword argument. If given, it is iterated between the normal arguments,
otherwise the behavior is unchanged.

I'd like to hear your opinion on both, the functionality and the actual
implementation (as this is one of the first things i ever wrote in C).

till then,
David.

Index: python/dist/src/Modules/itertoolsmodule.c
================================================== =================
RCS file: /cvsroot/python/python/dist/src/Modules/itertoolsmodule.c,v
retrieving revision 1.41
diff -c -r1.41 itertoolsmodule.c
*** python/dist/src/Modules/itertoolsmodule.c 26 Aug 2005 06:42:30 -0000 1.41
--- python/dist/src/Modules/itertoolsmodule.c 30 Sep 2005 22:28:38 -0000
***************
*** 1561,1587 ****
int tuplesize = PySequence_Length(args);
int i;
PyObject *ittuple;

! if (!_PyArg_NoKeywords("chain()", kwds))
! return NULL;

/* obtain iterators */
assert(PyTuple_Check(args));
ittuple = PyTuple_New(tuplesize);
if(ittuple == NULL)
return NULL;
! for (i=0; i < tuplesize; ++i) {
! PyObject *item = PyTuple_GET_ITEM(args, i);
! PyObject *it = PyObject_GetIter(item);
! if (it == NULL) {
! if (PyErr_ExceptionMatches(PyExc_TypeError))
! PyErr_Format(PyExc_TypeError,
! "chain argument #%d must support iteration",
! i+1);
! Py_DECREF(ittuple);
! return NULL;
}
- PyTuple_SET_ITEM(ittuple, i, it);
}

/* create chainobject structure */
--- 1561,1621 ----
int tuplesize = PySequence_Length(args);
int i;
PyObject *ittuple;
+ PyObject *link = NULL;

! if (kwds != NULL && PyDict_Check(kwds)) {
! link = PyDict_GetItemString(kwds, "link");
! if (link != NULL)
! /* create more space for the link iterators */
! tuplesize = tuplesize*2-1;
! }

/* obtain iterators */
assert(PyTuple_Check(args));
ittuple = PyTuple_New(tuplesize);
if(ittuple == NULL)
return NULL;
! if (link == NULL) {
! /* no keyword argument provided */
! for (i=0; i < tuplesize; ++i) {
! PyObject *item = PyTuple_GET_ITEM(args, i);
! PyObject *it = PyObject_GetIter(item);
! if (it == NULL) {
! if (PyErr_ExceptionMatches(PyExc_TypeError))
! PyErr_Format(PyExc_TypeError,
! "chain argument #%d must support iteration",
! i+1);
! Py_DECREF(ittuple);
! return NULL;
! }
! PyTuple_SET_ITEM(ittuple, i, it);
! }
! }
! else {
! for (i=0; i < tuplesize; ++i) {
! PyObject *it = NULL;
! if (i%2 == 0) {
! PyObject *item = PyTuple_GET_ITEM(args, i/2);
! it = PyObject_GetIter(item);
! }
! else {
! it = PyObject_GetIter(link);
! }
! if (it == NULL) {
! if (PyErr_ExceptionMatches(PyExc_TypeError)) {
! if (i%2 == 0)
! PyErr_Format(PyExc_TypeError,
! "chain argument #%d must support iteration",
! i/2+1);
! else
! PyErr_Format(PyExc_TypeError,
! "chain keyword argument link must support iteration");
! }
! Py_DECREF(ittuple);
! return NULL;
! }
! PyTuple_SET_ITEM(ittuple, i, it);
}
}

/* create chainobject structure */

Sep 30 '05 #11

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: flexibal | last post by:
hi there. i didnt know if i should post it at python-dev or here, so i'll start here. i'd like to suggest a new language feature for python that allows you to explicitly declare a variable. ...
15
by: Jordan Rastrick | last post by:
First, a disclaimer. I am a second year Maths and Computer Science undergraduate, and this is my first time ever on Usenet (I guess I'm part of the http generation). On top of that, I have been...
31
by: Brian Sabbey | last post by:
Here is a pre-PEP for what I call "suite-based keyword arguments". The mechanism described here is intended to act as a complement to thunks. Please let me know what you think. Suite-Based...
18
by: Chris Travers | last post by:
Hi all; I have been looking into how to ensure that synchronous replication, etc. could best be implimented. To date, I see only two options: incorporate the replication code into the database...
32
by: James Curran | last post by:
I'd like to make the following proposal for a new feature for the C# language. I have no connection with the C# team at Microsoft. I'm posting it here to gather input to refine it, in an "open...
9
by: corey.coughlin | last post by:
Alright, so I've been following some of the arguments about enhancing parallelism in python, and I've kind of been struck by how hard things still are. It seems like what we really need is a more...
30
by: Raymond Hettinger | last post by:
Proposal -------- I am gathering data to evaluate a request for an alternate version of itertools.izip() with a None fill-in feature like that for the built-in map() function: >>> map(None,...
12
by: Raymond Hettinger | last post by:
I am evaluating a request for an alternate version of itertools.izip() that has a None fill-in feature like the built-in map function: >>> map(None, 'abc', '12345') # demonstrate map's None...
21
by: Paul Rubin | last post by:
I've always found the string-building idiom temp_list = for x in various_pieces_of_output(): v = go_figure_out_some_string() temp_list.append(v) final_string = ''.join(temp_list) ...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.