472,119 Members | 2,024 Online

# Best way to create a copy of a list

Hi all

Assume a 2-dimensional list called 'table' - conceptually think of it
as rows and columns.

Assume I want to create a temporary copy of a row called 'row',
allowing me to modify the contents of 'row' without modifying the
contents of 'table'.

I used to fall into the newbie trap of 'row = table[23]', but I have
learned my lesson by now - changing 'row' also changes 'table'.

I have found two ways of doing it that seem to work.

1 - row = table[23][:]

2 - row = []
row[:] = table[23]

Are these effectively identical, or is there a subtle distinction which
I should be aware of.

I did some timing tests, and 2 is quite a bit faster if 'row'
pre-exists and I just measure the second statement.

TIA

Frank Millman

Apr 4 '06 #1
7 1816

Frank Millman wrote:
Hi all

Assume a 2-dimensional list called 'table' - conceptually think of it
as rows and columns.

Assume I want to create a temporary copy of a row called 'row',
allowing me to modify the contents of 'row' without modifying the
contents of 'table'.

I used to fall into the newbie trap of 'row = table[23]', but I have
learned my lesson by now - changing 'row' also changes 'table'.

I have found two ways of doing it that seem to work.

1 - row = table[23][:]

2 - row = []
row[:] = table[23]

Are these effectively identical, or is there a subtle distinction which
I should be aware of.

I did some timing tests, and 2 is quite a bit faster if 'row'
pre-exists and I just measure the second statement.

you could use list()

row = list(table[23])

The effect is the same, but it's nicer to read.

Apr 4 '06 #2
Frank Millman wrote:
I have found two ways of doing it that seem to work.

1 - row = table[23][:]

2 - row = []
row[:] = table[23]

Are these effectively identical, or is there a subtle distinction which
I should be aware of.

I did some timing tests, and 2 is quite a bit faster if 'row'
pre-exists and I just measure the second statement.

quite a bit ? maybe if you're using very short rows, and all rows
have the same length, but hardly in the general case:

python -mtimeit -s "data=[range(100)]*100; row = []" "row[:] = data[23]"
100000 loops, best of 3: 5.35 usec per loop

python -mtimeit -s "data=[range(100)]*100" "row = data[23][:]"
100000 loops, best of 3: 4.81 usec per loop

(for constant-length rows, the "row[:]=" form saves one memory
allocation, since the target list can be reused as is. for longer rows,
other things seem to dominate)

</F>

Apr 4 '06 #3

Fredrik Lundh wrote:
Frank Millman wrote:
I have found two ways of doing it that seem to work.

1 - row = table[23][:]

2 - row = []
row[:] = table[23]

Are these effectively identical, or is there a subtle distinction which
I should be aware of.

I did some timing tests, and 2 is quite a bit faster if 'row'
pre-exists and I just measure the second statement.

quite a bit ? maybe if you're using very short rows, and all rows
have the same length, but hardly in the general case:

python -mtimeit -s "data=[range(100)]*100; row = []" "row[:] = data[23]"
100000 loops, best of 3: 5.35 usec per loop

python -mtimeit -s "data=[range(100)]*100" "row = data[23][:]"
100000 loops, best of 3: 4.81 usec per loop

(for constant-length rows, the "row[:]=" form saves one memory
allocation, since the target list can be reused as is. for longer rows,
other things seem to dominate)

</F>

Interesting. My results are opposite.

python -mtimeit -s "data=[range(100)]*100; row = []" "row[:] =
data[23]"
100000 loops, best of 3: 2.57 usec per loop

python -mtimeit -s "data=[range(100)]*100" "row = data[23][:]"
100000 loops, best of 3: 2.89 usec per loop

For good measure, I tried Rune's suggestion -

python -mtimeit -s "data=[range(100)]*100" "row = list(data[23])"
100000 loops, best of 3: 3.69 usec per loop

For practical purposes these differences are immaterial - I do not
anticipate huge quantities of data.

If they are all equivalent from a functional point of view, I lean
towards the second version. I agree with Rune that the third one is
nicer to read, but somehow the [:] syntax makes it a bit more obvious
what is going on.

Thanks

Frank

Apr 4 '06 #4
"Frank Millman" <fr***@chagford.com> writes:
Interesting. My results are opposite.
I got the same here (cPython 2.4.1):

godoy@jupiter ~ % python -mtimeit -s "data=[range(100)]*100; row = []" "row[:] = data[23]"
1000000 loops, best of 3: 1.15 usec per loop
godoy@jupiter ~ % python -mtimeit -s "data=[range(100)]*100" "row = data[23][:]"
100000 loops, best of 3: 1.42 usec per loop
godoy@jupiter ~ % python -mtimeit -s "data=[range(100)]*100" "row = list(data[23])"
100000 loops, best of 3: 1.93 usec per loop
godoy@jupiter ~ %
If they are all equivalent from a functional point of view, I lean
towards the second version. I agree with Rune that the third one is
nicer to read, but somehow the [:] syntax makes it a bit more obvious
what is going on.

I prefer the third option for readability. It makes it clear that I'll get a
*new* list with the 23rd row of data. Just think: how would you get the 1st
column of the 23rd row?
a = [[1, 2], [2, 3], [3, 4], [4, 5], [5, 6]]
a [[1, 2], [2, 3], [3, 4], [4, 5], [5, 6]] a[1] [2, 3] a[1][1] 3 a[1][:] [2, 3]

Someone might think that the "[:]" means "all columns" and the syntax to be
equivalent to "data[23]".
--
Jorge Godoy <go***@ieee.org>

"Quidquid latine dictum sit, altum sonatur."
- Qualquer coisa dita em latim soa profundo.
- Anything said in Latin sounds smart.
Apr 4 '06 #5
Frank Millman <fr***@chagford.com> wrote:
...
If they are all equivalent from a functional point of view, I lean
towards the second version. I agree with Rune that the third one is
nicer to read, but somehow the [:] syntax makes it a bit more obvious
what is going on.

I vastly prefer to call list(xxx) in order to obtain a new list with the
same items as xxx -- couldn't be more obvious than that.

You can't claim it's obvious that xxx[:] *copies* data -- because in
Numeric, for example, it doesn't, it returns an array that *shares* data
with xxx. So, the [:] notation sometimes copies and sometimes does not,
list list(...) always copies -- if I want to ensure that a copy does
happen, then list(...) is the more obvious and readable choice.
Alex
Apr 4 '06 #6
al*****@yahoo.com (Alex Martelli) writes:
I vastly prefer to call list(xxx) in order to obtain a new list with the
same items as xxx -- couldn't be more obvious than that.

You can't claim it's obvious that xxx[:] *copies* data

Heh, it wasn't obvious that list(xxx) copies data either (I thought of
it as being like a typecast), but I just checked, and it does copy.
I'll have to remember to do it like that. I do like it better than
xxx[:] which is what I'd been using because I remember seeing that the
copy module does it that way.
Apr 4 '06 #7

Alex Martelli wrote:
Frank Millman <fr***@chagford.com> wrote:
...
If they are all equivalent from a functional point of view, I lean
towards the second version. I agree with Rune that the third one is
nicer to read, but somehow the [:] syntax makes it a bit more obvious
what is going on.

I vastly prefer to call list(xxx) in order to obtain a new list with the
same items as xxx -- couldn't be more obvious than that.

You can't claim it's obvious that xxx[:] *copies* data -- because in
Numeric, for example, it doesn't, it returns an array that *shares* data
with xxx. So, the [:] notation sometimes copies and sometimes does not,
list list(...) always copies -- if I want to ensure that a copy does
happen, then list(...) is the more obvious and readable choice.
Alex

Thanks very much for the detailed explanation.

Frank

Apr 5 '06 #8

### This discussion thread is closed

Replies have been disabled for this discussion.