473,382 Members | 1,348 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,382 software developers and data experts.

Re: Light slices + COW

That xpairs() generator is nice, but it's not the best possible code
(but you may disagree with me, and you may think that code better than
the successive D code that uses two slices). Inside it I can't use a
list slicing because it copies subparts of the list, probably becoming
too much slow (for example if len(seq) == 1000).
What do you mean by best possible? Most efficient? Most readable? And
why don't you use islice?

eg:

def xpairs(seq):
len_seq = len(seq)
for i, e1 in enumerate(seq):
for e2 in islice(seq, i+1, None):
yield e1, e2

Here's a version which makes more use of itertools. It should be more
efficient, but it's ugly :-) (this is my first time using itertools).

def xpairs(seq):
def _subfunc():
for i in xrange(len(seq)):
e1 = seq[i]
yield izip(repeat(e1), islice(seq, i+1, None))
return chain(*_subfunc())
That D code is as fast or faster than the code you can back-translate
from Python, this is possible because in D arrays slices are very
light, they are a struct of <length, pointer>
D compiles to efficient machine code so Python is at a disadvantage
even if you use the same syntax (see my first example). You can make
the Python version faster, but beware of premature optimization.
I think Python 3.0 too may enjoy a similar strategy of light slices +
COW for strings, lists and arrays (tuples and strings don't need the
COW).
What I'dlike to see is a rope[1] module for Python. I'ts in C++'s STL
library[2], but I haven't found a Python version yet.

[1] http://en.wikipedia.org/wiki/Rope_(computer_science)
[2] http://www.sgi.com/tech/stl/Rope.html

With a Python rope library you could do things like this:

a = '<some extremely long string>'
b = rope(a) # Contains a reference to a
c = b[0:100000] # Get a rope object
d = b[100000:200000] # Get another rope object
e = b + b # Get another rope object
print e # Get the string representation of all the re-assembled sub-sections

# And so on. In the above code there was only 1 copy of the huge
string in memory. The rope objects only contain a tree of
sub-operations (slices, concatenations, references to original
sequences, etc).

This shouldn't be too hard to implement. Does anyone know of an
already-existing 'rope' module?

David.
Jun 27 '08 #1
3 1153
David:
What do you mean by best possible? Most efficient? Most readable?
What's a good wine? It's not easy to define what's "good/best". In
such context it's a complex balance of correct, short, fast and
readable (and more, because you need to define a context. This context
refers to Psyco too).

And why don't you use islice?
You are right, but the purpose of light slices that I have suggested
is to avoid using islice so often. And currently Psyco doesn't digest
itertools well.

D compiles to efficient machine code so Python is at a disadvantage
even if you use the same syntax (see my first example). You can make
the Python version faster, but beware of premature optimization.
This time I don't agree with this "premature optimization" thing. My
original Python version is just 5 lines long, it's readable enough,
and it's a part of a large library of mine of similar functions, they
must be fast because I use them all the time as building blocks in
programs.

What I'dlike to see is a rope[1] module for Python.
People have already suggested it, and there's even an implementation
to replace Python strings. It was refused because... I don't know why,
maybe its implementation was too much more complex than the current
one, and it wasn't faster in all situations (and I think Guido wants
data structures that try to replace the basic built-in ones to be
faster in all situations and user-transparent too).

Bye,
bearophile
Jun 27 '08 #2
D compiles to efficient machine code so Python is at a disadvantage
even if you use the same syntax (see my first example). You can make
the Python version faster, but beware of premature optimization.

This time I don't agree with this "premature optimization" thing. My
original Python version is just 5 lines long, it's readable enough,
and it's a part of a large library of mine of similar functions, they
must be fast because I use them all the time as building blocks in
programs.
Have you looked into the 'numpy' libraries? Those have highly
optimized array/numeric processing functions. Also the have a concept
of getting 'views' when you slice, rather than a full copy.

I also suggest benchmarking your apps which use your libs and see
where the bottlenecks are. Then you can move those to external
compiled modules (c/pyrex/cython/shedskin/etc). You can keep your
slower Python version around as a reference implementation.
What I'dlike to see is a rope[1] module for Python.

People have already suggested it, and there's even an implementation
to replace Python strings. It was refused because... I don't know why,
maybe its implementation was too much more complex than the current
one, and it wasn't faster in all situations (and I think Guido wants
data structures that try to replace the basic built-in ones to be
faster in all situations and user-transparent too).
I'd be happy if there was a separate 'rope' class that you could wrap
arbitrary long sequences in when you need to (the same way that STL
has it separate to the string class, even though the string class has
a lot of other optimizations (internal ref counts, etc, to avoid
unnecessary copies)). Do you know one?

David.
Jun 27 '08 #3
On May 4, 7:49*am, David <wizza...@gmail.comwrote:
*D compiles to efficient machine code so Python is at a disadvantage
*even if you use the same syntax (see my first example). You can make
*the Python version faster, but beware of premature optimization.
*This time I don't agree with this "premature optimization" thing. My
*original Python version is just 5 lines long, it's readable enough,
*and it's a part of a large library of mine of similar functions, they
*must be fast because I use them all the time as building blocks in
*programs.

Have you looked into the 'numpy' libraries? Those have highly
optimized array/numeric processing functions. Also the have a concept
of getting 'views' when you slice, rather than a full copy.

I also suggest benchmarking your apps which use your libs and see
where the bottlenecks are. Then you can move those to external
compiled modules (c/pyrex/cython/shedskin/etc). You can keep your
slower Python version around as a reference implementation.
*What I'dlike to see is a rope[1] module for Python.
*People have already suggested it, and there's even an implementation
*to replace Python strings. It was refused because... I don't know why,
*maybe its implementation was too much more complex than the current
*one, and it wasn't faster in all situations (and I think Guido wants
*data structures that try to replace the basic built-in ones to be
*faster in all situations and user-transparent too).

I'd be happy if there was a separate 'rope' class that you could wrap
arbitrary long sequences in when you need to (the same way that STL
has it separate to the string class, even though the string class has
a lot of other optimizations (internal ref counts, etc, to avoid
unnecessary copies)). Do you know one?

David.
Persistence on the rope class might be trivial; that is, a constant
number of disk modifications be made per state modification.
Jun 27 '08 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: thecrow | last post by:
I am trying to figure out how to use hash slices in PHP. (Assign certain values of an associative array into variables, not dependent on loops or internal order of the associative array). In...
17
by: Noam Raphael | last post by:
Hello, Many times I find myself asking for a slice of a specific length, and writing something like l. This happens both in interactive use and when writing Python programs, where I have to write...
19
by: David Abrahams | last post by:
Can anyone explain the logic behind the behavior of list slicing with negative strides? For example: >>> print range(10) I found this result very surprising, and would just like to see the...
29
by: George Sakkis | last post by:
Why does slicing a tuple returns a new tuple instead of a view of the existing one, given that tuples are immutable ? I ended up writing a custom ImmutableSequence class that does this, but I...
1
by: Antoon Pardon | last post by:
I'm for the moment writing two classes. A table, which is like a list, but can start at any integer. A tree which is like a dictionary, but will iterate over the keys in sorted order. The...
1
by: meridian | last post by:
If, like me, you're always forgetting which way around your list/seq slices need to go then worry no more. Just put my handy "slice lookupper" (TM) ) on a (large!) PostIt beside your screen and,...
3
slightlybefuddled
by: slightlybefuddled | last post by:
(Exporting ImageReady slices as CSS rather than tables) apparently means it'll work just fine in Firefox, but do wacky stuff in IE? Can anyone help me figure out why on earth the slices are not...
31
by: Antoon Pardon | last post by:
The following is part of the explanation on slices in the tutorial: The best way to remember how slices work is to think of the indices as pointing between characters, with the left edge of the...
5
by: NuberSteve | last post by:
I'm very new to using CSS and also the concept of slices for mouse-overs, and have made my first attempt at using ImageReady to generate slices of a world map. I basically wanted a map that would...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.