473,395 Members | 1,441 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,395 software developers and data experts.

Python is slow?

I have recently been playing with a kd-tree for solving the "post
office problem" in a 12-dimensional space. This is pure cpu bound
number crunching, a task for which I suspected Python to be
inefficient.

My prototype in Python 2.5 using NumPy required 0.41 seconds to
construct the tree from 50,000 samples. Unfortunately, searching it
felt a bit slow, finding the 11 nearest-neighbours of 1,000 points
took 29.6 seconds (and there were still 49,000 to go). Naturally, I
blamed this on Python. It would be 100 times faster if I used C++,
right?

After having a working Python prototype, I resorted to rewrite the
program in C++. The Python prototype took an hour to make, debug and
verify. The same thing in C++ took me almost a day to complete, even
with a working prototype as model. To my surprise, the resulting beast
of C++ required 64.3 seconds to construct the same kd-tree. Searching
the tree was not faster either, 1,000 points required 38.8 seconds. I
wasted a day, only to find my Python prototype being the faster.

We may conclude that I'm bad at programming C++, but I suspect that is
not the case here. Albeit micro-benchmarks may indicate that Python is
100-200 times slower than C++, they may not be applicable to the real
world. Python can be very efficient. And when combined with libraries
like NumPy, beating it's performance with hand-crafted C++ is
difficult. At least, my 10 years experience programming scientific
software in various languages was not sufficient to beat my own Python
prototype with C++.

That is not to say I have never seen C++ run a lot faster than Python.
But it tends to be very short pieces of CPU bound code, no more than a
function or two. But as the problem grows in complexity, C++
accumulates too much of its own bloat.
Sep 23 '08 #1
55 2877
On Tue, 23 Sep 2008 06:23:12 -0700 (PDT), sturlamolden
<st**********@yahoo.nowrote:
>I have recently been playing with a kd-tree for solving the "post
office problem" in a 12-dimensional space. This is pure cpu bound
number crunching, a task for which I suspected Python to be
inefficient.
Well, python is not a number crunching language. However much we would
like it to be (we would ? :-). No scripting language is.
Developing time is shorter, I agree, but when you have, for example a
problem which takes 8,31 minutes to go through in optimized fortran
code (like the one I had the other day), then that hardly matters.
>
My prototype in Python 2.5 using NumPy required 0.41 seconds to
construct the tree from 50,000 samples. Unfortunately, searching it
felt a bit slow, finding the 11 nearest-neighbours of 1,000 points
took 29.6 seconds (and there were still 49,000 to go). Naturally, I
blamed this on Python. It would be 100 times faster if I used C++,
right?

Not necessarily.
Before resorting to rewriting the problem try psyco. It speeds up
things sometimes.
Also, (I'm not that familiar with python yet, so I don't know how to
do it in python), try finding the bottlenecks of your calculation. Are
the loops where most of the processing time is wasted, or disk
accessing, or ... ?
>
After having a working Python prototype, I resorted to rewrite the
program in C++. The Python prototype took an hour to make, debug and
verify. The same thing in C++ took me almost a day to complete, even
with a working prototype as model. To my surprise, the resulting beast
of C++ required 64.3 seconds to construct the same kd-tree. Searching
the tree was not faster either, 1,000 points required 38.8 seconds. I
wasted a day, only to find my Python prototype being the faster.

>
We may conclude that I'm bad at programming C++, but I suspect that is
not the case here. Albeit micro-benchmarks may indicate that Python is
100-200 times slower than C++, they may not be applicable to the real
world. Python can be very efficient. And when combined with libraries
like NumPy, beating it's performance with hand-crafted C++ is
difficult. At least, my 10 years experience programming scientific
software in various languages was not sufficient to beat my own Python
prototype with C++.

That is not to say I have never seen C++ run a lot faster than Python.
But it tends to be very short pieces of CPU bound code, no more than a
function or two. But as the problem grows in complexity, C++
accumulates too much of its own bloat.
Well, personally, I try to combine fortran (being a fortran programmer
by trade) with python (in the last few years), as I find fortran to
be, by two grades, more comfortable for solving scientific problems
then c (or python for that matter, although it has its merits).
Starting from ith his capabilities for "normal" array handling, to
optimisation and easy readability, to whatnot.
Best regards
Bob
Sep 23 '08 #2
On 2008-09-23, sturlamolden <st**********@yahoo.nowrote:

[...]
After having a working Python prototype, I resorted to rewrite the
program in C++. The Python prototype took an hour to make, debug and
verify. The same thing in C++ took me almost a day to complete, even
with a working prototype as model. To my surprise, the resulting beast
of C++ required 64.3 seconds to construct the same kd-tree. Searching
the tree was not faster either, 1,000 points required 38.8 seconds. I
wasted a day, only to find my Python prototype being the faster.

We may conclude that I'm bad at programming C++,
AFAICT, _everybody_ is bad at programming C++.

One begins to suspect it's not the fault of the programmers.

--
Grant Edwards grante Yow! Finally, Zippy
at drives his 1958 RAMBLER
visi.com METROPOLITAN into the
faculty dining room.
Sep 23 '08 #3
On Sep 23, 9:57 am, Grant Edwards <gra...@visi.comwrote:
On 2008-09-23, sturlamolden <sturlamol...@yahoo.nowrote:

[...]
After having a working Python prototype, I resorted to rewrite the
program in C++. The Python prototype took an hour to make, debug and
verify. The same thing in C++ took me almost a day to complete, even
with a working prototype as model. To my surprise, the resulting beast
of C++ required 64.3 seconds to construct the same kd-tree. Searching
the tree was not faster either, 1,000 points required 38.8 seconds. I
wasted a day, only to find my Python prototype being the faster.
We may conclude that I'm bad at programming C++,

AFAICT, _everybody_ is bad at programming C++.
+1 QOTW
Sep 23 '08 #4
>We may conclude that I'm bad at programming C++,
GrantAFAICT, _everybody_ is bad at programming C++.

GrantOne begins to suspect it's not the fault of the programmers.

+1 QOTW...

Skip
Sep 23 '08 #5
sturlamolden:

CPython is generally slow (you can see this from the huge amount of
solutions invented to solve the speed problem, like Cython, Numpy,
Psyco, ShedSkin, Weave, Inline, SIP, Boost Python, SWIG, etc etc), but
for most of the usages Python is used for, it's not a significant
problem. I know that sounds like a tautology :-)

Well written C++ code is generally faster or much faster than similar
Python code, but programming in Python is often simpler, and it
generally requires less time. So it may happen that to solve a problem
a Python program that runs in 1 hour that requires 1 hour to be
written allows you to find the solution in less time than a C++
program that runs in 5-10 minutes that requires you 3-4 hours to be
written :-)

Note that C++ is just one option, but there are other languages
around, like CLisp or D (or even a modern JavaVM), that are often an
acceptable compromise between Python and C/C++.

So you can show us a reduced/minimal working version of your Python
code, so I/someone may find ways to speed it up, translate it to C or C
++ or CLisp or D, etc.

Note that I have written a kd-tree in both Python (with Psyco
compulsively) and D, this is the Psyco version:
http://code.activestate.com/recipes/572156/

Bye,
bearophile
Sep 23 '08 #6
On Sep 23, 3:44*pm, Robert Singer <rsinger@____.comwrote:
Well, python is not a number crunching language. However much we would
like it to be (we would ? :-).
No scripting language is.
Not even Matlab, R, IDL, Octave, SciLab, S-PLUS or Mathematica?

Before resorting to rewriting the problem try psyco. It speeds up
things sometimes.
I did, Psyco did not help.

Also, (I'm not that familiar with python yet, so I don't know how to
do it in python), try finding the bottlenecks of your calculation.
I did use a profiler, there is no particular single bottle-neck.

Well, personally, I try to combine fortran (being a fortran programmer
by trade) with python
Good compilers are too expensive, and gfortran is not good enough yet.

Sep 23 '08 #7
On Sep 23, 10:57*am, Grant Edwards <gra...@visi.comwrote:
AFAICT, _everybody_ is bad at programming C++.
Thankfully, at least Numpy developers are not bad at C programming.
Sep 23 '08 #8
sturlamolden:
>F# and OCaml look promising though.<
I bet on the future of D and Haskell (and maybe Fortress) instead :-)
We'll see.

>Sure I could show you the code, Python and C++, if I had a place to post it.<
I think the Python version suffices. If it's not too much private you
may post the single minimal/reduced runnable Python module here, it
will be deleted in some time (if you want you can also use a private
paste):
http://codepad.org/

>It looks very different form yours though.<
Is this a good or bad thing? ;-)

Bye,
bearophile
Sep 23 '08 #9
On Sep 23, 8:52*pm, bearophileH...@lycos.com wrote:
I think the Python version suffices. If it's not too much private you
may post the single minimal/reduced runnable Python module here, it
will be deleted in some time (if you want you can also use a private
paste):http://codepad.org/
http://codepad.org/rh8GzzJT

Available 24 hours from now.

Is this a good or bad thing? ;-)
It's just facinating how different people working on similar problems
come up with different looking code.



Sep 23 '08 #10
be************@lycos.com wrote:
sturlamolden:
>Sure I could show you the code, Python and C++, if I had a place to post it.<

I think the Python version suffices. If it's not too much private you
may post the single minimal/reduced runnable Python module here, it
will be deleted in some time (if you want you can also use a private
paste):
http://codepad.org/
You could also drop it on the scipy.org wiki in the Cookbook category.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco

Sep 23 '08 #11
On Sep 23, 8:31*am, bearophileH...@lycos.com wrote:

Guys, this looks like a great data structure/algo for something I am
working on.

But... where do I find some definitions of the original BK-tree idea?
I looked through Amazon
and only a few books mention something like BK-Tree and these are
mostly conference minutes books, at ungodly prices.

I also did a quick Google on it and there isn't that much about the
subject.

http://blog.notdot.net/archives/30-D...-BK-Trees.html

is the one I mostly saw referred.

So... 2 questions:

1. More bk-tree references? I can follow the code, but some
understanding of the background would be nice.

2. What, if any, is a good book to understand the basic of fuzzy/
string matching? Proximity/affinity problems? Or, more generally, a
good book on advanced algorithms?

No, I don't wanna read Knuth's just yet, something more modern/easy to
follow maybe? Something like 'Programming Collective Intelligence',
ISBN 0596529325, would be very nice, though it is perhaps a bit too
specific in its applications. Books using Java or C are fine. Lisp,
hmmm, well... I have trouble reading its notation, sorry.

Cheers

JLuc
Sep 23 '08 #12
On Sep 23, 9:17*pm, Robert Kern <robert.k...@gmail.comwrote:
You could also drop it on the scipy.org wiki in the Cookbook category.
Yes, if I could figure out how to use it...

Sep 23 '08 #13
J Peyret wrote:
On Sep 23, 8:31 am, bearophileH...@lycos.com wrote:

Guys, this looks like a great data structure/algo for something I am
working on.

But... where do I find some definitions of the original BK-tree idea?
Uh, actually we're talking about kd-trees, not BK-trees. kd-trees are for
searching through point sets in a k-dimensional space.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco

Sep 23 '08 #14
On Tue, 23 Sep 2008 11:07:22 -0700 (PDT), sturlamolden
<st**********@yahoo.nowrote:
>On Sep 23, 3:44*pm, Robert Singer <rsinger@____.comwrote:
>Well, python is not a number crunching language. However much we would
like it to be (we would ? :-).
>No scripting language is.

Not even Matlab, R, IDL, Octave, SciLab, S-PLUS or Mathematica?
No. And just to avoid eventual useless discussions which might arise,
I ment to say that *in general* compiled languages are faster. We can
always have discussions whether or not some newer scripting languages
like some from the above list, come close, but that usually is just
wasted time.

Specifically, I cannot say about R, IDL or S-PLUS, since I never used
those (not even heard of IDL till now). Octave and Mathematica have
been with me for such a short time (we had a few licences for
Wolfram's child for one year, but not my part of the company, so ...)
that I would rather not give my opinion about those.
I've used Matlab and Scilab for a longer time (still do actually -
Matlab for measurement data acquisition, and Scilab ... well, it just
sits on the disk somewhere actually), and although Matlab is quite
fast when disk I/O is involved, it still comes far.

>Also, (I'm not that familiar with python yet, so I don't know how to
do it in python), try finding the bottlenecks of your calculation.

I did use a profiler, there is no particular single bottle-neck.
You're talking about your c or your python version of the program?

There is always a bottleneck - that's just the part which works most
slowly. Try to find the part which takes the longest to execute, try
to put it differently. If it cannot be done, go to the next slowest
part.

>Good compilers are too expensive, and gfortran is not good enough yet.
?
Gfortran is one of the better compilers on the market. There was, just
the other day, a nice discussion on comp.lang.fortran how it is
marvellous what a group of enthousiasts managed do in their time, what
commercial giants still didn't.
May I ask what are your main objections to it ?

Best regards
Bob
Sep 23 '08 #15
On 23 Sep., 21:23, J Peyret <jpey...@gmail.comwrote:
On Sep 23, 8:31 am, bearophileH...@lycos.com wrote:

Guys, this looks like a great data structure/algo for something I am
working on.

But... where do I find some definitions of the original BK-tree idea?
*geometric data structures*. Just google for it.

Sep 23 '08 #16
sturlamolden wrote:
On Sep 23, 9:17 pm, Robert Kern <robert.k...@gmail.comwrote:
>You could also drop it on the scipy.org wiki in the Cookbook category.

Yes, if I could figure out how to use it...
What's confusing? You do have to create a profile:

http://www.scipy.org/UserPreferences

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco

Sep 23 '08 #17
On Sep 23, 10:05*pm, Robert Singer <rsinger@____.comwrote:
May I ask what are your main objections to it ?
1. gfortran is not Absoft.

2. If I program the same in C99 and Fortran 95, and compile with gcc
and gfortran, the C99 code runs a lot faster (I've only tested with
wavelet transforms).

3. gfortran is not Absoft.


Sep 23 '08 #18
On Sep 23, 10:16*pm, Robert Kern <robert.k...@gmail.comwrote:
What's confusing? You do have to create a profile:
How do I add a new page to the wiki? I'm only able to edit the front
page of the cookbook. But it doesn't help to add link there if I have
no page to link. (I may be incredibly stupid though.)
Sep 23 '08 #19
sturlamolden wrote:
On Sep 23, 10:16 pm, Robert Kern <robert.k...@gmail.comwrote:
>What's confusing? You do have to create a profile:

How do I add a new page to the wiki? I'm only able to edit the front
page of the cookbook. But it doesn't help to add link there if I have
no page to link. (I may be incredibly stupid though.)
You just navigate to the URL you want:

http://www.scipy.org/Cookbook/KDTree

This will show you a page saying:

"""
This page does not exist yet. You can create a new empty page, or use one of the
page templates. Before creating the page, please check if a similar page already
exists.

Create new empty page
"""

The latter is a link that you can click.

Looking at the other Cookbook page sources, I notice that you should add the
following to the bottom of your page to categorize it appropriately.

"""
----
. CategoryCookbook
"""

Then put the link on the main Cookbook page (or do that first, then navigate
through the link to create the page).

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco

Sep 23 '08 #20
On Tue, 23 Sep 2008 13:34:10 -0700 (PDT), sturlamolden
<st**********@yahoo.nowrote:
>1. gfortran is not Absoft.
I find this comment absurd. What did you mean by it ?
Yes, gfortran is not Absoft, just as red is not blue (?!).

I also don't understand whether you're looking for a free or a
commercial compiler. I got the impression from your previous post that
money was an object.
>2. If I program the same in C99 and Fortran 95, and compile with gcc
and gfortran, the C99 code runs a lot faster (I've only tested with
wavelet transforms).
Hmm. Unfortunatelly, i have none whatsoever experience in that field.
Completely different area of study here, so without seeing at least
some code, I cannot comment anything on that part.

And 'a lot faster' is a very relative term.
>
3. gfortran is not Absoft.
True.

Best regards
Bob
Sep 23 '08 #21
On Sep 23, 8:52*pm, bearophileH...@lycos.com wrote:
Is this a good or bad thing? ;-)
It seems we have been implementing different algorithms. kd-trees are
not BK-trees.

http://www.scipy.org/Cookbook/KDTree
Sep 23 '08 #22
Robert Kern wrote:
J Peyret wrote:
>On Sep 23, 8:31 am, bearophileH...@lycos.com wrote:

Guys, this looks like a great data structure/algo for something I am
working on.

But... where do I find some definitions of the original BK-tree idea?

Uh, actually we're talking about kd-trees, not BK-trees. kd-trees are
for searching through point sets in a k-dimensional space.
My apologies. I did not actually follow bearophile's link, and thought he was
talking about kd-trees like Sturla was.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco

Sep 24 '08 #23
sturlamolden:
It seems we have been implementing different algorithms. kd-trees are
not BK-trees.
http://www.scipy.org/Cookbook/KDTree
Sorry for my silly mistake :-)

Note: in your code I don't know if the collections.deque data
structure may help (it's faster than list for appending), but I
presume that's not a bottleneck.

Bye,
bearophile
Sep 24 '08 #24
On Tue, 23 Sep 2008 15:23:12 +0200, sturlamolden <st**********@yahoo.no>
wrote:
[...]

Would it be possible to post this text to some "persistent" web page with
(links to) the code you wrote in both languages? This would be a very
interesting resource for people experiencing some resistence when they
suggest Python as a possible language because of the 'Python is slow'
myth...
--
python -c "print ''.join([chr(154 - ord(c)) for c in
'U(17zX(%,5.zmz5(17l8(%,5.Z*(93-965$l7+-'])"
Sep 24 '08 #25

For those who are interested:

I've updated the cookbook tutorial on the kd-tree:

http://scipy.org/Cookbook/KDTree

It now also includes parallel search for multicore CPUs
(multiprocessing standard module). Even if you are not genuinely
interested in kd-trees, it shows how to do parallel processing in
Python despite of the GIL.


Sep 24 '08 #26

I have updated the cookbook entry for yesterday to also include
parallel processing for large data sets. Even if you're not interested
in kd-trees, it is a good example of what the new multiprocessing
standard module can do. There are still people being scared by the
GIL, thinking it prevents Python from utilizing multicore processors.

http://www.scipy.org/Cookbook/KDTree
Sep 24 '08 #27
On Wed, Sep 24, 2008 at 3:07 AM, sturlamolden <st**********@yahoo.nowrote:
On Sep 23, 3:44 pm, Robert Singer <rsinger@____.comwrote:
>Well, python is not a number crunching language. However much we would
like it to be (we would ? :-).
>No scripting language is.

Not even Matlab, R, IDL, Octave, SciLab, S-PLUS or Mathematica?
I am fairly experienced in matlab (have been using it extensively for
5 years in academical context), and now with numpy, and generally,
they are comparable speed-wise. Matlab has some niceties which makes
it faster in some simple cases (JIT for loops, function calls faster,
sometimes COW semantics means it faster), but numpy (at its core at
least) is much more powerful IMHO. Also, matlab is horrible when you
want to interface some C to it (the C api is basically broken; in
particular, there is no way to gurantee you won't leak memory when you
Ctrl+C custom C extensions because the C api does not have facility to
deal with signals). I totally gave up matlab for numpy 2 years ago,
and never regretted it.

I think speed is not the issue when comparing matlab, R and co.
Availability of functionalities matter much more. R is quite hard to
beat if you need to do advanced statistics, specially since it is the
tool of choice for most academic statisticians. I hope numpy/scipy
will be there sometime, but it is honestly still quite far in that
domain.

cheers,

David
Sep 26 '08 #28
In article <Y7******************************@posted.visi>,
Grant Edwards <gr****@visi.comwrote:
>
AFAICT, _everybody_ is bad at programming C++.

One begins to suspect it's not the fault of the programmers.
http://www.netfunny.com/rhf/jokes/98...troustrup.html
--
Aahz (aa**@pythoncraft.com) <* http://www.pythoncraft.com/

"Argue for your limitations, and sure enough they're yours." --Richard Bach
Sep 26 '08 #29
Matlab's strongest side is data visualization though. Although we have
matplotlib, mayavi and possibility of interfacing with gnuplot, it's
not anywhere near the capabilities of Matlab.
What particular Matlab visualization features are you referring to? I
can't think of anything that would justify using the "not anywhere
near" term.

Cheers,
Victor.
Sep 26 '08 #30
In message
<02**********************************@x35g2000hsb. googlegroups.com>,
sturlamolden wrote:
... and possibility of interfacing with gnuplot ...
Gnuplot is non-Free software.

Sep 29 '08 #31
On Sep 29, 3:05*am, Lawrence D'Oliveiro <l...@geek-
central.gen.new_zealandwrote:
In message
<02918eb6-c2fb-4908-923f-d878a1956...@x35g2000hsb.googlegroups.com>,

sturlamolden wrote:
... and possibility of interfacing with gnuplot ...

Gnuplot is non-Free software.
Yes, it is.

Victor.
Sep 29 '08 #32
Lawrence D'Oliveiro:
>Gnuplot is non-Free software.
Fly Away:
Yes, it is.
From:
http://www.gnuplot.info/faq/faq.txt

1.7 Does gnuplot have anything to do with the FSF and the GNU project?
[...]
Gnuplot is freeware in the sense that you don't have to pay for it.
However
it is not freeware in the sense that you would be allowed to
distribute a
modified version of your gnuplot freely. [...]

Bye,
bearophile
Sep 29 '08 #33
be************@lycos.com wrote:
Lawrence D'Oliveiro:
>>Gnuplot is non-Free software.

Fly Away:
>Yes, it is.

From:
http://www.gnuplot.info/faq/faq.txt

1.7 Does gnuplot have anything to do with the FSF and the GNU project?
[...]
Gnuplot is freeware in the sense that you don't have to pay for it.
However
it is not freeware in the sense that you would be allowed to
distribute a
modified version of your gnuplot freely. [...]
Yes, I did read this prior to posting.

Victor.
Sep 29 '08 #34
r0g
be************@lycos.com wrote:
Lawrence D'Oliveiro:
>>Gnuplot is non-Free software.

Fly Away:
>Yes, it is.

From:
http://www.gnuplot.info/faq/faq.txt

1.7 Does gnuplot have anything to do with the FSF and the GNU project?
[...]
Gnuplot is freeware in the sense that you don't have to pay for it.
However
it is not freeware in the sense that you would be allowed to
distribute a
modified version of your gnuplot freely. [...]

Bye,
bearophile

Well, ish. You can only distribute modifications to gnuplot itself as
patches, but you can distribute it freely and they publish the source
so, while it's not GPL free it's tending towards it.

Roger.
Sep 29 '08 #35
In message <gb**********@aioe.org>, r0g wrote:
You can only distribute modifications to gnuplot itself as
patches, but you can distribute it freely ...
This must be some new definition of "freely" of which I'm unaware.
Sep 30 '08 #36
r0g
Lawrence D'Oliveiro wrote:
In message <gb**********@aioe.org>, r0g wrote:
>You can only distribute modifications to gnuplot itself as
patches, but you can distribute it freely ...

This must be some new definition of "freely" of which I'm unaware.
As in beer.
Sep 30 '08 #37
In message <gb**********@aioe.org>, r0g wrote:
Lawrence D'Oliveiro wrote:
>In message <gb**********@aioe.org>, r0g wrote:
>>You can only distribute modifications to gnuplot itself as
patches, but you can distribute it freely ...

This must be some new definition of "freely" of which I'm unaware.

As in beer.
You get free beer?
Sep 30 '08 #38
On Tue, 30 Sep 2008 14:50:26 +1300, Lawrence D'Oliveiro wrote:
In message <gb**********@aioe.org>, r0g wrote:
>You can only distribute modifications to gnuplot itself as patches, but
you can distribute it freely ...

This must be some new definition of "freely" of which I'm unaware.
You're free to distribute the official release of gnuplot.

You're free to distribute patches to gnuplot.

You're even free to provide people with a script or program to apply
those patches to gnuplot.
Where's the non-free bit?
Personally, I don't get the whole "only distribute patches" requirement.
It's a bit like saying "You're free to distribute this software, but only
as a tarball". It seems silly to me. But I don't see it as non-free,
except in the sense that "only licences approved by the FSF are free".

--
Steven
Sep 30 '08 #39
Steven D'Aprano <st****@REMOVE.THIS.cybersource.com.auwrites:
On Tue, 30 Sep 2008 14:50:26 +1300, Lawrence D'Oliveiro wrote:
In message <gb**********@aioe.org>, r0g wrote:
You can only distribute modifications to gnuplot itself as
patches, but you can distribute it freely ...
[…]
Where's the non-free bit?
You're not free to modify gnuplot and redistribute the result.

That you're free to distribute patches is nice, but it's not enough to
make the work free. The freedom to help people by giving them an
*already-modified* gnuplot is restricted by the copyright holder.

It's an artificial restriction on redistribution of derived works,
making them second-class for the prupose of getting them into people's
hands.
Personally, I don't get the whole "only distribute patches"
requirement. It's a bit like saying "You're free to distribute this
software, but only as a tarball". It seems silly to me.
That, too, would be a non-free requirement.
But I don't see it as non-free, except in the sense that "only
licences approved by the FSF are free".
I try to judge freedom of a software work by the freedoms granted to
all recipients of the work, not by the approval of some organisation.

--
\ “When I turned two I was really anxious, because I'd doubled my |
`\ age in a year. I thought, if this keeps up, by the time I'm six |
_o__) I'll be ninety.” —Steven Wright |
Ben Finney
Sep 30 '08 #40
On Tue, 30 Sep 2008 19:04:41 +1000, Ben Finney wrote:
Steven D'Aprano <st****@REMOVE.THIS.cybersource.com.auwrites:
>On Tue, 30 Sep 2008 14:50:26 +1300, Lawrence D'Oliveiro wrote:
In message <gb**********@aioe.org>, r0g wrote:

You can only distribute modifications to gnuplot itself as patches,
but you can distribute it freely ...
[…]
>Where's the non-free bit?

You're not free to modify gnuplot and redistribute the result.

That you're free to distribute patches is nice, but it's not enough to
make the work free. The freedom to help people by giving them an
*already-modified* gnuplot is restricted by the copyright holder.

It's an artificial restriction on redistribution of derived works,
making them second-class for the prupose of getting them into people's
hands.
Yes it is. It seems a strange, unnecessary restriction. But is it
sufficient to make it non-free? I don't think so.

In case you are thinking that gnuplot allows people to *only* distribute
the diffs, not the original source to apply the diffs onto, that is not
the case. I quote from gnuplot help copyright

"Permission to distribute the released version of the source code along
with corresponding source modifications in the form of a patch file is
granted with same provisions 2 through 4 for binary distributions."

Those provisions aren't terribly onerous, although #3 may be considered a
privacy issue:

2. add special version identification to distinguish your version
in addition to the base release version number,
3. provide your name and address as the primary contact for the
support of your modified version, and
4. retain our contact information in regard to use of the base
software.

>Personally, I don't get the whole "only distribute patches"
requirement. It's a bit like saying "You're free to distribute this
software, but only as a tarball". It seems silly to me.

That, too, would be a non-free requirement.
>But I don't see it as non-free, except in the sense that "only licences
approved by the FSF are free".

I try to judge freedom of a software work by the freedoms granted to all
recipients of the work, not by the approval of some organisation.
Yes, but you accept some restrictions as legitimate. For example, you
accept the restriction that the GPL makes that says you may not
redistribute a modified work without making the source code available.
That's a restriction, but it's not enough to disqualify it from being a
free software licence. In fact, that restriction is *necessary* to make
it a free software licence in the sense we're talking about. So "free"
does not mean "no restrictions", it merely means "none of some sorts of
restrictions, but other restrictions are okay". Likewise the restriction
that GPL software must be distributed with a copy of the appropriate
licence.

It is useful to compare the "diffs only" licence to two different GPL-
related scenarios. Scenario one is clearly against the spirit of the GPL,
and possibly (hopefully!) the letter as well. Scenario two is not.

(1) I distribute the modified source code encrypted and charge $1,000,000
for a NON-TRANSFERABLE licence to the encryption key. If you don't have
the encryption key, that's your bad luck.

(2) I distribute the modified source code archived in a tar file, and
refuse to offer it in any other format. If you don't have an untar
application, that's your bad luck.

It's my contention that the restriction of supplying diffs is closer to
Scenario 2 than to Scenario 1. The modified source is supplied, but it is
split into two pieces: the official source, plus a set of diffs.
Reversing that to get the modified source is not much more difficult than
untarring a tarball.


--
Steven
Sep 30 '08 #41
Steven D'Aprano <st****@REMOVE.THIS.cybersource.com.auwrites:
On Tue, 30 Sep 2008 19:04:41 +1000, Ben Finney wrote:
You're not free to modify gnuplot and redistribute the result.

That you're free to distribute patches is nice, but it's not
enough to make the work free. The freedom to help people by giving
them an *already-modified* gnuplot is restricted by the copyright
holder.

It's an artificial restriction on redistribution of derived works,
making them second-class for the prupose of getting them into
people's hands.

Yes it is. It seems a strange, unnecessary restriction. But is it
sufficient to make it non-free? I don't think so.
I do, because a natural, beneficial act (modify the work and
redistribute it) that has no technical reason to restrict, is
artifically restricted.
In case you are thinking that gnuplot allows people to *only*
distribute the diffs, not the original source to apply the diffs
onto, that is not the case. I quote from gnuplot help copyright

"Permission to distribute the released version of the source code
along with corresponding source modifications in the form of a patch
file is granted with same provisions 2 through 4 for binary
distributions."
That's what I refer to when I say that it artifically makes derived
works into second-class for the purpose of doing the beneficial act of
distributing them: the redistributor is artificially restricted from
making the work as useful as the original they received.

They have only the options to redistribute a work that is more
cumbersome for the recipient of that work, or not to redistribute at
all. That's not free redistribution.
I try to judge freedom of a software work by the freedoms granted
to all recipients of the work, not by the approval of some
organisation.

Yes, but you accept some restrictions as legitimate. For example, you
accept the restriction that the GPL makes that says you may not
redistribute a modified work without making the source code available.
Yes, which is why I was careful to say "the freedoms granted to all
recipients of the work".

The power to restrict a recipient of one's work (by choosing not to
grant them the freedoms you yourself had when you received the work)
reduces the freedoms available to all recipients of the work, even
though one party's power may be increased.

This is where the useful "your freedom to swing your fist ends at the
tip of the other man's nose" applies: As soon as the act you wish to
perform is restricting the freedom of another, you're not
contemplating an act of freedom, but an act of power over another.
Freedoms should be protected, but only within the limits imposed by
the freedoms of others.
That's a restriction, but it's not enough to disqualify it from
being a free software licence.
Specifically because it upholds the freedom of the recipient of a
derived work from having power exerted over them.
In fact, that restriction is *necessary* to make it a free software
licence in the sense we're talking about.
Not really; it's necessary to make it a copyleft license, which is a
way of preserving freedom as the work gets passed along.

Works can still be free software without being copyleft-licensed,
though. A license allowing free redistribution and requiring only
attribution be preserved is less restrictive than a copyleft; yet,
because it allows any free act (even as it also allows acts of power
over others), the work is free software.
So "free" does not mean "no restrictions", it merely means "none of
some sorts of restrictions, but other restrictions are okay".
Likewise the restriction that GPL software must be distributed with
a copy of the appropriate licence.
That's right, and I've explained above what restrictions I consider
justified, and why, and how to tell the difference.

--
\ “Reichel's Law: A body on vacation tends to remain on vacation |
`\ unless acted upon by an outside force.” —Carol Reichel |
_o__) |
Ben Finney
Sep 30 '08 #42
On 30 Sep, 14:19, Ben Finney <bignose+hates-s...@benfinney.id.au>
wrote:
>
This is where the useful "your freedom to swing your fist ends at the
tip of the other man's nose" applies: As soon as the act you wish to
perform is restricting the freedom of another, you're not
contemplating an act of freedom, but an act of power over another.
Freedoms should be protected, but only within the limits imposed by
the freedoms of others.
This is a very good explanation of what copyleft is all about. I
suppose one could regard copyleft as a means to preserve the "maximal
common freedom" in a system - if anyone else were to acquire more
power or privilege to do something, that would diminish the freedoms
of others.

Paul
Sep 30 '08 #43
On Tue, 30 Sep 2008 22:19:57 +1000, Ben Finney wrote:
Steven D'Aprano <st****@REMOVE.THIS.cybersource.com.auwrites:
>On Tue, 30 Sep 2008 19:04:41 +1000, Ben Finney wrote:
You're not free to modify gnuplot and redistribute the result.

That you're free to distribute patches is nice, but it's not enough
to make the work free. The freedom to help people by giving them an
*already-modified* gnuplot is restricted by the copyright holder.

It's an artificial restriction on redistribution of derived works,
making them second-class for the prupose of getting them into
people's hands.

Yes it is. It seems a strange, unnecessary restriction. But is it
sufficient to make it non-free? I don't think so.

I do, because a natural, beneficial act (modify the work and
redistribute it) that has no technical reason to restrict, is
artifically restricted.
We agree that the restriction is artificial, and I think irrational
(although I'd be interested in hearing the gnuplot developers' reasoning
before making a final judgment).

But I just don't see the requirement that modified software be
distributed in form X (original source + diffs) versus form Y (modified
source in a tar ball) or form Z (an rpm) to be that big a deal. Not
enough to make it "non-free software".

I simply don't think that having to run some variation on

patch -i patchfile.patch

is a requirement so onerous that it makes the gnuplot licence non-free.
Perhaps I'm just more tolerant of eccentricities than you :)
--
Steven
Sep 30 '08 #44
On Sep 30, 9:43 am, Steven D'Aprano <st...@REMOVE-THIS-
cybersource.com.auwrote:
On Tue, 30 Sep 2008 22:19:57 +1000, Ben Finney wrote:
Steven D'Aprano <ste...@REMOVE.THIS.cybersource.com.auwrites:
On Tue, 30 Sep 2008 19:04:41 +1000, Ben Finney wrote:
You're not free to modify gnuplot and redistribute the result.
That you're free to distribute patches is nice, but it's not enough
to make the work free. The freedom to help people by giving them an
*already-modified* gnuplot is restricted by the copyright holder.
It's an artificial restriction on redistribution of derived works,
making them second-class for the prupose of getting them into
people's hands.
Yes it is. It seems a strange, unnecessary restriction. But is it
sufficient to make it non-free? I don't think so.
I do, because a natural, beneficial act (modify the work and
redistribute it) that has no technical reason to restrict, is
artifically restricted.

We agree that the restriction is artificial, and I think irrational
(although I'd be interested in hearing the gnuplot developers' reasoning
before making a final judgment).

But I just don't see the requirement that modified software be
distributed in form X (original source + diffs) versus form Y (modified
source in a tar ball) or form Z (an rpm) to be that big a deal. Not
enough to make it "non-free software".

I simply don't think that having to run some variation on

patch -i patchfile.patch

is a requirement so onerous that it makes the gnuplot licence non-free.
Perhaps I'm just more tolerant of eccentricities than you :)
What you're missing is that for Free Software (TM) zealots it's a
matter of philosophical principle, totally unrelated to how easy is to
overcome the restriction. There is not a "practicality beats purity"
clause in the FSF Bible.

George
Sep 30 '08 #45
On Tuesday 30 September 2008 16:04:35 George Sakkis wrote:
What you're missing is that for Free Software (TM) zealots it's a
matter of philosophical principle, totally unrelated to how easy is to
overcome the restriction. There is not a "practicality beats purity"
clause in the FSF Bible.
The gnuplot license is a free software according to FSF, what is the problem
here after all?
George
--
Jos Ablio
Sep 30 '08 #46
Steven D'Aprano wrote:
On Tue, 30 Sep 2008 22:19:57 +1000, Ben Finney wrote:
>I do, because a natural, beneficial act (modify the work and
redistribute it) that has no technical reason to restrict, is
artifically restricted.

We agree that the restriction is artificial, and I think irrational
(although I'd be interested in hearing the gnuplot developers' reasoning
before making a final judgment).
I believe it is a matter of preserving clarity of authorship, just as is
the quoting mechanism we take for granted in posts like this. If I
removed the quote marks above and silently edited what Ben and you
wrote, I might upset someone and certainly could confuse readers.

tjr
Sep 30 '08 #47
Steven D'Aprano <st***@REMOVE-THIS-cybersource.com.auwrites:
I simply don't think that having to run some variation on

patch -i patchfile.patch

is a requirement so onerous that it makes the gnuplot licence
non-free. Perhaps I'm just more tolerant of eccentricities than you
:)
The distinction here is that this command must be run by *every*
recipient of a modified work. A work where one must do that is more
onerous for *each* recipient than one where it's already been patched
for the recipient.

Thus there is value, and no loss of freedom, in you as a redistributor
doing that work *once* and then redistributing the work intact to any
recipient. Your freedom to do this useful, harmless action is
restricted artificially by copyright, and is not granted by the
license.

So, recipients of the 'gnuplot' code are artificially restricted from
performing an action useful to society that does no harm.

--
\ “The Bermuda Triangle got tired of warm weather. It moved to |
`\ Alaska. Now Santa Claus is missing.” —Steven Wright |
_o__) |
Ben Finney
Sep 30 '08 #48
Terry Reedy <tj*****@udel.eduwrites:
Steven D'Aprano wrote:
We agree that the restriction is artificial, and I think
irrational (although I'd be interested in hearing the gnuplot
developers' reasoning before making a final judgment).

I believe it is a matter of preserving clarity of authorship, just
as is the quoting mechanism we take for granted in posts like this.
If I removed the quote marks above and silently edited what Ben and
you wrote, I might upset someone and certainly could confuse
readers.
That, if it were to be prosecuted under law, would be a matter already
covered by laws other than copyright: fraud, libel, etc.

Note that I consider a work free even if it fails to grant “the right
to distribute misrepresentations of the author's words”, because that
act is an exercise of undue power over another person, and so falls
outside the limit imposed by the freedoms of others.

--
\ “What is it that makes a complete stranger dive into an icy |
`\ river to save a solid gold baby? Maybe we'll never know.” —Jack |
_o__) Handey |
Ben Finney
Sep 30 '08 #49
On Wed, 01 Oct 2008 09:06:08 +1000, Ben Finney wrote:
Terry Reedy <tj*****@udel.eduwrites:
>Steven D'Aprano wrote:
We agree that the restriction is artificial, and I think irrational
(although I'd be interested in hearing the gnuplot developers'
reasoning before making a final judgment).

I believe it is a matter of preserving clarity of authorship, just as
is the quoting mechanism we take for granted in posts like this. If I
removed the quote marks above and silently edited what Ben and you
wrote, I might upset someone and certainly could confuse readers.

That, if it were to be prosecuted under law, would be a matter already
covered by laws other than copyright: fraud, libel, etc.

Note that I consider a work free even if it fails to grant “the right to
distribute misrepresentations of the author's words”, because that act
is an exercise of undue power over another person, and so falls outside
the limit imposed by the freedoms of others.

But distributing modified source code *does* misrepresent the author's
words, because you confuse authorship. Given only the modified version of
the source code, how is the recipient supposed to identify which parts of
the source code were written by the original authors and which parts
where written by you?

If that is why the gnuplot people do not allow you to distribute such
modified documents, then the only "freedom" they fail to grant is exactly
the one you don't consider necessary for a free licence: "the right to
distribute misrepresentations of the author's words".
--
Steven
Sep 30 '08 #50

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

220
by: Brandon J. Van Every | last post by:
What's better about Ruby than Python? I'm sure there's something. What is it? This is not a troll. I'm language shopping and I want people's answers. I don't know beans about Ruby or have...
37
by: Ubaidullah Nubar | last post by:
Hi, How well is Python suited for developing database based applications? I am new to Python so please bear with me if some of the questions are too simple. I specifically have the following...
13
by: Peter Mutsaers | last post by:
Hello, Up to now I mostly wrote simple filter scripts in Perl, e.g. while(<>) { # do something with $_, regexp matching, replacements etc. print; } Now I learned Python and like it much...
49
by: Ville Vainio | last post by:
I don't know if you have seen this before, but here goes: http://text.userlinux.com/white_paper.html There is a jab at Python, though, mentioning that Ruby is more "refined". -- Ville...
52
by: Neuruss | last post by:
It seems there are quite a few projects aimed to improve Python's speed and, therefore, eliminate its main limitation for mainstream acceptance. I just wonder what do you all think? Will Python...
50
by: diffuser78 | last post by:
I have just started to learn python. Some said that its slow. Can somebody pin point the issue. Thans
118
by: 63q2o4i02 | last post by:
Hi, I've been thinking about Python vs. Lisp. I've been learning Python the past few months and like it very much. A few years ago I had an AI class where we had to use Lisp, and I absolutely...
83
by: Licheng Fang | last post by:
Hi, I'm learning STL and I wrote some simple code to compare the efficiency of python and STL. //C++ #include <iostream> #include <string> #include <vector> #include <set> #include...
53
by: Vicent Giner | last post by:
Hello. I am new to Python. It seems a very interesting language to me. Its simplicity is very attractive. However, it is usually said that Python is not a compiled but interpreted programming...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.