473,408 Members | 1,735 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,408 software developers and data experts.

Python's biggest compromises

I have been reading a book about the evolution of the Basic
programming language. The author states that Basic - particularly
Microsoft's version is full of compromises which crept in along the
language's 30+ year evolution.

What to you think python largest compromises are?

The three that come to my mind are significant whitespace, dynamic
typing, and that it is interpreted - not compiled. These three put
python under fire and cause some large projects to move off python or
relegate it to prototyping.

Whitespace is an esthetic preference that make Andrew Hunt and David
Thomas (of Pragmatic Programmer fame) prefer Ruby. Personally, I love
it - but I can see why some people might not like it (30 years of
braces).

Dynamic typing causes the most fuss. I like Guido's answer to the
question -
"Doesn't dynamic typing cause more errors to creep into the code because you catch them later than compile time?". "No, we use Unit Testing in Zope".


That said, obvious Basic compromised by using things such as "Option
Explicit", thereby allowing both dynamic and more static style
variables. Yahoo groups moved from python to C due to dynamic typing.

Non-compiled - obviously there are times when performance matters more
than other things. Google I believe uses python to prototype (or used)
and then turns to c++ for heavy lifting.

What about immutable strings? I'm not sure I understand Guido's
preference for them.

Anthony
http://xminc.com/anthony
Jul 18 '05
65 6649
Andrew Dalke fed this fish to the penguins on Saturday 02 August 2003
11:09 pm:
There's also the TrueBASIC folks. I believe they started in the
mid-80s and argue their BASIC is essentially the same.
Unless I'm mistaken, the folks behind TrueBASIC /were/ Kemeny and
Kurtz (spellings)... IE, the creators of the original BASIC. <G>

Interesting test. Nasty idea: get that same person to judge if
Lisp and Scheme are closely related then post the results on c.l.lisp.
Heh... Toss in MPI (the only expansion I've seen for that is "my
personal insanity" -- it's similar except for using {, :, and ,! Been
way too long since I've done any MUCKing but... it was easier to learn
than MUF, which I never could get into even though I used to program HP
calculators).

-- ================================================== ============ <
wl*****@ix.netcom.com | Wulfraed Dennis Lee Bieber KD6MOG <
wu******@dm.net | Bestiaria Support Staff <
================================================== ============ <
Bestiaria Home Page: http://www.beastie.dm.net/ <
Home Page: http://www.dm.net/~wulfraed/ <


Jul 18 '05 #51
Dennis Lee Bieber:
Unless I'm mistaken, the folks behind TrueBASIC /were/ Kemeny and
Kurtz (spellings)... IE, the creators of the original BASIC. <G>


Yup. From truebasic.com
] John G. Kemeny and Thomas E. Kurtz invented BASIC in 1964
] for use at Dartmouth College. They made it freely available to
] everyone who wanted to learn how to program computers. It
] soon became a world standard.
]
] In 1983 they created True BASIC to incorporate and showcase
] all the advanced developments they had added to their language,
] and offered it as a commercial product.

But that doesn't implement the original BASIC language. OTOH,
it does say it can convert older BASIC to TrueBASIC, so there
is still backwards compatibility. That's gotta warm someone's heart
knowing code written back in the 1960s on a teletype machine
will still run today.. and even on a handheld.

Andrew
da***@dalkescientific.com
Jul 18 '05 #52
Andy C wrote:
OK, then unless I'm missing something, tabs vs. spaces shouldn't matter
for
you. The editor should be able to handle tabs in a satisfactory manner as
well.


There seems to be an unspoken assumption here that an editor is the only
program that will need to deal with Python code I receive. It isn't --
mail and news readers, for example, are often involved -- and those often
can't deal satisfactorily with tabs. Thus, it should be the job of the
editor of the person who PRODUCES code (if said code is to be sent outside
at all) to store spaces, not tabs.
Alex

Jul 18 '05 #53
ha******@yahoo.com.au (Hannu Kankaanpää) writes:
Michael Hudson <mw*@python.net> wrote in message news:<7h*************@pc150.maths.bris.ac.uk>...
<button nature="hot">
Reference counting *is* a form of garbage collection.
</button>
You apparently have such a loose definition for garbage
collection, that even C programs have "a form of garbage
collection" on modern OSes: All garbage is reclaimed by
the OS when the program exits. It's just a very lazy collector.

I don't consider something a garbage collector unless it
collects all garbage (ref.counting doesn't) and is a bit more
agile than the one provided by OS.


Well, OK, but people do tend to allow a bit of 'conservativeness' in
their collectors, don't they? Boehm's GC-for-C is usually considered
a 'real GC' and that doesn't necessarily collect everything (AIUI, I'm
not an expert in this field).

CPython's 'ref counting + gimmicks' certainly *would* seem to qualify
as a GC, by your definitions.
Saying "Ref. counting sucks, let's use GC instead" is a statement near
as dammit to meaningless.


You, I and everyone knows what I was talking about, so it could
hardly be regarded as "meaningless".


Well, OK, but: even if we don't allow refcounting as 'GC' I would
(really!) like to know which form of garbage collection you would use
instead.
Given the desires above, I really cannot think of a clearly better GC
strategy for Python that the one currently employed. AFAICS, the
current scheme's biggest drawback is its memory overhead, followed by
the cache-trashing tendencies of decrefs.


It's not "the one currently employed". It's the *two* currently
employed and that causes grief as I described in my previous post.


You mean two, as in the ones used by Jython and CPython? If have to
admit, I
And AFAIK, Ruby does use GC (mark-and-sweep, if you wish) and
seems to be working.
With finalizers? Just curious.
However, this is rather iffy knowledge. I'm actually longing for
real GC because I've seen it work well in Java and C#, and I know
that it's being used successfully in many other languages.
What would you use instead?


A trick question?


Not at all!

I think it would be practically very difficult to move to a radically
different memory management methodology without breaking the vast
majority of C extensions out there, but I would like to see some
more concrete speculation about alternatives.

Cheers,
mwh

--
Considering that this thread is completely on-topic in the way only
c.l.py threads can be, I think I can say that you should replace
"Oblivion" with "Gravity", and increase your Radiohead quotient.
-- Ben Wolfson, comp.lang.python
Jul 18 '05 #54
Lulu of the Lotus-Eaters <me***@gnosis.cx> writes:
"Daniel Dittmar" <da************@sap.com> wrote previously:
|But a lot of Python code depends on reference counting or more exactly it
|depends on the timely call of the destructor. So even if a much better GC is
|added to Python, reference counting would perhaps be kept for backwards
|compatibility (see Python's biggest compromises)

Did this thread get caught in a time warp, and posts from two years ago
get posted again. Exactly this all happened years ago.


Welcome to USENET!

Cheers,
mwh

--
Not only does the English Language borrow words from other
languages, it sometimes chases them down dark alleys, hits
them over the head, and goes through their pockets. -- Eddy Peters
Jul 18 '05 #55
Mel Wilson:
(Idea for obfuscated Python: a program that mixes spaces
and tabs in the indenting so as to perform two or more
distinct useful functions depending on the space-to-tab
rate. Bonus points for a program that undoes itself at
different settings.)


That's just not possible because Python only has one setting
for a tag - 8 spaces. At most there would appear to be a
difference if your editor was configured to display tabs as,
say, 4 spaces. But Python's understanding of it would not
change.

Andrew
da***@dalkescientific.com
Jul 18 '05 #56
Anthony_Barker wrote:
...
What about immutable strings? I'm not sure I understand Guido's
preference for them.

...
Strings are also immutable in Java. Maybe Guido likes Java? ;-)
--dang
Jul 18 '05 #57
In article <bg**********@gateway.northgrum.com>,
Dang Griffith <dm*************@tasc.com> wrote:
Anthony_Barker wrote:

What about immutable strings? I'm not sure I understand Guido's
preference for them.


Strings are also immutable in Java. Maybe Guido likes Java? ;-)


Python predates Java.
--
Aahz (aa**@pythoncraft.com) <*> http://www.pythoncraft.com/

This is Python. We don't care much about theory, except where it intersects
with useful practice. --Aahz
Jul 18 '05 #58
an************@hotmail.com (Anthony_Barker) wrote in message news:<89*************************@posting.google.c om>...
I have been reading a book about the evolution of the Basic
programming language. The author states that Basic - particularly
Microsoft's version is full of compromises which crept in along the
language's 30+ year evolution.

What to you think python largest compromises are?


Its non existant SMP scalability.
If you try to sell clients somewhat bigger server apps, they don't
want to hear that having these run on a SMP system might (and will)
actually _hurt_ performance without special administrative
interference (processor binding), which isn't even possible on some
older operating systems.
We seem to be more or less at the end concerning ramping up single
processor speeds, and SMP like hardware becomes more and more
ubiquitous, e.g. Intel's Hyperthreading, IBM's SMP-on-a-chip, Sun's
"Throughput Computing".
As far as I know, there seems to be no interest to get rid of the GIL.
To do this might be a big technical problem, but I fear if it stays
the way it is, the actual trend in the hardware industry might work
against python (in server apps) big time.
Jul 18 '05 #59
In article <ad**************************@posting.google.com >,
enoch <en***@gmx.net> wrote:
an************@hotmail.com (Anthony_Barker) wrote in message news:<89*************************@posting.google.c om>...

What to you think python largest compromises are?


Its non existant SMP scalability.


Would you care to back up your claim with some actual evidence?

(Yes, there are issues with Python on SMP machines, but to call Python's
built-in threading "non-existent SMP scalability" is either a lie or
revelatory of near-complete ignorance. That doesn't even count the
various IPC mechanisms.)
--
Aahz (aa**@pythoncraft.com) <*> http://www.pythoncraft.com/

This is Python. We don't care much about theory, except where it intersects
with useful practice. --Aahz
Jul 18 '05 #60
In article <bg*********@panix3.panix.com>, Aahz <aa**@pythoncraft.com>
writes
.....

(Yes, there are issues with Python on SMP machines, but to call Python's
built-in threading "non-existent SMP scalability" is either a lie or
revelatory of near-complete ignorance. That doesn't even count the
various IPC mechanisms.)

I'm not an expert, but the various grid computation schemes seem to
prefer either java or c/c++, I suspect that those schemes aren't really
using threads in main, after all they seem to be running between
machines in different parts of the world even. I suspect Python would be
in better shape if we could migrate threads or tasklets from one
processor to another.

I believe pyro can almost do that, but I haven't tried it.
--
Robin Becker
Jul 18 '05 #61
aa**@pythoncraft.com (Aahz) writes:
(Yes, there are issues with Python on SMP machines, but to call
Python's built-in threading "non-existent SMP scalability" is either
a lie or revelatory of near-complete ignorance. That doesn't even
count the various IPC mechanisms.)


It's an interesting subject though. How does python threading on SMP
machines compare with f.ex. Java and C++. I know that at least the
MSVC compiler has a GIL like problem with heap access (new, malloc,
delete, free), which is guarded with a global lock.

Would migrating the global data for a thread to some sort of thread
local storage help Python SMP performance? If Java has better
threading performance than Python how have they solved the interpreter
state problem. Java is interpreted isn't it?
--

Vennlig hilsen

Syver Enstad
Jul 18 '05 #62
aa**@pythoncraft.com (Aahz) wrote in message news:<bg*********@panix3.panix.com>...
In article <ad**************************@posting.google.com >,
enoch <en***@gmx.net> wrote:
an************@hotmail.com (Anthony_Barker) wrote in message news:<89*************************@posting.google.c om>...

What to you think python largest compromises are?
Its non existant SMP scalability.


Would you care to back up your claim with some actual evidence?

(Yes, there are issues with Python on SMP machines, but to call Python's
built-in threading "non-existent SMP scalability" is either a lie or
revelatory of near-complete ignorance.


Ok, I confess, the term you cited might be little bit exaggerated. But
there's no need to get personal. I'm surely not a liar (w.r.t. to this
thread, everything else is not a matter of public concern ;) ). The
ignorance part, well, we can talk about that ...
That doesn't even count the various IPC mechanisms.)
Correct me if I'm wrong, but I don't think any form of IPC is a
measurement of scalability of something like the python interpreter.

Here are some sources which show that I'm not alone with my assessment
that python has deficiencies w.r.t. SMP systems:

http://www.python.org/pycon/papers/deferex/
"""
It is optimal, however, to avoid requiring threads for any part of a
framework. Threading has a significant cost, especially in Python. The
global interpreter lock destroys any performance benefit that
threading may yield on SMP systems, [...]
"""
http://groups.google.com/groups?hl=e...nix2.panix.com
(note the author of that post)
"""My project will be running on an SMP box and requires scalability.
However, my test shows that Python threading has very poor performancein terms of scaling. In fact it doesn't scale at all.


That's true for pure Python code.
"""

I'm aware that you know quite well about these facts, so I'll leave it
at that. But let me just add one more link which maybe you don't know:

http://www.zope.org/Members/glpb/solaris/multiproc

"""
Well, in worst case, it can actually give you performance UNDER 1X.
The latency switching the GIL between CPUs comes right off your
ability to do work in a quanta. If you have a 1 gigahertz machine
capable of doing 12,000 pystones of work, and it takes 50 milliseconds
to switch the GIL(I dont know how long it takes, this is an example)
you would lose 5% of your peak performance for *EACH* GIL switch.
Setting sys.setchechinterval(240) will still yield the GIL 50 times a
second. If the GIL actually migrates only 10% of the time its
released, that would 50 * .1 * 5% = 25% performance loss. The cost
to switch the GIL is going to vary, but will probably range between .1
and .9 time quantas (scheduler time intervals) and a typical time
quanta is 5 to 10ms.
[...]
However, I have directly observed a 30% penalty under MP constraints
when the sys.setcheckinterval value was too low (and there was too
much GIL thrashing).
"""

So, although python is capable of taking advantage of SMP systems
under certain circumstances (I/O bound systems etc. etc.), there are
real world situations where python's performance is _hurt_ by running
on a SMP system.
Btw. I think even IPC might not help you there, because the different
processes might bounce betweeen CPUs, so only processor binding might
help.

I did quite a bit of googling on this problem - several times -
because I'm selling zope solutions. Sometimes, the client wants to run
the solution on an existing SMP system, and worse, the system has to
fulfill some performance requirements. Then I have the problem of
explaining to him that his admins need to undertake some special tasks
in order for zope to be able to exploit the multiple procs in his
system.


Aazh, I'm lurking this newsgroup since approx. 3 years, so I know who
you are. You have participated in nearly any discussion about threads,
I know your slides, and there's no doubt that you have forgotten more
about this subject than I'll never know.
Jul 18 '05 #63
In article <uz***********@online.no>,
Syver Enstad <sy*************@online.no> wrote:
aa**@pythoncraft.com (Aahz) writes:

(Yes, there are issues with Python on SMP machines, but to call
Python's built-in threading "non-existent SMP scalability" is either
a lie or revelatory of near-complete ignorance. That doesn't even
count the various IPC mechanisms.)
It's an interesting subject though. How does python threading on SMP
machines compare with f.ex. Java and C++. I know that at least the
MSVC compiler has a GIL like problem with heap access (new, malloc,
delete, free), which is guarded with a global lock.


Sure, but that's not where a C++ application usually spends its time.
Would migrating the global data for a thread to some sort of thread
local storage help Python SMP performance? If Java has better
threading performance than Python how have they solved the interpreter
state problem. Java is interpreted isn't it?


Well, that's a good question. *Does* Java have better threading
performance than Python? If it does, to what extent is that performance
bought at the cost of complexity for the programmer?

Keep in mind that the GIL exists not because of issues with thread-local
storage but because every Python object is global and can have bindings
to it in any -- or every -- thread. Python uses objects *everywhere*;
the GC uses Python objects, stack frames are Python objects, modules are
Python objects. To create "thread-local" storage as you suggest would
require a wholesale revision of Python's object model that would make it
something other than what Python is today.

Based on recent discussions about restricted execution, I suspect that
security would be much more likely to drive such changes; if that
happens, perhaps revisiting the way GIL works might happen with it.
--
Aahz (aa**@pythoncraft.com) <*> http://www.pythoncraft.com/

This is Python. We don't care much about theory, except where it intersects
with useful practice. --Aahz
Jul 18 '05 #64
aa**@pythoncraft.com (Aahz) wrote in message news:<bg**********@panix3.panix.com>...
<snip>
Since, as you say, you've done some research, that's why I flamed you.
There's just no call for making such an overstated claim -- it is *NOT*
"a little bit exaggerated".
Well, I based this phrase on the fact that while under some
circumstances (e.g. your web spider) python does scale somewhat, under
others (e.g. zope) it may perform even worse on a SMP system. If you
sum these two facts up ...

<snip IPC>
Here are some sources which show that I'm not alone with my assessment
that python has deficiencies w.r.t. SMP systems:
That I won't argue. But Python's approach also has some benefits even
on SMP systems. And if you choose a multi-process approach, the same
advantages that accrue to Python's approach on a single-CPU box apply
just as much to an SMP system.


Yes, and these advantages also include a simpler threading model, as
far as I understand it, on every system. It's a compromise, that's why
I posted in this thread.
http://www.python.org/pycon/papers/deferex/
"""
It is optimal, however, to avoid requiring threads for any part of a
framework. Threading has a significant cost, especially in Python. The
global interpreter lock destroys any performance benefit that
threading may yield on SMP systems, [...]
"""
Just because it's a published PyCon paper doesn't mean that it's correct.
The multi-threaded spider that I use as my example is a toy version of a
spider that was used on an SMP box. (That's why I became a threading
expert in the first place -- Tim Peters probably remembers me pestering
him with questions four years ago. ;-) I guarantee you that SMP made
that spider much faster.


But how big is the significance of software which has the same
characteristics as your web spider example versus application servers?
So, although python is capable of taking advantage of SMP systems
under certain circumstances (I/O bound systems etc. etc.), there are
real world situations where python's performance is _hurt_ by running
on a SMP system.


Absolutely. But that's true of any system with threading that isn't
designed and tuned for the needs of a specific application. Python
trades performance in some situations for a clean and simple model of
threading.


Again, the compromise we were talking about. I'm not in a position to
weigh the pros and cons of it against each other, but I think I can
point out some cons of the current approach. I'm not doing that to
spread FUD, but to give an outsiders perspective on what I think might
hurt python in the future, and I want python to thrive because I like
using it alot.
Btw. I think even IPC might not help you there, because the different
processes might bounce betweeen CPUs, so only processor binding might
help.


My understanding that most OSes are designed to avoid this; I'd be
interested in seeing some information if I'm wrong. In any event, I do
know that IPC speeds things up in real-world applications on SMP boxes.


For example, there are always lots of discussions about CPU affinity
on linux-kernel, and it seems to be a hard problem. Hyperthreading and
other non-symmetric architectures make this problem even harder.
Add to that the problem of the GIL getting shuffled around and you
have a system where you'll have trouble to predict the performance
characteristics. Admins don't like that. Though, it's not like there
are no problems without the GIL, it just adds to the complication.
I did quite a bit of googling on this problem - several times -
because I'm selling zope solutions. Sometimes, the client wants to run
the solution on an existing SMP system, and worse, the system has to
fulfill some performance requirements. Then I have the problem of
explaining to him that his admins need to undertake some special tasks
in order for zope to be able to exploit the multiple procs in his
system.


Even if Zope is the 800-pound gorilla of the Python world, Python isn't
going to change just for Zope. If you want to talk about ways of
improving Zope's performance on SMP boxes, I'll be glad to contribute
what I can. But spreading false information isn't the way to get me
interested.


I wasn't even aware that zope is the "800-pound gorilla" of the python
world. I used it just as an example for a typical larger server app,
because, well, I know it.
incidentally, the pycon paper above, which you seem to dismiss as
false, is also from a guy which is working on a larger server app.
Maybe there's a pattern?
Keep in mind that one reason IPC has gained popularity is because it
scales more than threading does, in the end. Blade servers are cheaper
than big SMP boxes, and IPC works across multiple computers.
Allow me some comment of the nature of this discussion (python and SMP
in general, not just this thread). I've seen it before and the
ingredients are:

- a major open source project
- developers which love this project
- some "outsider" which points out some perceived deficiency of said
project
- said developers pointing out (rightly or wrongly) reasons why this
deficiency doesn't matter, or that there are other (better) ways for
the "outsider" to achieve what he wants

In most cases this discussion then develops in to a big fat flamewar
;).

Two examples are linux and its threading capabilities, and mysql and
ACID compliancy.
A nice quote from the linux discussion btw. was from Alan Cox:

"A Computer is a state machine. Threads are for people who can't
program state machines."

But today, linux' thread support is magnitudes better than it was.

You wrote in another message in this thread: Well, that's a good question. *Does* Java have better threading
performance than Python? If it does, to what extent is that performance
bought at the cost of complexity for the programmer?


While I can't comment on the second question, here's an article which
sheds some light on the SMP scalability of an older java JDK, the meat
is on the third page:
http://www.javaworld.com/javaworld/j...readscale.html

Seems that java does indeed have better threading performance than
python.
Jul 18 '05 #65
In article <3f***********************@news.xs4all.nl>, Irmen de Jong
<irmen@-NOSPAM-REMOVETHIS-xs4all.nl> writes
Robin Becker wrote:
......
I believe pyro can almost do that, but I haven't tried it.


Could you please elaborate on this a bit?
What exactly did you have in mind when talking about
"migrating threads or tasklets" ?


Well I had in mind the grid concept, which I believe implies the
distribution of code to multiple nodes and then the ability to execute
on them (I suppose that includes re-sending data to already distributed
instances).

I imagine that a proper grid would allow reloading of modules as the
overall application requires, but that would be relatively trivial if we
could capture 'execution state'.

Moving a running thread to another process would be fairly hard I
imagine, but I guess that's what we want for load balancing etc.Does this involve transporting code across nodes,
or only the 'execution' (and data)?

Pyro supports transporting code, but with a few important limitations,
such as "once loaded, not reloaded".

--Irmen de Jong


--
Robin Becker
Jul 18 '05 #66

This thread has been closed and replies have been disabled. Please start a new discussion.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.