By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
440,905 Members | 900 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 440,905 IT Pros & Developers. It's quick & easy.

Project organization and import

P: n/a
I'm using Python for what is becoming a sizeable project and I'm
already running into problems organizing code and importing packages.
I feel like the Python package system, in particular the isomorphism
between filesystem and namespace, doesn't seem very well suited for
big projects. However, I might not really understand the Pythonic way.
I'm not sure if I have a specific question here, just a general plea
for advice.

1) Namespace. Python wants my namespace heirarchy to match my
filesystem heirarchy. I find that a well organized filesystem
heirarchy for a nontrivial project will be totally unwieldy as a
namespace. I'm either forced to use long namespace prefixes, or I'm
forced to use "from foo import *" and __all__, which has its own set
of problems.

1a) Module/class collision. I like to use the primary class in a file
as the name of the file. However this can lead to namespace collisions
between the module name and the class name. Also it means that I'm
going to be stuck with the odious and wasteful syntax foo.foo
everywhere, or forced to use "from foo import *".

1b) The Pythonic way seems to be to put more stuff in one file, but I
believe this is categorically the wrong thing to do in large projects.
The moment you have more than one developer along with a revision
control system, you're going to want files to contain the smallest
practical functional blocks. I feel pretty confident saying that "put
more stuff in one file" is the wrong answer, even if it is the
Pythonic answer.

2) Importing and reloading. I want to be able to reload changes
without exiting the interpreter. This pretty much excludes "from foo
import *", unless you resort to this sort of hack:

http://www.python.org/search/hyperma...1993/0448.html

Has anyone found a systematic way to solve the problem of reloading in
an interactive interpreter when using "from foo import *"?
I appreciate any advice I can get from the community.

Martin

Mar 5 '07 #1
Share this Question
Share on Google+
49 Replies


P: n/a
"Martin Unsal" <ma*********@gmail.comwrites:
1) Namespace. Python wants my namespace heirarchy to match my filesystem
heirarchy. I find that a well organized filesystem heirarchy for a
nontrivial project will be totally unwieldy as a namespace. I'm either
forced to use long namespace prefixes, or I'm forced to use "from foo import
*" and __all__, which has its own set of problems.
I find it nice. You have the idea of where is something just from the import
and you don't have to search for it everywhere. Isn't, e.g., Java like that?
(It's been so long since I last worried with Java that I don't remember if
this is mandatory or just a convention...)

You might get bitten with that when moving files from one OS to another,
specially if one of them disconsider the case and the other is strict with
it.
1a) Module/class collision. I like to use the primary class in a file as the
name of the file. However this can lead to namespace collisions between the
module name and the class name. Also it means that I'm going to be stuck
with the odious and wasteful syntax foo.foo everywhere, or forced to use
"from foo import *".
Your classes should be CamelCased and start with an uppercase letter. So
you'd have foo.Foo, being "foo" the package and "Foo" the class inside of it.
1b) The Pythonic way seems to be to put more stuff in one file, but I
believe this is categorically the wrong thing to do in large projects. The
moment you have more than one developer along with a revision control
system, you're going to want files to contain the smallest practical
functional blocks. I feel pretty confident saying that "put more stuff in
one file" is the wrong answer, even if it is the Pythonic answer.
Why? RCS systems can merge changes. A RCS system is not a substitute for
design or programmers communication. You'll only have a problem if two people
change the same line of code and if they are doing that (and worse: doing that
often) then you have a bigger problem than just the contents of the file.

Unit tests help being sure that one change doesn't break the project as a
whole and for a big project you're surely going to have a lot of those tests.

If one change breaks another, then there is a disagreement on the application
design and more communication is needed between developers or a better
documentation of the API they're implementing / using.
2) Importing and reloading. I want to be able to reload changes without
exiting the interpreter. This pretty much excludes "from foo import *",
unless you resort to this sort of hack:

http://www.python.org/search/hyperma...1993/0448.html

Has anyone found a systematic way to solve the problem of reloading in an
interactive interpreter when using "from foo import *"?
I don't reload... When my investigative tests gets bigger I write a script
and run it with the interpreter. It is easy since my text editor can call
Python on a buffer (I use Emacs).
I appreciate any advice I can get from the community.
This is just how I deal with it... My bigger "project" has several modules
now each with its own namespace and package. The API is very documented and
took the most work to get done.

Using setuptools, entrypoints, etc. helps a lot as well.
The thing is that for big projects your design is the most important part.
Get it right and you won't have problems with namespaces and filenames. If
you don't dedicate enough time on this task you'll find yourself in trouble
really soon.

--
Jorge Godoy <jg****@gmail.com>
Mar 5 '07 #2

P: n/a
On 5 mar, 01:21, "Martin Unsal" <martinun...@gmail.comwrote:
I'm using Python for what is becoming a sizeable project and I'm
already running into problems organizing code and importing packages.
I feel like the Python package system, in particular the isomorphism
between filesystem and namespace,
It's not necessarily a 1:1 mapping. Remember that you can put code in
the __init__.py of a package, and that this code can import sub-
packages/modules namespaces, making the package internal organisation
transparent to user code (I've quite often started with a simple
module, latter turning it into a package as the source-code was
growing too big).
doesn't seem very well suited for
big projects. However, I might not really understand the Pythonic way.
cf above.
I'm not sure if I have a specific question here, just a general plea
for advice.

1) Namespace. Python wants my namespace heirarchy to match my
filesystem heirarchy. I find that a well organized filesystem
heirarchy for a nontrivial project will be totally unwieldy as a
namespace. I'm either forced to use long namespace prefixes, or I'm
forced to use "from foo import *" and __all__, which has its own set
of problems.
cf above. Also remember that you can "import as", ie:

import some_package.some_subpackage.some_module as some_module
1a) Module/class collision. I like to use the primary class in a file
as the name of the file.
Bad form IMHO. Packages and module names should be all_lower,
classnames CamelCased.
>
1b) The Pythonic way seems to be to put more stuff in one file,
Pythonic way is to group together highly related stuff. Not to "put
more stuff".
but I
believe this is categorically the wrong thing to do in large projects.
Oh yes ? Why ?
The moment you have more than one developer along with a revision
control system,
You *always* have a revision system, don't you ? And having more than
one developper on a project - be it big or small - is quite common.
you're going to want files to contain the smallest
practical functional blocks. I feel pretty confident saying that "put
more stuff in one file" is the wrong answer, even if it is the
Pythonic answer.
Is this actually based on working experience ? It seems that there are
enough not-trivial Python projects around to prove that it works just
fine.
Mar 5 '07 #3

P: n/a
On Mar 5, 12:45 am, "bruno.desthuilli...@gmail.com"
<bruno.desthuilli...@gmail.comwrote:
Remember that you can put code in
the __init__.py of a package, and that this code can import sub-
packages/modules namespaces, making the package internal organisation
transparent to user code
Sure, but that doesn't solve the problem.

Say you have a package "widgets" with classes ScrollBar, Form, etc.
You want the end user to "import widgets" and then invoke
"widgets.ScrollBar()". As far as I know there are only two ways to do
this, both seriously flawed: 1) Put all your code in one module
widgets.py, 2) use "from scrollbar import *" in widgets/__init__.py,
which is semi-deprecated and breaks reload().
Also remember that you can "import as", ie:

import some_package.some_subpackage.some_module as some_module
Sure but that doesn't eliminate the unfortunate interaction between
Python class organization and filesystem heirarchy. For example, say
you want to organize the widgets package as follows:

widgets/scrollbar/*.py
widgets/form/*.py
widgets/common/util.py

Other than messing around with PYTHONPATH, which is horrible, I don't
see how to import util.py from the widget code.
Bad form IMHO. Packages and module names should be all_lower,
classnames CamelCased.
You're still stuck doing foo.Foo() everywhere in your client code,
which is ugly and wastes space, or using "from foo import *" which is
broken.
but I
believe this is categorically the wrong thing to do in large projects.

Oh yes ? Why ?
For myriad reasons, just one of them being the one I stated -- smaller
files with one functional unit each are more amenable to source code
management with multiple developers.

We could discuss this till we're blue in the face but it's beside the
point. For any given project, architecture, and workflow, the
developers are going to have a preference for how to organize the code
structurally into files, directories, packages, etc. The language
itself should not place constraints on them. The mere fact that it is
supposedly "Pythonic" to put more functionality in one file indicates
to me that the Python package system is obstructing some of its users
who have perfectly good reasons to organize their code differently.
you're going to want files to contain the smallest
practical functional blocks. I feel pretty confident saying that "put
more stuff in one file" is the wrong answer, even if it is the
Pythonic answer.

Is this actually based on working experience ? It seems that there are
enough not-trivial Python projects around to prove that it works just
fine.
Yes. I've worked extensively on several projects in several languages
with multi-million lines of code and they invariably have coding
styles that recommend one functional unit (such as a class), or at
most a few closely related functional units per file.

In Python, most of the large projects I've looked at use "from foo
import *" liberally.

I guess my question boils down to this. Is "from foo import *" really
deprecated or not? If everyone has to use "from foo import *" despite
the problems it causes, how do they work around those problems (such
as reloading)?

Martin

Mar 5 '07 #4

P: n/a
Jorge, thanks for your response. I replied earlier but I think my
response got lost. I'm trying again.

On Mar 4, 5:20 pm, Jorge Godoy <jgo...@gmail.comwrote:
Why? RCS systems can merge changes. A RCS system is not a substitute for
design or programmers communication.
Text merges are an error-prone process. They can't be eliminated but
they are best avoided when possible.

When refactoring, it's much better to move small files around than to
move chunks of code between large files. In the former case your SCM
system can track integration history, which is a big win.
Unit tests help being sure that one change doesn't break the project as a
whole and for a big project you're surely going to have a lot of those tests.
But unit tests are never an excuse for error prone workflow. "Oh,
don't worry, we'll catch that with unit tests" is never something you
want to say or hear.
I don't reload... When my investigative tests gets bigger I write a script
and run it with the interpreter. It is easy since my text editor can call
Python on a buffer (I use Emacs).
That's interesting, is this workflow pretty universal in the Python
world?

I guess that seems unfortunate to me, one of the big wins for
interpreted languages is to make the development cycle as short and
interactive as possible. As I see it, the Python way should be to
reload a file and reinvoke the class directly, not to restart the
interpreter, load an entire package and then run a test script to set
up your test conditions again.

Martin

Mar 5 '07 #5

P: n/a
On 5 Mar 2007 08:32:34 -0800, Martin Unsal <ma*********@gmail.comwrote:
Jorge, thanks for your response. I replied earlier but I think my
response got lost. I'm trying again.

On Mar 4, 5:20 pm, Jorge Godoy <jgo...@gmail.comwrote:
Why? RCS systems can merge changes. A RCS system is not a substitute for
design or programmers communication.

Text merges are an error-prone process. They can't be eliminated but
they are best avoided when possible.

When refactoring, it's much better to move small files around than to
move chunks of code between large files. In the former case your SCM
system can track integration history, which is a big win.
Unit tests help being sure that one change doesn't break the project as a
whole and for a big project you're surely going to have a lot of those tests.

But unit tests are never an excuse for error prone workflow. "Oh,
don't worry, we'll catch that with unit tests" is never something you
want to say or hear.
That's actually the exact benefit of unit testing, but I don't feel
that you've actually made a case that this workflow is error prone.
You often have multiple developers working on the same parts of the
same module?
I don't reload... When my investigative tests gets bigger I write a script
and run it with the interpreter. It is easy since my text editor can call
Python on a buffer (I use Emacs).

That's interesting, is this workflow pretty universal in the Python
world?

I guess that seems unfortunate to me, one of the big wins for
interpreted languages is to make the development cycle as short and
interactive as possible. As I see it, the Python way should be to
reload a file and reinvoke the class directly, not to restart the
interpreter, load an entire package and then run a test script to set
up your test conditions again.
If you don't do this, you aren't really testing your changes, you're
testing your reload() machinery. You seem to have a lot of views about
what the "Python way" should be and those are at odds with the actual
way people work with Python. I'm not (necessarily) saying you're
wrong, but you seem to be coming at this from a confrontational
standpoint.

Your claim, for example, that the language shouldn't place constraints
on how you manage your modules is questionable. I think it's more
likely that you've developed a workflow based around the constraints
(and abilities) of other languages and you're now expecting Python to
conform to that instead of its own.

I've copied some of your responses from your earlier post below:
>Yes. I've worked extensively on several projects in several languages
with multi-million lines of code and they invariably have coding
styles that recommend one functional unit (such as a class), or at
most a few closely related functional units per file.
I wonder if you've ever asked yourself why this is the case. I know
from my own experience why it's done in traditional C++/C environments
- it's because compiling is slow and breaking things into as many
files (with as few interdependencies) as possible speeds up the
compilation process. Absent this need (which doesn't exist in Python),
what benefit is there to separating out related functionality into
multiple files? Don't split them up just because you've done so in the
past - know why you did it in the past and if those conditions still
apply. Don't split them up until it makes sense for *this* project,
not the one you did last year or 10 years ago.
>I guess my question boils down to this. Is "from foo import *" really
deprecated or not? If everyone has to use "from foo import *" despite
the problems it causes, how do they work around those problems (such
as reloading)?
from foo import * is a bad idea at a top level because it pollutes
your local namespace. In a package __init__, which exists expressly
for the purpose of exposing it's interior namespaces as a single flat
one, it makes perfect sense. In some cases you don't want to export
everything, which is when __all__ starts to make sense. Clients of a
package (or a module) shouldn't use from foo import * without a good
reason. Nobody I know uses reload() for anything more than trivial "as
you work" testing in the interpreter. It's not reliable or recommended
for anything other than that. It's not hard to restart a shell,
especially if you use ipython (which can save and re-create a session)
or a script thats set up to create your testing environment. This is
still a much faster way than compiling any but the most trivial of
C/C++ modules. In fact, on my system startup time for the interpreter
is roughly the same as the "startup time" of my compiler (that is to
say, the amount of time it takes deciding what its going to compile,
without actually compiling anything).
>You're still stuck doing foo.Foo() everywhere in your client code,
which is ugly and wastes space, or using "from foo import *" which is
broken.
If you don't like working with explicit namespaces, you've probably
chosen the wrong language. If you have a specific name (or a few
names) which you use all the time from a module, then you can import
just those names into your local namespace to save on typing. You can
also alias deeply nested names to something more shallow.
>For myriad reasons, just one of them being the one I stated -- smaller
files with one functional unit each are more amenable to source code
management with multiple developers.
I propose that the technique most amenable to source code management
is for a single file (or RCS level module, if you have a locking RCS)
to have everything that it makes sense to edit or change for a
specific feature. This is an impossible goal in practice (because you
will inevitably and necessarily have intermodule dependencies) but
your developers don't write code based around individual files. They
base it around the systems and the interfaces that compose your
project. It makes no more sense to arbitrarily break them into
multiple files than it does to arbitrarily leave them all in a single
file.

In summary: I think you've bound yourself to a style of source
management that made sense in the past without reanalyzing it to see
if it makes sense now. Trust your judgment and that of your developers
when it comes to modularization. When they end up needing to merge all
the time because they're conflicting with someone else's work, they'll
break things up into modules.

You're also placing far too much emphasis on reload. Focus yourself on
unit tests and environment scripts instead. These are more reliable
and easier to validate than reload() in a shell.
Mar 5 '07 #6

P: n/a
On Mar 5, 9:15 am, "Chris Mellon" <arka...@gmail.comwrote:
That's actually the exact benefit of unit testing, but I don't feel
that you've actually made a case that this workflow is error prone.
You often have multiple developers working on the same parts of the
same module?
Protecting your head is the exact benefit of bike helmets, that
doesn't mean you should bike more more recklessly just because you're
wearing a helmet. :)

Doing text merges is more error prone than not doing them. :)

There are myriad other benefits of breaking up large files into
functional units. Integration history, refactoring, reuse, as I
mentioned. Better clarity of design. Easier communication and
coordination within a team. What's the down side? What's the advantage
of big files with many functional units?
If you don't do this, you aren't really testing your changes, you're
testing your reload() machinery.
Only because reload() is hard in Python! ;)
You seem to have a lot of views about
what the "Python way" should be and those are at odds with the actual
way people work with Python. I'm not (necessarily) saying you're
wrong, but you seem to be coming at this from a confrontational
standpoint.
When I refer to "Pythonic" all I'm talking about is what I've read
here and observed in other people's code. I'm here looking for more
information about how other people work, to see if there are good
solutions to the problems I see.

However when I talk about what I think is "wrong" with the Pythonic
way, obviously that's just my opinion formed by my own experience.
Your claim, for example, that the language shouldn't place constraints
on how you manage your modules is questionable. I think it's more
likely that you've developed a workflow based around the constraints
(and abilities) of other languages and you're now expecting Python to
conform to that instead of its own.
I don't think so; I'm observing things that are common to several
projects in several languages.
I wonder if you've ever asked yourself why this is the case. I know
from my own experience why it's done in traditional C++/C environments
- it's because compiling is slow and breaking things into as many
files (with as few interdependencies) as possible speeds up the
compilation process.
I don't think that's actually true. Fewer, bigger compilation units
actually compile faster in C, at least in my experience.
Absent this need (which doesn't exist in Python),
Python still takes time to load & "precompile". That time is becoming
significant for me even in a modest sized project; I imagine it would
be pretty awful in a multimillion line project.

No matter how fast it is, I'd rather reload one module than exit my
interpreter and reload the entire world.

This is not a problem for Python as scripting language. This is a real
problem for Python as world class application development language.
In a package __init__, which exists expressly
for the purpose of exposing it's interior namespaces as a single flat
one, it makes perfect sense.
OK! That's good info, thanks.
Nobody I know uses reload() for anything more than trivial "as
you work" testing in the interpreter. It's not reliable or recommended
for anything other than that.
That too... although I think that's unfortunate. If reload() were
reliable, would you use it? Do you think it's inherently unreliable,
that is, it couldn't be fixed without fundamentally breaking the
Python language core?
This is
still a much faster way than compiling any but the most trivial of
C/C++ modules.
I'm with you there! I love Python and I'd never go back to C/C++. That
doesn't change my opinion that Python's import mechanism is an
impediment to developing large projects in the language.
If you don't like working with explicit namespaces, you've probably
chosen the wrong language.
I never said that. I like foo.Bar(), I just don't like typing
foo.Foo() and bar.Bar(), which is a waste of space; syntax without
semantics.
I propose that the technique most amenable to source code management
is for a single file (or RCS level module, if you have a locking RCS)
to have everything that it makes sense to edit or change for a
specific feature.
Oh, I agree completely. I think we're using the exact same criterion.
A class is a self-contained feature with a well defined interface,
just what you'd want to put in it's own file. (Obviously there are
trivial classes which don't implement features, and they don't need
their own files.)
You're also placing far too much emphasis on reload. Focus yourself on
unit tests and environment scripts instead. These are more reliable
and easier to validate than reload() in a shell.
I think this is the crux of my frustration. I think reload() is
unreliable and hard to validate because Python's package management is
broken. I appreciate your suggestion of alternatives and I think I
need to come to terms with the fact that reload() is just broken. That
doesn't mean it has to be that way or that Python is blameless in this
problem.

Martin

Mar 5 '07 #7

P: n/a
On 5 Mar 2007 10:31:33 -0800, Martin Unsal <ma*********@gmail.comwrote:
On Mar 5, 9:15 am, "Chris Mellon" <arka...@gmail.comwrote:
That's actually the exact benefit of unit testing, but I don't feel
that you've actually made a case that this workflow is error prone.
You often have multiple developers working on the same parts of the
same module?

Protecting your head is the exact benefit of bike helmets, that
doesn't mean you should bike more more recklessly just because you're
wearing a helmet. :)

Doing text merges is more error prone than not doing them. :)

There are myriad other benefits of breaking up large files into
functional units. Integration history, refactoring, reuse, as I
mentioned. Better clarity of design. Easier communication and
coordination within a team. What's the down side? What's the advantage
of big files with many functional units?

I never advocated big files with many functional units - just files
that are "just big enough". You'll know you've broken them down small
enough when you stop having to do text merges every time you commit.
If you don't do this, you aren't really testing your changes, you're
testing your reload() machinery.

Only because reload() is hard in Python! ;)
You seem to have a lot of views about
what the "Python way" should be and those are at odds with the actual
way people work with Python. I'm not (necessarily) saying you're
wrong, but you seem to be coming at this from a confrontational
standpoint.

When I refer to "Pythonic" all I'm talking about is what I've read
here and observed in other people's code. I'm here looking for more
information about how other people work, to see if there are good
solutions to the problems I see.

However when I talk about what I think is "wrong" with the Pythonic
way, obviously that's just my opinion formed by my own experience.
Your claim, for example, that the language shouldn't place constraints
on how you manage your modules is questionable. I think it's more
likely that you've developed a workflow based around the constraints
(and abilities) of other languages and you're now expecting Python to
conform to that instead of its own.

I don't think so; I'm observing things that are common to several
projects in several languages.
..... languages with similar runtime semantics and perhaps common
ancestry? All languages place limitations on how you handle modules,
either because they have infrastructure you need to use or because
they lack it and you're left on your own.
I wonder if you've ever asked yourself why this is the case. I know
from my own experience why it's done in traditional C++/C environments
- it's because compiling is slow and breaking things into as many
files (with as few interdependencies) as possible speeds up the
compilation process.

I don't think that's actually true. Fewer, bigger compilation units
actually compile faster in C, at least in my experience.
If you're doing whole project compilation. When you're working,
though, you want to be able to do incremental compilation (all modern
compilers I know of support this) so you just recompile the files
you've changed (and dependencies) and relink. Support for this is why
we have stuff like precompiled headers, shadow headers like Qt uses,
and why C++ project management advocates single class-per-file
structures. Fewer dependencies between compilation units means a
faster rebuild-test turnaround.
Absent this need (which doesn't exist in Python),

Python still takes time to load & "precompile". That time is becoming
significant for me even in a modest sized project; I imagine it would
be pretty awful in a multimillion line project.

No matter how fast it is, I'd rather reload one module than exit my
interpreter and reload the entire world.
Sure, but whats your goal here? If you're just testing something as
you work, then this works fine. If you're testing large changes, that
affect many modules, then you *need* to reload your world, because you
want to make sure that what you're testing is clean. I think this
might be related to your desire to have everything in lots of little
files. The more modules you load, the harder it is to track your
dependencies and make sure that the reload is correct.
This is not a problem for Python as scripting language. This is a real
problem for Python as world class application development language.
Considering that no other "world class application development
language" supports reload even as well as Python does, I'm not sure I
can agree here. A perfect reload might be a nice thing to have, but
lack of it hardly tosses Python (or any language) out of the running.
In a package __init__, which exists expressly
for the purpose of exposing it's interior namespaces as a single flat
one, it makes perfect sense.

OK! That's good info, thanks.
Nobody I know uses reload() for anything more than trivial "as
you work" testing in the interpreter. It's not reliable or recommended
for anything other than that.

That too... although I think that's unfortunate. If reload() were
reliable, would you use it? Do you think it's inherently unreliable,
that is, it couldn't be fixed without fundamentally breaking the
Python language core?
The semantics of exactly what reload should do are tricky. Pythons
reload works in a sensible but limited way. More complicated reloads
are generally considered more trouble than they are worth. I've wanted
different things from reload() at different times, so I'm not even
sure what I would consider it being "reliable".

Here's a trivial example - if you rename a class in a module and then
reload it, what should happen to instances of the class you renamed?
This is
still a much faster way than compiling any but the most trivial of
C/C++ modules.

I'm with you there! I love Python and I'd never go back to C/C++. That
doesn't change my opinion that Python's import mechanism is an
impediment to developing large projects in the language.
If you don't like working with explicit namespaces, you've probably
chosen the wrong language.

I never said that. I like foo.Bar(), I just don't like typing
foo.Foo() and bar.Bar(), which is a waste of space; syntax without
semantics.
There's nothing that prevents there being a bar.Foo, the namespace
makes it clear where you're getting the object. This is again a
consequence of treating modules like classes. Some modules only expose
a single class (StringIO/cStringIO in the standardlib is a good
example), but it's more common for them to expose a single set of
"functionality".

That said, nothing prevents you from using "from foo import Foo" if
Foo is all you need (or need most - you can combine this with import
foo).
I propose that the technique most amenable to source code management
is for a single file (or RCS level module, if you have a locking RCS)
to have everything that it makes sense to edit or change for a
specific feature.

Oh, I agree completely. I think we're using the exact same criterion.
A class is a self-contained feature with a well defined interface,
just what you'd want to put in it's own file. (Obviously there are
trivial classes which don't implement features, and they don't need
their own files.)
Sure, if all your classes are that. But very few classes exist in
isolation - there's external and internal dependencies, and some
classes are tightly bound. There's no reason for these tightly bound
classes to be in external files (or an external namespace), because
when you work on one you'll need to work on them all.
You're also placing far too much emphasis on reload. Focus yourself on
unit tests and environment scripts instead. These are more reliable
and easier to validate than reload() in a shell.

I think this is the crux of my frustration. I think reload() is
unreliable and hard to validate because Python's package management is
broken. I appreciate your suggestion of alternatives and I think I
need to come to terms with the fact that reload() is just broken. That
doesn't mean it has to be that way or that Python is blameless in this
problem.
I wonder what environments you worked in before that actually had a
reliable and gotcha free version of reload? I actually don't know of
any - Smalltalk is closest. It's not really "broken" when you
understand what it does. There's just an expectation that it does
something else, and when it doesn't meet that expectation it's assumed
to be broken. Now, thats a fair definition of "broken", but replacing
running instances in a live image is a very hard problem to solve
generally. Limiting reload() to straightforward, reliable behavior is
a reasonable design decision.
Mar 5 '07 #8

P: n/a
Martin Unsal a écrit :
On Mar 5, 12:45 am, "bruno.desthuilli...@gmail.com"
<bruno.desthuilli...@gmail.comwrote:
>>Remember that you can put code in
the __init__.py of a package, and that this code can import sub-
packages/modules namespaces, making the package internal organisation
transparent to user code


Sure, but that doesn't solve the problem.

Say you have a package "widgets" with classes ScrollBar, Form, etc.
You want the end user to "import widgets" and then invoke
"widgets.ScrollBar()". As far as I know there are only two ways to do
this, both seriously flawed: 1) Put all your code in one module
widgets.py, 2) use "from scrollbar import *" in widgets/__init__.py,
which is semi-deprecated
"deprecated" ? Didn't see any mention of this so far. But it's bad form,
since it makes hard to know where some symbol comes from.

# widgets.__init
from scrollbar import Scrollbar, SomeOtherStuff, some_function, SOME_CONST
and breaks reload().
>
>>Also remember that you can "import as", ie:

import some_package.some_subpackage.some_module as some_module


Sure but that doesn't eliminate the unfortunate interaction between
Python class organization and filesystem heirarchy.
*class* organization ? It's not Java here. Nothing forces you to use
classes.
For example, say
you want to organize the widgets package as follows:

widgets/scrollbar/*.py
widgets/form/*.py
widgets/common/util.py

Other than messing around with PYTHONPATH, which is horrible, I don't
see how to import util.py from the widget code.
Some of us still manage to do so without messing with PYTHONPATH.
>
>>Bad form IMHO. Packages and module names should be all_lower,
classnames CamelCased.


You're still stuck doing foo.Foo() everywhere in your client code,
from foo import Foo

But:
which is ugly
It's not ugly, it's informative. At least you know where Foo comes from.
and wastes space,
My. Three letters and a dot...
or using "from foo import *" which is
broken.
cf above.
>
>>>but I
believe this is categorically the wrong thing to do in large projects.

Oh yes ? Why ?


For myriad reasons, just one of them being the one I stated -- smaller
files with one functional unit each
Oh. So you're proposing that each and any single function goes in a
separate file ?
are more amenable to source code
management with multiple developers.
This is not my experience.
We could discuss this till we're blue in the face but it's beside the
point. For any given project, architecture, and workflow, the
developers are going to have a preference for how to organize the code
structurally into files, directories, packages, etc. The language
itself should not place constraints on them. The mere fact that it is
supposedly "Pythonic" to put more functionality in one file indicates
to me that the Python package system is obstructing some of its users
who have perfectly good reasons to organize their code differently.
It has never been an issue for me so far.
>
>>>you're going to want files to contain the smallest
practical functional blocks. I feel pretty confident saying that "put
more stuff in one file" is the wrong answer, even if it is the
Pythonic answer.

Is this actually based on working experience ? It seems that there are
enough not-trivial Python projects around to prove that it works just
fine.


Yes. I've worked extensively on several projects in several languages
with multi-million lines of code
I meant, based on working experience *with Python* ? I've still not seen
a "multi-million" KLOC project in Python - unless of course you include
all the stdlib and the interpreter itself, and even then I doubt we get
so far.
and they invariably have coding
styles that recommend one functional unit (such as a class), or at
most a few closely related functional units per file.
Which is what I see in most Python packages I've seen so far. But we may
not have the same definition for "a few" and "closely related" ?
In Python, most of the large projects I've looked at use "from foo
import *" liberally.
I've seen few projects using this. And I wouldn't like having to
maintain such a project.
I guess my question boils down to this. Is "from foo import *" really
deprecated or not?
This syntax is only supposed to be a handy shortcut for quick testing
and exploration in an interactive session. Using it in production code
is considered bad form.
If everyone has to use "from foo import *"
I never did in 7 years.
despite
the problems it causes, how do they work around those problems (such
as reloading)?
Do you often have a need for "reloading" in production code ???

Martin, I'm not saying Python is perfect, but it really feels like
you're worrying about things that are not problems.
Mar 5 '07 #9

P: n/a
Martin Unsal a écrit :
(snip)
When refactoring, it's much better to move small files around than to
move chunks of code between large files.
Indeed. But having hundreds or thousands of files each with at most a
dozen lines of effective code is certainly not an ideal. Remember that
Python let you tell much more in a few lines than some mainstream
languages I won't name here.
>
>>I don't reload... When my investigative tests gets bigger I write a script
and run it with the interpreter. It is easy since my text editor can call
Python on a buffer (I use Emacs).

That's interesting, is this workflow pretty universal in the Python
world?
I don't know, but that's also mostly how I do work.
I guess that seems unfortunate to me,
So I guess you don't understand what Jorge is talking about.
one of the big wins for
interpreted languages is to make the development cycle as short and
interactive as possible.
It's pretty short interactive. Emacs Python mode let you fire up a
subinterpreter and eval either your whole buffer or a class or def block
or even a single expression - and play with the result in the
subinterpreter.
As I see it, the Python way should be to
reload a file and reinvoke the class directly, not to restart the
interpreter, load an entire package and then run a test script to set
up your test conditions again.
^Cc^C! to start a new interpeter
^Cc^Cc to eval the whole module

Since the module takes care of "loading the entire package", you don't
have to worry about this. And since, once the script eval'd, you still
have your (interactive) interpreter opened, with all state set, you can
then explore at will. Try it by yourself. It's by far faster and easier
than trying to manually keep track of the interpreter state.
Mar 5 '07 #10

P: n/a
Martin Unsal a écrit :
On Mar 5, 9:15 am, "Chris Mellon" <arka...@gmail.comwrote:
(snip)
There are myriad other benefits of breaking up large files into
functional units. Integration history, refactoring, reuse, as I
mentioned. Better clarity of design. Easier communication and
coordination within a team. What's the down side? What's the advantage
of big files with many functional units?
What is a "big file" ?

(snip)
However when I talk about what I think is "wrong" with the Pythonic
way, obviously that's just my opinion formed by my own experience.
Your own experience *with Python* ? or any close-enough language ? Or
your experience with C++ ?

(snip)
Python still takes time to load & "precompile".
compile. To byte-code, FWIW. Not "load & precompile". And - apart from
the top-level script - only modified modules get recompiled.
That time is becoming
significant for me even in a modest sized project;
On what hardware are you working ??? I have my interpreter up and
running in a couple milliseconds, and my box is a poor athlon xp1200/256.
I imagine it would
be pretty awful in a multimillion line project.
I still wait to see a multimillion line project in Python !-)

If you find yourself in this situation, then you there's certainly
something totally wrong in the way you (and/or your team) design and code.

But anyway - remember that only the modified modules get recompiled.
No matter how fast it is, I'd rather reload one module than exit my
interpreter and reload the entire world.
Did you actually *tried* it ?
This is not a problem for Python as scripting language. This is a real
problem for Python as world class application development language.
Sorry to have to say so, but this is total bullshit IMHO - which is
based on working experience.
>>Nobody I know uses reload() for anything more than trivial "as
you work" testing in the interpreter. It's not reliable or recommended
for anything other than that.


That too... although I think that's unfortunate. If reload() were
reliable, would you use it?
I wouldn't. It easier to rerun a simple test script and keep the
interpreter opened with full state - and you're sure you have the
correct desired state then.
>>This is
still a much faster way than compiling any but the most trivial of
C/C++ modules.

I'm with you there! I love Python and I'd never go back to C/C++. That
doesn't change my opinion that Python's import mechanism is an
impediment to developing large projects in the language.
What about basing your opinion on facts ? What about going with the
language instead of fighting against it ?
>
>>If you don't like working with explicit namespaces, you've probably
chosen the wrong language.


I never said that. I like foo.Bar(), I just don't like typing
foo.Foo() and bar.Bar(), which is a waste of space; syntax without
semantics.
May I say that the problem here comes from your insistance on putting
each class in a single module ?
>
>>I propose that the technique most amenable to source code management
is for a single file (or RCS level module, if you have a locking RCS)
to have everything that it makes sense to edit or change for a
specific feature.


Oh, I agree completely. I think we're using the exact same criterion.
I really doubt you do. What Chris is talking about is grouping together
what usually needs to change together.
A class is a self-contained feature with a well defined interface,
So is a function. Should we put any single function in a separate module
then ?
just what you'd want to put in it's own file. (Obviously there are
trivial classes which don't implement features, and they don't need
their own files.)

>>You're also placing far too much emphasis on reload. Focus yourself on
unit tests and environment scripts instead. These are more reliable
and easier to validate than reload() in a shell.


I think this is the crux of my frustration. I think reload() is
unreliable and hard to validate because Python's package management is
broken.
I think the "crux of your frustation" comes from your a priori. Fighting
against a language can only bring you into frustration. If the language
don't fit your brain - which is perfectly legitimate - then use another
one - but don't blame the language for it.
Mar 5 '07 #11

P: n/a
Bruno Desthuilliers <bd*****************@free.quelquepart.frwrote:
>I don't reload... When my investigative tests gets bigger I write a script
and run it with the interpreter. It is easy since my text editor can call
Python on a buffer (I use Emacs).
That's interesting, is this workflow pretty universal in the Python
world?

I don't know, but that's also mostly how I do work.
My favorite way of working: add a test (or a limited set of tests) for
the new or changed feature, run it, check that it fails, change the
code, rerun the test, check that the test now runs, rerun all tests to
see that nothing broke, add and run more tests to make sure the new code
is excellently covered, rinse, repeat. Occasionally, to ensure the code
stays clean, stop to refactor, rerunning tests as I go.

I'm also keen on bigger tests (integration tests, as well as system
tests for regressions, acceptance, etc), but of course I don't run those
anywhere as frequently (they're not part of my daily workflow, iterated
multiple times per day -- more like a "nightly run" kind of thing, or
for special occasions such as just before committing into HEAD... I'm
somewhat of a stickler about HEAD *always* passing *all* tests...).

Not exactly TDD, please note -- I tend to start the cycle with a few
tests (not strictly just one), implement some large chunk of the
new/changed stuff, and add "coverage" and "boundary cases" tests towards
the end of the cycle (more often than not I don't need further changes
to satisfy the coverage and boundary-case tests, because of the "large
chunk" thing). So, a TDD purist would blast me for heresy.

Nevertheless, having tried everything from pure TDD to papertrail-heavy
waterfall (including the "toss the bits over the wall to QA", shudder!)
to typical Chaos Driven Development, in over a quarter century of
experience, this almost-TDD is what works best for me -- in Python, C,
and C++, at least (it's been a long time, if ever, since I did enough
production Java, Haskell, Ruby, SML, assembly, Perl, bash, Fortran,
Cobol, Objective C, Tcl, awk, Scheme, PL/I, Rexx, Forth, Pascal,
Modula-2, or Basic, to be sure that the same approach would work well in
each of these cases, though I have no reason to think otherwise).
Alex
Mar 6 '07 #12

P: n/a
On Mar 5, 11:06 am, "Chris Mellon" <arka...@gmail.comwrote:
I never advocated big files with many functional units - just files
that are "just big enough".
Then we're in total agreement. I'm not sure why you thought my
opinions were the result of baggage from other languages when you
don't seem to actually disagree with me.
Fewer dependencies between compilation units means a
faster rebuild-test turnaround.
I know all about incremental builds and I just don't think people use
small compilation units in C++ to make their builds faster. It
certainly never been the reason why I subdivided a source file.
Sure, but whats your goal here? If you're just testing something as
you work, then this works fine. If you're testing large changes, that
affect many modules, then you *need* to reload your world, because you
want to make sure that what you're testing is clean.
I don't think reload works for anything but trivial scripts. The
moment you use "from foo import bar" reload is broken.
The semantics of exactly what reload should do are tricky. Pythons
reload works in a sensible but limited way.
I agree that there is some subtlety there, and I appreciate your
example. However the fact that Python's module system essentially
forces you to use "from foo import *" and that reload is almost
entirely imcompatible with "from foo import *", I would say that
reload is essentially useless.
That said, nothing prevents you from using "from foo import Foo" if
Foo is all you need (or need most - you can combine this with import
foo).
Well "from foo import Foo" is just a special case of "from foo import
*". :) It still breaks reload. It still means you're restarting your
interpreter even to do the most trivial development cycle.
I wonder what environments you worked in before that actually had a
reliable and gotcha free version of reload?
I'm perfectly well aware that I'm not going to be able to reload a
widget in the middle of a running GUI app, for example. I'm not
looking for gotcha free, I'll settle for minimally useful.

Here's an analogy. In C, you can do an incremental build and run your
modified application without having to first reboot your computer. In
Python, where reload() is essentially the incremental build process,
and the interpreter is essentially a virtual machine, you guys are
saying that my best option is to just "reboot" the virtual machine to
make sure I have a "clean slate". It may be the path of least
resistance, but to say that it is necessary or inevitable is 1960s
mainframe thinking.

Martin

Mar 6 '07 #13

P: n/a
On Mar 5, 3:11 pm, Bruno Desthuilliers
<bdesth.quelquech...@free.quelquepart.frwrote:
Your own experience *with Python* ?
No, my experience with Visual Basic. ;)

Of course my experience with Python!

Sorry, I can continue writing snarky replies to your snarky comments
but that won't get us anywhere productive. Instead I think the
following really gets to the crux of the issue.
May I say that the problem here comes from your insistance on putting
each class in a single module ?
No, it doesn't.

It really doesn't matter how many classes you have in a module; either
you use "from foo import bar", or you are stuck with a file structure
that is isomorphic to your design namespace.

The former breaks reload; the latter breaks large projects.

Martin

Mar 6 '07 #14

P: n/a
On Mar 5, 10:06 pm, a...@mac.com (Alex Martelli) wrote:
My favorite way of working: add a test (or a limited set of tests) for
the new or changed feature, run it, check that it fails, change the
code, rerun the test, check that the test now runs, rerun all tests to
see that nothing broke, add and run more tests to make sure the new code
is excellently covered, rinse, repeat. Occasionally, to ensure the code
stays clean, stop to refactor, rerunning tests as I go.
>From the way you describe your workflow, it sounds like you spend very
little time working interactively in the interpreter. Is that the case
or have I misunderstood?

Martin

Mar 6 '07 #15

P: n/a
En Tue, 06 Mar 2007 04:57:18 -0300, Martin Unsal <ma*********@gmail.com>
escribió:
On Mar 5, 10:06 pm, a...@mac.com (Alex Martelli) wrote:
>My favorite way of working: add a test (or a limited set of tests) for
the new or changed feature, run it, check that it fails, change the
code, rerun the test, check that the test now runs, rerun all tests to
[...]
From the way you describe your workflow, it sounds like you spend very
little time working interactively in the interpreter. Is that the case
or have I misunderstood?
FWIW, I only work interactively with the interpreter just to test some
constructs, or use timeit, or check code posted here... Never to develop
production code. That's why I don't care at all about reload(), by example.

--
Gabriel Genellina

Mar 6 '07 #16

P: n/a
In article <11**********************@v33g2000cwv.googlegroups .com>,
"Martin Unsal" <ma*********@gmail.comwrote:
That too... although I think that's unfortunate. If reload() were
reliable, would you use it? Do you think it's inherently unreliable,
that is, it couldn't be fixed without fundamentally breaking the
Python language core?
I wrote a module that wraps __import__ and tracks the dependencies of
imports. It then allows you to unload any modules whose source have
changed. That seemed to work out nicely for multi-module projects.

However, one problem I ran into was that dynamic libraries don't get
reloaded, so if you are doing hybrid C++/Python development then this
doesn't help - you still have to restart the whole python process to
pick up changes in your C++ code.

I also didn't do a ton of testing. It worked for a few small projects
I was working on, but I stopped using it once I ran into the dynamic
library thing, and at this point I'm used to just restarting python
each time. I'm sure there are some odd things that some python modules
could do that would interfere with the automatic reloading code I
wrote.

If you're interested in the code, drop me an email.

Dave
Mar 6 '07 #17

P: n/a
Martin Unsal a écrit :
On Mar 5, 3:11 pm, Bruno Desthuilliers
<bdesth.quelquech...@free.quelquepart.frwrote:
>>Your own experience *with Python* ?


No, my experience with Visual Basic. ;)

Of course my experience with Python!
Sorry but this was really not obvious.
Sorry, I can continue writing snarky replies to your snarky comments
but that won't get us anywhere productive.
You're right - sorry.
Instead I think the
following really gets to the crux of the issue.
>>May I say that the problem here comes from your insistance on putting
each class in a single module ?

No, it doesn't.

It really doesn't matter how many classes you have in a module; either
you use "from foo import bar", or you are stuck with a file structure
that is isomorphic to your design namespace.

The former breaks reload;
<imho>
Which is not a problem. reload() is of very limited use for any
non-trivial stuff.
</imho>

Mar 6 '07 #18

P: n/a
al***@mac.com (Alex Martelli) writes:
Bruno Desthuilliers <bd*****************@free.quelquepart.frwrote:
>>I don't reload... When my investigative tests gets bigger I write a script
and run it with the interpreter. It is easy since my text editor can call
Python on a buffer (I use Emacs).

That's interesting, is this workflow pretty universal in the Python
world?

I don't know, but that's also mostly how I do work.

My favorite way of working: add a test (or a limited set of tests) for
the new or changed feature, run it, check that it fails, change the
code, rerun the test, check that the test now runs, rerun all tests to
see that nothing broke, add and run more tests to make sure the new code
is excellently covered, rinse, repeat. Occasionally, to ensure the code
stays clean, stop to refactor, rerunning tests as I go.
I believe this is a distinct case. When we write tests we're worried with the
system itself. When using the interactive interpreter we're worried with how
to best use the language. There might be some feature of the system related
to that investigation, but there might be not. For example: "what are the
methods provided by this object?" or "which approach is faster for this loop?"

I won't write a test case to test loop speed. But I'd poke with the
interpreter and if the environment gets a bit big to setup then I'd go to the
text editor as I said.
--
Jorge Godoy <jg****@gmail.com>
Mar 6 '07 #19

P: n/a
"Martin Unsal" <ma*********@gmail.comwrites:
On Mar 5, 11:06 am, "Chris Mellon" <arka...@gmail.comwrote:
>I never advocated big files with many functional units - just files
that are "just big enough".

Then we're in total agreement. I'm not sure why you thought my
opinions were the result of baggage from other languages when you
don't seem to actually disagree with me.
I believe the reason was that you were advocating one class per file. "big
enough" might be more classes. Or fewer... :-)
I agree that there is some subtlety there, and I appreciate your
example. However the fact that Python's module system essentially
forces you to use "from foo import *" and that reload is almost
entirely imcompatible with "from foo import *", I would say that
reload is essentially useless.
The don't force you to that... There are many modules that do, but they are
generally glueing your Python code to some other language (usually C) written
code. This is common for GUI development, for example.

In fact, it is rare to me -- mathematics, statistics, database, web
development, testing -- to use this construction. There are no modules that
demand that.

And you can also write:

from foo import Bar, Baz

or even

from foo import Bar as B1, Baz as B2 # OUCH! ;-)
Well "from foo import Foo" is just a special case of "from foo import
*". :) It still breaks reload. It still means you're restarting your
interpreter even to do the most trivial development cycle.
That's what you get when you're working with instances of Foo... I believe
that for classmethods this would work right. So, again, it depends on your
code, how it is structured (and how it can be structured), etc.
Here's an analogy. In C, you can do an incremental build and run your
modified application without having to first reboot your computer. In
Python, where reload() is essentially the incremental build process,
and the interpreter is essentially a virtual machine, you guys are
saying that my best option is to just "reboot" the virtual machine to
make sure I have a "clean slate". It may be the path of least
resistance, but to say that it is necessary or inevitable is 1960s
mainframe thinking.
How can you reload C code that would affect already running code --
ie. existing data, pointers, etc. -- without reloading the full program? Even
changing and reloading a dynamic library wouldn't do that to already existing
code, so you'd have to "reboot" your application as well.

--
Jorge Godoy <jg****@gmail.com>
Mar 6 '07 #20

P: n/a
On 5 Mar 2007 23:35:00 -0800, Martin Unsal <ma*********@gmail.comwrote:
On Mar 5, 11:06 am, "Chris Mellon" <arka...@gmail.comwrote:
I never advocated big files with many functional units - just files
that are "just big enough".

Then we're in total agreement. I'm not sure why you thought my
opinions were the result of baggage from other languages when you
don't seem to actually disagree with me.
Because you're advocating single class per file. A scan through the
standard library may be instructive, where there are some modules that
expose a single class (StringIO, pprint) and others that expose many,
and some that expose none at all. "smallest unit that it makes sense
to work on" and "single class" are totally different things. In any
case, as I hinted at, I prefer an organic, developer driven approach
to deciding these things, not handed down from above style guidelines.
You know your modules are broken up enough when you no longer have
conflicts.
Fewer dependencies between compilation units means a
faster rebuild-test turnaround.

I know all about incremental builds and I just don't think people use
small compilation units in C++ to make their builds faster. It
certainly never been the reason why I subdivided a source file.
Faster compile/debug/edit cycle is the main justification I've heard
for single class per file. The others are variations of your RCS
argument, which I don't think is justifiable for the above reasons. It
smells of the kind of "my developers are stupid" short sighted
management that kills projects.
Sure, but whats your goal here? If you're just testing something as
you work, then this works fine. If you're testing large changes, that
affect many modules, then you *need* to reload your world, because you
want to make sure that what you're testing is clean.

I don't think reload works for anything but trivial scripts. The
moment you use "from foo import bar" reload is broken.
The semantics of exactly what reload should do are tricky. Pythons
reload works in a sensible but limited way.

I agree that there is some subtlety there, and I appreciate your
example. However the fact that Python's module system essentially
forces you to use "from foo import *" and that reload is almost
entirely imcompatible with "from foo import *", I would say that
reload is essentially useless.
I'm still not sure why you believe this, since several counterexamples
where given. As an intellectual exercise, though, lets assume that
reload is totally broken and you just can't use it. Pretend it will
reformat your machine if you ever call it. Can you really think of no
other reason to use Python? You still haven't given any justification
for why a magic reload is essential to Python development when a) all
existing python development works fine without it and b) all existing
development in every other language works fine without it.
That said, nothing prevents you from using "from foo import Foo" if
Foo is all you need (or need most - you can combine this with import
foo).

Well "from foo import Foo" is just a special case of "from foo import
*". :) It still breaks reload. It still means you're restarting your
interpreter even to do the most trivial development cycle.
You're totally fixated on reload. I don't understand this. I'm totally
positive that your traditional development experience has not been in
an environment where you could effortlessly slot in new code to a
running image. Why do you demand it from Python?

Also, the difference between "from foo import Bar" and "from foo
import *" is that the former is limited in scope (you're adding a
limited set of explicit names to your namespace) and is futureproof
(additional names exported from foo won't clash with vars in the
importing module with unknown effects). The reason why one is common
and accepted and the other is frowned upon has nothing to do with
reload().
I wonder what environments you worked in before that actually had a
reliable and gotcha free version of reload?

I'm perfectly well aware that I'm not going to be able to reload a
widget in the middle of a running GUI app, for example. I'm not
looking for gotcha free, I'll settle for minimally useful.
Then reload() as is is what you want.
Here's an analogy. In C, you can do an incremental build and run your
modified application without having to first reboot your computer. In
Python, where reload() is essentially the incremental build process,
and the interpreter is essentially a virtual machine, you guys are
saying that my best option is to just "reboot" the virtual machine to
make sure I have a "clean slate". It may be the path of least
resistance, but to say that it is necessary or inevitable is 1960s
mainframe thinking.
But you do need to restart the application image. The python
interpreter is not an emulator. You're drawing incompatible analogies
and making unjustified assumptions based on them. reload() is not an
incremental build process, and starting a new Python instance is not
rebooting your machine. This is just not a justifiable comparison.
Mar 6 '07 #21

P: n/a
Martin Unsal <ma*********@gmail.comwrote:
On Mar 5, 10:06 pm, a...@mac.com (Alex Martelli) wrote:
My favorite way of working: add a test (or a limited set of tests) for
the new or changed feature, run it, check that it fails, change the
code, rerun the test, check that the test now runs, rerun all tests to
see that nothing broke, add and run more tests to make sure the new code
is excellently covered, rinse, repeat. Occasionally, to ensure the code
stays clean, stop to refactor, rerunning tests as I go.

From the way you describe your workflow, it sounds like you spend very
little time working interactively in the interpreter. Is that the case
or have I misunderstood?
I often do have an interpreter open in its own window, to help me find
out something or other, but you're correct that it isn't where I "work";
I want all tests to be automated and repeatable, after all, so they're
better written as their own scripts and run in the test-framework. I
used to use a lot of doctests (often produced by copy and paste from an
interactive interpreter session), but these days I lean more and more
towards unittest and derivatives thereof.

Sometimes, when I don't immediately understand why a test is failing
(or, at times, why it's unexpectedly succeeding _before_ I have
implemented the feature it's supposed to test!-), I stick a
pdb.set_trace() call at the right spot to "look around" (and find out
how to fix the test and/or the code) -- I used to use "print" a lot for
such exploration, but the interactive interpreter started by pdb is
often handier (I can look at as many pieces of data as I need to find
out about the problem). I still prefer to run the test[s] within the
test framework, getting interactive only at the point where I want to
be, rather than running the tests from within pdb to "set breakpoints"
manually -- not a big deal either way, I guess.
Alex
Mar 6 '07 #22

P: n/a
Jorge Godoy <jg****@gmail.comwrote:
...
My favorite way of working: add a test (or a limited set of tests) for
the new or changed feature, run it, check that it fails, change the
code, rerun the test, check that the test now runs, rerun all tests to
see that nothing broke, add and run more tests to make sure the new code
is excellently covered, rinse, repeat. Occasionally, to ensure the code
stays clean, stop to refactor, rerunning tests as I go.

I believe this is a distinct case. When we write tests we're worried with the
system itself.
Not sure I get what you mean; when I write tests, just as when I write
production code, I'm focused (not worried:-) about the application
functionality I'm supposed to deliver. The language mostly "gets out of
my way" -- that's why I like Python, after all:-).

When using the interactive interpreter we're worried with how
to best use the language. There might be some feature of the system related
to that investigation, but there might be not. For example: "what are the
methods provided by this object?" or "which approach is faster for this loop?"
I do generally keep an interactive interpreter running in its own
window, and help and dir are probably the functions I call most often
there. If I need to microbenchmark for speed, I use timeit (which I
find far handier to use from the commandline). I wouldn't frame this as
"worried with how to best use the language" though; it's more akin to a
handy reference manual (I also keep a copy of the Nutshell handy for
exactly the same reason -- some things are best looked up on paper).

I won't write a test case to test loop speed. But I'd poke with the
interpreter and if the environment gets a bit big to setup then I'd go to the
text editor as I said.
I don't really see "getting a bit big to setup" as the motivation for
writing automated, repeatable tests (including load-tests, if speed is
such a hot topic in your case); rather, the key issue is, will you ever
want to run this again? For example, say you want to check the relative
speeds of approaches A and B -- if you do that in a way that's not
automated and repeatable (i.e., not by writing scripts), then you'll
have to repeat those manual operations exactly every time you refactor
your code, upgrade Python or your OS or some library, switch to another
system (HW or SW), etc, etc. Even if it's only three or four steps, who
needs the aggravation? Almost anything worth doing (in the realm of
testing, measuring and variously characterizing software, at least) is
worth automating, to avoid any need for repeated manual labor; that's
how you get real productivity, by doing ever less work yourself and
pushing ever more work down to your computer.
Alex
Mar 6 '07 #23

P: n/a

Bruno Desthuilliers wrote:
<imho>
Which is not a problem. reload() is of very limited use for any
non-trivial stuff.
</imho>
Now that I've heard this from 5 different people it might be sinking
in. :) :) I really do appreciate all of you taking the time to explain
this to me.

When I started using Python a few years ago I was very excited about
the fact that it was an interpreted language and offered a more
interactive workflow than the old compile-link-test workflow. As my
project has grown to be pretty sizeable by Python standards, I tried
to continue taking advantage of the tight, reload-based, interpreted-
language workflow and it's become really cumbersome, which is
disappointing. However y'all are right, giving up on reload() doesn't
mean Python is inadequate for large projects, just that it doesn't
live up entirely to what I perceived as its initial promise. Once I
adjust my mindset and workflow for a life without reload(), I'll
probably be better off.

I'd like to point out something though. More than one of the people
who responded have implied that I am bringing my prior-language
mindset to Python, even suggesting that my brain isn't built for
Python. ;) In fact I think it's the other way around. I am struggling
to take full advantage of the fact that Python is an interpreted
language, to use Python in the most "Pythonic" way. You guys are
telling me that's broken and I should go back to a workflow that is
identical in spirit, and not necessarily any faster than I would use
with a compiled language. While that might be the right answer in
practice, I don't feel like it's a particularly "good" answer, and it
confirms my initial impression that Python package management is
broken.

I think you should be asking yourselves, "Did we all abandon reload()
because it is actually an inferior workflow, or just because it's
totally broken in Python?"

I have one question left but I'll ask that in a separate post.

Martin

Mar 6 '07 #24

P: n/a
On Mar 5, 2:18 pm, Bruno Desthuilliers
<bdesth.quelquech...@free.quelquepart.frwrote:
Martin Unsal a écrit :
For example, say
you want to organize the widgets package as follows:
widgets/scrollbar/*.py
widgets/form/*.py
widgets/common/util.py
Other than messing around with PYTHONPATH, which is horrible, I don't
see how to import util.py from the widget code.

Some of us still manage to do so without messing with PYTHONPATH.
How exactly do you manage it?

The only way I can see to do it is to have widgets/__init__.py look
something like this:

from common import util
from scrollbar import Scrollbar
from form import Form

Then Scrollbar.py doesn't have to worry about importing util, it just
assumes that util is already present in its namespace.

BUT ... this means that Scrollbar.py can only be loaded in the
interpreter as part of package "widgets". You can't run an interpreter
and type "import widgets.scrollbar.Scrollbar" and start going to town,
because Scrollbar doesn't import its own dependencies.

So what I want to clarify here: Do Python programmers try to design
packages so that each file in the package can be individually loaded
into the interpreter and will automatically import its own
dependencies? Or do you design packages so they can only be used by
importing from the top level and running the top level __init__.py?

I hope that made sense. :)

Martin

Mar 6 '07 #25

P: n/a
I'd like to point out something though. More than one of the people
who responded have implied that I am bringing my prior-language
mindset to Python, even suggesting that my brain isn't built for
Python. ;) In fact I think it's the other way around. I am struggling
to take full advantage of the fact that Python is an interpreted
language, to use Python in the most "Pythonic" way. You guys are
telling me that's broken and I should go back to a workflow that is
identical in spirit, and not necessarily any faster than I would use
with a compiled language. While that might be the right answer in
practice, I don't feel like it's a particularly "good" answer, and it
confirms my initial impression that Python package management is
broken.

I think you should be asking yourselves, "Did we all abandon reload()
because it is actually an inferior workflow, or just because it's
totally broken in Python?"
Sorry, but I fail to see the point of your argumentation.

Reloading a module means that you obviously have some editor open you code
your module in, and an interactive interpreter running where you somehow
have to make the

reload(module)

line (re-)appear, and then most probably (unless the pure reloading itself
triggers some testing code) some other line that e.g. instantiates a class
defined in "module"

Now how exactly does that differ from having a test.py file containing

import module
<do-something>

and a commandline sitting there with a

python test.py

waiting to be executed, easily brought back by a single key-stroke.

Especially if <do-somethingbecomes more that some easy lines brought back
by the command line history.

I've been writing python for a few years now, to programs the size of a few
K-lines, and _never_ felt the slightest need to reload anything. And as
there have been quite a few discussions like this in the past few years,
IMHO reload is a wart and should be removed.

Diez
Mar 6 '07 #26

P: n/a
On 6 Mar 2007 08:42:00 -0800, Martin Unsal <ma*********@gmail.comwrote:
On Mar 5, 2:18 pm, Bruno Desthuilliers
<bdesth.quelquech...@free.quelquepart.frwrote:
Martin Unsal a écrit :
For example, say
you want to organize the widgets package as follows:
widgets/scrollbar/*.py
widgets/form/*.py
widgets/common/util.py
Other than messing around with PYTHONPATH, which is horrible, I don't
see how to import util.py from the widget code.
Some of us still manage to do so without messing with PYTHONPATH.

How exactly do you manage it?

The only way I can see to do it is to have widgets/__init__.py look
something like this:

from common import util
from scrollbar import Scrollbar
from form import Form

Then Scrollbar.py doesn't have to worry about importing util, it just
assumes that util is already present in its namespace.

BUT ... this means that Scrollbar.py can only be loaded in the
interpreter as part of package "widgets". You can't run an interpreter
and type "import widgets.scrollbar.Scrollbar" and start going to town,
because Scrollbar doesn't import its own dependencies.

So what I want to clarify here: Do Python programmers try to design
packages so that each file in the package can be individually loaded
into the interpreter and will automatically import its own
dependencies? Or do you design packages so they can only be used by
importing from the top level and running the top level __init__.py?

I hope that made sense. :)
Scrollbar *can't* assume that util will be present in its namespace,
because it won't be unless it imports it. Scrollbar needs to import
its own dependencies. But why do you think thats a problem?
Martin

--
http://mail.python.org/mailman/listinfo/python-list
Mar 6 '07 #27

P: n/a
On Mar 6, 6:07 am, "Chris Mellon" <arka...@gmail.comwrote:
Because you're advocating single class per file.
What I actually said was "Smallest practical functional block." I
never said one class per file, in fact I generally have more than one
class per file. Nonetheless I frequently have a class which has the
same name as the file it's contained in, which is where I start having
trouble.
What you said was A scan through the
standard library may be instructive, where there are some modules that
expose a single class (StringIO, pprint) and others that expose many,
and some that expose none at all.
AHA! Here we see the insidious Python package system at work! ;)

I said "file" and you assume that I am talking about the exposed
namespace. Files should not have to be isomorphic with namespace! A
package that exposes many classes may still use one class per file if
it wants to.
In any
case, as I hinted at, I prefer an organic, developer driven approach
to deciding these things, not handed down from above style guidelines.
PRECISELY. And in the case of Python, package stucture is dictated,
not by a style guideline, but by the design flaws of Python's package
system.

Martin

Mar 6 '07 #28

P: n/a
On 6 Mar 2007 09:09:13 -0800, Martin Unsal <ma*********@gmail.comwrote:
On Mar 6, 6:07 am, "Chris Mellon" <arka...@gmail.comwrote:
Because you're advocating single class per file.

What I actually said was "Smallest practical functional block." I
never said one class per file, in fact I generally have more than one
class per file. Nonetheless I frequently have a class which has the
same name as the file it's contained in, which is where I start having
trouble.
You do? Or do you only have trouble because you don't like using "from
foo import Foo" because you need to do more work to reload such an
import?
>
What you said was A scan through the
standard library may be instructive, where there are some modules that
expose a single class (StringIO, pprint) and others that expose many,
and some that expose none at all.

AHA! Here we see the insidious Python package system at work! ;)

I said "file" and you assume that I am talking about the exposed
namespace. Files should not have to be isomorphic with namespace! A
package that exposes many classes may still use one class per file if
it wants to.
What makes you think that the exposed namespace has to be isomorphic
with the filesystem? Further, why do you think doing so is bad? People
do it because it's convenient and simple, not because its necessary.
Why don't you like filesystems?
In any
case, as I hinted at, I prefer an organic, developer driven approach
to deciding these things, not handed down from above style guidelines.

PRECISELY. And in the case of Python, package stucture is dictated,
not by a style guideline, but by the design flaws of Python's package
system.
What design flaws are those? Is it because you're trying to have
packages as part of your project without installing them on your
PYTHONPATH somewhere?

If you want to break a module internally into multiple files, then
make it a package. To an importer, they're almost indistinguishable.
If you want to break a module into multiple packages and then stick
the files that make up the package in bizarre spots all over the
filesystem, can you give a reason why?
Mar 6 '07 #29

P: n/a
On Mar 6, 8:56 am, "Chris Mellon" <arka...@gmail.comwrote:
Scrollbar *can't* assume that util will be present in its namespace,
because it won't be unless it imports it. Scrollbar needs to import
its own dependencies. But why do you think thats a problem?
OK, maybe I'm totally missing something here, but you can't do
"import ../util/common" in Python can you?

Look at the directory structure in my original post. How does
Scrollbar.py import its dependencies from common.py, without relying
on PYTHONPATH?

Martin

Mar 6 '07 #30

P: n/a
On 6 Mar 2007 09:24:32 -0800, Martin Unsal <ma*********@gmail.comwrote:
On Mar 6, 8:56 am, "Chris Mellon" <arka...@gmail.comwrote:
Scrollbar *can't* assume that util will be present in its namespace,
because it won't be unless it imports it. Scrollbar needs to import
its own dependencies. But why do you think thats a problem?

OK, maybe I'm totally missing something here, but you can't do
"import ../util/common" in Python can you?

Look at the directory structure in my original post. How does
Scrollbar.py import its dependencies from common.py, without relying
on PYTHONPATH?
It assumes that util.common is a module thats on the PYTHONPATH.

The common way to ensure that this is the case is either to handle
util as a separate project, and install it into the system
site-packages just as you would any third party package, or to have it
(and all your other application packages and modules) off a single
root which is where your your application "base" scripts live.

This, and other intra-package import issues are affected by the
relative/absolute import changes that were begun in Python 2.5, you
can read about them here: http://www.python.org/dev/peps/pep-0328/

Note that using relative imports to import a package that "happens" to
be share a common higher level directory would be frowned upon. The
"blessed" mechanism would still be to use an absolute import, and to
install the other package on the PYTHONPATH in one of any number of
ways.
Mar 6 '07 #31

P: n/a
Diez B. Roggisch a écrit :
>>I'd like to point out something though. More than one of the people
who responded have implied that I am bringing my prior-language
mindset to Python, even suggesting that my brain isn't built for
Python. ;) In fact I think it's the other way around. I am struggling
to take full advantage of the fact that Python is an interpreted
language, to use Python in the most "Pythonic" way. You guys are
telling me that's broken and I should go back to a workflow that is
identical in spirit, and not necessarily any faster than I would use
with a compiled language. While that might be the right answer in
practice, I don't feel like it's a particularly "good" answer, and it
confirms my initial impression that Python package management is
broken.

I think you should be asking yourselves, "Did we all abandon reload()
because it is actually an inferior workflow, or just because it's
totally broken in Python?"


Sorry, but I fail to see the point of your argumentation.

Reloading a module means that you obviously have some editor open you code
your module in, and an interactive interpreter running where you somehow
have to make the

reload(module)

line (re-)appear, and then most probably (unless the pure reloading itself
triggers some testing code) some other line that e.g. instantiates a class
defined in "module"

Now how exactly does that differ from having a test.py file containing

import module
<do-something>

and a commandline sitting there with a

python test.py
Actually, make it
python -i test.py

Then you have test.py executed, and your interactive interpreter up and
ready in the desired state.
Mar 6 '07 #32

P: n/a
On Mar 6, 9:19 am, "Chris Mellon" <arka...@gmail.comwrote:
You do? Or do you only have trouble because you don't like using "from
foo import Foo" because you need to do more work to reload such an
import?
More work, like rewriting __import__ and reload??? :)

There's a point where you should blame the language, not the
programmer. Are you saying I'm lazy just because I don't want to mess
with __import__?
What makes you think that the exposed namespace has to be isomorphic
with the filesystem?
I don't; you do!

I was clearly talking about files and you assumed I was talking about
namespace. That's Pythonic thinking... and I don't mean that in a good
way!
If you want to break a module into multiple packages and then stick
the files that make up the package in bizarre spots all over the
filesystem, can you give a reason why?
Because I have written a project with 50,000 lines of Python and I'm
trying to organize it in such a way that it'll scale up cleanly by
another order of magnitude. Because I've worked on projects with
millions of lines of code and I know about how such things are
organized. It's funny, I'm a newbie to Python but it seems like I'm
one of the only people here thinking about it as a large scale
development language rather than a scripting language.

Martin

Mar 6 '07 #33

P: n/a
On Mar 6, 9:46 am, Dennis Lee Bieber <wlfr...@ix.netcom.comwrote:
The only usage I've ever made of "reload()" has been during
interactive debugging: Modify the module, then reload it at the
interactive prompt so I could create an instance of the modified code,
and manually manipulate it.
That's exactly what I want to do. That's exactly what I'm having
trouble with.

Martin

Mar 6 '07 #34

P: n/a
On 6 Mar 2007 09:49:55 -0800, Martin Unsal <ma*********@gmail.comwrote:
On Mar 6, 9:19 am, "Chris Mellon" <arka...@gmail.comwrote:
You do? Or do you only have trouble because you don't like using "from
foo import Foo" because you need to do more work to reload such an
import?

More work, like rewriting __import__ and reload??? :)

There's a point where you should blame the language, not the
programmer. Are you saying I'm lazy just because I don't want to mess
with __import__?
You have to reload the importing module as well as the module that
changed. That doesn't require rewriting the import infrastructure.
It's only an issue because you're changing things at one level but
you're trying to use them at a level removed from that. I never work
that way, because I only have any need or desire to reload when I'm
working interactively and I when I'm doing that I work directly with
the modules I'm changing. The interfaces are what my unit tests are
for. If you're doing stuff complicated and intricate enough in the
interpreter that you need reload() to do very much more than its
doing, then you're working poorly - that sort of operation should be
in a file you can run and test automatically.
>
What makes you think that the exposed namespace has to be isomorphic
with the filesystem?

I don't; you do!

I was clearly talking about files and you assumed I was talking about
namespace. That's Pythonic thinking... and I don't mean that in a good
way!
All the files on the PYTHONPATH will map into the namespace. However,
you can have items in the namespace that do not map to files. The main
reasons to do so are related to deployment, not development though so
I wonder why you want to.
If you want to break a module into multiple packages and then stick
the files that make up the package in bizarre spots all over the
filesystem, can you give a reason why?

Because I have written a project with 50,000 lines of Python and I'm
trying to organize it in such a way that it'll scale up cleanly by
another order of magnitude. Because I've worked on projects with
millions of lines of code and I know about how such things are
organized. It's funny, I'm a newbie to Python but it seems like I'm
one of the only people here thinking about it as a large scale
development language rather than a scripting language.
Thats not answering the question. Presumably you have some sort of
organization for your code in mind. What about that organization
doesn't work for Python? If you want multiple files to map to a single
module, make them a package.
Mar 6 '07 #35

P: n/a
On Mar 6, 9:34 am, "Chris Mellon" <arka...@gmail.comwrote:
It assumes that util.common is a module thats on the PYTHONPATH.
Now we're getting somewhere. :)
The common way to ensure that this is the case is either to handle
util as a separate project, and install it into the system
site-packages just as you would any third party package,
This breaks if you ever need to test more than one branch of the same
code base. I use a release branch and a development branch. Only the
release branch goes into site-packages, but obviously I do most of my
work in the development branch.
or to have it
(and all your other application packages and modules) off a single
root which is where your your application "base" scripts live.
This has SERIOUS scaling problems.
This, and other intra-package import issues are affected by the
relative/absolute import changes that were begun in Python 2.5, you
can read about them here:http://www.python.org/dev/peps/pep-0328/
Awesome! Thanks. I'll take a look.
Note that using relative imports to import a package that "happens" to
be share a common higher level directory would be frowned upon.
What if it shares a common higher level directory by design? :)

Relative imports aren't ideal, but I think in some cases it's better
than relying on PYTHONPATH which is global state (an environment
variable no less).

Martin

Mar 6 '07 #36

P: n/a
Martin Unsal <ma*********@gmail.comwrote:
We could discuss this till we're blue in the face but it's beside the
point. For any given project, architecture, and workflow, the
developers are going to have a preference for how to organize the
code structurally into files, directories, packages, etc. The
language itself should not place constraints on them.
I agree.

For example, say you want to organize the widgets package as follows:
widgets/scrollbar/*.py
widgets/form/*.py
widgets/common/util.py
One possibility is to have one module for each namespace that you want,
and compose each module out of multiple files by using execfile().

-M-

Mar 6 '07 #37

P: n/a
On 6 Mar 2007 10:30:03 -0800, Martin Unsal <ma*********@gmail.comwrote:
On Mar 6, 9:34 am, "Chris Mellon" <arka...@gmail.comwrote:
It assumes that util.common is a module thats on the PYTHONPATH.

Now we're getting somewhere. :)
The common way to ensure that this is the case is either to handle
util as a separate project, and install it into the system
site-packages just as you would any third party package,

This breaks if you ever need to test more than one branch of the same
code base. I use a release branch and a development branch. Only the
release branch goes into site-packages, but obviously I do most of my
work in the development branch.
Theres a number of solutions. They do involve manipulation of
PYTHONPATH or creation of infrastructure, though. I find that I
generally work "against" only one version of package at a time, so
it's not any trouble for me to create a local directory that has all
the version I'm working against. Testing infrastructure manipulates
PYTHONPATH to ensure it's testing the version its supposed to.
or to have it
(and all your other application packages and modules) off a single
root which is where your your application "base" scripts live.

This has SERIOUS scaling problems.
If you have lots of modules used by lots of "things" it can be. Not
necessarily though, it depends on how you package and deploy them.
It's often the best solution to the above issue when it comes to
testing, though.
This, and other intra-package import issues are affected by the
relative/absolute import changes that were begun in Python 2.5, you
can read about them here:http://www.python.org/dev/peps/pep-0328/

Awesome! Thanks. I'll take a look.
Note that using relative imports to import a package that "happens" to
be share a common higher level directory would be frowned upon.

What if it shares a common higher level directory by design? :)
Then its a subpackage of a parent package. That's different than just
walking up to wherever your did your RCS checkout.
Relative imports aren't ideal, but I think in some cases it's better
than relying on PYTHONPATH which is global state (an environment
variable no less).
Environment and manipulation of it is the job of the top level
script/application/whatever. Modules/packages/whatever should rely on
PYTHONPATH being sane.
Martin

--
http://mail.python.org/mailman/listinfo/python-list
Mar 6 '07 #38

P: n/a
On Mar 6, 10:13 am, "Chris Mellon" <arka...@gmail.comwrote:
You have to reload the importing module as well as the module that
changed. That doesn't require rewriting the import infrastructure.
As far as I can tell, the moment you use "from foo_module import bar",
you've broken reload(). Reloading higher level packages doesn't help.
The only practical solution I can see is to rewrite __import__ and
reload.
Thats not answering the question. Presumably you have some sort of
organization for your code in mind.
I already gave a simple example. I thought you were asking why I would
want to organize code that way, and the only short answer is
experience. I'd prefer not to try to formulate a long answer because
it would be time consuming and somewhat off topic, but we can go there
if necessary.

Martin

Mar 6 '07 #39

P: n/a
On 6 Mar 2007 10:58:14 -0800, Martin Unsal <ma*********@gmail.comwrote:
On Mar 6, 10:13 am, "Chris Mellon" <arka...@gmail.comwrote:
You have to reload the importing module as well as the module that
changed. That doesn't require rewriting the import infrastructure.

As far as I can tell, the moment you use "from foo_module import bar",
you've broken reload(). Reloading higher level packages doesn't help.
The only practical solution I can see is to rewrite __import__ and
reload.
Example:

a.py
AExport = object()

b.py
from a import AExport

class Object(object): pass

BExport = Object()
BExport.a = AExport

interpreter session:
>>import b
b.AExport
<object object at 0x009804A8>
>>b.BExport.a
<object object at 0x009804A8>
>>import a
a.AExport
<object object at 0x009804A8>
>>"changed a.py such that AExport = list()"
'changed a.py such that AExport = list()'
>>reload(b)
<module 'b' from 'b.pyc'>
>>b.AExport
<object object at 0x009804A8>
>>"note no change"
'note no change'
>>reload(a)
<module 'a' from 'a.py'>
>>b.AExport
<object object at 0x009804A8>
>>"note still no change"
'note still no change'
>>reload(b)
<module 'b' from 'b.pyc'>
>>b.AExport
[]
>>"now its changed"
'now its changed'
>>b.BExport.a
[]
>>>
Mar 6 '07 #40

P: n/a
In article <11**********************@h3g2000cwc.googlegroups. com>,
"Martin Unsal" <ma*********@gmail.comwrote:
I'm using Python for what is becoming a sizeable project and I'm
already running into problems organizing code and importing packages.
I feel like the Python package system, in particular the isomorphism
between filesystem and namespace, doesn't seem very well suited for
big projects. However, I might not really understand the Pythonic way.
I'm not sure if I have a specific question here, just a general plea
for advice.

1) Namespace. Python wants my namespace heirarchy to match my
filesystem heirarchy. I find that a well organized filesystem
heirarchy for a nontrivial project will be totally unwieldy as a
namespace. I'm either forced to use long namespace prefixes, or I'm
forced to use "from foo import *" and __all__, which has its own set
of problems.
1a) Module/class collision. I like to use the primary class in a file
as the name of the file. However this can lead to namespace collisions
between the module name and the class name. Also it means that I'm
going to be stuck with the odious and wasteful syntax foo.foo
everywhere, or forced to use "from foo import *".
The issue of module names vs contained class names is one thing I find a
bit frustrating about python. Fortunately it is fairly easy to work
around.

My own solution has been to import up just one level. So for example:
pkg/subpkg/foo.py defines class foo and associated stuff
pkg/subpkg/bar.py defines class bar
pkt/subpkg/__init__.py contains:

from foo import *
from bar import *

To use this I then do:
import pkg.subpkg
myfoo = pkg.subpkg.foo(...)

But that's the only "from x import" that I do. I never raise stuff from
a sub-package to a higher level.

Once you do this (or in some other way eliminate the foo.foo problem), I
think you will find that python namespaces work very well for large
projects.

Overall I personally like having the namespace follow the file structure
(given that one has to use source files in the first place; my smalltalk
roots are showing). Java reportedly does much the same thing and it is
very helpful for finding code.

I'm sure it's partly what you're used to that counts. C++ experts
probably enjoy the freedom of C++ namespaces, but to me it's just a pain
that they are totally independent of file structure.
1b) The Pythonic way seems to be to put more stuff in one file, but I
believe this is categorically the wrong thing to do in large projects.
The moment you have more than one developer along with a revision
control system, you're going to want files to contain the smallest
practical functional blocks. I feel pretty confident saying that "put
more stuff in one file" is the wrong answer, even if it is the
Pythonic answer.
I don't personally find that python encourages lots of code per file. I
think this perception only stems from (1a) and once you solve that
you'll find it's fine to divide your code into small files.
2) Importing and reloading. I want to be able to reload changes
without exiting the interpreter. This pretty much excludes "from foo
import *", unless you resort to this sort of hack:

http://www.python.org/search/hyperma...1993/0448.html

Has anyone found a systematic way to solve the problem of reloading in
an interactive interpreter when using "from foo import *"?
I totally agree here. This is a real weakness to python and makes it
feel much more static than it ought to be. I know of no solution other
than restarting. That tends to be fast, but it can be a pain to get back
to where you were.

Smalltalk solved this problem long ago in a way that makes for very
dynamic development and debugging. Unfortunately few languages have
followed suit. The Smalltalk development environment is the one feature
I really miss in all other languages I've used (I certainly don't miss
its quirky syntax for control flow :)).

-- Russell
Mar 6 '07 #41

P: n/a
In article <11**********************@8g2000cwh.googlegroups.c om>,
"Martin Unsal" <ma*********@gmail.comwrote:
On Mar 6, 9:34 am, "Chris Mellon" <arka...@gmail.comwrote:
It assumes that util.common is a module thats on the PYTHONPATH.

Now we're getting somewhere. :)
The common way to ensure that this is the case is either to handle
util as a separate project, and install it into the system
site-packages just as you would any third party package,

This breaks if you ever need to test more than one branch of the same
code base. I use a release branch and a development branch. Only the
release branch goes into site-packages, but obviously I do most of my
work in the development branch.
This is an interesting point that we are just facing. If you have a big
package for all your stuff and you want to separately version components
of it, you do run into problems. The solution we are adopting is to
write a custom import hook, but a simpler solution is to make sure each
separately versioned component is a top-level package (in which case you
can manipulate PYTHONPATH to temporarily "install" a test version).

-- Russell
Mar 6 '07 #42

P: n/a
"Martin Unsal" <ma*********@gmail.comwrites:
I think you should be asking yourselves, "Did we all abandon reload()
because it is actually an inferior workflow, or just because it's
totally broken in Python?"
I never "abandoned reload()", because it never even occurred to me to
use the interpreter for developing the code that I know is going to
end up in a file anyway. That's what my text editor is for.

--
\ "I have a microwave fireplace in my house. The other night I |
`\ laid down in front of the fire for the evening in two minutes." |
_o__) -- Steven Wright |
Ben Finney

Mar 6 '07 #43

P: n/a
"Martin Unsal" <ma*********@gmail.comwrites:
On Mar 6, 9:19 am, "Chris Mellon" <arka...@gmail.comwrote:
>You do? Or do you only have trouble because you don't like using "from
foo import Foo" because you need to do more work to reload such an
import?

More work, like rewriting __import__ and reload??? :)

There's a point where you should blame the language, not the
programmer. Are you saying I'm lazy just because I don't want to mess
with __import__?
I *never* messed with __import__. And one of my systems has more than 15
packages, with an average of 7 more subpackages plus __init__.py...

Why do you need messing with __import__?
I was clearly talking about files and you assumed I was talking about
namespace. That's Pythonic thinking... and I don't mean that in a good
way!
Hmmm... Why not? How are you going to track down where is something, on
which file? I can make sucessive imports and I can subclass things, so I
might be importing a subclass of a subclass of the class that provides the
method that I want to change. Having a direct correlation helps me a lot with
big projects. For small ones I don't care since they are very simple and a
grep usually takes me directly to where I want (just to avoid tools that map
classes to files that are specific to one IDE or editor).
Because I have written a project with 50,000 lines of Python and I'm trying
to organize it in such a way that it'll scale up cleanly by another order of
magnitude. Because I've worked on projects with millions of lines of code
and I know about how such things are organized. It's funny, I'm a newbie to
Python but it seems like I'm one of the only people here thinking about it
as a large scale development language rather than a scripting language.
I don't se a problem scaling my biggest project with, now, 65K lines of code.
What are the problems you're seeing for yours? In fact, the Python part of
this code is the easiest to deal with. And there's ctypes involved here,
which messes things up a bit since I need to keep C + Python in sync.

And if I once imagined I'd write that many LOC and would reach the millions of
LOC of *Python* code then it would certainly make me feel comfortable knowing
that this approach *do* scale. At least to me and to the ones that work with
me and use the system... Implementing new features is fast and extremely
modular. There are modules specific to one client, modules specific to
another, modules shared between all clients, etc. It isn't a monolithic take
all or nothing. And even like that it works.

There are customizations on some features that only exists at one client's
branch, there are customizations that might be selected "on the fly" by
choosing something on a preferences screen, etc.

It is a "normal" (but rather complex) application on any aspect that we see
around. And it scales. I don't fear changing code. I don't fear adding new
features. It "simply works".
--
Jorge Godoy <jg****@gmail.com>
Mar 6 '07 #44

P: n/a
al***@mac.com (Alex Martelli) writes:
Not sure I get what you mean; when I write tests, just as when I write
production code, I'm focused (not worried:-) about the application
semantics... ;-) Thanks for the correction.
functionality I'm supposed to deliver. The language mostly "gets out of
my way" -- that's why I like Python, after all:-).
That's the same reason why I like it. I believe it is not a coincidence that
we both like writing Python code.

But there are cases where investigating is more necessary than testing. This
is where I see the need of the interactive session. For program's features I
also write tests.
I do generally keep an interactive interpreter running in its own
window, and help and dir are probably the functions I call most often
there. If I need to microbenchmark for speed, I use timeit (which I
find far handier to use from the commandline). I wouldn't frame this as
"worried with how to best use the language" though; it's more akin to a
handy reference manual (I also keep a copy of the Nutshell handy for
exactly the same reason -- some things are best looked up on paper).
That's the same use -- and the same most used functions -- that I have here.
I believe that I wasn't clear on my previous post, and this is why you saw a
different meaning to it.
I don't really see "getting a bit big to setup" as the motivation for
writing automated, repeatable tests (including load-tests, if speed is
such a hot topic in your case); rather, the key issue is, will you ever
It's not for writing tests. It's for investigating things. If I have to open
database connections, make several queries to get to a point where I have the
object that I want to "dir()", it is easier to me to put that all in a file.
It isn't a test.

want to run this again? For example, say you want to check the relative
speeds of approaches A and B -- if you do that in a way that's not
automated and repeatable (i.e., not by writing scripts), then you'll
have to repeat those manual operations exactly every time you refactor
your code, upgrade Python or your OS or some library, switch to another
system (HW or SW), etc, etc. Even if it's only three or four steps, who
needs the aggravation? Almost anything worth doing (in the realm of
testing, measuring and variously characterizing software, at least) is
worth automating, to avoid any need for repeated manual labor; that's
how you get real productivity, by doing ever less work yourself and
pushing ever more work down to your computer.
I won't write a script to write two commands and rerun them often. But I
would for some more -- lets say starting from 5 commands I might start
thinking about having this somewhere where I can at least Cut'n'Past to the
interactive interpreter (even with readline's help).
--
Jorge Godoy <jg****@gmail.com>
Mar 6 '07 #45

P: n/a
On Mar 6, 12:49 pm, "Martin Unsal" <martinun...@gmail.comwrote:
On Mar 6, 9:19 am, "Chris Mellon" <arka...@gmail.comwrote:
You do? Or do you only have trouble because you don't like using "from
foo import Foo" because you need to do more work to reload such an
import?

More work, like rewriting __import__ and reload??? :)

There's a point where you should blame the language, not the
programmer. Are you saying I'm lazy just because I don't want to mess
with __import__?
What makes you think that the exposed namespace has to be isomorphic
with the filesystem?

I don't; you do!

I was clearly talking about files and you assumed I was talking about
namespace. That's Pythonic thinking... and I don't mean that in a good
way!
If you want to break a module into multiple packages and then stick
the files that make up the package in bizarre spots all over the
filesystem, can you give a reason why?

Because I have written a project with 50,000 lines of Python and I'm
trying to organize it in such a way that it'll scale up cleanly by
another order of magnitude. Because I've worked on projects with
millions of lines of code and I know about how such things are
organized. It's funny, I'm a newbie to Python but it seems like I'm
one of the only people here thinking about it as a large scale
development language rather than a scripting language.

Martin

I'm still not clear on what your problem is or why you don't like
"from foo import bar". FWIW our current project is about 330,000
lines of Python code. I do a ton of work in the interpreter--I'll
often edit code and then send a few lines over to the interpreter to
be executed. For simple changes, reload() works fine; for more
complex cases we have a reset() function to clear out most of the
namespace and re-initialize. I don't really see how reload could be
expected to guess, in general, what we'd want reloaded and what we'd
want kept, so I have a hard time thinking of it as a language problem.

Mar 6 '07 #46

P: n/a
On Mar 6, 4:58 pm, Ben Finney <bignose+hates-s...@benfinney.id.au>
wrote:
"Martin Unsal" <martinun...@gmail.comwrites:
I think you should be asking yourselves, "Did we all abandon reload()
because it is actually an inferior workflow, or just because it's
totally broken in Python?"

I never "abandoned reload()", because it never even occurred to me to
use the interpreter for developing the code that I know is going to
end up in a file anyway. That's what my text editor is for.
It's most useful for debugging for me; I'll instantiate the objects of
a known bad test case, poke around, maybe put some more debugging code
into one of my classes and re-instantiate only those objects (but keep
the rest of the test objects as-is).

Even there I find that I'd rather use a scratch file in an editor to
set up the test cases and send a specified region to the interpreter
for the most part, only actually typing in the interpreter when I'm
poking at an object. I'll often wind up wanting to pull part of the
test case out either to go into the production code or to set up a
permanent unit test.

Once I figure out what's going on, the production code definitely gets
edited in the text editor.

Even though I use the interactive interpreter every day, though, I
haven't noticed reload being a major issue.

Mar 6 '07 #47

P: n/a
Martin Unsal wrote:
I'm using Python for what is becoming a sizeable project and I'm
already running into problems organizing code and importing packages.
I feel like the Python package system, in particular the isomorphism
between filesystem and namespace, doesn't seem very well suited for
big projects.
I've never worked on what you would call a "big project", but I *am*
kind of a neat-freak/control-freak about file organization of code, so I
have tinkered with the structure of source trees in Python quite a bit.

If you want to explode a module into a lot of smaller files, you create
a package. I find that this usually works best like this (this is what
the filesystem looks like):

package_name/
package_pre.py - contains globals for the package
component_a.py - a useful-sized collection of functionality
component_b.py - another
component_c.py - another
package_post.py - stuff that relies on the prior stuff
__init__.py - or you can put the "post" stuff here

Then __init__.py contains something like:

from package_pre import *
from component_a import *
from component_b import *
from component_c import *
from package_post import *

or you can explicitly load what you need:

from package_pre import *
from component_a import A, A1, A2
from component_a import A3 as A5
from component_b import B, B1
from component_c import C, C2, C5
from package_post import *

if you want to keep the namespace cleaner.

Also, instead of just dropping things into the module's global
namespace, use an named namespace, such as a class, or use the
"package_pre" in the example above. That helps to keep things separable.

IOW, you can use __init__.py to set up the package's namespace anyway
you want, breaking the actual code up into just about as many files as
you like (I also don't like reading long source files -- I find it
easier to browse directories than source files, even with outlining
extensions. It's rare for me to have more than 2-3 classes per file).

Of course, if you *really* want your namespace to be *completely*
different from the filesystem, then there's no actual reason that all of
these files have to be in the same directory. You can use Python's
relative import (standard in Python 2.5+, available using __future__ in
2.4, IIRC) to make this easier. There was an obnoxious hack used in Zope
which used code to extract the "package_path" and then prepend that to
get absolute import locations which was necessary in earlier versions --
but I can't recommend that, just use the newer version of Python.

So, you could do evil things like this in __init__.py:

from .other_package.fiddly_bit import dunsel

(i.e. grab a module from a neighboring package)

Of course, I really can't recommend that either. Python will happily do
it, but it's a great way to shoot yourself in the foot in terms of
keeping your code organized!

The only exception to that is that I often have a "util" or "utility"
package which has a collection of little extras I find useful throughout
my project.

As for relying heavily on reload(), it isn't that great of a feature for
debugging large projects. Any code of sufficient size to make reload()
problematic, though, needs formal unit testing, anyway. The cheapest and
easiest unit test method is doctests (IMHO), so you ought to give those
a try -- I think you'll like the easy relationship those have to working
in the interactive interpreter: just walk your objects through their
paces in the interpreter, then cut-and-paste.

What reload() and the interactive interpreter is good for is
experimentation, not development.

If you need huge amounts of code to be loaded to be able to do any
useful experiments with the modules you are writing, then your code is
too tightly coupled to begin with. Try to solve that by using something
like "mock objects" to replace the full blown implementations of objects
you need for testing. I've never formally used any of the "mock"
packages, but I have done a number of tests using objects which are
dumbed-down versions of objects which are really supposed to be provided
from another module -- but I wanted to test the two separately (which is
essentially creating my own mock objects from scratch).

HTH,
Terry

--
Terry Hancock (ha*****@AnansiSpaceworks.com)
Anansi Spaceworks http://www.AnansiSpaceworks.com

Mar 7 '07 #48

P: n/a
On Mar 5, 1:21 am, "Martin Unsal" <martinun...@gmail.comwrote:
2) Importing and reloading. I want to be able to reload changes
without exiting the interpreter.
What about this?

$ cat reload_obj.py
"""
Reload a function or a class from the filesystem.

For instance, suppose you have a module

$ cat mymodule.py
def f():
print 'version 1 of function f'

Suppose you are testing the function from the interactive interpreter:
>>from mymodule import f
f()
version 1 of function f

Then suppose you edit mymodule.py:

$ cat mymodule.py
def f():
print 'version 2 of function f'

You can see the changes in the interactive interpreter simply by doing
>>f = reload_obj(f)
f()
version 2 of function f
"""

import inspect

def reload_obj(obj):
assert inspect.isfunction(obj) or inspect.isclass(obj)
mod = __import__(obj.__module__)
reload(mod)
return getattr(mod, obj.__name__)

Pretty simple, isn't it?

The issue is that if you have other objects dependending on the
previous version
of the function/class, they will keep depending on the previous
version, not on
the reloaded version, but you cannot pretende miracles from reload! ;)

You can also look at Michael Hudson's recipe

http://aspn.activestate.com/ASPN/Coo.../Recipe/160164

for a clever approach to automatic reloading.

Michele Simionato

Mar 7 '07 #49

P: n/a
package_name/
package_pre.py - contains globals for the package
component_a.py - a useful-sized collection of functionality
component_b.py - another
component_c.py - another
package_post.py - stuff that relies on the prior stuff
__init__.py - or you can put the "post" stuff here

Then __init__.py contains something like:

from package_pre import *
from component_a import *
from component_b import *
from component_c import *
from package_post import *

Anansi Spaceworkshttp://www.AnansiSpaceworks.com
Thank you! That is by far the clearest I have ever seen that
explained.
I saved it and Sent it on to a friend that is learning python.

Mar 7 '07 #50

This discussion thread is closed

Replies have been disabled for this discussion.