473,756 Members | 8,034 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Project organization and import

I'm using Python for what is becoming a sizeable project and I'm
already running into problems organizing code and importing packages.
I feel like the Python package system, in particular the isomorphism
between filesystem and namespace, doesn't seem very well suited for
big projects. However, I might not really understand the Pythonic way.
I'm not sure if I have a specific question here, just a general plea
for advice.

1) Namespace. Python wants my namespace heirarchy to match my
filesystem heirarchy. I find that a well organized filesystem
heirarchy for a nontrivial project will be totally unwieldy as a
namespace. I'm either forced to use long namespace prefixes, or I'm
forced to use "from foo import *" and __all__, which has its own set
of problems.

1a) Module/class collision. I like to use the primary class in a file
as the name of the file. However this can lead to namespace collisions
between the module name and the class name. Also it means that I'm
going to be stuck with the odious and wasteful syntax foo.foo
everywhere, or forced to use "from foo import *".

1b) The Pythonic way seems to be to put more stuff in one file, but I
believe this is categorically the wrong thing to do in large projects.
The moment you have more than one developer along with a revision
control system, you're going to want files to contain the smallest
practical functional blocks. I feel pretty confident saying that "put
more stuff in one file" is the wrong answer, even if it is the
Pythonic answer.

2) Importing and reloading. I want to be able to reload changes
without exiting the interpreter. This pretty much excludes "from foo
import *", unless you resort to this sort of hack:

http://www.python.org/search/hyperma...1993/0448.html

Has anyone found a systematic way to solve the problem of reloading in
an interactive interpreter when using "from foo import *"?
I appreciate any advice I can get from the community.

Martin

Mar 5 '07
49 3942
In article <11************ **********@h3g2 000cwc.googlegr oups.com>,
"Martin Unsal" <ma*********@gm ail.comwrote:
I'm using Python for what is becoming a sizeable project and I'm
already running into problems organizing code and importing packages.
I feel like the Python package system, in particular the isomorphism
between filesystem and namespace, doesn't seem very well suited for
big projects. However, I might not really understand the Pythonic way.
I'm not sure if I have a specific question here, just a general plea
for advice.

1) Namespace. Python wants my namespace heirarchy to match my
filesystem heirarchy. I find that a well organized filesystem
heirarchy for a nontrivial project will be totally unwieldy as a
namespace. I'm either forced to use long namespace prefixes, or I'm
forced to use "from foo import *" and __all__, which has its own set
of problems.
1a) Module/class collision. I like to use the primary class in a file
as the name of the file. However this can lead to namespace collisions
between the module name and the class name. Also it means that I'm
going to be stuck with the odious and wasteful syntax foo.foo
everywhere, or forced to use "from foo import *".
The issue of module names vs contained class names is one thing I find a
bit frustrating about python. Fortunately it is fairly easy to work
around.

My own solution has been to import up just one level. So for example:
pkg/subpkg/foo.py defines class foo and associated stuff
pkg/subpkg/bar.py defines class bar
pkt/subpkg/__init__.py contains:

from foo import *
from bar import *

To use this I then do:
import pkg.subpkg
myfoo = pkg.subpkg.foo( ...)

But that's the only "from x import" that I do. I never raise stuff from
a sub-package to a higher level.

Once you do this (or in some other way eliminate the foo.foo problem), I
think you will find that python namespaces work very well for large
projects.

Overall I personally like having the namespace follow the file structure
(given that one has to use source files in the first place; my smalltalk
roots are showing). Java reportedly does much the same thing and it is
very helpful for finding code.

I'm sure it's partly what you're used to that counts. C++ experts
probably enjoy the freedom of C++ namespaces, but to me it's just a pain
that they are totally independent of file structure.
1b) The Pythonic way seems to be to put more stuff in one file, but I
believe this is categorically the wrong thing to do in large projects.
The moment you have more than one developer along with a revision
control system, you're going to want files to contain the smallest
practical functional blocks. I feel pretty confident saying that "put
more stuff in one file" is the wrong answer, even if it is the
Pythonic answer.
I don't personally find that python encourages lots of code per file. I
think this perception only stems from (1a) and once you solve that
you'll find it's fine to divide your code into small files.
2) Importing and reloading. I want to be able to reload changes
without exiting the interpreter. This pretty much excludes "from foo
import *", unless you resort to this sort of hack:

http://www.python.org/search/hyperma...1993/0448.html

Has anyone found a systematic way to solve the problem of reloading in
an interactive interpreter when using "from foo import *"?
I totally agree here. This is a real weakness to python and makes it
feel much more static than it ought to be. I know of no solution other
than restarting. That tends to be fast, but it can be a pain to get back
to where you were.

Smalltalk solved this problem long ago in a way that makes for very
dynamic development and debugging. Unfortunately few languages have
followed suit. The Smalltalk development environment is the one feature
I really miss in all other languages I've used (I certainly don't miss
its quirky syntax for control flow :)).

-- Russell
Mar 6 '07 #41
In article <11************ **********@8g20 00cwh.googlegro ups.com>,
"Martin Unsal" <ma*********@gm ail.comwrote:
On Mar 6, 9:34 am, "Chris Mellon" <arka...@gmail. comwrote:
It assumes that util.common is a module thats on the PYTHONPATH.

Now we're getting somewhere. :)
The common way to ensure that this is the case is either to handle
util as a separate project, and install it into the system
site-packages just as you would any third party package,

This breaks if you ever need to test more than one branch of the same
code base. I use a release branch and a development branch. Only the
release branch goes into site-packages, but obviously I do most of my
work in the development branch.
This is an interesting point that we are just facing. If you have a big
package for all your stuff and you want to separately version components
of it, you do run into problems. The solution we are adopting is to
write a custom import hook, but a simpler solution is to make sure each
separately versioned component is a top-level package (in which case you
can manipulate PYTHONPATH to temporarily "install" a test version).

-- Russell
Mar 6 '07 #42
"Martin Unsal" <ma*********@gm ail.comwrites:
I think you should be asking yourselves, "Did we all abandon reload()
because it is actually an inferior workflow, or just because it's
totally broken in Python?"
I never "abandoned reload()", because it never even occurred to me to
use the interpreter for developing the code that I know is going to
end up in a file anyway. That's what my text editor is for.

--
\ "I have a microwave fireplace in my house. The other night I |
`\ laid down in front of the fire for the evening in two minutes." |
_o__) -- Steven Wright |
Ben Finney

Mar 6 '07 #43
"Martin Unsal" <ma*********@gm ail.comwrites:
On Mar 6, 9:19 am, "Chris Mellon" <arka...@gmail. comwrote:
>You do? Or do you only have trouble because you don't like using "from
foo import Foo" because you need to do more work to reload such an
import?

More work, like rewriting __import__ and reload??? :)

There's a point where you should blame the language, not the
programmer. Are you saying I'm lazy just because I don't want to mess
with __import__?
I *never* messed with __import__. And one of my systems has more than 15
packages, with an average of 7 more subpackages plus __init__.py...

Why do you need messing with __import__?
I was clearly talking about files and you assumed I was talking about
namespace. That's Pythonic thinking... and I don't mean that in a good
way!
Hmmm... Why not? How are you going to track down where is something, on
which file? I can make sucessive imports and I can subclass things, so I
might be importing a subclass of a subclass of the class that provides the
method that I want to change. Having a direct correlation helps me a lot with
big projects. For small ones I don't care since they are very simple and a
grep usually takes me directly to where I want (just to avoid tools that map
classes to files that are specific to one IDE or editor).
Because I have written a project with 50,000 lines of Python and I'm trying
to organize it in such a way that it'll scale up cleanly by another order of
magnitude. Because I've worked on projects with millions of lines of code
and I know about how such things are organized. It's funny, I'm a newbie to
Python but it seems like I'm one of the only people here thinking about it
as a large scale development language rather than a scripting language.
I don't se a problem scaling my biggest project with, now, 65K lines of code.
What are the problems you're seeing for yours? In fact, the Python part of
this code is the easiest to deal with. And there's ctypes involved here,
which messes things up a bit since I need to keep C + Python in sync.

And if I once imagined I'd write that many LOC and would reach the millions of
LOC of *Python* code then it would certainly make me feel comfortable knowing
that this approach *do* scale. At least to me and to the ones that work with
me and use the system... Implementing new features is fast and extremely
modular. There are modules specific to one client, modules specific to
another, modules shared between all clients, etc. It isn't a monolithic take
all or nothing. And even like that it works.

There are customizations on some features that only exists at one client's
branch, there are customizations that might be selected "on the fly" by
choosing something on a preferences screen, etc.

It is a "normal" (but rather complex) application on any aspect that we see
around. And it scales. I don't fear changing code. I don't fear adding new
features. It "simply works".
--
Jorge Godoy <jg****@gmail.c om>
Mar 6 '07 #44
al***@mac.com (Alex Martelli) writes:
Not sure I get what you mean; when I write tests, just as when I write
production code, I'm focused (not worried:-) about the application
semantics... ;-) Thanks for the correction.
functionality I'm supposed to deliver. The language mostly "gets out of
my way" -- that's why I like Python, after all:-).
That's the same reason why I like it. I believe it is not a coincidence that
we both like writing Python code.

But there are cases where investigating is more necessary than testing. This
is where I see the need of the interactive session. For program's features I
also write tests.
I do generally keep an interactive interpreter running in its own
window, and help and dir are probably the functions I call most often
there. If I need to microbenchmark for speed, I use timeit (which I
find far handier to use from the commandline). I wouldn't frame this as
"worried with how to best use the language" though; it's more akin to a
handy reference manual (I also keep a copy of the Nutshell handy for
exactly the same reason -- some things are best looked up on paper).
That's the same use -- and the same most used functions -- that I have here.
I believe that I wasn't clear on my previous post, and this is why you saw a
different meaning to it.
I don't really see "getting a bit big to setup" as the motivation for
writing automated, repeatable tests (including load-tests, if speed is
such a hot topic in your case); rather, the key issue is, will you ever
It's not for writing tests. It's for investigating things. If I have to open
database connections, make several queries to get to a point where I have the
object that I want to "dir()", it is easier to me to put that all in a file.
It isn't a test.

want to run this again? For example, say you want to check the relative
speeds of approaches A and B -- if you do that in a way that's not
automated and repeatable (i.e., not by writing scripts), then you'll
have to repeat those manual operations exactly every time you refactor
your code, upgrade Python or your OS or some library, switch to another
system (HW or SW), etc, etc. Even if it's only three or four steps, who
needs the aggravation? Almost anything worth doing (in the realm of
testing, measuring and variously characterizing software, at least) is
worth automating, to avoid any need for repeated manual labor; that's
how you get real productivity, by doing ever less work yourself and
pushing ever more work down to your computer.
I won't write a script to write two commands and rerun them often. But I
would for some more -- lets say starting from 5 commands I might start
thinking about having this somewhere where I can at least Cut'n'Past to the
interactive interpreter (even with readline's help).
--
Jorge Godoy <jg****@gmail.c om>
Mar 6 '07 #45
On Mar 6, 12:49 pm, "Martin Unsal" <martinun...@gm ail.comwrote:
On Mar 6, 9:19 am, "Chris Mellon" <arka...@gmail. comwrote:
You do? Or do you only have trouble because you don't like using "from
foo import Foo" because you need to do more work to reload such an
import?

More work, like rewriting __import__ and reload??? :)

There's a point where you should blame the language, not the
programmer. Are you saying I'm lazy just because I don't want to mess
with __import__?
What makes you think that the exposed namespace has to be isomorphic
with the filesystem?

I don't; you do!

I was clearly talking about files and you assumed I was talking about
namespace. That's Pythonic thinking... and I don't mean that in a good
way!
If you want to break a module into multiple packages and then stick
the files that make up the package in bizarre spots all over the
filesystem, can you give a reason why?

Because I have written a project with 50,000 lines of Python and I'm
trying to organize it in such a way that it'll scale up cleanly by
another order of magnitude. Because I've worked on projects with
millions of lines of code and I know about how such things are
organized. It's funny, I'm a newbie to Python but it seems like I'm
one of the only people here thinking about it as a large scale
development language rather than a scripting language.

Martin

I'm still not clear on what your problem is or why you don't like
"from foo import bar". FWIW our current project is about 330,000
lines of Python code. I do a ton of work in the interpreter--I'll
often edit code and then send a few lines over to the interpreter to
be executed. For simple changes, reload() works fine; for more
complex cases we have a reset() function to clear out most of the
namespace and re-initialize. I don't really see how reload could be
expected to guess, in general, what we'd want reloaded and what we'd
want kept, so I have a hard time thinking of it as a language problem.

Mar 6 '07 #46
On Mar 6, 4:58 pm, Ben Finney <bignose+hate s-s...@benfinney. id.au>
wrote:
"Martin Unsal" <martinun...@gm ail.comwrites:
I think you should be asking yourselves, "Did we all abandon reload()
because it is actually an inferior workflow, or just because it's
totally broken in Python?"

I never "abandoned reload()", because it never even occurred to me to
use the interpreter for developing the code that I know is going to
end up in a file anyway. That's what my text editor is for.
It's most useful for debugging for me; I'll instantiate the objects of
a known bad test case, poke around, maybe put some more debugging code
into one of my classes and re-instantiate only those objects (but keep
the rest of the test objects as-is).

Even there I find that I'd rather use a scratch file in an editor to
set up the test cases and send a specified region to the interpreter
for the most part, only actually typing in the interpreter when I'm
poking at an object. I'll often wind up wanting to pull part of the
test case out either to go into the production code or to set up a
permanent unit test.

Once I figure out what's going on, the production code definitely gets
edited in the text editor.

Even though I use the interactive interpreter every day, though, I
haven't noticed reload being a major issue.

Mar 6 '07 #47
Martin Unsal wrote:
I'm using Python for what is becoming a sizeable project and I'm
already running into problems organizing code and importing packages.
I feel like the Python package system, in particular the isomorphism
between filesystem and namespace, doesn't seem very well suited for
big projects.
I've never worked on what you would call a "big project", but I *am*
kind of a neat-freak/control-freak about file organization of code, so I
have tinkered with the structure of source trees in Python quite a bit.

If you want to explode a module into a lot of smaller files, you create
a package. I find that this usually works best like this (this is what
the filesystem looks like):

package_name/
package_pre.py - contains globals for the package
component_a.py - a useful-sized collection of functionality
component_b.py - another
component_c.py - another
package_post.py - stuff that relies on the prior stuff
__init__.py - or you can put the "post" stuff here

Then __init__.py contains something like:

from package_pre import *
from component_a import *
from component_b import *
from component_c import *
from package_post import *

or you can explicitly load what you need:

from package_pre import *
from component_a import A, A1, A2
from component_a import A3 as A5
from component_b import B, B1
from component_c import C, C2, C5
from package_post import *

if you want to keep the namespace cleaner.

Also, instead of just dropping things into the module's global
namespace, use an named namespace, such as a class, or use the
"package_pr e" in the example above. That helps to keep things separable.

IOW, you can use __init__.py to set up the package's namespace anyway
you want, breaking the actual code up into just about as many files as
you like (I also don't like reading long source files -- I find it
easier to browse directories than source files, even with outlining
extensions. It's rare for me to have more than 2-3 classes per file).

Of course, if you *really* want your namespace to be *completely*
different from the filesystem, then there's no actual reason that all of
these files have to be in the same directory. You can use Python's
relative import (standard in Python 2.5+, available using __future__ in
2.4, IIRC) to make this easier. There was an obnoxious hack used in Zope
which used code to extract the "package_pa th" and then prepend that to
get absolute import locations which was necessary in earlier versions --
but I can't recommend that, just use the newer version of Python.

So, you could do evil things like this in __init__.py:

from .other_package. fiddly_bit import dunsel

(i.e. grab a module from a neighboring package)

Of course, I really can't recommend that either. Python will happily do
it, but it's a great way to shoot yourself in the foot in terms of
keeping your code organized!

The only exception to that is that I often have a "util" or "utility"
package which has a collection of little extras I find useful throughout
my project.

As for relying heavily on reload(), it isn't that great of a feature for
debugging large projects. Any code of sufficient size to make reload()
problematic, though, needs formal unit testing, anyway. The cheapest and
easiest unit test method is doctests (IMHO), so you ought to give those
a try -- I think you'll like the easy relationship those have to working
in the interactive interpreter: just walk your objects through their
paces in the interpreter, then cut-and-paste.

What reload() and the interactive interpreter is good for is
experimentation , not development.

If you need huge amounts of code to be loaded to be able to do any
useful experiments with the modules you are writing, then your code is
too tightly coupled to begin with. Try to solve that by using something
like "mock objects" to replace the full blown implementations of objects
you need for testing. I've never formally used any of the "mock"
packages, but I have done a number of tests using objects which are
dumbed-down versions of objects which are really supposed to be provided
from another module -- but I wanted to test the two separately (which is
essentially creating my own mock objects from scratch).

HTH,
Terry

--
Terry Hancock (ha*****@Anansi Spaceworks.com)
Anansi Spaceworks http://www.AnansiSpaceworks.com

Mar 7 '07 #48
On Mar 5, 1:21 am, "Martin Unsal" <martinun...@gm ail.comwrote:
2) Importing and reloading. I want to be able to reload changes
without exiting the interpreter.
What about this?

$ cat reload_obj.py
"""
Reload a function or a class from the filesystem.

For instance, suppose you have a module

$ cat mymodule.py
def f():
print 'version 1 of function f'

Suppose you are testing the function from the interactive interpreter:
>>from mymodule import f
f()
version 1 of function f

Then suppose you edit mymodule.py:

$ cat mymodule.py
def f():
print 'version 2 of function f'

You can see the changes in the interactive interpreter simply by doing
>>f = reload_obj(f)
f()
version 2 of function f
"""

import inspect

def reload_obj(obj) :
assert inspect.isfunct ion(obj) or inspect.isclass (obj)
mod = __import__(obj. __module__)
reload(mod)
return getattr(mod, obj.__name__)

Pretty simple, isn't it?

The issue is that if you have other objects dependending on the
previous version
of the function/class, they will keep depending on the previous
version, not on
the reloaded version, but you cannot pretende miracles from reload! ;)

You can also look at Michael Hudson's recipe

http://aspn.activestate.com/ASPN/Coo.../Recipe/160164

for a clever approach to automatic reloading.

Michele Simionato

Mar 7 '07 #49
package_name/
package_pre.py - contains globals for the package
component_a.py - a useful-sized collection of functionality
component_b.py - another
component_c.py - another
package_post.py - stuff that relies on the prior stuff
__init__.py - or you can put the "post" stuff here

Then __init__.py contains something like:

from package_pre import *
from component_a import *
from component_b import *
from component_c import *
from package_post import *

Anansi Spaceworkshttp://www.AnansiSpace works.com
Thank you! That is by far the clearest I have ever seen that
explained.
I saved it and Sent it on to a friend that is learning python.

Mar 7 '07 #50

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
10923
by: Steven T. Hatton | last post by:
I think Danny was one cup of coffee shy of full consciousness when he wrote this, but the gist of it makes sens to me: "C++ Project Organization Guidelines Last updated May 26, 2005. http://www.informit.com/guides/content.asp?g=cplusplus&seqNum=175 Last week's article about inline functions subtly brought into the limelight another important issue, namely how to organize the files of a typical C++ program, or project. This week I...
2
2162
by: Lasse Vågsæther Karlsen | last post by:
I am slowly learning Python and I'm already starting to write some minor modules for myself. Undoubtedly there are better modules available either built-in or 3rd party that do the same as mine and much more but I need to learn it one way or another anyway. What I'm wondering about is module organization. I created my own directory for storing my modules and added the full path to this to PYTHONPATH (Windows XP platform).
10
9924
by: TokiDoki | last post by:
Hello there, I have been programming python for a little while, now. But as I am beginning to do more complex stuff, I am running into small organization problems. It is possible that what I want to obtain is not possible, but I would like the advice of more experienced python programmers. I am writing a relatively complex program in python that has now around 40 files.
8
7862
by: Neil Robbins | last post by:
I have created a setup project using the setup wizard and am now editing the properties of this project. I want to replace the default banner with one of my own. Could someone tell me what the dimensions of the banner should be (in pixels preferably). I can do a rough match, but I'd rather things were a bit more perfect. As always any help would be greatly appreciated. Neil R
0
9456
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
10040
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
9846
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9713
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
7248
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5142
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5304
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
3806
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3359
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.