473,376 Members | 1,043 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,376 software developers and data experts.

Organizing a Python project

Hello all,

I'm starting work on what is going to become a fairly substantial
Python project, and I'm trying to find the best way to organize
everything. The project will consist of:

- A few applications
- Several small scripts and utilities
- Unit tests and small interactive test programs
- A number of custom libraries and modules that may be shared and
referenced among all of the above

I have the following general structure in mind:

myproject/
app1/
main.py
file1.py
file2.py
tests/
test_abc.py
test_xyz.py
app2/
...
scripts/
script1.py
script2.py
shared/
mylib1/
file1.py
file2.py
tests/
test_foo.py
test_bar.py
mylib2/
...
The files that you might want to execute directly are:
- Any of the "main.py" files under app*/
- Any of the files under shared/
- Any of the files under app*/tests or shared/mylib*/tests

So, my questions:
First of all, does this look like a reasonable overall structure, or
are there better alternatives?

Second (and the thing I'm primarily interested in), what is the best
way to deal with importing the shared modules in the applications,
scripts, test programs, and possibly other shared modules? I think the
most obvious solution is to add /path/to/myproject to PYTHONPATH.
However, this seems like an annoying little dependency that you are
likely to forget whenever you move your workspace to a new path, open
up a branch of the project in a different directory, or download and
work on the project using a different computer.

Is there a way to set this up that is a bit more self contained? For
example, at first I was somewhat hopeful that Python could ascend
parent directories until it reached a directory that did not include
an __init__.py file, and it could use this as a root for referring to
packages and modules from any file contained within. (e.g. in the
example project above, any file could refer to myproject.shared.mylib1
so long as 'myproject' and all subdirectories contained an
__init__.py, and the parent of 'myproject' didn't contain such a
file). Evidently this is not the case, but it seems like it could be a
useful feature in these situations.

Anyway, I'm sure this is not an unusual situation, so I'm curious to
hear how other people have handled it.

(Note - don't remove 'spam' from my e-mail address when replying. The
address is correct as listed)

Thanks!
Kevin
Jun 27 '08 #1
5 2801
On 2008-05-19, iv**********@gmail.com <iv**********@gmail.comwrote:
Hello all,

I'm starting work on what is going to become a fairly substantial
Python project, and I'm trying to find the best way to organize
everything. The project will consist of:

- A few applications
- Several small scripts and utilities
- Unit tests and small interactive test programs
- A number of custom libraries and modules that may be shared and
referenced among all of the above

I have the following general structure in mind:

myproject/
app1/
main.py
file1.py
file2.py
tests/
test_abc.py
test_xyz.py
app2/
...
scripts/
script1.py
script2.py
shared/
mylib1/
file1.py
file2.py
tests/
test_foo.py
test_bar.py
mylib2/
...
The files that you might want to execute directly are:
- Any of the "main.py" files under app*/
- Any of the files under shared/
- Any of the files under app*/tests or shared/mylib*/tests

So, my questions:
First of all, does this look like a reasonable overall structure, or
are there better alternatives?
You could make a 'bin' directory next to 'myproject' with executable programs
which would usually do something like

#!/usr/bin/env python
from myproject.app1 import main
main.run()

to make a more clear seperation between code that can be executed and code that
is imported in an application.
Also, why do you make a distinction between shared and non-shared code?
You could simply eliminate 'shared' directory, and put its contents directly
under myproject.
Second (and the thing I'm primarily interested in), what is the best
way to deal with importing the shared modules in the applications,
scripts, test programs, and possibly other shared modules? I think the
most obvious solution is to add /path/to/myproject to PYTHONPATH.
However, this seems like an annoying little dependency that you are
likely to forget whenever you move your workspace to a new path, open
up a branch of the project in a different directory, or download and
work on the project using a different computer.
What I am missing here is how you plan to do the development.
If you want to do branch-based development, you may want to have a look at
Combinator (at divmod.org). It handles branch management, adds executable
programs from bin to your path (in your current branch), and extends PYTHONPATH
(with your current branch).

Even if you have just 1 branch (namely 'trunk') it may be useful.

Is there a way to set this up that is a bit more self contained? For
example, at first I was somewhat hopeful that Python could ascend
parent directories until it reached a directory that did not include
an __init__.py file, and it could use this as a root for referring to
packages and modules from any file contained within. (e.g. in the
example project above, any file could refer to myproject.shared.mylib1
so long as 'myproject' and all subdirectories contained an
__init__.py, and the parent of 'myproject' didn't contain such a
file). Evidently this is not the case, but it seems like it could be a
useful feature in these situations.
Work is being done on relative imports. Not sure of its state.
Anyway, I'm sure this is not an unusual situation, so I'm curious to
hear how other people have handled it.
Most people probably run scripts from the root, ie where 'myproject' is a
sub-directory. Since Python automatically adds '.' to its path, it will work.

Sincerely,
Albert
Jun 27 '08 #2
A.T.Hofkamp wrote:
Also, why do you make a distinction between shared and non-shared code?
You could simply eliminate 'shared' directory, and put its contents
directly under myproject.
I would go further and make them individual projects, with their own version
control, code repository and then install them as eggs using setuptools.

This has been working fine for me in some projects and has the advantage of
being reusable in different big projects.

Also, using setuptools on each big project I don't have to worry with
dependencies because it downloads and installs everything to me when I
install the main project.
>Is there a way to set this up that is a bit more self contained? For
example, at first I was somewhat hopeful that Python could ascend
parent directories until it reached a directory that did not include
an __init__.py file, and it could use this as a root for referring to
packages and modules from any file contained within. (e.g. in the
example project above, any file could refer to myproject.shared.mylib1
so long as 'myproject' and all subdirectories contained an
__init__.py, and the parent of 'myproject' didn't contain such a
file). Evidently this is not the case, but it seems like it could be a
useful feature in these situations.
Eggs would solve that as well. They would behave like any other
installed "library" on your system.

--
Jorge Godoy <jg****@gmail.com>

Jun 27 '08 #3

<iv**********@gmail.comwrote in message
news:96**********************************@e39g2000 hsf.googlegroups.com...
| Hello all,
|
| I'm starting work on what is going to become a fairly substantial
| Python project, and I'm trying to find the best way to organize
| everything. The project will consist of:
|
| - A few applications
| - Several small scripts and utilities
| - Unit tests and small interactive test programs
| - A number of custom libraries and modules that may be shared and
| referenced among all of the above
|
| I have the following general structure in mind:
|
| myproject/
| app1/
| main.py

If you put myproject in Pythonxy/Lib/site-packages, there is no need to
fiddle with PYTHONPATH or sys.path. In 3.0a5 I tried a relative import and
got a message that relative imports only work within packages, not modules.
I presume that means package.__init__.py. Maybe I just miswrote the
import, but I decided to stick with what dependably works whether from
within or without the package
from package.subpackage import module #or
from package.subpackage.module import object.

I agree with the comment about removing the 'shared' package layer.
Two packages deep is enough typing unless the deeper hierarchy is needed
(like possibly the 'tests' subsubpackages, if they make running the tests
easier).

tjr

Jun 27 '08 #4
En Wed, 21 May 2008 07:44:50 -0300, Casey McGinty <ca***********@gmail.comescribió:

Just my own opinion on these things:
1. Script code should be as basic as possible, ideally a module import line
and function or method call. This is so you don't have to worry about script
errors and/or increase startup time because a *.pyc file can not be store in
/usr/bin.
Scripts are not compiled by default, only imported modules (anyway you could compile scripts by hand). Keeping the main script short might reduce the startup time, yes.
2. In the top of your package directory it is typical to have a module name
'_package.py'. This is ideally where the main command line entry point for
the package code should be placed.
Why the underscore? And I usually don't put executable scripts inside a package - I consider them just libraries, to be imported by other parts of the application.
It's easier to test too when your tests *and* the application code are external to the package itself.
3. In the _package.py file you should add a "class Package" that holds most
or all of the application startup/execution code not designated to other
modules. Then run the application with the call to "Package()", as in

if __name__ == '__main__':
Package()
In Python that doesn't *have* to be a class, and in fact, most of the time I use a function instead. Something like this:

def main(argv):
# do things

if __name__ == '__main__':
import sys
sys.exit(int(main(sys.argv) or 0))
Some other questions I have are:
A. What should go in the package __init__.py file? For example, a doc
describing the program usage seems helpful, but maybe it should have info
about your modules only? Assuming the __init__.py code gets executed when
you import the module, you could place part or all of the application code
here as well. I'm guessing this is not a good idea, but not really
convinced.
Yes, the __init__.py is executed when you import the package. And no, I don't think it's a good idea to put all the application code there. As I said above, I consider packages *libraries*, the application code should *import* and use them, but not reside *inside* a package.
And there is the "import lock" issue too - I'm unsure of this but I think a lock is held until the import operation finishes, and that would happen only after __init__.py is fully executed.
B. How should you import your _package.py module into your /usr/bin script.
Is there a way to use an '__all__' to simplify this? Again this goes back to
question A, should there be any code added to __init__.py?
I don't get the question... you import it as any other module (but I would not use a _package.py file anyway)
C. If you have a _package.py file as the application entry, is it worth it
to place most of the application code in a class, described in part 3?
As I said, I'd use a function, at least at the top level. Of course it can create many other objects.
D. When I import a package._package module, I get a lot of junk in my
namespace. I thought an '__all__' define in the module would prevent this,
but it does not seem to work.
`import package._package` should only add "package" to the current namespace. If you're using `from package import *` - well, just don't do that :)
Otherwise I don't understand what you mean.

--
Gabriel Genellina

Jun 27 '08 #5
En Sun, 25 May 2008 19:46:50 -0300, Casey McGinty <ca***********@gmail.comescribió:
On Sat, May 24, 2008 at 2:11 PM, Gabriel Genellina <ga*******@yahoo.com.ar>
wrote:
2. In the top of your package directory it is typical to have a module
name
'_package.py'. This is ideally where the main command line entry point
for
the package code should be placed.

Why the underscore? And I usually don't put executable scripts inside a
package - I consider them just libraries, to be imported by other parts of
the application.
It's easier to test too when your tests *and* the application code are
external to the package itself.

Ok, guess using the '_package' name is not very common. I saw it used by the
dbus module. My whole concern here is that using a script will fragment your
application code across the file system. I would prefer to have it all in a
single spot. So I still like they idea of keeping most of application code
(argument parsing, help output, initialization) inside of the package. The
'_' should indicate that any other modules using your package should import
that module.
Not only using _package isn't very common - it goes against the general rule (this one very well established) that names with a single leading _underscore are private (implementation details). So I would *not* expect to import _package as a rule.
I don't think splitting the application in two or more parts ("fragmentation" as you call it) is a bad thing by itself; the package will likely appear under some-python-directory/site-packages/your_package_name (where anyone would search for it) and the executable scripts on /usr/bin (where anyone would likely search for it too).

--
Gabriel Genellina

Jun 27 '08 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: Jamie | last post by:
Hi, Thanks for the excellent answer to my last question! One more: Does anyone have a method they follow for organizing stylesheets themselves? They seem like they can get bloated and hard to...
10
by: Rada Chirkova | last post by:
Hi, at NC State University, my students and I are working on a project called "self-organizing databases," please see description below. I would like to use an open-source database system for...
4
by: Charlie Dison | last post by:
At the moment I'm an asp.net rookie. I'm using Visual Studio .net 2003. I get the error "An error occurred during the parsing of a resource required... Could not load type 'myApp.main'. This...
2
by: Sandor Palfy | last post by:
I have a VS solution with 3 c# class library projects and an "asp.net web application" project. The whole stuff is under integrated source control provided by VSS. The application is under...
0
by: Fredrik Strandberg | last post by:
I am trying to figure out how to organize a visual studio vb.net solution to make it possible to execute automated unit tests. I encounter problems when creating stubs. When testing an existing...
0
by: andrewbb | last post by:
Is there a simple way of organizing the subfolders that flows from development through release? For example, say your end result will be: AppFolder DataFolder TemplateFolder To run that in...
0
by: Jeff Rush | last post by:
OSCON 2007 in Portland, Oregon from July 23-27 is fast approaching. This is a professional conference that can give Python a lot of visibility in the IT world and draws a different crowd from our...
12
by: xkenneth | last post by:
All, I apologize if this is a commonly asked question, but I didn't find anything that answered my question while searching. So what I have right now is a few packages that contain some...
1
by: eliben | last post by:
Hello, At the moment, I place all the code of my project in a src/ directory, and all the tests in a sibling tests/ directory, so for instance a sample project can look like this: doc/ ......
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.