473,396 Members | 1,734 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

Multiple modules with database access + general app design?

Hey people

I'm an experience PHP programmer who's been writing python for a couple of
weeks now. I'm writing quite a large application which I've decided to
break down in to lots of modules (replacement for PHP's include()
statement).

My problem is, in PHP if you open a database connection it's always in
scope for the duration of the script. Even if you use an abstraction layer
($db = DB::connect(...)) you can `global $db` and bring it in to scope,
but in Python I'm having trouble keeping the the database in scope. At the
moment I'm having to "push" the database into the module, but I'd prefer
the module to bring the database connection in ("pull") from its parent.

Eg:
import modules
modules.foo.c = db.cursor()
modules.foo.Bar()

Can anyone recommend any "cleaner" solutions to all of this? As far as I
can see it, Python doesn't have much support for breaking down large
programs in to organisable files and referencing each other.

Another problem is I keep having to import modules all over the place. A
real example is, I have a module "webhosting", a module "users", and a
module "common". These are all submodules of the module "modules" (bad
naming I know). The database connection is instantiated on the "db"
variable of my main module, which is "yellowfish" (a global module), so
get the situation where:

(yellowfish.py)
import modules
modules.webhosting.c = db.cursor()
modules.webhosting.Something()

webhosting needs methods in common and users:

from modules import common, users

However users also needs common:

from modules import common

And they all need access to the database

(users and common)
from yellowfish import db
c = db.cursor()

Can anyone give me advice on making this all a bit more transparent? I
guess I really would like a method to bring all these files in to the same
scope to make everything seem to be all one application, even though
everything is broken up in to different files.

One added complication in this particular application:

I used modules because I'm calling arbitrary methods defined in some XML
format. Obviously I wanted to keep security in mind, so my application
goes something like this:

import modules
module, method, args = getXmlAction()
m = getattr(modules, module)
m.c = db.cursor()
f = getattr(m, method)
f(args)

In PHP this method is excellent, because I can include all the files I
need, each containing a class, and I can use variable variables:

<?php
$class = new $module; // can't remember if this works, there are
// alternatves though
$class->$method($args);
?>

And $class->$method() just does "global $db; $db->query(...);".

Any advice would be greatly appreciated!

Cheers

-Robin Haswell
Jan 19 '06 #1
13 2311
"Robin Haswell" <ro*@digital-crocus.com> wrote in message
news:pa****************************@digital-crocus.com...
Hey people

I'm an experience PHP programmer who's been writing python for a couple of
weeks now. I'm writing quite a large application which I've decided to
break down in to lots of modules (replacement for PHP's include()
statement).

My problem is, in PHP if you open a database connection it's always in
scope for the duration of the script. Even if you use an abstraction layer
($db = DB::connect(...)) you can `global $db` and bring it in to scope,
but in Python I'm having trouble keeping the the database in scope. At the
moment I'm having to "push" the database into the module, but I'd prefer
the module to bring the database connection in ("pull") from its parent.

Eg:
import modules
modules.foo.c = db.cursor()
modules.foo.Bar()

Can anyone recommend any "cleaner" solutions to all of this?


Um, I think your Python solution *is* moving in a cleaner direction than
simple sharing of a global $db variable. Why make the Bar class have to
know where to get a db cursor from? What do you do if your program extends
to having multiple Bar() objects working with different cursors into the db?

The unnatural part of this (and hopefully, the part that you feel is
"unclean") is that you're trading one global for another. By just setting
modules.foo.c to the db cursor, you force all Bar() instances to use that
same cursor.

Instead, make the database cursor part of Bar's constructor. Now you can
externally create multiple db cursors, a Bar for each, and they all merrily
do their own separate, isolated processing, in blissful ignorance of each
other's db cursors (vs. colliding on the shared $db variable).

-- Paul
Jan 19 '06 #2
On Thu, 19 Jan 2006 12:23:12 +0000, Paul McGuire wrote:
"Robin Haswell" <ro*@digital-crocus.com> wrote in message
news:pa****************************@digital-crocus.com...
Hey people

I'm an experience PHP programmer who's been writing python for a couple of
weeks now. I'm writing quite a large application which I've decided to
break down in to lots of modules (replacement for PHP's include()
statement).

My problem is, in PHP if you open a database connection it's always in
scope for the duration of the script. Even if you use an abstraction layer
($db = DB::connect(...)) you can `global $db` and bring it in to scope,
but in Python I'm having trouble keeping the the database in scope. At the
moment I'm having to "push" the database into the module, but I'd prefer
the module to bring the database connection in ("pull") from its parent.

Eg:
import modules
modules.foo.c = db.cursor()
modules.foo.Bar()

Can anyone recommend any "cleaner" solutions to all of this?
Um, I think your Python solution *is* moving in a cleaner direction than
simple sharing of a global $db variable. Why make the Bar class have to
know where to get a db cursor from? What do you do if your program extends
to having multiple Bar() objects working with different cursors into the db?

The unnatural part of this (and hopefully, the part that you feel is
"unclean") is that you're trading one global for another. By just setting
modules.foo.c to the db cursor, you force all Bar() instances to use that
same cursor.

Instead, make the database cursor part of Bar's constructor. Now you can
externally create multiple db cursors, a Bar for each, and they all merrily
do their own separate, isolated processing, in blissful ignorance of each
other's db cursors (vs. colliding on the shared $db variable).


Hm if truth be told, I'm not totally interested in keeping a separate
cursor for every class instance. This application runs in a very simple
threaded socket server - every time a new thread is created, we create a
new db.cursor (m = getattr(modules, module)\n m.c = db.cursor() is the
first part of the thread), and when the thread finishes all its actions
(of which there are many, but all sequential), the thread exits. I don't
see any situations where lots of methods will tread on another method's
cursor. My main focus really is minimising the number of connections.
Using MySQLdb, I'm not sure if every MySQLdb.connect or db.cursor is a
separate connection, but I get the feeling that a lot of cursors = a lot
of connections. I'd much prefer each method call with a thread to reuse
that thread's connection, as creating a connection incurs significant
overhead on the MySQL server and DNS server.

-Rob

-- Paul


Jan 19 '06 #3
Robin Haswell wrote:
cursor for every class instance. This application runs in a very simple
threaded socket server - every time a new thread is created, we create a
new db.cursor (m = getattr(modules, module)\n m.c = db.cursor() is the
first part of the thread), and when the thread finishes all its actions
(of which there are many, but all sequential), the thread exits. I don't
If you use a threading server, you can't put the connection object into
the module. Modules and hence module variables are shared across
threads. You could use thread local storage, but I think it's better to
pass the connection explicitely as a parameter.
separate connection, but I get the feeling that a lot of cursors = a lot
of connections. I'd much prefer each method call with a thread to reuse
that thread's connection, as creating a connection incurs significant
overhead on the MySQL server and DNS server.


You can create several cursor objects from one connection. There should
be no problems if you finish processing of one cursor before you open
the next one. In earlier (current?) versions of MySQL, only one result
set could be opened at a time, so using cursors in parallel present some
problems to the driver implementor.

Daniel
Jan 19 '06 #4
On Thu, 19 Jan 2006 14:37:34 +0100, Daniel Dittmar wrote:
Robin Haswell wrote:
cursor for every class instance. This application runs in a very simple
threaded socket server - every time a new thread is created, we create a
new db.cursor (m = getattr(modules, module)\n m.c = db.cursor() is the
first part of the thread), and when the thread finishes all its actions
(of which there are many, but all sequential), the thread exits. I don't
If you use a threading server, you can't put the connection object into
the module. Modules and hence module variables are shared across
threads. You could use thread local storage, but I think it's better to
pass the connection explicitely as a parameter.


Would you say it would be better if in every thread I did:

m = getattr(modules, module)
b.db = db

...

def Foo():
c = db.cursor()

?
separate connection, but I get the feeling that a lot of cursors = a lot
of connections. I'd much prefer each method call with a thread to reuse
that thread's connection, as creating a connection incurs significant
overhead on the MySQL server and DNS server.


You can create several cursor objects from one connection. There should
be no problems if you finish processing of one cursor before you open
the next one. In earlier (current?) versions of MySQL, only one result
set could be opened at a time, so using cursors in parallel present some
problems to the driver implementor.

Daniel


Jan 19 '06 #5

Robin Haswell wrote:
Hey people

I'm an experience PHP programmer who's been writing python for a couple of
weeks now. I'm writing quite a large application which I've decided to
break down in to lots of modules (replacement for PHP's include()
statement).

My problem is, in PHP if you open a database connection it's always in
scope for the duration of the script. Even if you use an abstraction layer
($db = DB::connect(...)) you can `global $db` and bring it in to scope,
but in Python I'm having trouble keeping the the database in scope. At the
moment I'm having to "push" the database into the module, but I'd prefer
the module to bring the database connection in ("pull") from its parent.


This is what I do.

Create a separate module to contain your global variables - mine is
called 'common'.

In common, create a class, with attributes, but with no methods. Each
attribute becomes a global variable. My class is called 'c'.

At the top of every other module, put 'from common import c'.

Within each module, you can now refer to any global variable as
c.whatever.

You can create class attributes on the fly. You can therefore have
something like -

c.db = MySql.connect(...)

All modules will be able to access c.db

As Daniel has indicated, it may not be safe to share one connection
across multiple threads, unless you can guarantee that one thread
completes its processing before another one attempts to access the
database. You can use threading locks to assist with this.

HTH

Frank Millman

Jan 19 '06 #6
Robin Haswell wrote:
On Thu, 19 Jan 2006 14:37:34 +0100, Daniel Dittmar wrote:
If you use a threading server, you can't put the connection object into
the module. Modules and hence module variables are shared across
threads. You could use thread local storage, but I think it's better to
pass the connection explicitely as a parameter.

Would you say it would be better if in every thread I did:

m = getattr(modules, module)
b.db = db

...

def Foo():
c = db.cursor()


I was thinking (example from original post):

import modules
modules.foo.Bar(db.cursor ())

# file modules.foo
def Bar (cursor):
cursor.execute (...)

The same is true for other objects like the HTTP request: always pass
them as parameters because module variables are shared between threads.

If you have an HTTP request object, then you could attach the database
connection to that object, that way you have to pass only one object.

Or you create a new class that encompasses everything useful for this
request: the HTTP request, the database connection, possibly an object
containing authorization infos etc.

I assume that in PHP, global still means 'local to this request', as PHP
probably runs in threads under Windows IIS (and Apache 2.0?). In Python,
you have to be more explicit about the scope.

Daniel
Jan 19 '06 #7
On Thu, 19 Jan 2006 15:43:58 +0100, Daniel Dittmar wrote:
Robin Haswell wrote:
On Thu, 19 Jan 2006 14:37:34 +0100, Daniel Dittmar wrote:
If you use a threading server, you can't put the connection object into
the module. Modules and hence module variables are shared across
threads. You could use thread local storage, but I think it's better to
pass the connection explicitely as a parameter.

Would you say it would be better if in every thread I did:

m = getattr(modules, module)
b.db = db

...

def Foo():
c = db.cursor()


I was thinking (example from original post):

import modules
modules.foo.Bar(db.cursor ())

# file modules.foo
def Bar (cursor):
cursor.execute (...)


Ah I see.. sounds interesting. Is it possible to make any module variable
local to a thread, if set within the current thread? Your method, although
good, would mean revising all my functions in order to make it work?

Thanks

Jan 19 '06 #8
On Thu, 19 Jan 2006 06:38:39 -0800, Frank Millman wrote:

Robin Haswell wrote:
Hey people

I'm an experience PHP programmer who's been writing python for a couple of
weeks now. I'm writing quite a large application which I've decided to
break down in to lots of modules (replacement for PHP's include()
statement).

My problem is, in PHP if you open a database connection it's always in
scope for the duration of the script. Even if you use an abstraction layer
($db = DB::connect(...)) you can `global $db` and bring it in to scope,
but in Python I'm having trouble keeping the the database in scope. At the
moment I'm having to "push" the database into the module, but I'd prefer
the module to bring the database connection in ("pull") from its parent.


This is what I do.

Create a separate module to contain your global variables - mine is
called 'common'.

In common, create a class, with attributes, but with no methods. Each
attribute becomes a global variable. My class is called 'c'.

At the top of every other module, put 'from common import c'.

Within each module, you can now refer to any global variable as
c.whatever.

You can create class attributes on the fly. You can therefore have
something like -

c.db = MySql.connect(...)

All modules will be able to access c.db

As Daniel has indicated, it may not be safe to share one connection
across multiple threads, unless you can guarantee that one thread
completes its processing before another one attempts to access the
database. You can use threading locks to assist with this.

HTH

Frank Millman

Thanks, that sounds like an excellent idea. While I don't think it applies
to the database (threading seems to be becoming a bit of an issue at the
moment), I know I can use that in other areas :-)

Cheers

-Rob

Jan 19 '06 #9
Robin Haswell wrote:
Can anyone give me advice on making this all a bit more transparent? I
guess I really would like a method to bring all these files in to the same
scope to make everything seem to be all one application, even though
everything is broken up in to different files.


This is very much a deliberate design decision in Python.
I haven't used PHP, but in e.g. C, the #include directive
means that you pollute your namespace with all sorts of
strange names from all the third party libraries you are
using, and this doesn't scale well. As your application
grows, you'll get mysterious bugs due to strange name clashes,
removing some module you no-longer need means that your app
won't build since the include file you no longer include in
turn included another file that you should have included but
didn't etc. In Python, explicit is better than implicit (type
"import this" at the Python prompt) and while this causes some
extra typing it helps with code maintenance. You can always
see where a name in your current namespace comes from (unless
you use "from xxx import *"). No magic!
Concerning your database operations, it seems they are distributed
over a lot of different modules, and that might also cause problems,
whatever programming language we use. In typical database
applications, you need to keep track of transactions properly.

For each opened connection, you can perform a number of transactions
after each other. A transaction starts with the first database
operation after a connect, commit or rollback. A cursor should only
live within a transaction. In other words, you should close all
cursors before you perform a commit or rollback.

I find it very difficult to manage transactions properly if the
commits are spread out in the code. Usually I want one module to
contain some kind of transaction management logic, where I determine
the transaction boundries. This logic will hand out cursor object
to various pieces of code, and determine when to close the cursors
and commit the transaction.

I haven't really written multithreaded applications, so I don't
have any experiences in the problems that might cause. I know that
it's a fairly common pattern to have all database transactions in
one thread though, and to use Queue.Queue instances to pass data
to and from the thread that handles DB.

Anyway, you can only have one transaction going on at a time for
a connection, so if you share connections between threads (or use
a separate DB thread and queues) a rollback or commit in one thread
will affect the other threads as well...

Each DB-API 2.0 compliant library should be able to declare how it
can be used in a threaded application. See the DB-API 2.0 spec:
http://python.org/peps/pep-0249.html Look for "threadsafety".
Jan 19 '06 #10
Robin Haswell wrote:
Ah I see.. sounds interesting. Is it possible to make any module variable
local to a thread, if set within the current thread?


Not directly. The following class tries to simulate it (only in Python 2.4):

import threading

class ThreadLocalObject (threading.local):
def setObject (self, object):
setattr (self, 'object', object)

def clearObject (self):
setattr (self, 'object', None)

def __getattr__ (self, name):
object = threading.local.__getattribute__ (self, 'object')
return getattr (object, name)

You use it as:

in some module x:

db = ThreadLocalObject ()

in some module that create the database connection:

import x

def createConnection ()
localdb = ...connect (...)
x.db.setObject (localdb)

in some module that uses the databasse connection:

import x

def bar ():
cursor = x.db.cursor ()

The trick is:
- every attribute of a threading.local is thread local (see doc of
module threading)
- when accessing an attribute of object x.db, the method __getattr__
will first retrieve the thread local database connection and then access
the specific attribute of the database connection. Thus it looks as if
x.db is itself a database connection object.

That way, only the setting of the db variable would have to be changed.

I'm not exactly recommneding this, as it seems very error prone to me.
It's easy to overwrite the variable holding the cursors with an actual
cursor object.

Daniel
Jan 19 '06 #11

Daniel Dittmar wrote:
Robin Haswell wrote:
Ah I see.. sounds interesting. Is it possible to make any module variable
local to a thread, if set within the current thread?


Not directly. The following class tries to simulate it (only in Python 2.4):

import threading

class ThreadLocalObject (threading.local):


Daniel, perhaps you can help me here.

I have subclassed threading.Thread, and I store a number of attributes
within the subclass that are local to the thread. It seems to work
fine, but according to what you say (and according to the Python docs,
otherwise why would there be a 'Local' class) there must be some reason
why it is not a good idea. Please can you explain the problem with this
approach.

Briefly, this is what I am doing.

class Link(threading.Thread): # each link runs in its own thread
"""Run a loop listening for messages from client."""

def __init__(self,args):
threading.Thread.__init__(self)
print 'link connected',self.getName()
self.ctrl, self.conn = args
self._db = {} # to store db connections for this client
connection
[create various other local attributes]

def run(self):
readable = [self.conn.fileno()]
error = []
self.sendData = [] # 'stack' of replies to be sent

self.running = True
while self.running:
if self.sendData:
writable = [self.conn.fileno()]
else:
writable = []
r,w,e = select.select(readable,writable,error,0.1) # 0.1
timeout
[continue to handle connection]

class Controller(object):
"""Run a main loop listening for client connections."""

def __init__(self):
self.s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
self.s.bind((HOST,PORT))
self.s.listen(5)
self.running = True

def mainloop(self):
while self.running:
try:
conn,addr = self.s.accept()
Link(args=(self,conn)).start() # create thread to
handle connection
except KeyboardInterrupt:
self.shutdown()

Controller().mainloop()

TIA

Frank Millman

Jan 20 '06 #12
Frank Millman wrote:
I have subclassed threading.Thread, and I store a number of attributes
within the subclass that are local to the thread. It seems to work
fine, but according to what you say (and according to the Python docs,
otherwise why would there be a 'Local' class) there must be some reason
why it is not a good idea. Please can you explain the problem with this
approach.


Your design is just fine. If you follow the thread upwards, you'll
notice that I encouraged the OP to pass everything by parameter.

Using thread local storage in this case was meant to be a kludge so that
not every def and every call has to be changed. There are other cases
when you don't control how threads are created (say, a plugin for web
framework) where thread local storage is useful.

threading.local is new in Python 2.4, so it doesn't seem to be that
essential to Python thread programming.

Daniel
Jan 20 '06 #13

Daniel Dittmar wrote:
Frank Millman wrote:
I have subclassed threading.Thread, and I store a number of attributes
within the subclass that are local to the thread. It seems to work
fine, but according to what you say (and according to the Python docs,
otherwise why would there be a 'Local' class) there must be some reason
why it is not a good idea. Please can you explain the problem with this
approach.


Your design is just fine. If you follow the thread upwards, you'll
notice that I encouraged the OP to pass everything by parameter.


Many thanks, Daniel

Frank

Jan 20 '06 #14

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: john | last post by:
How do u guys handle multiple sessions?? i.e, opening different browser windows by running iexplore.exe or clicking IE icons and opening the application. My sessions are mixing up. what i mean is...
1
by: bdj | last post by:
Hello! Can anyone tell me where to read more about best practices about this: Should I put data in a seperate scheme for tables, packages in anoter schema and create a lot of users that have...
7
by: Billy Jacobs | last post by:
I am using a datagrid to display some data. I need to create 2 header rows for this grid with columns of varying spans. In html it would be the following. <Table> <tr> <td colspan=8>Official...
3
by: Hamed | last post by:
Hello Every where in .NET books is mentioned that VS.NET is a seamless cross platform environment. We have two groups of programmers that some are VB programmer but others prefer to use C#. Is it...
5
by: Ross A. Finlayson | last post by:
Hi, I'm scratching together an Access database. The development box is Office 95, the deployment box Office 2003. So anyways I am griping about forms and global variables. Say for example...
2
by: James | last post by:
Dear Access Guru's, Hopefully you can help me (as Microsoft don't seem to be able to) We have an Access database on a Windows 2003 server with 5 CALS in our office. Access is loaded onto...
2
by: Ed_P. | last post by:
Hello, Up to this point, I've been using C# to create simple applications that only use on Project to store Windows Forms and other Files. The project has been complied into an EXE. The exe file...
9
by: TC | last post by:
I need to design a system which represents multiple "projects" in SQL Server. Each project has the same data model, but is independent of all others. My inclination is to use one database to store...
8
by: Ed Dror | last post by:
Hi there ASP.NET 2.0 VB & SQL Express Lest take Northwind Categories Products as example I create a table that hold these two together and I create a stored procedure like select ProductID,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.