473,796 Members | 2,558 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

CGIs and file exclusion

Hi all,

While doing a quite big "set of programs" for a university subject I've
found myself in the middle of a problem someone surely has had before, so
I'm looking for some help.

At one point, I call a python cgi that pickle.load's a file, adds or deletes
a registry and dumps the result again in the file.
I'm worried that the cgi could be called simultaneously from two or more
different computers, thus most probably corrupting the files. I don't think
I can use a mutex as it's two different instances of the program and not
different threads, so I was thinking about locking the files between
programs, but I really don't know how to do that. It's not a problem if
there's no portable way of doing this, it's only going to be run on a linux
based computer.
Another solution I would accept is that the second called cgi detects that
other instance is running and displays a message saying to try again later.
Yes, not quite professional, but it'd do the job, after all this is just a
little detail I want to settle for a quite picky professor and not a "real
life" thing.
I think that's all the background you need, if someone can answer with
references on what should I look for or even better example code that would
be simply great.
Many thanks in advance.
DH
Jul 18 '05
13 1788
fu******@gmail. com (Michael Foord) wrote in message news:<6f******* *************** ***@posting.goo gle.com>...
A simple solution that doesn't scale well is to create a file when the
access starts. You can check if the file exists and pause until the
other process deletes it - with a timeout in case the file gets keft
there due to an error.

Obviously not an industrial strength solution, but it does work...

import time
import os

def sleep(thelockfi le, sleepcycle=0.01 , MAXCOUNT=200):
"""Sleep until the lockfile has been removed or a certain number
of cycles have gone.
Defaults to a max 2 second delay.
"""
counter = 0
while os.path.exists( thelockfile):
time.sleep(slee pcycle)
counter += 1
if counter > MAXCOUNT: break

def createlock(thel ockfile):
"""Creates a lockfile from the path supplied."""
open(thelockfil e, 'w').close()

def releaselock(the lockfile):
"""Deletes the lockfile."""
if os.path.isfile( thelockfile):
os.remove(thelo ckfile)

The sleep function waits until the specified file dissapears - or it
times out.


I tried essentially the same solution in my experiments, but I was
unhappy
with it: it seems to work 99% of times, but occasionally you get
strange
things (for instance once I got "File not found" when trying to remove
the lockfile, evidently it was already removed by another process;
other
times I got different strange errors). The issue is that it is very
difficult to reproduce the problems, hence to fix them. Maybe
Diez B. Roggisch is right and a real database server is the simplest
solution. However my first attempt with ZEO didn't worked either:

$ cat zeoclient.py
import ZODB, ZEO
from ZEO.ClientStora ge import ClientStorage

def openzeo(host, port):
db = ZODB.DB(ClientS torage((host, port)))
conn = db.open()
return db, conn, conn.root()

def store():
# I have a ZEO instance running on port 9999
print "Opening the db ..."
db, conn, root = openzeo("localh ost", 9999)
print "Storing something ..."
root["somekey"] = "somedata"
get_transaction ().commit()
print "Closing the db ..."
conn.close(); db.close()

if __name__ == "__main__":
store()

$ echo Makefile
default:
python zeoclient.py&
python zeoclient.py

$ make
python zeoclient.py&
python zeoclient.py
Opening the db ...
Opening the db ...

Storing something ...
Storing something ...
Closing the db ...
Traceback (most recent call last):
File "zeoclient. py", line 20, in ?
store()
File "zeoclient. py", line 15, in store
get_transaction ().commit()
File "/usr/share/partecs/zope/lib/python/ZODB/Transaction.py" , line
247, in commit
~/pt/python/zopexplore $
~/pt/python/zopexplore $ vote(self)
File "/usr/share/partecs/zope/lib/python/ZODB/Connection.py", line
699, in tpc_vote
s = vote(transactio n)
File "/opt/zope/lib/python/ZEO/ClientStorage.p y", line 841, in
tpc_vote
return self._check_ser ials()
File "/opt/zope/lib/python/ZEO/ClientStorage.p y", line 825, in
_check_serials
raise s
ZODB.POSExcepti on.ConflictErro r: database conflict error (oid
000000000000000 0, serial was 035900d31b7feda a, now 035900d2f6cd879 9)

(it works with a single process instead).

Maybe I misunderstood how ZEO is intended to be used, as usual it is
difficult to found the relevant documentation :-( Maybe I should ask
on another list ...
Michele Simionato
Jul 18 '05 #11
[Michele Simionato]
...
Maybe Diez B. Roggisch is right and a real database server is the simplest
solution. However my first attempt with ZEO didn't worked either:
That's OK, nobody's first attempt with any database server works <0.6 wink>.
$ cat zeoclient.py
import ZODB, ZEO
from ZEO.ClientStora ge import ClientStorage

def openzeo(host, port):
db = ZODB.DB(ClientS torage((host, port)))
conn = db.open()
return db, conn, conn.root()

def store():
# I have a ZEO instance running on port 9999
print "Opening the db ..."
db, conn, root = openzeo("localh ost", 9999)
print "Storing something ..."
root["somekey"] = "somedata"
It's important to note that store() always changes at least the root object.
get_transaction ().commit()
print "Closing the db ..."
conn.close(); db.close()

if __name__ == "__main__":
store()

$ echo Makefile
default:
python zeoclient.py&
python zeoclient.py

$ make
python zeoclient.py&
python zeoclient.py
Opening the db ...
Opening the db ...

Storing something ...
Storing something ...
Closing the db ...
Traceback (most recent call last):
File "zeoclient. py", line 20, in ?
store()
File "zeoclient. py", line 15, in store
get_transaction ().commit()
File "/usr/share/partecs/zope/lib/python/ZODB/Transaction.py" , line
247, in commit
~/pt/python/zopexplore $
~/pt/python/zopexplore $ vote(self)
File "/usr/share/partecs/zope/lib/python/ZODB/Connection.py", line
699, in tpc_vote
s = vote(transactio n)
File "/opt/zope/lib/python/ZEO/ClientStorage.p y", line 841, in
tpc_vote
return self._check_ser ials()
File "/opt/zope/lib/python/ZEO/ClientStorage.p y", line 825, in
_check_serials
raise s
ZODB.POSExcepti on.ConflictErro r: database conflict error (oid
000000000000000 0, serial was 035900d31b7feda a, now 035900d2f6cd879 9)

(it works with a single process instead).
Yes, that's predictable too <wink>.
Maybe I misunderstood how ZEO is intended to be used, as usual it is
difficult to found the relevant documentation :-( Maybe I should ask
on another list ...


zo******@zope.o rg is the best place for ZODB/ZEO questions independent
of Zope use. Note that you must subscribe to a zope.org list in order
to post to it (that's a Draconian but very effective anti-spam
policy).

In the case above, ZEO isn't actually relevant. You'd see the same
thing if you had a single process with two threads, each using a
"direct" ZODB connection to the same database.

ZODB doesn't do object-level locking. It relies on "optimistic
concurrency control" (a googlable phrase) instead, which is especially
appropriate for high-read low-write applications like most Zope
deployments.

In effect, that means it won't stop you from trying to do something
insane, but does stop you from *completing* it. What you got above is
a "write conflict error", and is normal behavior. What happens:

- Process A loads revision n of some particular object O.
- Process B loads the same revision n of O.
- Process A modifies O, creating revision n+1.
- Process A commits its change to O. Revsion n+1 is then current.
- Process B modifies O, creating revision n+2.
- Process B *tries* to commit its change to O.

The implementation of commit() investigates, and effectively says
"Hmm. Process B started with revision n of O, but revision n+1 is
currently committed. That means B didn't *start* with the currently
committed revision of O, so B has no idea what might have happened in
revision n+1 -- B may be trying to commit an insane change as a
result. Can't let that happen, so I'll raise ConflictError". That
line of argument makes a lot more sense if more than one object is
involved, but maybe it's enough to hint at the possible problems.

Anyway, since your store() method always picks on the root object,
you're going to get ConflictErrors frequently. It's bad application
design for a ZODB/ZEO app to have a "hot spot" like that.

In real life, all ZEO apps, and all multithreaded ZODB apps, always do
their work inside try/except structures. When a conflict error
occurs, the except clause catches it, and generally tries the
transaction again. In your code above, that isn't going to work well,
because there's a single object that's modified by every transaction
-- it will be rare for a commit() attempt not to give up with a
conflict error.

Perhaps paradoxically, it can be easier to get a real ZEO app working
well than one's first overly simple attempts -- ZODB effectively
*wants* you to scribble all over the database.
Jul 18 '05 #12
Mike Meyer <mw*@mired.or g> wrote in message news:<x7******* *****@guru.mire d.org>...
fu******@gmail. com (Michael Foord) writes:

[snip..]

A simple solution that doesn't scale well is to create a file when the
access starts. You can check if the file exists and pause until the
other process deletes it - with a timeout in case the file gets keft
there due to an error.

Obviously not an industrial strength solution, but it does work...


To strengthen the solution, write the process id of the script
(available via os.getpid()) to the file. If the file doesn't vanish before
your timeout, you can check to see if the process is still around, and
kill it.

<mike


Thanks - a good suggestion.

Regards,

Fuzzy
http://www.voidspace.org.uk/atlantib...thonutils.html
Jul 18 '05 #13
Tim Peters <ti********@gma il.com> wrote in message news:<ma******* *************** *************** *@python.org>.. .
What you got above is
a "write conflict error", and is normal behavior. What happens:

- Process A loads revision n of some particular object O.
- Process B loads the same revision n of O.
- Process A modifies O, creating revision n+1.
- Process A commits its change to O. Revsion n+1 is then current.
- Process B modifies O, creating revision n+2.
- Process B *tries* to commit its change to O.

The implementation of commit() investigates, and effectively says
"Hmm. Process B started with revision n of O, but revision n+1 is
currently committed. That means B didn't *start* with the currently
committed revision of O, so B has no idea what might have happened in
revision n+1 -- B may be trying to commit an insane change as a
result. Can't let that happen, so I'll raise ConflictError". That
line of argument makes a lot more sense if more than one object is
involved, but maybe it's enough to hint at the possible problems.

Anyway, since your store() method always picks on the root object,
you're going to get ConflictErrors frequently. It's bad application
design for a ZODB/ZEO app to have a "hot spot" like that.

In real life, all ZEO apps, and all multithreaded ZODB apps, always do
their work inside try/except structures. When a conflict error
occurs, the except clause catches it, and generally tries the
transaction again. In your code above, that isn't going to work well,
because there's a single object that's modified by every transaction
-- it will be rare for a commit() attempt not to give up with a
conflict error.

Perhaps paradoxically, it can be easier to get a real ZEO app working
well than one's first overly simple attempts -- ZODB effectively
*wants* you to scribble all over the database.


Ok, I understand what you are saying, but I do not understand how would I
solve the problem. This is interesting to me since it has to do with a real
application I am working on. Maybe I should give the framework.

We have an application where the users can interact with the system via
a Web interface (developed in Zope/Plone by other people) and via email. I am
doing the email part. We want the email part to be independent from the
Zope part, since it must act also as a safety belt (i.e. even if the Zope
server is down for any reason the email part must continue to work).

Moreover, people with slow connections can prefer the email interface
over the Zope/Plone interface which is pretty heavyweight. So, it must be
there. We do expect to have few emails coming in (<100 per hour) so I just
modified /etc/aliases and each mail is piped to a simple Python script which
parses it and stores the relevant information (who sent the email, the date,
the content, etc.).

Input coming via email or via the web interface should go into
the same database. Since we are using Zope anyway and there is no
much writing to do, we thought to use the ZODB and actually ZEO to
keep it independent from the main Zope instance. We could use another
database if needed, but we would prefer to avoid additional dependencies
and installation issues.

The problem is that occasionally two emails (or an email and a web submission)
can arrive at the same time. At the moment I just catch the error and send
back an email saying "Sorry, there was an internal error. Please retry later".
This is rare but it happened during the testing phase. I would rather avoid
that. I thought about catching the exception and waiting a bit before retrying,
but I am not completely happy with that; I also tried a hand-coded solution
involving a lock file but if was not 100% reliable. So I ask here if the ZODB
has some smart way to solve that, or if it is possible to change the design in
such a way to avoid those concurrency issues as much as possible.

Another concern of mine is security. What happens if a maliciuous user
sends 10000 emails at the same time? Does the mail server (can be postfix or
exim4) spawn tons of processes until we run out of memory and the server
crashes? How would I avoid that? I can think of various hackish solutions but I
would like something reliable.

Any hints? Suggestions?

Thanks,
Michele Simionato
Jul 18 '05 #14

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
2833
by: Simon Willison | last post by:
Hi all, I've been experimenting with Python CGIs, and more recently mod_python. The FieldStorage class used by Python's cgi module and mod_python's utility library allows form data sent by a user's browser to be easily accessed, but doesn't appear to provide any method of distinguishing between data sent by POST and data sent by GET. Coming from PHP, where these two types of input are available in separate $_POST and $_GET arrays,...
4
467
by: Rookie | last post by:
If I have multiple processes (parent+1 or more child processes) that are reading a file at the same time (fopen("path.txt","r") - do I have to implement mutual exclusion requiring only one process have access to the file at a time?
5
2230
by: clusardi2k | last post by:
Hello, I have a assignment just thrown onto my desk. What is the easiest way to solve it? Below is a brief description of the task. There are multible programs which use the same library routine which is an interface to what I'll call a service program.
3
1449
by: Irena | last post by:
Hi all there, I want to develop in C a couple of CGIs to be uploaded on a Linux Web Server. Unfortunately (or fortunately, just a question of points of view), I use a classical Windows 2k/XP PC to locally develop/debug the software. I'm wondering if any of you can suggest a good C cross compiler I can use to locally develop and test on a Windows platform and upload a proper binary code on a Linux platform.
78
4639
by: wkehowski | last post by:
The python code below generates a cartesian product subject to any logical combination of wildcard exclusions. For example, suppose I want to generate a cartesian product S^n, n>=3, of that excludes '*a*b*' and '*c*d*a*'. See below for details. CHALLENGE: generate an equivalent in ruby, lisp, haskell, ocaml, or in a CAS like maple or mathematica. #------------------------------------------------------------------------------- # Short...
3
10405
by: dchadha | last post by:
Hi, I am working on application in C# which uses and stores data in xml file. This app can use xml file from two different ways: (a) From User Interface (Windows application) and (b) From Windows Service. Both apps read data from xml file in dataset and update the file after doing the desired operation on dataset. Now the problem is when windows service reads the xml in dataset and while its doing the operation (it takes few minutes)...
0
1573
by: mukeshp | last post by:
I did not find very elaborate details on the manifests on msdn and so am still not clear on the manifest file usage. I am building a win32 console executable application in Visual Studio 2005. The application is being linked to a dll by providing the lib file in the linker options as input. On running the application I get the error that the application has made an attempt to load a c runtime library without using a manifest. when I build...
6
1711
by: naima.mans | last post by:
Hello :) I'm newbie to XSLT and i need some advice please: Here the action: I have to browse an XML file with xslt : For each node i have to determinate if it is a node where i need to add an attribute... The question is:
1
1887
by: illegal.prime | last post by:
So I have a container of objects that I don't want to iterate across when I'm modifying it. I.E. I lock on adds and deletes to the container - so that my traversals of it don't result in concurrency issues. However, what do I need to do to allow multiple threads to traverse the container without synchronization/mutual-exclusion - but ensure that synchronization/mutual-exclusion is there when the container is trying to be both changed...
0
10456
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10230
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10174
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9052
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
6788
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5442
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
4118
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3731
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2926
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.