Hi,
i've got a small problem with my python-script. It is a cgi-script, which is
called regulary (e.g. every 5 minutes) and returns a xml-data-structure.
This script calls a very slow function, with a duration of 10-40 seconds. To
avoid delays, i inserted a cache for the data. So, if the script is called,
it returns the last caculated data-structure and then the function is called
again and the new data is stored in the cache. (There is no problem to use
older but faster data)
My problem is, that the client (A Java program (or browser, command line))
waits, until the whole script has ended and so the cache is worthless. How
can I tell the client/browser/... that after the last print line there is no
more data and it can proceed? Or how can I tell the python script, that
everything after the return of the data (the retieval of the new data and
the storage in a file) can be done in an other thread or in the background?
Greetings
Ralph 5 2057
Ralph Sluiters wrote in message ... Hi, i've got a small problem with my python-script. It is a cgi-script, which
iscalled regulary (e.g. every 5 minutes) and returns a xml-data-structure. This script calls a very slow function, with a duration of 10-40 seconds.
Toavoid delays, i inserted a cache for the data. So, if the script is called, it returns the last caculated data-structure and then the function is
calledagain and the new data is stored in the cache. (There is no problem to use older but faster data)
My problem is, that the client (A Java program (or browser, command line)) waits, until the whole script has ended and so the cache is worthless. How can I tell the client/browser/... that after the last print line there is
nomore data and it can proceed? Or how can I tell the python script, that everything after the return of the data (the retieval of the new data and the storage in a file) can be done in an other thread or in the background?
Wouldn't a better approach be to decouple the cache mechanism from the cgi
script? Have a long-running Python process act as a memoizing cache and
delegate requests to the slow function. The cgi scripts then connect to
this cache process (via your favorite IPC mechanism). If the cache process
has a record of the call/request, it returns the previous value immediately,
and updates its cache in the meantime. If it doesn't have a record, then it
blocks the cgi script until it gets a result.
How can threading help you if the cgi-process dies after each request unless
you store the value somewhere else? And if you store the value somewhere,
why not have another process manage that storage? If it's possible to
output a complete page before the cgi script terminates (I don't know if the
server blocks until the script terminates), then you could do the cache
updating afterwards. In this case I guess you could use a pickled
dictionary or something as your cache, and you don't need a separate
process. But even here you wouldn't necessarily use threads.
Threads are up there with regexps: powerful, but avoid as much as possible.
--
Francis Avila
> Wouldn't a better approach be to decouple the cache mechanism from the cgi script? Have a long-running Python process act as a memoizing cache and delegate requests to the slow function. The cgi scripts then connect to this cache process (via your favorite IPC mechanism). If the cache
process has a record of the call/request, it returns the previous value
immediately, and updates its cache in the meantime. If it doesn't have a record, then
it blocks the cgi script until it gets a result.
The caching can not be decoupled, because the cgi-script gets an folder ID
gets only data from this "folder". So if I decouple die processes, I don't
know which folders to cache and I can not cache all folders, because the
routine is to slow. So I must get the actual folder from cgi and then cache
this one as long as the uses is in this folder and pulls data every 2
Minutes and cache another folder, if
the uses changes his folder.
How can threading help you if the cgi-process dies after each request
unless you store the value somewhere else? And if you store the value somewhere, why not have another process manage that storage? If it's possible to output a complete page before the cgi script terminates (I don't know if
the server blocks until the script terminates), then you could do the cache updating afterwards. In this case I guess you could use a pickled dictionary or something as your cache, and you don't need a separate process. But even here you wouldn't necessarily use threads.
The data is to large to store it in the memmory and with this method, as you
said, threading wouldn't help, but I store the data in the disk.
My code:
#Read from file
try:
oldfile = open(filename,"r")
oldresult =string.joinfields(oldfile.readlines(),'\r\n')
oldfile.close()
except:
# Start routine
oldresult = get_data(ID) # Get xml data
# Print header, so that it is returned via HTTP
print string.joinfields(header, '\r\n')
print oldresult
# ***
# Start routine
result = get_data(ID) # Get xml data
#Save to file
newfile = open(filename, "w")
newfile.writelines(result)
newfile.close()
#END
At the position *** the rest of the script must be uncoupled, so that the
client can proceed with the actual data, but the new data generation for the
next time ist stored in a file.
Ralph
Ralph Sluiters fed this fish to the penguins on Tuesday 06 January 2004
02:07 am: The caching can not be decoupled, because the cgi-script gets an folder ID gets only data from this "folder". So if I decouple die processes, I don't know which folders to cache and I can not cache all folders, because the routine is to slow. So I must get the actual folder from cgi and then cache this one as long as the uses is in this folder and pulls data every 2 Minutes and cache another folder, if the uses changes his folder.
I've been having some difficulty following this thread but...
Isn't this what Cookies are for? Obtaining some sort of user ID/state
that can be passed into the processing to allow for continuing from a
previous connection?
HTTP is normally stateless. The client requests a page, the page
contents are obtained (either a static page, or some CGI-style
computation generates the immediate page data), the page is returned,
and the connection ends. If the page needs to be updated, that is a
completely separate transaction.
Cookies are used to link these separate transactions into one "whole";
the first time the client requests the page, a cookie is generated. On
subsequent requests (updates) the (now) existing cookie is sent back to
the server to identify the user and allow for selecting the proper
continuation state. At the position *** the rest of the script must be uncoupled, so that the client can proceed with the actual data, but the new data generation for the next time ist stored in a file.
I've not coded CGI stuff (don't have access to a server that permits
user CGI) but my rough view of this task would be:
CGI******
if no cookie
generate a cookie for this user
endif
pass (received or generated) cookie to background process
wait for return-data from background process (if a new cookie, this
will take time to compute, otherwise the background process should
already have computed it)
return web-page with cookie and data
Background********
loop
scan "cache" list for expired cookies (unused threads)
terminate related process thread (process thread should clean up disk
files used)
clean up (delete) cookie from "cache" list
get request (and cookie) from CGI
if the cookie is not in the "cache" list
create new processing thread
endif
Use cookie data to identify (existing) processing thread and read next
data batch from it (queue.queue perhaps, one queue per cookie).
Return data (processing thread continues to compute next update)
endloop
You probably want to include, in "Background" a bit of logic to track
"last request time" and terminate processing threads if no client has
asked for an update in some period of time. The Cookies should also
have expiration times associated so that reconnecting after a period of
time will force a new cookie.
As for the folder? If the user physically navigates to other folders,
that can be passed to the background process and used to update the
threads (or create a new thread, if you assume the cookie identifies a
folder).
Caching would be semi-automatic here. The processing threads could be
folder specific, and when the thread is terminated (on lack of update
requests... let's see, you expect 2-minute update period, allow for a
slow net, say you terminate a process after 5 minutes of disuse...) you
can clean up the disk space (folder) that process was using. The cookie
expiration time would be updated on each update.
The master web page should have whatever HTML tags force a timed
reload to do a new request every 2 minutes.
-- ================================================== ============ < wl*****@ix.netcom.com | Wulfraed Dennis Lee Bieber KD6MOG < wu******@dm.net | Bestiaria Support Staff < ================================================== ============ < Bestiaria Home Page: http://www.beastie.dm.net/ < Home Page: http://www.dm.net/~wulfraed/ <
You did everything, but not answer my question. I know what cookies are, but
I don't need cookies here. And you said in your answer "start background
process", that was my question. How can I start a background process.
But I've solved it now,
Ralph
Simply put the last part in an extra file 'cachedata.py', then use
import os
os.spawnlp(os.P_NOWAIT, 'python', 'python', 'cachedata.py')
to call this as child process and DON'T wait for this process.
Ralph This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: Gardner Pomper |
last post by:
Hi,
I am pretty new to python, so be gentle :)
I have a python script that spawns a number of threads (configurable) on a
12 processor AIX RISC-6000 machine. It works fine, so long as I am...
|
by: Garry Hodgson |
last post by:
a colleague of mine has seen an odd problem in some code of ours.
we initially noticed it on webware, but in distilling a test case it seems
to be strictly a python issue. in the real system, it...
|
by: Atul Kshirsagar |
last post by:
I am embedding python in my C++ application. I am using Python *2.3.2* with
a C++ extention DLL in multi-threaded environment. I am using SWIG-1.3.19 to
generate C++ to Python interface.
Now to...
|
by: Holger Joukl |
last post by:
Hi,
migrating from good old python 1.5.2 to python 2.3, I have a problem
running a program that features some threads which execute calls to
an extension module.
Problem is that all of a sudden,...
|
by: Ronan Viernes |
last post by:
Hi,
I have created a python script (see below) to count the maximum number
of threads per process (by starting new threads continuously until it
breaks).
######
#testThread.py
import...
|
by: Alban Hertroys |
last post by:
Hello,
I'm using psycopg to insert records in a number of threads. After the
threads finish, another thread runs to collect the inserted data. Now,
my problem is that psycopg let's my threads...
|
by: Gurpreet Sachdeva |
last post by:
I have written a code to figure out the difference in excecution time
of a func before and after using threading...
#!/usr/bin/env python
import threading
import time
loops =
|
by: Ugo Di Girolamo |
last post by:
I have the following code, that seems to make sense to me.
However, it crashes about 1/3 of the times.
My platform is Python 2.4.1 on WXP (I tried the release version from
the msi and...
|
by: nikhilketkar |
last post by:
What are the implications of the Global Interpreter Lock in Python ?
Does this mean that Python threads cannot exploit a dual core
processor and the only advantage of using threads is in that...
|
by: CloudSolutions |
last post by:
Introduction:
For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
|
by: Faith0G |
last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome former...
|
by: Charles Arthur |
last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
|
by: aa123db |
last post by:
Variable and constants
Use var or let for variables and const fror constants.
Var foo ='bar';
Let foo ='bar';const baz ='bar';
Functions
function $name$ ($parameters$) {
}
...
|
by: ryjfgjl |
last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
|
by: nemocccc |
last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
|
by: Sonnysonu |
last post by:
This is the data of csv file
1 2 3
1 2 3
1 2 3
1 2 3
2 3
2 3
3
the lengths should be different i have to store the data by column-wise with in the specific length.
suppose the i have to...
|
by: Hystou |
last post by:
There are some requirements for setting up RAID:
1. The motherboard and BIOS support RAID configuration.
2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
| |