design choice: multi-threaded / asynchronous wxpython client?

bullockbefriending bard

I am a complete ignoramus and newbie when it comes to designing and
coding networked clients (or servers for that matter). I have a copy
of Goerzen (Foundations of Python Network Programming) and once
pointed in the best direction should be able to follow my nose and get
things sorted... but I am not quite sure which is the best path to
take and would be grateful for advice from networking gurus.

I am writing a program to display horse racing tote odds in a desktop
client program. I have access to an HTTP (open one of several URLs,
and I get back an XML doc with some data... not XML-RPC.) source of
XML data which I am able to parse and munge with no difficulty at all.
I have written and successfully tested a simple command line program
which allows me to repeatedly poll the server and parse the XML. Easy
enough, but the real world production complications are:

1) The data for the race about to start updates every (say) 15
seconds, and the data for earlier and later races updates only every
(say) 5 minutes. There is no point for me to be hammering the server
with requests every 15 seconds for data for races after the upcoming
race... I should query for this perhaps every 150s to be safe. But for
the upcoming race, I must not miss any updates and should query every
~7s to be safe. So... in the middle of a race meeting the situation
might be:
race 1 (race done with, no-longer querying), race 2 (race done with,
no longer querying) race 3 (about to start, data on server for this
race updating every 15s, my client querying every 7s), races 4-8 (data
on server for these races updating every 5 mins, my client querying
every 2.5 mins)

2) After a race has started and betting is cut off and there are
consequently no more tote updates for that race (it is possible to
determine when this occurs precisely because of an attribute in the
XML data), I need to stop querying (say) race 3 every 7s and remove
race 4 from the 150s query group and begin querying its data every 7s.

3) I need to dump this data (for all races, not just current about to
start race) to text files, store it as BLOBs in a DB *and* update real
time display in a wxpython windowed client.

My initial thought was to have two threads for the different update
polling cycles. In addition I would probably need another thread to
handle UI stuff, and perhaps another for dealing with file/DB data
write out. But, I wonder if using Twisted is a better idea? I will
still need to handle some threading myself, but (I think) only for
keeping wxpython happy by doing all this other stuff off the main
thread + perhaps also persisting received data in yet another thread.

I have zero experience with these kinds of design choices and would be
very happy if those with experience could point out the pros and cons
of each (synchronous/multithreaded, or Twisted) for dealing with the
two differing sample rates problem outlined above.

Many TIA!

Jun 27 '08 #1

Subscribe Reply

2162

Larry Bates

bullockbefriend ing bard wrote:

I am a complete ignoramus and newbie when it comes to designing and
coding networked clients (or servers for that matter). I have a copy
of Goerzen (Foundations of Python Network Programming) and once
pointed in the best direction should be able to follow my nose and get
things sorted... but I am not quite sure which is the best path to
take and would be grateful for advice from networking gurus.

I am writing a program to display horse racing tote odds in a desktop
client program. I have access to an HTTP (open one of several URLs,
and I get back an XML doc with some data... not XML-RPC.) source of
XML data which I am able to parse and munge with no difficulty at all.
I have written and successfully tested a simple command line program
which allows me to repeatedly poll the server and parse the XML. Easy
enough, but the real world production complications are:

1) The data for the race about to start updates every (say) 15
seconds, and the data for earlier and later races updates only every
(say) 5 minutes. There is no point for me to be hammering the server
with requests every 15 seconds for data for races after the upcoming
race... I should query for this perhaps every 150s to be safe. But for
the upcoming race, I must not miss any updates and should query every
~7s to be safe. So... in the middle of a race meeting the situation
might be:
race 1 (race done with, no-longer querying), race 2 (race done with,
no longer querying) race 3 (about to start, data on server for this
race updating every 15s, my client querying every 7s), races 4-8 (data
on server for these races updating every 5 mins, my client querying
every 2.5 mins)

2) After a race has started and betting is cut off and there are
consequently no more tote updates for that race (it is possible to
determine when this occurs precisely because of an attribute in the
XML data), I need to stop querying (say) race 3 every 7s and remove
race 4 from the 150s query group and begin querying its data every 7s.

3) I need to dump this data (for all races, not just current about to
start race) to text files, store it as BLOBs in a DB *and* update real
time display in a wxpython windowed client.

My initial thought was to have two threads for the different update
polling cycles. In addition I would probably need another thread to
handle UI stuff, and perhaps another for dealing with file/DB data
write out. But, I wonder if using Twisted is a better idea? I will
still need to handle some threading myself, but (I think) only for
keeping wxpython happy by doing all this other stuff off the main
thread + perhaps also persisting received data in yet another thread.

I have zero experience with these kinds of design choices and would be
very happy if those with experience could point out the pros and cons
of each (synchronous/multithreaded, or Twisted) for dealing with the
two differing sample rates problem outlined above.

Many TIA!

IMHO using twisted will give you the best performance and framework. Since it
uses callbacks for every request, your machine could handle a LOT of different
external queries and keep everything updated in WX. Might be a little tricky to
get working with WX, but I recall Googling for something like this not long ago
and there appeared to be sufficient information on how to get working.

http://twistedmatrix.com/projects/co...g-reactor.html

Twisted even automatically uses threads to keep SQL database storage routines
from blocking (see Chapter 4 of Twisted Network Programming Essentials)

This is an ambitious project, good luck.

-Larry

Jun 27 '08 #2

Eric Wertman

HI, that does look like a lot of fun... You might consider breaking
that into 2 separate programs. Write one that's threaded to keep a db
updated properly, and write a completely separate one to handle
displaying data from your db. This would allow you to later change or
add a web interface without having to muck with the code that handles
data.

Jun 27 '08 #3

David

1) The data for the race about to start updates every (say) 15
seconds, and the data for earlier and later races updates only every
(say) 5 minutes. There is no point for me to be hammering the server
with requests every 15 seconds for data for races after the upcoming

Try using an HTTP HEAD instruction instead to check if the data has
changed since last time.

Jun 27 '08 #4

bullockbefriending bard

On Apr 27, 10:10*pm, David <wizza...@gmail .comwrote:

*1) The data for the race about to start updates every (say) 15
*seconds, and the data for earlier and later races updates only every
*(say) 5 minutes. There is *no point for me to be hammering the server
*with requests every 15 seconds for data for races after the upcoming

Try using an HTTP HEAD instruction instead to check if the data has
changed since last time.

Thanks for the suggestion... am I going about this the right way here?

import urllib2
request = urllib2.Request ("http://get-rich.quick.com" )
request.get_met hod = lambda: "HEAD"
http_file = urllib2.urlopen (request)

print http_file.heade rs

->>>
Age: 0
Date: Sun, 27 Apr 2008 16:07:11 GMT
Content-Length: 521
Content-Type: text/xml; charset=utf-8
Expires: Sun, 27 Apr 2008 16:07:41 GMT
Cache-Control: public, max-age=30, must-revalidate
Connection: close
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
X-AspNet-Version: 1.1.4322
Via: 1.1 jcbw-nc3 (NetCache NetApp/5.5R4D6)

Date is the time of the server response and not last data update. Data
is definitely time of server response to my request and bears no
relation to when the live XML data was updated. I know this for a fact
because right now there is no active race meeting and any data still
available is static and many hours old. I would not feel confident
rejecting incoming data as duplicate based only on same content length
criterion. Am I missing something here?

Actually there doesn't seem to be too much difficulty performance-wise
in fetching and parsing (minidom) the XML data and checking the
internal (it's an attribute) update time stamp in the parsed doc. If
timings got really tight, presumably I could more quickly check each
doc's time stamp with SAX (time stamp comes early in data as one might
reasonably expect) before deciding whether to go the whole hog with
minidom if the time stamp has in fact changed since I last polled the
server.

But if there is something I don't get about HTTP HEAD approach, please
let me know as a simple check like this would obviously be a good
thing for me.

Jun 27 '08 #5

Jorge Godoy

bullockbefriend ing bard wrote:

3) I need to dump this data (for all races, not just current about to
start race) to text files, store it as BLOBs in a DB *and* update real
time display in a wxpython windowed client.

Why in a BLOB? Why not into specific data types and normalized tables? You
can also save the BLOB for backup or auditing, but this won't allow you to
use your DB to the best of its capabilities... It will just act as a data
container, the same as a network share (which would not penalize you too
much to have connections open/closed).

Jun 27 '08 #6

Jarkko Torppa

On 2008-04-27, David <wi******@gmail .comwrote:

>>
1) The data for the race about to start updates every (say) 15
seconds, and the data for earlier and later races updates only every
(say) 5 minutes. There is no point for me to be hammering the server
with requests every 15 seconds for data for races after the upcoming

Try using an HTTP HEAD instruction instead to check if the data has
changed since last time.

Get If-Modified-Since is still better
(http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html 14.25)

--
Jarkko Torppa

Jun 27 '08 #7

=?ISO-8859-1?Q?BJ=F6rn_Lindqvist?=

I think twisted is overkill for this problem. Threading, elementtree
and urllib should more than suffice. One thread polling the server for
each race with the desired polling interval. Each time some data is
treated, that thread sends a signal containing information about what
changed. The gui listens to the signal and will, if needed, update
itself with the new information. The database handler also listens to
the signal and updates the db.

2008/4/27, bullockbefriend ing bard <ki*******@gmai l.com>:

I am a complete ignoramus and newbie when it comes to designing and
coding networked clients (or servers for that matter). I have a copy
of Goerzen (Foundations of Python Network Programming) and once
pointed in the best direction should be able to follow my nose and get
things sorted... but I am not quite sure which is the best path to
take and would be grateful for advice from networking gurus.

I am writing a program to display horse racing tote odds in a desktop
client program. I have access to an HTTP (open one of several URLs,
and I get back an XML doc with some data... not XML-RPC.) source of
XML data which I am able to parse and munge with no difficulty at all.
I have written and successfully tested a simple command line program
which allows me to repeatedly poll the server and parse the XML. Easy
enough, but the real world production complications are:

1) The data for the race about to start updates every (say) 15
seconds, and the data for earlier and later races updates only every
(say) 5 minutes. There is no point for me to be hammering the server
with requests every 15 seconds for data for races after the upcoming
race... I should query for this perhaps every 150s to be safe. But for
the upcoming race, I must not miss any updates and should query every
~7s to be safe. So... in the middle of a race meeting the situation
might be:
race 1 (race done with, no-longer querying), race 2 (race done with,
no longer querying) race 3 (about to start, data on server for this
race updating every 15s, my client querying every 7s), races 4-8 (data
on server for these races updating every 5 mins, my client querying
every 2.5 mins)

2) After a race has started and betting is cut off and there are
consequently no more tote updates for that race (it is possible to
determine when this occurs precisely because of an attribute in the
XML data), I need to stop querying (say) race 3 every 7s and remove
race 4 from the 150s query group and begin querying its data every 7s.

3) I need to dump this data (for all races, not just current about to
start race) to text files, store it as BLOBs in a DB *and* update real
time display in a wxpython windowed client.

My initial thought was to have two threads for the different update
polling cycles. In addition I would probably need another thread to
handle UI stuff, and perhaps another for dealing with file/DB data
write out. But, I wonder if using Twisted is a better idea? I will
still need to handle some threading myself, but (I think) only for
keeping wxpython happy by doing all this other stuff off the main
thread + perhaps also persisting received data in yet another thread.

I have zero experience with these kinds of design choices and would be
very happy if those with experience could point out the pros and cons
of each (synchronous/multithreaded, or Twisted) for dealing with the
two differing sample rates problem outlined above.

Many TIA!

--
http://mail.python.org/mailman/listinfo/python-list

--
mvh Björn

Jun 27 '08 #8

bullockbefriending bard

On Apr 27, 11:27*pm, "BJörn Lindqvist" <bjou...@gmail. comwrote:

I think twisted is overkill for this problem. Threading, elementtree
and urllib should more than suffice. One thread polling the server for
each race with the desired polling interval. Each time some data is
treated, that thread sends a signal containing information about what
changed. The gui listens to the signal and will, if needed, update
itself with the new information. The database handler also listens to
the signal and updates the db.

So, if i understand you correctly:

Assuming 8 races and we are just about to start the race 1, we would
have 8 polling threads with the race 1 thread polling at faster rate
than the other ones. after race 1 betting closed, could dispense with
that thread, change race 2 thread to poll faster, and so on...? I had
been rather stupidly thinking of just two polling threads, one for the
current race and one for races not yet run... but starting out with a
thread for each extant race seems simpler given there then is no need
to handle the mechanics of shifting the polling of races from the
omnibus slow thread to the current race fast thread.

Having got my minidom parser working nicely, I'm inclined to stick
with it for now while I get other parts of the problem licked into
shape. However, I do take your point that it's probably overkill for
this simple kind of structured, mostly numerical data and will try to
find time to experiment with the elementtree approach later. No harm
at all in shaving the odd second off document parse times.

Jun 27 '08 #9

David

Date is the time of the server response and not last data update. Data

is definitely time of server response to my request and bears no
relation to when the live XML data was updated. I know this for a fact
because right now there is no active race meeting and any data still
available is static and many hours old. I would not feel confident
rejecting incoming data as duplicate based only on same content length
criterion. Am I missing something here?

It looks like the data is dynamically generated on the server, so the
web server doesn't know if/when the data changed. You will usually see
this for static content (images, html files, etc). You could go by the
Cache-Control line and only fetch data every 30 seconds, but it's
possible for you to miss some updates this way.

Another thing you could try (if necessary, this is a bit of an
overkill) - download the first part of the XML (GET request with a
range header), and check the timestamp you mentinoed. If that changed
then re-request the doc (a download resume is risky, the XML might
change between your 2 requests).

David.

Jun 27 '08 #10

Similar topics

2406

multi-instance and classic singleton design patterns

by: Neil Zanella | last post by:

Hello, I would be very interested in knowing how the following C++ multi-instance singleton (AKA Borg) design pattern based code snippet can be neatly coded in Python. While there may be somewhat unusual places where multi-instance singleton is more useful than plain singleton, it seems to me that the former leads to less coding, so unless...

Python

1954

Multi-paradigm design with C++

by: Florian Weimer | last post by:

I'd like to get up to speed with multi-paradigm design with C++. Can you recommend any books on this topic? I'd prefer something which is not tied to a particular compiler, treats ISO C++, and is not focused on GUI development and its specific problems. I've got a basic understanding of most C++ features (although I haven't used the...

C / C++

1359

design quandry ..

by: forums_mp | last post by:

I've come full circle on a design here. Consider the case where I've got two modes of operation, uni-cast or multi-cast. In either mode you can transmit(send) and/or receive. The distinction between the two modes amounts to a set of vendor APIs. When transmitting, to estabilish a connection in uni-cast mode I'll do: ...

C / C++

992

More design questions here..

by: Ivan Demkovitch | last post by:

Hi! This is a long one :-) Please correct me if I *did* something wrong and help me with my questions (at the end) I'm learning C# and .NET but good at SQL. I'm working on portal and writing Survey module. This module is a web control.

ASP.NET

1781

object oriented design

by: scottrm | last post by:

I am fairly new to oo design and I am looking at developing an object oriented asp.net application which will be built on top of a relational database. I have read quite a bit of the theory but find it hard to put it into practice. In particular I am confused in terms of interacting with the database. It seems to me classes map quite closely...

ASP.NET

1862

Store multi-choice groupbox selection in a field

by: dbuchanan | last post by:

I am designing a table to contain a field to stores a value that must indicate none, one, or many choices. The form will have a group box with checkboxes (multi-choice) You see the records in this table contain partially pre-configured records. The records serve as "master" data that will later be inserted into a data table. The master data...

.NET Framework

2689

Which database design tool

by: none | last post by:

Hi, I am looking for a tool to document and maintain a large data model (500 tables). Oracle Designer used to be the tool of choice for this sort of thing. But given the investment of effort we will be making, and the future of Designer, it is not the tool of choice for new applications. Can you recommend an alternative tool? Our...

Oracle Database

5052

design a forum that posts multiple choice questions

by: raiya | last post by:

hi, I'm a teacher and new ms access user. I'm intending to design an ms access db to post multiple choice questions each with 4 choices. I created 2 tables one for the questions and the other for the answers. The first has the fields: question_no - number question - text answer - memo --- gives the full solution The second has the fields:...

Microsoft Access / VBA

4830

Design difference Firefox<->IE

by: roN | last post by:

Hi, I'm creating a Website with divs and i do have some troubles, to make it looking the same way in Firefox and IE (tested with IE7). I checked it with the e3c validator and it says: " This Page Is Valid XHTML 1.0 Transitional!" but it still wouldn't look the same. It is on http://www.dvdnowkiosks.com/new/theproduct.php scroll down and...

HTML / CSS

1813

Not urgent - just learning - database design tips

by: thecheyenne | last post by:

Hi and good evening / good morning (depending on your location on this planet) Despite my limited - o.k, non-existing - knowledge of vba, I'd like to design a database to help with the admin of school outings. I pretty much know my way around MS access, but may well need vba for this thing here, so if you have any hints and tips for me, I'd be...

Microsoft Access / VBA

7451

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language...

Windows Server

7720

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. ...

C / C++

7475

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For...

Windows Server

5372

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes...

Microsoft Access / VBA

5089

Couldn’t get equations in html when convert word .docx file to html file in C#.

by: conductexam | last post by:

I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert...

C# / C Sharp

3483

Windows Forms - .Net 8.0

by: adsilva | last post by:

A Windows Forms form does not have the event Unload, like VB6. What one acts like?

Visual Basic .NET

1944

transfer the data from one system to another through ip address

by: 6302768590 | last post by:

Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

C# / C Sharp

1061

How to add payments to a PHP MySQL app.

by: muto222 | last post by:

How can i add a mobile payment intergratation into php mysql website.

PHP

766

Comprehensive Guide to Website Development in Toronto: Expert Insights from BSMN Consultancy

by: bsmnconsultancy | last post by:

In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...

General