I am attempting to build a server application that will execute
disparate modules provided by any number of sources (internal or
customer provided). These 'Plug-Ins' are simply assembly's with
classes that adhere to a well known interface. The servers main job is
to load the 'plug-in', execute code, and make it 'agile' from a remote
management console. One of the critical features of the server
application is the ability to isolate the 'plug-ins' from each other in
a way that a fatal failure (IE, unhandled exception) in one module does
not result in the process space dying and subsequently terminating the
other 'plug-ins'.
My first crack at the server involved creating separate Application
Domain instances for each of the 'plug-ins'. This seemed logical at
the time because I was operating under the impression that I could
instantly get the isolation I needed (v1.1 worked this way). You also
get the benefit of being able to unload AppDomains, which would allow
me to dynamically upgrade the code running on one 'plug-in' without
affecting the operation of another. These separate AppDomains can be
running on single or multiple threads, which I found to be a good
option. This approach is perfectly suitable, except for the changes in
..NET 2.0 in regard to threading and AppDomain unhandled exception
'ignoring'. I actually think this change is for the better even though
it doesn't suit me on this particular project.
See these links for details on the AppDomain unhandled exception
discussion in .NET 2.0:
http://www.julmar.com/blog/mark/Perm...2f3cf1f1b.aspx
http://msdn2.microsoft.com/en-US/lib...exception.aspx
Implementing the <legacyUnhandledExceptionPolicy'workaround' is not
an option, as there is no guarantee that this would be supported in the
future, and I just dont find it a sound solution. Regardless, I've
been unable to get the workaround to work (Configuration Exceptions) so
it really doesn't appear to be a valid option.
So, the question becomes: What is a good architectural for a server
such as this built in managed code?
I can only think of a few options:
Option 1: Implement tight and ubiquitous exception handling around the
known interface calls, outside of the 'plug-in' code. Capturing,
reporting and managing unahandled exceptions and terminating
misbehaving modules. This doesn't at all guarantee that something
wouldn't slip through the cracks and result in process termination.
Option 2: Running 'plug-ins' in isolated processes. I don't feel this
is a valid option because of the resources required by each of the
'plug-ins' (I'm expecting possibly hundreds of 'plugins' per machine).
These resources include socket connections and the high frequency data
being delivered to them. It would be much more performant if the
'plug-ins' could share this data in the memory of a process, not
distributed via socket or IPC. This is a last resort, and even then,
the requirement of isolation would possibly be removed before using it.
Option 3: Enforce a certification process? I don't want to have to
review every line of code that could possibly be executed in the
server... not feasible and not 100% reliable.
If anyone has anymore options, I'm all ears.
Thanks for any comments-