473,385 Members | 1,372 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

Custom Thread Pool (by Mr. Jon Skeet) Enhancement

Hello,
I'm currently in the tail end process of developing a high scalability
server for my employer. Essentially it receives short socket based
connections with an ASCII message, parses that message, does some
processing and then sends out a string reply on the same connection.

I'm using the asynchrounous IO completion port based socket methods in
..NET 2.0 to handle the comms side of things (giving extremely good
performance) and the venerable Mr. Skeet's custom threadpool to execute
the processing portion of the system. My problem is that not all of the
messages are created equal. Some simply do a very quick query from a DB
(or even a cache), whereas others do a credit card authorisation or
some longer SQL work.

Because of this I've tried to seperate my messages into two classes,
short running and long running, running them on two seperate
threadpools. Its become very hard to manage how many threads I allocate
to each - as all messages involve some I/O blocking I want to avoid
starving the pools as much as possible to keep latency low.

Based on this article:
http://blogs.msdn.com/cbrumme/archiv.../21/77595.aspx (Threading
and Synchronization, near the top - he explains the concept much better
than I ever could), I've been thinking about expanding Jon's ThreadPool
to automatically determine its own maximum number of threads to keep
CPU usage high (by increasing the number of threads) whilst not choking
the machine to death with too many context switches. This is apparently
a big simplification of what MS SQL Server does internally, and I'd
like to replace my two pools with jsut one of this nature.

Essentially I'd just like anyone's comments (especially yours Jon!) as
to whether this is worthwhile, or how to go about it - I've got some
ideas of my own:
1. Check on completion of a work item the CPU usage and decide whether
to create a new thread, keep the thread count the same, suspend the
thread?
2. Add a timer (100ms?) that checks the CPU periodically and sets the
max and min thread values? How does CPU usage work with multiple CPUs?

Its getting to be a bit of a pain as I have to tune the system as it is
to the machine it runs on and the type of load it encounters!

Thanks everyone!

Nov 17 '05 #1
5 5074
Kieran Benton <ki**********@fastmail.fm> wrote:
I'm currently in the tail end process of developing a high scalability
server for my employer. Essentially it receives short socket based
connections with an ASCII message, parses that message, does some
processing and then sends out a string reply on the same connection.

I'm using the asynchrounous IO completion port based socket methods in
.NET 2.0 to handle the comms side of things (giving extremely good
performance) and the venerable Mr. Skeet's custom threadpool to execute
the processing portion of the system. My problem is that not all of the
messages are created equal. Some simply do a very quick query from a DB
(or even a cache), whereas others do a credit card authorisation or
some longer SQL work.

Because of this I've tried to seperate my messages into two classes,
short running and long running, running them on two seperate
threadpools. Its become very hard to manage how many threads I allocate
to each - as all messages involve some I/O blocking I want to avoid
starving the pools as much as possible to keep latency low.
Right. Could you not use asychronous IO again, and keep the thread pool
for actual processing, so that any thread which is in the pool is
either running or available, not blocking in a useless way? It's no
doubt relatively tricky to write the code that way, but it sounds like
the way to get the maximum performance out.
Based on this article:
http://blogs.msdn.com/cbrumme/archiv.../21/77595.aspx (Threading
and Synchronization, near the top - he explains the concept much better
than I ever could), I've been thinking about expanding Jon's ThreadPool
to automatically determine its own maximum number of threads to keep
CPU usage high (by increasing the number of threads) whilst not choking
the machine to death with too many context switches. This is apparently
a big simplification of what MS SQL Server does internally, and I'd
like to replace my two pools with jsut one of this nature.
Have you tried just using a relatively large pool, and measuring how
much time is actually spent context switching? Do you definitely have a
problem?
Essentially I'd just like anyone's comments (especially yours Jon!) as
to whether this is worthwhile, or how to go about it - I've got some
ideas of my own:
1. Check on completion of a work item the CPU usage and decide whether
to create a new thread, keep the thread count the same, suspend the
thread?
2. Add a timer (100ms?) that checks the CPU periodically and sets the
max and min thread values? How does CPU usage work with multiple CPUs?

Its getting to be a bit of a pain as I have to tune the system as it is
to the machine it runs on and the type of load it encounters!


You certainly could take either of those approaches, but I don't know
whether they'd really do what you want them to. I'd definitely try
profiling with a few different sizes of threadpool first.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Nov 17 '05 #2
Hi Jon,
I'm certain we do have a problem, sometimes we are processing
exclusively short lived, high CPU/low blocking messages - with which a
large threadpool is less efficient, at other times low CPU/long
blocking messages where a larger threadpool is more performant. Or a
mixture of the two. We've done plenty of tests on production servers
with built in performance metrics and performance counters collecting
our data.

As to using async IO, I assume you mean the new stuff for DB access? We
are using this where possible, althoguh as you say it is a royal pain
in the arse for some situations! Unfortunately one of our longest
running (and most common) messages involves using a COM object I have
no control internally over - so I'm kind of SOA with that. I agree
though that async DB access is the way to go wherever it is humanely
possible.

I'm interested on your thoughts on when in your TP to make a decision
over whether to create a new thread and whether to destroy (or maybe
even just suspend?) a thread. Another niggle in my mind is whether to
work out some kind of metric in terms of remaining memory and CPU, or
whether to just monitor the CPU. Hopefully I'll get started on this
soon and be able to post some benchmark results.

--
Kieran Benton

Nov 17 '05 #3
Kieran Benton <ki**********@fastmail.fm> wrote:
I'm certain we do have a problem, sometimes we are processing
exclusively short lived, high CPU/low blocking messages - with which a
large threadpool is less efficient, at other times low CPU/long
blocking messages where a larger threadpool is more performant. Or a
mixture of the two. We've done plenty of tests on production servers
with built in performance metrics and performance counters collecting
our data.
Right. Oh dear :(
As to using async IO, I assume you mean the new stuff for DB access? We
are using this where possible, althoguh as you say it is a royal pain
in the arse for some situations! Unfortunately one of our longest
running (and most common) messages involves using a COM object I have
no control internally over - so I'm kind of SOA with that. I agree
though that async DB access is the way to go wherever it is humanely
possible.
Yes - although it does tend to be a pig to code.
I'm interested on your thoughts on when in your TP to make a decision
over whether to create a new thread and whether to destroy (or maybe
even just suspend?) a thread. Another niggle in my mind is whether to
work out some kind of metric in terms of remaining memory and CPU, or
whether to just monitor the CPU. Hopefully I'll get started on this
soon and be able to post some benchmark results.


Hmm. Yes, it sounds like a CPU monitor *might* work. If I were you, I'd
try adapting my threadpool to have use kind of interface which it asks
about whether or not to kill a thread and whether or not to create one
- then one could have different policies for different situations.
Easier said than done, of course...

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Nov 17 '05 #4
> > I'm certain we do have a problem, sometimes we are processing
exclusively short lived, high CPU/low blocking messages - with which a
large threadpool is less efficient, at other times low CPU/long
blocking messages where a larger threadpool is more performant. Or a
mixture of the two. We've done plenty of tests on production servers
with built in performance metrics and performance counters collecting
our data.
Right. Oh dear :(


Yup, its a tricky one isnt it? :) Its not actually too bad balancing it
by hand - I would just like a more flexible solution that responds to
the type of load its under.
As to using async IO, I assume you mean the new stuff for DB access? We
are using this where possible, althoguh as you say it is a royal pain
in the arse for some situations! Unfortunately one of our longest
running (and most common) messages involves using a COM object I have
no control internally over - so I'm kind of SOA with that. I agree
though that async DB access is the way to go wherever it is humanely
possible.
Yes - although it does tend to be a pig to code.


Absolutely - the semantics of having to break quite a simple processing
block up into async methods... Well - yuck! It would be really quite
nice if you could have an async method run inplace, yield the thread to
the pool for some other processing and then ask for one back again when
it needs it. Not going to happen of course - mainly as that is a gross
simplification! :)
I'm interested on your thoughts on when in your TP to make a decision
over whether to create a new thread and whether to destroy (or maybe
even just suspend?) a thread. Another niggle in my mind is whether to
work out some kind of metric in terms of remaining memory and CPU, or
whether to just monitor the CPU. Hopefully I'll get started on this
soon and be able to post some benchmark results.


Hmm. Yes, it sounds like a CPU monitor *might* work. If I were you, I'd
try adapting my threadpool to have use kind of interface which it asks
about whether or not to kill a thread and whether or not to create one
- then one could have different policies for different situations.
Easier said than done, of course...


Cheers for your advice Jon, your input is much appreciated. I think I
will go down that route!

---
Kieran Benton
(www.kieranbenton.com)

Nov 17 '05 #5
Kieran Benton <ki**********@fastmail.fm> wrote:
I'm certain we do have a problem, sometimes we are processing
exclusively short lived, high CPU/low blocking messages - with which a
large threadpool is less efficient, at other times low CPU/long
blocking messages where a larger threadpool is more performant. Or a
mixture of the two. We've done plenty of tests on production servers
with built in performance metrics and performance counters collecting
our data.


Right. Oh dear :(


Yup, its a tricky one isnt it? :) Its not actually too bad balancing it
by hand - I would just like a more flexible solution that responds to
the type of load its under.


Yes. That sort of heuristic approach is usually a pain to implement, of
course.
As to using async IO, I assume you mean the new stuff for DB access? We
are using this where possible, althoguh as you say it is a royal pain
in the arse for some situations! Unfortunately one of our longest
running (and most common) messages involves using a COM object I have
no control internally over - so I'm kind of SOA with that. I agree
though that async DB access is the way to go wherever it is humanely
possible.


Yes - although it does tend to be a pig to code.


Absolutely - the semantics of having to break quite a simple processing
block up into async methods... Well - yuck! It would be really quite
nice if you could have an async method run inplace, yield the thread to
the pool for some other processing and then ask for one back again when
it needs it. Not going to happen of course - mainly as that is a gross
simplification! :)


It would be interesting to try to develop a language extension which
allowed that. In some ways it would be akin to the "yield" in C# 2.0
for implementing iterators...
I'm interested on your thoughts on when in your TP to make a decision
over whether to create a new thread and whether to destroy (or maybe
even just suspend?) a thread. Another niggle in my mind is whether to
work out some kind of metric in terms of remaining memory and CPU, or
whether to just monitor the CPU. Hopefully I'll get started on this
soon and be able to post some benchmark results.


Hmm. Yes, it sounds like a CPU monitor *might* work. If I were you, I'd
try adapting my threadpool to have use kind of interface which it asks
about whether or not to kill a thread and whether or not to create one
- then one could have different policies for different situations.
Easier said than done, of course...


Cheers for your advice Jon, your input is much appreciated. I think I
will go down that route!


Best of luck - and if the IP doesn't prevent you from showing the code
afterwards, I'd be interested in bringing this flexibility into the
custom threadpool code.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Nov 17 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Daylor | last post by:
hi. i have application with 2 appdomain , and 2 threads (win32 threads(not logical threads)). i want to call the "remote" object on the second appdomain in async way, so the method will execute...
0
by: Santa | last post by:
I am using Fritz Onion's "Asynchronous Pages" approach as mentioned in the article http://msdn.microsoft.com/msdnmag/issues/03/06/Threading/default.aspx to increase the performance of my ASPX...
1
by: Mullin Yu | last post by:
I want to know what're the main differences between them. In fact, I want to write an application to get the request from DB, and then based on the request type (print, email, fax), and then send...
2
by: palaga | last post by:
hi I'm using QueueUserWorkItem to execute a bunch of tasks using the thread pool. Once started, I would like to wait for all of them to finish, using something like WaitAll. Is there a way I can...
1
by: buzz | last post by:
I am evaluating Mike Woodring's custom thread pool classes (Developmentor) for use with an ASP.NET project that will be implementing pages derived from IHttpAsyncHandler. (Recommended by the...
0
by: buzz | last post by:
I am new to ASP.NET, so perhaps this is problem will yield a simple answer. I am building an ASP.NET application using asynchronous handlers (IHttpAsyncHandler). In doing so, it was recommended...
0
by: roni schuetz | last post by:
since a few day's i'm running around the problem that I stocked with a change i need to do. hopefully somebody here can give me a tipp which will be usefull to solve my problem. I'm using a...
1
by: Navin Mishra | last post by:
Hi, I've an ASP.NET web service that consumes other web services as well as sends data to client using TCP blocking sockets on a custom thread pool threads. If I use asynchronous sockets to send...
3
by: Venkat | last post by:
Hi, I am working on an application (developed using c#2.0) which needs to do a big job and when you start the job in a single thread it takes long time to complete. So, we want to break the job...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.