FrankEsser wrote:
Hello!
I am not an expert on C++ programming and therefor I have a question:
We use a kind of communication server that was written in C++
especially for our company. It just takes incoming socket requests,
connects, evaluates incoming data packages and gives them to certain
plug-ins adressed in the frame of the data packages. For each data
package management an own thread is started. The data packages can
arrive faster than the processing in the plug-ins so there is a kind of
simple thread pool holding the data packages until they are processed.
The system was quite stable for many years and now we suddently have a
strange effect: Some data packages are sent by network clients but they
are put to the corresponding plug-ins with a delay of 2 to 15 Minutes
!!!
The system performance shows that there is no overload (about 20%
processor usage in task manager).
It seems that the effect occurs in the thread pool.
I'm not sure what you mean. The thread pool is threads waiting for work.
Is the work queued in FIFO order?
The only difference to all other systems we are running is that this
certain system is a server with a real dual processor hardware.
My question: is there any known issue about my problem on dual
processor systems?
If not, does anybody have any idea of what could cause our problems?
Any answer is welcome!!!
How many threads can you have running concurrently? Usually it's the size
of your thread pool. If you have too many threads, starvation can occur,
ie. not all threads will make forward progress in a timely manner. This
is a problem with scalability of the system scheduler w.r.t. the number of
threads it can run concurrently. Try having fewer threads in the thread
pool or change the scheduling policy to SCHED_RR or SCHED_FIFO if they're
supported on your platform. The other thing is that adaptive mutexes with
SCHED_OTHER may have scalability problems even if the thread pool size
isn't large enough to cause scalability problems otherwise. If you go to
one of the other scheduling policies, that may help.
SCHED_RR and SCHED_FIFO have more overhead and will reduce overall throughput.
If you have to go to less threads and your thread per connection is
i/o bound, you might want to go to a non-blocking i/o model where the
the session is kept as a user defined lighter weight task and you write
your own scheduling mechanism (simple FIFO queue) to schedule i/o and work
on the sessions. Some of the web servers and other server apps use that
strategy.
--
Joe Seigh
When you get lemons, you make lemonade.
When you get hardware, you make software.