469,326 Members | 1,534 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,326 developers. It's quick & easy.

Socket application getting unstable with many connected users


I'm experiencing a nasty problem in a server-client application. Don't have the perfect solution yet, but I have ideas. Seeking opinions/ideas to find the perfect solution.

The application is a card/board gaming platform with many other functions for chat, management and such.
Client is healthy and works as an user interface to communicate with server and other clients. Typical client, nothing special.

As you can see on the chart server is divided into parts. Listeners are actually infinite loops awaiting connection requests. You can pick the channel you want to connect and login.

Channels have a maximum limit of 100 clients, it's the place where all chatting and gaming happens. Clients can roll new rooms and get to gaming part. They can publicly chat with everyone in the channel, they can perform management functions (kick, ban, etc) if they are authorized. Each channel is another independant executable, they can't reach other directly. However for some particular occasions they can reach each other indirectly via socket.

Each channel has a "Client" object to define users and a "ClientCollection" to hold all of them. Since each client object works under their own thread, I use SyncLock on the collection to avoid inconsistence.

The main problem is;
when the channel reaches certain numbers of online users (limit's 100, but it starts with 75-80 clients), things start to get interestingly unstable. For instance, when you type a single sentence on chatbox, it can be distributed to other 80 people after one minute (instead of instantly). I monitor the server's CPU usage, surely it increases, but nothing unusual. Normally, a single channel application uses a minumum of 0% - 4%. When we have 75-80 users, it goes up to 5%-10% which looks acceptable. It feels like a temporary lag, whatever is causing it can lag everything for a couple of minutes (it even disconnect users), then it goes back to normal. Somehow it gets frozen and out of breathe, it coughs, catches its breathe and goes on running. Even sometimes, we are forced to restart individual channels.

When we have less online, everything goes smoothly.

At this point, since channel runs unstable only when it is crowded, I tried to observe the differences between a crowded environment and an underpopulated environment. I logged incoming and outgoing data to reduce and eliminate the most repetitive one in order to reduce traffic, hoping for some performance increase.

In my theory, this unstability occurs more often when an action effects the whole channel, just like chatting. Whenever you type something, if there are 80 clients inside, this has to be sent to 79 other clients (each task is done by a single thread, not a thread per client). And this is my main suspect. It's a lot of distributation work for the server, when 10-15 users are chatting at the same time in public. Currently, I'm distributing all these messages to the users by a single thread per task (new user connected, someone typed something, someone left the channel). Maybe using another way like TaskFactory could solve this issue. I wonder if it makes sense to gather the data required and deal with the distributation using TaskFactory. Would it solve my problem, what do you think?


We addressed a similar issue before when we first started on this project. It was advised us to set socket's NoDelay property to true. Maybe it is possible to get over with this by a single smart trick.
Feb 16 '12 #1
1 1841
Hmmm... interesting.
Mar 1 '12 #2

Post your reply

Sign in to post your reply or Sign up for a free account.

Similar topics

reply views Thread by Giuseppe | last post: by
7 posts views Thread by Jeff Pritchard | last post: by
3 posts views Thread by Shane Suebsahakarn | last post: by
1 post views Thread by Justin Creasy | last post: by
reply views Thread by harlem98 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.