You may have noticed that after the threaded echo server was discussed in the The Text Book, the author suggests that as an exercise, the reader try writing a multi-threaded chat server. It was nice to see that he left some fun projects for his readers. Well, I thought that sounded like an excellent idea and did just that. It took just a little thought up front to decide how to go about doing it, but once I got the basic design figured out, it wasn’t too bad and was actually a very fun program to write.
The chat server developed is a multi-party chat server, sometimes called a chat room, similar to IRC (Internet Relay Chat). It uses straight TCP sockets with it’s own small application layer protocol of a few commands. As with IRC, all the commands begin with slash (/) character. For example, the command: /nick Bob sets the user’s nick name to Bob.
Actually there are three programming strategies for developing server programs. In addition to purely threaded and purely asynchronous, a hybrid solution is also possible for many problems. For this problem, the hybrid solution may actually be preferred solution. However, at this point, my chat server is purely based on threads. Following are design considerations for each alternative.
The asynchronous approach works well when any action by the server must be initiated by the server receiving a message from one of the clients being served. This is exactly the situation with a chat server.
The main advantage of asynchronous technology is that since there is only one thread, then by definition, the server only does one thing at a time. So, all of the issues associated with synchronizing access to shared data between multiple threads disappears. A disadvantage of this approach is that managing the state and data of each client becomes a juggling exercise. Asynchronous programs can also be fairly tricky to get right and for some problems, just can not be used at all. The main challenge that asynchronous servers face is that no blocking system call can be made. If, in serving one client, the server does not return immediately from a blocking system call, such as to access a data base, acquire a lock, or even to load a file, then the whole program is held up and the other clients are not served.
The Text Book shows a chat server that uses Twisted in chapter 22 and I found one on the Internet that uses the select.select() system call. [PILLAI07] (See chatSelectServer.py) Both of these servers keep a list of client sockets and when a message is received, they loop through all the client sockets sending the message to each. This is probably the simplest way to develop a chat server and should have fair performance. These solutions; however, do not keep a record of any old message and thus are not able to show a new user any of the discussion from before they joined the session.
It should be noted that this is a fairly typical distinction between multi-threaded and asynchronous servers. Multi-threaded servers tend to be more modular and tend to do a better job of keeping track of the clients and associated data. Thus, it can be easier to implement advanced features with a multi-threaded design.
As the name implies, a multi-treaded server is one that uses a separate thread of execution for each client.
The primary advantage of using multiple threads is that the code for each thread need only worry about one client. If a blocking system call is needed to serve the client, no problem is created because it only affects one client.  The code for threaded servers tends to be fairly easy to follow.
The most obvious complexity of the multi-threaded approach comes from dealing with data that is shared between multiple threads. This data must be protected with locks and associated synchronization algorithms. These synchronization algorithms can be more complex than one would hope, but there are known solutions to most any synchronization problem. The most difficult chore for the programmer is just to determine which of the generic problems with known solutions can be used to solve the task at hand. In general, once the synchronization issues are resolved, multi-threaded techniques have better potential than asynchronous attempts to scale to larger, more complex problems.
Another challenge of multi-threaded network programming stems from the fact that, as was previously mentioned, the default behavior of the socket.recv() function is to block (not return) until either data or a disconnect is received on the socket. For applications, such as web servers where the clients do not interact with each other, this is not a problem at all. When a connection is received, a child thread is started and it waits on the socket.recv() call until data is received and then it collects the response needed, perhaps via a data base query, and returns the results to the client. However, in a multi-party chat server, the thread must not only watch the socket for new data, but must also pay watch for new messages from other users that need to be sent to the client. Thus, the socket.recv() calls can not block indefinitely.
In a pure threads based solution, the only way to watch a socket and other data at more or less the same time is to set a timeout on the socket such that after a short time with no data received, a socket.timeout event will be raised. The main disadvantage of this approach is that users are delayed waiting for the timeout before being sent a new message. However, if the timeout value, is kept fairly small, this delay will not generally be noticed by users. However, the performance is perhaps not as good as we would like because of repeated checking for messages after each timeout.
Socket timeouts can also be used with client applications. In this case, the application must be ready to both receive data from the server and to send data, which the local user entered. Since the client only manages one socket, the use of timeouts to facilitate the multi-tasking is a good solution.
Since the objective of this activity was to develop a multi-threaded chat server, the approach of using timeouts was also adopted for the initial implementation of the server. I decided that a pretty clean approach would be to hold a global message queue that all threads can write to and read from per the synchronization mechanisms of the readers and writers problem. So grabbing an operating systems text book, [nutt] which I knew had a solution to readers and writers, I had a synchronization solution. When I first opened an operating systems text book and saw all of the locks, (actually, they used semaphores, but in this case, those semaphores can be switched out with Python’s simple Lock object) I was a little overwhelmed for a minute and wondered if I really needed so many locks. Then I remembered that readers and writers has been worked on by lots of really smart people and that it has proven itself to be safe and free of deadlocks.
Another approach is to combine synchronous and threaded technologies. This approach may be slightly more complex, but in the case of a multi-party chat server, may offer the best combination of performance and flexibility for adding additional features.
In the chat server, activity is initiated by receiving a message from one of the clients. So once a message has been received, we can go about updating any data and sending the message to each client socket. The asynchronous select.select() function can do a nice job of monitoring all of the sockets for incoming data. While a thread for each client makes sense so that each client can have it’s own data, fairly simple code that only worries about one client and is not restricted from using blocking system calls – to read or write to the message queue.
The author gives a suggestion for developing the multi-threaded chat server.
... You’ll want to make sure that the client threads do the transmission, not the main thread.
One way to accomplish that is to have a semaphore and a queue for each client thread. You could maintain a list of these objects, adding and removing entries from the list as threads come and go. When data is to be sent, it’s added to the appropriate queues and the semaphore is signaled.
This suggestion is right on target with an excellent plan of attack.  Unfortunately, he never mentions using a combined asynchronous / threaded approach. When I first read his suggestion, it did not make sense to me at all because I was thinking of a pure threaded solution. Thus, his suggestion was not followed in my first implementation. But a combined solution is sure to come in a future revision of the server. 
See the Demonstration Videos, of the server program.
The next section (Programming Assignment 5 – Add BRB feature to Multi-threaded Chat Server) describes the chat server programming project.
|||We are fairly tolerent of small delays caused by waiting on a blocking system call, or a socket timeout, when it only affects one client. (We can always blame delays on the slow network, right?) But any delay that affects all of the clients, is unacceptable because when the server has a large number of client connections, those short delays can add-up to long delays.|
|||The Queue.Queue class from the standard library could be used to keep the client threads on task and remove the need to signal them with a semaphore. The asynchronous thread could use the queue to alert a waiting thread of a message received on its socket. Perhaps another thread could be used just to tell the children that they should check the message board for messages. This thread could be signaled with a threading.Condition (monitor) whenever a writer exits the message queue. Because the message queue can block, only the children thread should ever act as reader or writer to the message board queue.|
|||Although the purely threaded solution with timeouts may use a few more CPU cycles by checking for messages when none are there, it still seems to be a reasonable solution. On a heavily loaded server, it might read multiple messages from the queue at a time; whereas, the hybrid solution would tend to alert each child after each message, thus the hybrid solution might loose some of its performance advantage when the server is most loaded. – Hmm, this will merit some testing.|