When trying to write a scalable web service, you need to be aware of a few properties that affect how the WCF runtime will dispatch requests to your service: InstanceContextMode and ConcurrencyMode. In a nutshell, InstanceContextMode controls when a new instance of your service type is created, and ConcurrencyMode controls how many requests can be serviced simultaneously. The default InstanceContextMode is InstanceContextMode.PerSession, and the default ConcurrencyMode is ConcurrencyMode.Single.
Others have covered the details of these two knobs in detail, you can check them out for more background. Here I’m simply going to explain the affect these settings can have on your threading behavior.
If you have set ConcurrencyMode == ConcurrencyMode.Single, then you don’t have to worry about your Service instances being free-threaded (unless you are doing concurrent code within your methods). The only time multiple calls are allowed is when there are multiple instances. For InstanceContextMode.Singleton, you will get one method call at a time since you only have a single instance. For InstanceContextMode.PerCall or PerSession, ServiceModel will spin up extra threads up to a throttle in order to handle extra requests. There’s one possibly unexpected twist here. That is, when using a session-ful binding there will only be one outstanding instance call per-channel. Even with InstanceContextMode.PerCall. This is because WCF strictly maintains the in-order delivery guarantees of the channel with ConcurrencyMode.Single. So when using a session-ful binding (i.e. the default NetTcpBinding or NetNamedPipeBinding) + ConcurrencyMode.Single, InstanceContextMode.PerCall and InstanceContextMode.PerSession will behave the exact same way from a server-side threading/throttling perspective with regards to a single channel.
ConcurrencyMode == ConcurrencyMode.Reentrant is similar, except you can trigger another call to your instance from within your service (let’s say you call into a second service who calls back into you before it returns).
When you have ConcurrencyMode == ConcurrencyMode.Multiple, threading comes heavily into play. WCF will call into your instance on multiple threads unless you are using InstanceContextMode.PerCall. Throttles again will come into play (that’s a topic for another post).
To summarize, here are some basic scenarios for what happens when 100 clients simultaneously hit a service method:
Scenario 1: InstanceContextMode.Single+ConcurrencyMode.Single
Result: 100 sequential invocations of the service method on one thread
Scenario 2: InstanceContextMode.Single+ConcurrencyMode.Multiple
Result: N concurrent invocations of the service method on N threads, where N is determined by the service throttle.
Scenario 3:InstanceContextMode.PerCall+Any ConcurrencyMode
Result: N concurrent invocations of the method on N service instances, where N is determined by the service throttle