Wenlong has an excellent writeup of how write a WCF client with an eye towards performance.
Category Archives: Work
Behind the protected BindingElement ctor
On our BindingElement class, there is a protected ctor that takes another BindingElement as its parameter. This constructor exists in order to facilitate a composable implementation of BindingElement.Clone. When writing a custom binding element, first implement a protected copy constructor as follows (note that for sealed classes this ctor should be private):
protected
MyBindingElement(
MyBindingElement
elementToBeCloned)
: base(elementToBeCloned)
{
// copy all fields from elementToBeCloned.XXX to this.XXX
}
Then you should implement your Clone() method as follows:
public override
BindingElement
Clone()
{ return new
MyBindingElement(
this
);
}
Any BindingElement in your inheritance chain (assuming it has followed this pattern) will then copy over the relevant values in its copy constructor, so that you can be assured a full Clone of your custom binding element.
Auto-open and multi-thread usage of client channels
Buddhike hit a hiccup the other day with a multi-threaded client that bears explanation.
The Channel layer always requires an explicit Open() before it can be used. This enforces our CommunicationObject state machine. As a usability feature, our ServiceChannel proxy code supports “auto-open”. That is, you can call a proxy method without explicitly calling Open and the runtime will call Open() on your behalf. This is transparent in the case where you are using a proxy synchronously from a single thread. However, if you are using a proxy asynchronously (or from multiple threads), you may have the case that the Open() is associated with the first request, but subsequent requests are also pending.
Since the state machine is that Open() must complete before Send/Receive are valid operations, none of the requests can proceed until Open completes. In the shipping code, this synchronization is actually around the entire ServiceChannel call, and so Buddhike was seeing an excessive delay. We’ll investigate for the next version if there’s a way in unblock earlier on the client, while still providing all of our existing behavioral guarantees. In the interim, I recommend two things when using a client asynchronously and/or from multiple threads concurrently:
- Open your client explicitly prior to usage. You can do this sychronously or asynchronously depending on your application
- Prefer calling your client asynchronously to spinning up multiple threads for synchronous calls if you want better scalability/thread-usage
Signalling "End Of Session"
When authoring a session-ful channel, it’s important to signal “end of session” correctly so that the runtime (or any other user of the channel) knows when to stop reading messages, and to start shutting down his side of the conversation (with CloseOutputSession and/or channel.Close). A null
Message/RequestContext signals end-of-session to the caller. In particular, depending on your channel shape, you should do the following:
- IInputSessionChannel/IDuplexSessionChannel: Return
null
from channel.Receive(). Correspondingly, returntrue
from TryReceive with the “message” out-param set tonull
. And of course, cover your bases by having BeginTryReceive complete synchronously with a signal to returntrue
+ message =null
from EndTryReceive. - IRequestSessionChannel: Return
null
from channel.ReceiveRequest(). Correspondingly, returntrue
from TryReceiveRequest with the “context” out-param set tonull
. Lastly, have BeginTryReceiveRequest complete synchronously with a signal to returntrue
+ context =null
from EndTryReceiveRequest.
Throttling in WCF
When your server is hosted out on the “big bad internet”, you need a way to make sure that you don’t get flooded with client requests. In WCF, our services support throttling as a way of mitigating potential DoS (denial of service) attacks. These throttles can also help you smooth load on your server and help enforce resource allocations. There are three service-level throttles that are controlled by ServiceThrottlingBehavior. These are in addition to any transport-specific throttles imposed by your binding. To fully understand the impact of these throttles you should also understand the threading/instancing characteristics of your service.
- MaxConcurrentCalls bounds the total number of simultaneous calls that we will process (default == 16). This is the only normalized throttle we have across all of the outstanding reads that the ServiceModel Dispatcher will perform on any channels it accepts. Each call corresponds to a Message received from the top of the server-side channel stack. If you set this high then you are saying that you have the resources to handle that many calls simultaneously. In practice how many calls will come in also depends on your ConcurrencyMode and InstancingMode.
- MaxConcurrentSessions bounds the total number of sessionful channels that we will accept (default == 10). When we hit this throttle then new channels will not be accepted/opened. Note that this throttle is effectively disabled for non-sessionful channels (such as default BasicHttpBinding).
With TCP and Pipes, we donβt ack the preamble until channel.Open() time. So if you see clients timing out waiting for a “preamble response”, then it’s possible that the target server has reached this throttle. By default your clients will wait a full minute (our default SendTimeout), and then time out with a busy server. Your stack will look something like:
TestFailed System.TimeoutException: The open operation did not complete within the allotted timeout of 00:01:00. The time allotted to this operation may have been a portion of a longer timeout.
[…]
at System.ServiceModel.Channels.ClientFramingDuplexSessionChannel.SendPreamble(IConnection connection, ArraySegment`1 preamble, TimeoutHelper& timeoutHelper)If instead you are timing out under channel.Send (rather than channel.Open), then it’s possible that you are hitting the MaxConcurrentCalls throttle (which kicks in per-message, not per-channel).
- MaxConcurrentInstances bounds the total number of instances created. This throttle provides added protection in the case that you have an instance lifetime that is not tied to a call or a session (in which case it would already be bounded by the other two throttles). Orcas durable services are one such scenario.
Net-net: if you are testing your services under load, and your clients start timing out, take a look at your throttling and instancing values. On the flip side, do not just blindly set these to int.MaxValue without fully understanding the potential DoS consequences.
InstanceContextMode, ConcurrencyMode, and Server-side Threading
When trying to write a scalable web service, you need to be aware of a few properties that affect how the WCF runtime will dispatch requests to your service: InstanceContextMode and ConcurrencyMode. In a nutshell, InstanceContextMode controls when a new instance of your service type is created, and ConcurrencyMode controls how many requests can be serviced simultaneously. The default InstanceContextMode is InstanceContextMode.PerSession, and the default ConcurrencyMode is ConcurrencyMode.Single.
Others have covered the details of these two knobs in detail, you can check them out for more background. Here I’m simply going to explain the affect these settings can have on your threading behavior.
If you have set ConcurrencyMode == ConcurrencyMode.Single, then you don’t have to worry about your Service instances being free-threaded (unless you are doing concurrent code within your methods). The only time multiple calls are allowed is when there are multiple instances. For InstanceContextMode.Singleton, you will get one method call at a time since you only have a single instance. For InstanceContextMode.PerCall or PerSession, ServiceModel will spin up extra threads up to a throttle in order to handle extra requests. There’s one possibly unexpected twist here. That is, when using a session-ful binding there will only be one outstanding instance call per-channel. Even with InstanceContextMode.PerCall. This is because WCF strictly maintains the in-order delivery guarantees of the channel with ConcurrencyMode.Single. So when using a session-ful binding (i.e. the default NetTcpBinding or NetNamedPipeBinding) + ConcurrencyMode.Single, InstanceContextMode.PerCall and InstanceContextMode.PerSession will behave the exact same way from a server-side threading/throttling perspective with regards to a single channel.
ConcurrencyMode == ConcurrencyMode.Reentrant is similar, except you can trigger another call to your instance from within your service (let’s say you call into a second service who calls back into you before it returns).
When you have ConcurrencyMode == ConcurrencyMode.Multiple, threading comes heavily into play. WCF will call into your instance on multiple threads unless you are using InstanceContextMode.PerCall. Throttles again will come into play (that’s a topic for another post).
To summarize, here are some basic scenarios for what happens when 100 clients simultaneously hit a service method:
Scenario 1: InstanceContextMode.Single+ConcurrencyMode.Single
Result: 100 sequential invocations of the service method on one thread
Scenario 2: InstanceContextMode.Single+ConcurrencyMode.Multiple
Result: N concurrent invocations of the service method on N threads, where N is determined by the service throttle.
Scenario 3:InstanceContextMode.PerCall+Any ConcurrencyMode
Result: N concurrent invocations of the method on N service instances, where N is determined by the service throttle
More details on MaxConnections
Here are some more “under the hood” details in response to the following question:
How is one supposed to interpret the NetTcpBinding.MaxConnections property on the client? My assumption has been that setting this property at the client only allows this number of concurrent connections, and further connection attempts will be queued until an existing connection is released.
MaxConnections for TCP is not a hard and fast limit, but rather a knob on the connections that we will cache in our connection pool. That is, if you set MaxConnections=2, you can still open 4 client channels on the same factory simultaneously. However, when you close all of these channels, we will only keep two of these connections around (subject to IdleTimeout of course) for future channel usage. This helps performance in cases where you are creating and disposing client channels. This knob will also apply to the equivalent usage on the server-side as well (that is, when a server-side channel is closed, if we have less than MaxConnections in our server-side pool we will initiate I/O to look for another new client channel).
The reason that we don’t have a hard and fast limit on your connection usage is that you already can control the connection usage through your usage of the WCF objects. That is, if you don’t want to use more than two connections, don’t create more than two client channels π Any additional knobs at the lower layer would only impede debuggability and predictability.
Note that MaxConnections applies across channels. When sending messages over a single channel, you can only send out one message at a time. That is, your second Send() call will not be initiated until your first Send() completes. In this manner, our TCP binding can guarantee in-order delivery. Also, practically speaking there would be a significant amount of complexity (and overall a negative performance hit) if we allowed interleaving of data from multiple messages, as each “chunk” would need to be annotated with a scatter/gather message marker.
Lastly, all of the above comments apply to TransferMode.Buffered (the default). When using Streaming mode, we “check out” a connection for each in-progress send (and not per-channel). So all the above statements will apply to simultaneous sends rather than simultaneous channels. Streaming TCP is a datagram (not a session-ful) channel, and so simultaneous sends are supported since each send will use a separate TCP connection. This is more similar to HTTP’s usage of TCP connections (where each in-flight request-response pair is using a separate TCP connection).
Windows Core Networking Blog
For those interested in all the low-level intricacies of networking, check out the Windows Core Networking Blog. For the non-bit-heads out there, you will also find a bunch of Vista configuration tidbits useful to any home network user.
Client (TCP and Named Pipe) Connection Pooling
Using the TCP and Named Pipe bindings give you a very clean mapping between IDuplexSessionChannel and the underlying network resource (socket or pipe). Namely, you can effectively treat a channel as 1-1 to a socket (I will use socket as shorthand for the generic “network resource” for the remainder of this post :)).
That being said, the lifetime of the underlying socket is not necessarily 1-1 with the lifetime of the channel. Due to our connection pooling feature in WCF, a connection can be reused over the lifetime of multiple channels. We perform connection pooling for both buffered and streaming channels. Our connection pool is configurable through TcpConnectionPoolSettings
/NamedPipeConnectionPoolSettings
. These settings include a GroupName that we use for isolation, an upper bound on our cache size (MaxOutboundConnectionsPerEndpoint), and timeout values for reliability and NLB support
The way connection pooling works on the client is as follows:
- When you open a channel we will first look for a connection in our pool. This lookup is performed based on IP+port for sockets and based on endpoint Uri name for Pipes.
- If we find an available connection in our pool then we will attempt our open handshake using .Net Framing. If this succeeds then we will associate the connection with the new channel and return from Open. If it fails then we’ll discard the connection. If we have not yet exceeded the binding’s OpenTimeout then we will repeat the “look in pool” process.
- If no [valid] connections are found in our pool then we will establish a new connection (again, using up to the time remaining in OpenTimeout).
- When you close a channel, after we perform our close handshake we will consider returning the connection to our pool. If we already have reached MaxOutboundConnectionsPerEndpoint, or the connection’s lifetime has exceeded LeaseTimeout then we will close the connection instead. The connection that is returned to the pool is the “raw” connection (the one that was initially accepted, prior to any security upgrades). In this way we can provide a transparent pool without leaking any security or other information.
I was going to cover the server-side usage of connection pooling in this same post, but the process of accepting and reusing connections on the server is worthy of its own topic next time π
Boilerplate code for custom Encoder Binding Elements
Part of the process involved in writing a custom MessageEncoder
is hooking your encoder up to a binding stack. This is accomplished using a subclass of MessageEncodingBindingElement
. The way our build process works is that we will take the first BindingElement within a Binding, and call [Can]CreateChannel[Factory|Listener]() on it. By default, these 4 calls will simply delegate to the BindingElement “below” them in the stack.
This default behavior is not exactly what you want in your MessageEncodingBindingElement. While most binding elements add an item to the channel stack, Message Encoders are an “adjunct” binding element. That is, they provide parameters to other binding elements (in this case the transport). The way that parameters are provided to other parts of the build process is through BindingParameters
. For example, a MessageEncodingBindingElement can add itself to the BindingParameters collection:
context.BindingParameters.Add(
this
);
And later the TransportBindingElement can fish out the encoder when it is building the relevant factory/listener:
Collection
<
MessageEncodingBindingElement
>messageEncoderBindingElements
= context.BindingParameters.FindAll<MessageEncodingBindingElement
>();
Unfortunately, the base MessageEncodingBindingElement
does not take care of adding itself to the BindingParameters collection in its default [Can]Build methods. Therefore, every subclass needs to add the following four boilerplate overrides. Furthermore, it will be a breaking change to fix this in V2, so it’s unclear if we will be able to fix this in the next release π
public override
IChannelFactory
<TChannel> BuildChannelFactory<TChannel>(
BindingContext
context)
{
if
(context ==
null
)
{
throw new
ArgumentNullException
("context");
}
context.BindingParameters.Add(this
);
return
context.BuildInnerChannelFactory<TChannel>();
}
public override
IChannelListener
<TChannel> BuildChannelListener<TChannel>(
BindingContext
context)
{
if
(context ==
null
)
{
throw new
ArgumentNullException
("context");
}
context.BindingParameters.Add(this
);
return
context.BuildInnerChannelListener<TChannel>();
}
public override
bool
CanBuildChannelFactory<TChannel>(
BindingContext
context)
{
if
(context ==
null
)
{
throw new
ArgumentNullException
("context");
}
context.BindingParameters.Add(this
);
return
context.CanBuildInnerChannelFactory<TChannel>();
}
public override
bool
CanBuildChannelListener<TChannel>(
BindingContext
context)
{
if
(context ==
null
)
{
throw new
ArgumentNullException
("context");
}
context.BindingParameters.Add(this
);
return
context.CanBuildInnerChannelListener<TChannel>();
}