Category Archives: Work

Asynchronicity, OneWay, and WCF

I’ve recently encountered some confusion around the behavior of one-way operations in WCF that I’m going to try and clear up.  In particular, developers are under the impression that a one-way operation == a non-blocking call. However, this is not necessarily the case. A one-way operation means that we will call the underlying channel in a "one-way manner".  For IOutputChannel/IDuplexChannel, this maps to channel.Send(). For IRequestChannel this maps to channel.SendRequest() followed by a check for a null response.

Now, sometimes the underlying channel can complete immediately (UDP will drop the packet if the network is saturated, TCP may copy bytes into a kernel buffer if there’s room, etc). However, depending on the amount of data transmitted and the number of simultaneous calls to a proxy, you will often see blocking behavior. HTTP, TCP, and Pipes all have throttling built into their network protocols.

If this isn’t desirable, there are a few alternatives depending on your application design. First off, if you want a truly non-blocking call, you should call channel.BeginSend/client.BeginXXXX (i.e. generate an async proxy). This is step one if you want non-blocking calls. With an asynchronous proxy, we will always be non-blocking from a thread perspective (which is always my recommendation for middle tier solutions, though there’s some quota coordination necessary to avoid flooding outgoing sockets).

For one-way operations, when your async callback is signaled it means that the channel stack has "successfully put the message on the wire". When this happens depends on your channel stack:

  • TCP signals when the socket.[Begin]Send() of the serialized message has completed (either because it’s been buffered by the kernel or put onto the NIC)
  • Pipes are a similar process (NT Named Pipes work similarly to TCP under the covers but without the packet loss)
  • MSMQ signals when the Message has been transferred successfully to the queue manager
  • HTTP signals when we’ve received an empty response. The only other alternative would be to remove all guarantees (and have to arbitrarily propagate the exception through some other call or thread). Trust me, this is better
  • UDP will complete when the UDP socket send completes (which is effectively instantaneous)

For two-way operations, when your async callback is signaled it means that the channel stack has "successfully put the message on the wire and then received a correlated response".

Asynchronous operations can be tricky, and can often get you into flooding trouble when used incorrectly. So be careful, use quotas to manage your flow control, and always remember that the internet is not a big truck; it is a series of tubes and sometimes they get clogged. And you shouldn’t try to fight the clogs by pouring more data down the tubes 🙂

WF/WCF July MSDN Webcasts

Starting last month, MSDN put together a regular series of webcasts on WCF and WF (focusing on .Net 3.5).  Here are the talks being broadcast in July:

July 7th, 10:00AM (PST)
Transactional Windows Communication Foundation Services with Juval Lowy

July 9th, 10:00AM (PST)
Using Windows Workflow Foundation to Build Services with Jon Flanders

July 11th, 10:00AM (PST)
WCF Extensibility Deep Dive with Jesus Rodriguez

July 18th, 10:00AM (PST)
SharePoint Server and WCF with Joe Klug

WsDualHttp and Faults

The other day a customer was sending unauthenticated messages to a service and the requests were timing out over WsDualHttp. When using WsHttpBinding or NetTcpBinding, the customer received an authentication MessageFault.  Why was there no Fault returned over WsDualHttpBinding?

The reason has to do with securing composite-duplex channels. WsDualHttp uses two HTTP connections to provide its duplex nature, and messages are correlated between these two connections based on WS-Addressing (ReplyTo/MessageId).  Because of this, the server behavior is entirely dependent on data received from the client. If this client is malicious, then he can cause the server to do a couple of things:

  • Initiate arbitrary outbound connections
  • Cause a "bouncing DOS" attack. For example, consider a server A that can send messages to server B that’s behind a firewall. Suppose that client C can send messages to A, but cannot send messages directly to server B (due to the firewall). Now suppose that the client sends a badly secured message to server A, with a ReplyTo equal to server B. If we send back a fault for unsecured messages over WsDualHttp, this would result in C being able to DOS server B due to bounce-assistance from server A.

In addition, since these are two one-way channels, the HTTP response (202 Accepted) has already returned prior to the Security channel (or any higher-layer in the channel/dispatcher stack) being called. So we cannot simply piggy-back the fault over the HTTP back channel.

When to use Async Service Operations

I was recently asked about the motivation for choosing asynchronous service operations in WCF (i.e. [OperationContract(AsyncPattern = true)]).

If you have an operation that is blocking (accessing SQL, Channels, etc) then you should use AsyncPattern=true.  That way you’ll free up whatever thread we’re using to call your operation from. The general idea is that if you have a blocking call then you should use the async version and it should transparently play well with us.

Put another way: if you are calling a method that returns an AsyncResult (i.e. you’re accessing SQL, or using sockets, files, or channels), then you can wrap that IAsyncResult and return it from the BeginXXX call (or return the raw asyncresult depending on your scenario).

If you aren’t doing something that’s "natively async", then you shouldn’t be using AsyncPattern=true. That is, you shouldn’t just create a thread just for the sake of performing "background work" as part of an asynchronous operation. Note that it is legitimate to spawn a thread because your operation happens to kick off work that is outside of its completion scope (though in that case you should just have a synchronous method, not an async one).

Notes on lifetimes of Channels and their Factories

One question I get from custom channel authors has to do with the various lifetimes of the components involved. Especially since, as per best practice, their heavyweight resources are attached to ChannelFactories and ChannelListeners and simply referenced from the Channel. Nicholas covered the basics in this post. Which to summarize is:

  • ChannelFactory created Channels cannot outlive their creator
  • channelFactory.Close/Abort will Close/Abort all Channels created by that factory
  • ChannelListener created Channels can outlive their creator
    • channelListener.Close/Abort simply disables the ability to accept new channels; existing channels can continue to be serviced

    Which makes life on the ChannelFactory end pretty straightforward.  On the ChannelListener side, there are a few subtleties. 

    First, a Channel is owned by the listener from the time of creation until the successful completion of channel.Open.  So if you are writing a custom listener that has offered up channels, you should clean them up if and only if they haven’t been opened.

    Second, in order to perform eager disposal, you will need to track active ownership of your heavyweight resources. If there are opened channels, then you need to make sure that the shared resources that they leverage have their ownership transferred from the listener to your active channel(s). This can be accomplished through ref-counting, active transfers, or other mechanisms.

    Pumping in a Layered IInputChannel

    When writing a layered channel, sometime you want to decouple processing without changing the channel shape. One example would be for layered demux purposes. When layering your IInputChannel on top of a "lower" IInputChannel, you have a few options for lifetime management and pumping that are worth noting:

    1. When your innerInputChannel.Receive call returns "null", that means that no more messages will arrive on that particular channel. You should Close that channel and then Accept a new one.
    2. If the innerListener.AcceptChannel call returns null then your inner listener is completely done providing messages and you need to follow suit when you are through with any added messages that you may have buffered.
    3. If someone calls Close/Abort on your IInputChannel, you have the option of holding onto the inner channel within your listener (and providing a new facade on top when a new channel is accepted). Alternatively, you can close the inner channel and create a new one with your new channel. That choice is entirely at your discretion (and at the mercy of your perf tuning results :))

    TcpTransport's Buffer Management in WCF

    I’ve gotten some questions recently about how our net.tcp transport functions at the memory allocation level. While at its core this is simply an “implementation detail”, this information can be very useful for tuning high performance routers and servers depending on your network topology. 

    When writing a service on top of net.tcp, there are a few layers where buffer allocations and copies can occur:

    • Buffers passed into socket.Receive()
    • Buffers created by the Message class (to support Message copying, etc)
    • Buffers created by the Serialization stack (to convert from Message to/from parameters)

    In addition, this behavior differs based on your TransferMode (Buffered or Streamed). In general you’ll find that Buffered mode will provide you the highest performance for “small messages” (i.e. < 4MB or so), while Streamed will provide better performance for larger messages (i.e. > 4MB). Usually this switch is combined with a tweak of your MaxBufferPoolSize in order to avoid memory thrashing.

    TransferMode.Buffered

    In Buffered mode, the transport will reuse a fixed-size buffer on its calls to socket.Receive(). The ConnectionBufferSize setting on TcpTransportBindingElement controls the size of this buffer.

    The data read from the wire is incrementally parsed at the .Net Message Framing level, and is copied into a new buffer (allocated by the BufferManager) that is sized to the length of the message. This step always involves a buffer copy, but rarely involves a new allocation for small messages (since BufferManager will cache and recycle previously used buffers).

    The message-sized buffer is then passed to the MessageEncoder for Message construction. Again, minimal allocations should occur here since we will pool XmlReader/Writer instances where possible. At this point you have a Message that is backed by a fixed-size buffer and we will share that backing buffer for all copies of the Message created through message.CreateBufferedCopy().

    If your contract is Message/Message, then no more deserialization will occur. If you are using Stream parameters, then the allocations are determined by your code (since you are creating byte[]s to pass to stream.Read), and given the nature of the Stream APIs you will entail a buffer copy as well (since we have to fill the caller buffer with data stored in the callee).

    If your contract uses CLR object parameters, then the Serializer will process the Message and generate CLR objects as per your Data Contracts. This step often entails a number of allocations (as you would expect since you are generating a completely new set of constructs).

    TransferMode.Streamed

    There are three main differences that occur when you are using TransferMode.Streamed:

    1. The transport will use new variable-sized byte[]s in its calls to socket.Receive. These byte[]s are still allocated from the configured BufferManager. The transport will then offer up a scatter/gather-based stream to the MessageEncoder so there are no extra buffer allocations or copies in this case.
    2. The Message created from the Encoder will be a “Streamed” message. This means that if you simply Read() from the Message’s Body in a single-shot then we don’t allocate any extra memory. However, if you call message.CreateBufferedCopy() we will entail some large allocations since we need to fully buffer at that point. Note that some binding elements (such as ReliableSession and certain Security configurations) will call CreateBufferedCopy() under the hood and trigger this situation.
    3. Since the backing data may not all be in memory, invoking the Serializer will incur a larger cost since it will be pulling all the data into memory at once (in order to generate the appropriate CLR objects, even if that is simply a byte[]).

     

    As you can see, there is a tension between performance and usability. Hopefully this data will help you make the appropriate tradeoffs.

    Controlling HTTP Cookies on a WCF Client

    A few customers have tried to control their HTTP Cookie headers from a smart client (using a previously cached cookie for example). The common experience has been “I enabled AllowCookies=true, and added the cookie using HttpRequestMessageProperty.Headers.Add. But somehow my cookie didn’t get included in my request.”

    This is because “allowCookies” is a somewhat unfortunate name. When setting “allowCookies=true” on your HTTP-based binding (WsHttpBinding, BasicHttpBinding, etc), what you are indicating is that you want WCF to manage the cookie headers for you. That means that WCF will create CookieContainer, associate it with any IChannelFactory that is created from the binding, and then use that CookieContainer for all outgoing HttpWebRequests. The end result of this setup is that the user has zero control over his cookies, since the CookieContainer is taking over all management (including echoing server cookies if and only if a cookie was sent in an earlier response to that factory).

    To cut a long explanation short, if you want to manipulate cookies on the client, then you (somewhat unintuitively) need to ensure that allowCookies=false.  In this manner, you are taking full responsibility for all cookies sent on that channel (in all cases, including capturing response cookies and echoing appropriately).

    Performance Characteristics of WCF Encoders

    As part of the Framework, we ship 3 MessageEncoders (accessible through the relevate subclass of MessageEncodingBindingElement):

    1. Text – The “classic” web services encoder. Uses a Text-based (UTF-8 by default) XML encoding. This is the default encoder used by BasicHttpBinding and WsHttpBinding
    2. MTOM – An interoperable format (though less broadly supported then text) that allows for a more optimized transmission of binary blobs, as they don’t get base64 encoded.
    3. Binary –  A WCF-specific format that avoids base64 encoding your binary blobs, and also uses a dictionary-based algorithm to avoid data duplication. Binary supports “Session Encoders” that get smarter about data usage over the course of the session (through pattern recognition). This is the default encoder used by NetTcpBinding and NetNamedPipeBinding

    I often get asked “which encoder is the fastest?” (and then “by how much?” :)). As always, the first principle of performance is to measure and tune your exact scenarios to determine if this is a bottleneck for you. That being said, here are some notes on the performance characteristics of our built-in Message Encoders.

    Broadly speaking, encoders can impact your performance along two axis: size of encoded messages, and CPU load required to generate/consume those encoded messages.

    In general, binary has the fastest encoding/decoding speed since it has less to do (usually because there is less data to read/write). This has to do with the dictionary-based optimization characteristics. The speedup is greater over TCP/NamedPipes since the encoder can recognize patterns (and negotiate optimizations) over the course of the session. If both participants are using WCF, then binary is a natural choice for production. (Note that during development, Text may be useful for debugging purposes).

    Both binary and MTOM yield much faster processing of binary data (by avoiding the base64 process as well as the associated size bloat). Binary achieves this with inline binary blobs. The MTOM format achieves this through an inline base64 stub that references the binary blob outside of the Infoset. In both cases, the user model is abstracted from this detail and they will “appear” inline through the encoder.

    If you do not have any binary data involved, MTOM will actually be slower than text since it had the extra overlead of packaging and processing the Message within a MIME document. However, if there is enough binary data in the document then the savings from avoiding base64 encoding can make up for this added overhead.

    We spent a lot of engineering effort tuning the performance of our UTF-8 Text encoder, so you will see better performance over UTF-8 then the Unicode variations. And as to whether you should use Text or MTOM for interoperable endpoints, the guidance above should help with gut feel, but please measure your scenarios!