Code rant

If you are writing software that leverages the System.Net.WebRequest class, you’re probably familiar with tools like Fiddler or Wireshark. You can use these tools to see the actual HTTP requests and responses going back and forth between your client and the server. A nice alternative to these tools, that I only recently discovered, is the System.Net trace source. The System.Net source emits logging messages from the HttpWebRequest and HttpWebResponse classes that give a very similar experience to using Fiddler.

Here’s an example App.config file that configures the System.Net listener to output both to the console and a log file:

<?xml version="1.0" encoding="utf-8" ?>
<configuration>

    <system.diagnostics>

        <trace autoflush="true" />
        
        <sources>
            <source name="System.Net" maxdatasize="1024">
                <listeners>
                    <add name="MyTraceFile"/>
                    <add name="MyConsole"/>
                </listeners>
            </source>
        </sources>

        <sharedListeners>
            <add
              name="MyTraceFile"
              type="System.Diagnostics.TextWriterTraceListener"
              initializeData="System.Net.trace.log"
                />
            <add name="MyConsole" type="System.Diagnostics.ConsoleTraceListener" />
        </sharedListeners>

        <switches>
            <add name="System.Net" value="Information" />
<!--            <add name="System.Net" value="Verbose" />-->
        </switches>

    </system.diagnostics>

</configuration>

Here I’ve set up two listeners; ‘MyTraceFile’, that outputs the trace information to a log file; and ‘MyConsole’, that outputs to the console.

My favourite test tool is TestDriven.NET which allows you to run arbitrary methods and sends the output to the Visual Studio output console. Being able to run a test method (I’ve got F8 mapped to run tests, so it’s a single keystroke) and see the System.Net trace output immediately in Visual Studio is very cool.

Here’s some code which makes a GET request to www.google.com …

var request = WebRequest.CreateDefault(new Uri("http://www.google.com/"));

request.Method = "GET";

var response = (HttpWebResponse)request.GetResponse();

using (var responseStream = response.GetResponseStream())
{
    if (responseStream == null)
    {
        Console.Out.WriteLine("response stream is null");
        return;
    }

    using (var reader = new StreamReader(responseStream))
    {
        // do something with the response body
        var responseBody = reader.ReadToEnd();
    }
    
}

When I run this code, I get the following trace output …

System.Net Information: 0 : [5752] Current OS installation type is 'Client'.
System.Net Information: 0 : [5752] RAS supported: True
System.Net Error: 0 : [5752] Can't retrieve proxy settings for Uri 'http://www.google.com/'. Error code: 12180.
System.Net Information: 0 : [5752] Associating HttpWebRequest#49685557 with ServicePoint#53977989
System.Net Information: 0 : [5752] Associating Connection#56846532 with HttpWebRequest#49685557
System.Net Information: 0 : [5752] Connection#56846532 - Created connection from 192.168.1.146:53202 to 173.194.67.99:80.
System.Net Information: 0 : [5752] Associating HttpWebRequest#49685557 with ConnectStream#19026863
System.Net Information: 0 : [5752] HttpWebRequest#49685557 - Request: GET / HTTP/1.1

System.Net Information: 0 : [5752] ConnectStream#19026863 - Sending headers
{
Host: www.google.com
Connection: Keep-Alive
}.
System.Net Information: 0 : [5752] Connection#56846532 - Received status line: Version=1.1, StatusCode=302, StatusDescription=Found.
System.Net Information: 0 : [5752] Connection#56846532 - Received headers
{
Cache-Control: private
Content-Type: text/html; charset=UTF-8
Location: http://www.google.co.uk/
Set-Cookie: expires=; expires=Mon, 01-Jan-1990 00:00:00 GMT; path=/; domain=www.google.com,path=; expires=Mon, 01-Jan-1990 00:00:00 GMT; path=/; domain=www.google.com,domain=; expires=Mon, 01-Jan-1990 00:00:00 GMT; path=/; domain=www.google.com,expires=; expires=Mon, 01-Jan-1990 00:00:00 GMT; path=/; domain=.www.google.com,path=; expires=Mon, 01-Jan-1990 00:00:00 GMT; path=/; domain=.www.google.com,domain=; expires=Mon, 01-Jan-1990 00:00:00 GMT; path=/; domain=.www.google.com,expires=; expires=Mon, 01-Jan-1990 00:00:00 GMT; path=/; domain=google.com,path=; expires=Mon, 01-Jan-1990 00:00:00 GMT; path=/; domain=google.com,domain=; expires=Mon, 01-Jan-1990 00:00:00 GMT; path=/; domain=google.com,expires=; expires=Mon, 01-Jan-1990 00:00:00 GMT; path=/; domain=.google.com,path=; expires=Mon, 01-Jan-1990 00:00:00 GMT; path=/; domain=.google.com,domain=; expires=Mon, 01-Jan-1990 00:00:00 GMT; path=/; domain=.google...}.
System.Net Information: 0 : [5752] ConnectStream#10789400::ConnectStream(Buffered 221 bytes.)
System.Net Information: 0 : [5752] Associating HttpWebRequest#49685557 with ConnectStream#10789400
System.Net Information: 0 : [5752] Associating HttpWebRequest#49685557 with HttpWebResponse#11016073
System.Net Warning: 0 : [5752] HttpWebRequest#49685557::() - Error code 302 was received from server response.
System.Net Warning: 0 : [5752] HttpWebRequest#49685557::() - Resubmitting request.
System.Net Information: 0 : [5752] Associating HttpWebRequest#49685557 with ServicePoint#23936385
System.Net Information: 0 : [5752] Associating Connection#22196665 with HttpWebRequest#49685557
System.Net Information: 0 : [5752] Connection#22196665 - Created connection from 192.168.1.146:53203 to 173.194.67.94:80.
System.Net Information: 0 : [5752] Associating HttpWebRequest#49685557 with ConnectStream#57250404
System.Net Information: 0 : [5752] HttpWebRequest#49685557 - Request: GET / HTTP/1.1

System.Net Information: 0 : [5752] ConnectStream#57250404 - Sending headers
{
Host: www.google.co.uk
Connection: Keep-Alive
}.
System.Net Information: 0 : [5752] Connection#22196665 - Received status line: Version=1.1, StatusCode=200, StatusDescription=OK.
System.Net Information: 0 : [5752] Connection#22196665 - Received headers
{
Cache-Control: private, max-age=0
Content-Type: text/html; charset=ISO-8859-1
Date: Thu, 12 Jul 2012 13:39:26 GMT
Expires: -1
Set-Cookie: NID=61=CTUlcAyhXQp63NVCOkXYWVgi2nMQiOUpyG-x1yRlw-Unhq3OyQ5zXCIxIJ9ctSN_qg6Lni90142sYKQzDZ7oZXBZxnWQbzhcjqVcKQEgCfBgMAjxhDgVLOfgXBR6IzTm; expires=Fri, 11-Jan-2013 13:39:26 GMT; path=/; domain=.google.co.uk; HttpOnly,PREF=ID=b7c02536ab59a395:FF=0:TM=1342100366:LM=1342100366:S=gqGT-3tWl96NIpdz; expires=Sat, 12-Jul-2014 13:39:26 GMT; path=/; domain=.google.co.uk,NID=61=CTUlcAyhXQp63NVCOkXYWVgi2nMQiOUpyG-x1yRlw-Unhq3OyQ5zXCIxIJ9ctSN_qg6Lni90142sYKQzDZ7oZXBZxnWQbzhcjqVcKQEgCfBgMAjxhDgVLOfgXBR6IzTm; expires=Fri, 11-Jan-2013 13:39:26 GMT; path=/; domain=.google.co.uk; HttpOnly
P3P: CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.py?hl=en&answer=151657 for more info."
Server: gws
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Transfer-Encoding: chunked
}.
System.Net Information: 0 : [5752] ConnectStream#42047594::ConnectStream(Buffered -1 bytes.)
System.Net Information: 0 : [5752] Associating HttpWebRequest#49685557 with ConnectStream#42047594
System.Net Information: 0 : [5752] Associating HttpWebRequest#49685557 with HttpWebResponse#47902635
System.Net Information: 0 : [5752] ContentLength=-1

You can see that Google users in the UK get a 302 redirect to www.google.co.uk. This is a special Google that refuses to say anything bad about Her Majesty and stops working to drink tea at exactly 11 am.

With the reporting level set at ‘Information’, you can see all the header HTTP information and some of the work that the underlying sockets are doing. Setting the level to ‘Verbose’ will give you the HTTP bodies as well.

For more information see the MSDN documentation here.

Happy HTTPing!

I’ve been playing with the twitter streaming API today. In very simple terms, you make an HTTP request and then sit on the response stream reading objects off it. The stream is a stream of UTF-8 characters and each object is a JSON encoded data structure terminated by \r\n. Simple I thought, I’ll just create a StreamReader and set up a while loop on its Read method. Here’s my first attempt …

using(var reader = new StreamReader(stream, Encoding.UTF8))
{
    var messageBuilder = new StringBuilder();
    var nextChar = 'x';
    while (reader.Peek() >= 0)
    {
        nextChar = (char)reader.Read()
        messageBuilder.Append(nextChar);

        if (nextChar == '\r')
        {
            ProcessBuffer(messageBuilder.ToString());
            messageBuilder.Clear();
        }
    }
}

Unfortunately it didn’t work. The StreamReader maintains a small internal buffer so I wouldn’t see the \r\n combination that marked the end of a new tweet until the next tweet came along and flushed the buffer.

OK, so let’s just read each byte from the stream and convert them one-by-one into UTF-8 characters. This works fine when your tweets are all in English, but UTF-8 can have multi-byte characters; any Japanese tweets I tried to read failed.

Thanks to ‘Richard’ on Stack Overflow the answer turned out to be the Decoder class. It buffers the bytes of incomplete UTF-8 characters, allowing you to keep stacking up bytes until they are complete. Here’s revised example that works great with Japanese tweets:

int byteAsInt = 0;
var messageBuilder = new StringBuilder();
var decoder = Encoding.UTF8.GetDecoder();
var nextChar = new char[1];

while ((byteAsInt = stream.ReadByte()) != -1)
{
    var charCount = decoder.GetChars(new[] {(byte) byteAsInt}, 0, 1, nextChar, 0);
    if(charCount == 0) continue;

    Console.Write(nextChar[0]);
    messageBuilder.Append(nextChar);

    if (nextChar[0] == '\r')
    {
        ProcessBuffer(messageBuilder.ToString());
        messageBuilder.Clear();
    }
}

Recently I had a requirement to parse AMQP error messages. A typical message looks something like this:

The AMQP operation was interrupted: AMQP close-reason, initiated by Peer, 
code=406, text="PRECONDITION_FAILED - parameters for queue 'my.redeclare.queue' 
in vhost '/' not equivalent", classId=50, methodId=10, cause=

It starts off with the high-level message text ‘The AMQP operation was interrupted’, then a colon, then some comma separated values some of which are key value pairs.

I wanted to parse these into a ‘semantic model’ – an object graph that represents the structure of the error. Now I could have done some pretty nasty string manipulation; looking for the first colon, separating the rest by commas, looking for ‘=’ and separating out the key-value pairs, but the code would have been rather ugly to say the least. I could have used regular expressions, but once again I doubt very much if I would have been able to read the resulting expression if I revisited the code in a couple of weeks time.

Then I remembered Sprache, a little monadic parser by Nicholas Blumhardt that I’d encountered last year when I was writing about Monads. The lovely thing about Sprache is that you write your parser in readable C# code and build the sematic model directly in the parser code. It’s very easy to use and very readable. Nicholas has an excellent step-by-step post here that I’d strongly recommend reading.

I found it on NuGet, but whoever had put it up there had since disappeared. After contacting Nicholas I decided to adopt it, first moving the source to GitHub and setting up continuous deployment to NuGet via TeamCity.CodeBetter.com.

If you go to the NuGet.org Sprache page, you’ll see that its owners are myself and Nicholas and each push to the GitHub repository results in a new package upload (the last three all done today while I was getting it working ;).

So if you need to do some parsing, give Sprache a try, it’s much easier than writing your own parser from scratch, and unlike regex you can actually read the code after you’ve written it.

You can see my AMQP parser experiment here.

Roy Fielding writes a PhD dissertation describing the architectural style of the World Wide Web. He coins the term ‘Representational State Transfer’ (REST) to describe it – after all, if you’re going to talk about something, you need to give it a name. Somehow, in a semantic shift of epic fail proportions, the term REST has also come to mean something completely different to most people, a ‘style’ of writing web services. This style has no agreed protocol.

The result? The internet is ablaze with an out of control REST flame war. It seems that many people think there’s a REST protocol when in fact there’s no such thing. Looking for a protocol in Roy Fielding’s dissertation will get you nowhere because it’s an academic paper describing an architectural style, there’s no protocol to be had. The only contribution Mr Fielding makes to the debate is to tell almost anyone who describes their API as RESTful, that it is not.

Writing RESTful web services, in practice – in the real world – means that you are on your own. You have to write your own protocol (probably implicitly, because you don’t even realise that’s what you’re doing). Now the whole thing about a protocol – TCP/IP, HTTP, SMTP, SOAP – is that everyone agrees on a set of (reasonably) unambiguous rules for communication. These can then be coded into libraries, toolkits, servers, what have you, and my Linux web server written in PHP can communicate with your .NET client running on Windows because the TCP/IP, HTTP and HTML specs are unambiguous enough to ensure that if you follow them stuff will work. If you write your own protocol and nobody else adopts it, it’s not very useful. If I want to write a client to communicate with your REST API I’m in a world of pain; there’s no serviceutil I can point at your API to generate me a proxy, instead I have to carefully read your documentation (if it exists) and then lovingly handcraft my client using low level HTTP API calls.

Now don’t get me wrong, I think a web service protocol based on a RESTful architectural style would be a wonderful thing, but let’s not kid ourselves that such a thing exists.

Show me the metadata.

The core missing pieces of any RESTful web service protocol are agreed standards on media type and link metadata. Everyone seems to agree that the Content-Type header should describe the structure and purpose of the resource, but currently it’s up for grabs how you might navigate from a media type description (like ‘application/vnb.sutekishop.customer+json’) to a machine readable schema definition for the customer – should it be XSD? JSON schema? . The same goes for hyperlinks. Although there’s an established HTML standard (the A tag), how links should be encoded in an XML or JSON representation is up to the individual API author. Similarly although there seems to be agreement that the link metadata should live at URI described by the ‘rel’ attribute, what that metadata should look like is also undefined.

Sure there are some valiant attempts to come up with a protocol – I quite liked HAL, and the HAL browser is an interesting take on how RIA UIs might be purely metadata driven at runtime – these are all still just proposals.

I think we’ll know when we have an established RESTful web service protocol, or protocols. It will be when we stop using the term ‘REST’ to describe what we are doing. When we’re writing SOAP based services we call them just that. “I’m writing a SOAP web service.” Not, “I’m writing ‘XML based RPC tunnelled over HTTP’”, which of course could be open to interpretation about exactly how we’re doing it. When we have an established protocol, ‘REST’ will be retuned to its rightful place, and the only people who will use the term will be software architecture academics like Mr Fielding.

Evolution works for me

So far the tone of this rant has been somewhat negative, It seems like I’ve been rubbishing the whole ‘REST’ project. Actually I think the current situation is healthy. Monolithic ‘enterprise’ protocols like SOAP usually end up being a bad idea. The internet is a soup of small layered protocols, each one simple in itself, but that work together to make a much larger whole. The debate around REST has reminded us that much of the infrastructure to create hugely scalable services already exists. If the community can build on this, and fill in the missing pieces, preferably with small protocols that solve each problem individually, then I think we will arrive at a much better place. My reason for writing this piece is simply a warning to the unwary, regular ‘Morts’ like myself, that when someone says, “Mike, can you write us a REST API?” to be aware that the rule book has not yet been written and that you will be making much of it up as you go along.

The team at 15Below, my excellent clients, have been using Git and GitHub since last September. Although I’ve been using GitHub for open source projects for several years now, this is the first time I’ve worked with it in a largish (20+ developers) team. The default VCS for a Microsoft shop is, of course, TFS, so deciding to use GitHub might be seen as somewhat curious. This post describes why we came to the decision, how we integrate GitHub into our development process, and our experience so far.

So why did we choose Git as our VCS?

I, and several of my colleagues had had experience with distributed VCSs, specifically Git, from working on open source projects. Having your own local repository gives you so much more flexibility and opportunity for experimentation, that it seemed that a non-distributed VCS was a step backwards.
The team is spit into small project teams of 2 or 3 developers, each working on different features, so being able to branch-by-feature was also a requirement. We needed a VCS with excellent branching and merging capabilities.
We also have a distributed team with members in the UK, India and Australia, so a cloud based solution seemed appropriate. Our OSS experience with GitHub made it the obvious choice.
Whenever one is choosing tools, the level of adoption in the wider development community should be a consideration, and although Git is rare in the Microsoft world it’s fast becoming the default VCS elsewhere.

GitHub is Git’s killer app. Without GitHub, Git would simply be just another DVCS. After you’ve used a cloud based VCS like GitHub it feels like overkill to even consider hosting one’s own master repository. We pay $25 per month for the basic Bronze plan which is a trivial cost for an organisation of our size, yet it allows us to host our 5GB core repository and access for 20+ committers. I’m constantly amazed at Git and GitHub’s performance, I can pull the entire master branch down in just a few minutes and most normal pulls and pushes take a few seconds. Just to give you some idea of the scale of our software, running NDepend on our master branch gives:

635,884 Lines of code
435 Assemblies
17,831 Types
185,423 Methods
And we have 7665 commits since we started using GitHub last September.

So you can see that we are far from trivial users. GitHub and Git have proven to be reliable, scalable and fast (no, really fast) even for our rather bloated codebase.

The GitHub UI has also proved to be very useful. It gives a clear view of commits, and makes it easy to browse and comment on changes. Another nice touch is GitHub’s support for markdown syntax. We’ve started keeping technical documentation next to the code as .md files. This is great when you’re branching and merging because the documentation branches and merges along with the code. It also makes finding the docs for a particular assembly trivial since they’re part of the VS project.

Having decided on Git and GitHub, how did we integrate it into our existing tools and development process?

One lesson we’ve learnt is that source control tools that integrate into Visual Studio are problematic:

They tend to obfuscate changes to source code on disk with changes in the IDE. Weaning developers away from seeing everything from the view of the Solution Explorer has lead to far fewer problems with inadvertently changed files and corrupted solution and project files.
Source controlled assets that are not controlled by the IDE get forgotten. ‘Everything that the IDE cares about’ is different from ‘Everything that’s not ignored in this directory tree’. Using a source control tool that’s not integrated into VS gives a much cleaner view of the repository.

I still use the command line tools via Cygwin, but I’m in a minority of one, most of the team use Git Extensions and fall back on the bash shell when they need to do something complex. We initially tried Tortoise Git, but it wasn’t ready for prime time. We’ve also looked at GitHub for windows, but I don’t think anyone is using it day-to-day.

We have a single master repository on GitHub with multiple branches for releases and development. Each developer is a committer to the master repository. This is closest to the way one would work with a more old-fashioned client server tool like SVN and it seemed like the obvious model when we initially considered using GitHub. So far it’s worked reasonably well. We ‘branch-per-feature’, so each team works in their own feature branches and then merges into the development branch when they are done. We have discussed feature switches, but felt that it introduces an orthogonal source control concern into our code base.

We have also discussed using the GitHub model more directly with each developer having their own GitHub repository and issuing pull requests to a core repository. I quite like the idea of having a code review process built into the source control model, so it’s something I’d like to consider using in the future. I guess you’d have to have a ‘repo-guardian’ who handled all the pull requests. Whether this would be a single individual’s full time job, or something that would be shared around the team, is an interesting question.

We use TeamCity to manage our CI build process. It integrates well with GitHub and it only takes a few clicks to get it pulling on each push to GitHub. An essential piece of the branch-per-feature pattern is to have CI on every branch. Luckily TeamCity makes this pretty easy to do and with the new feature branch feature it should become trivial.

Problems with Git and GitHub

The security model doesn’t integrate with Active Directory, so we have to manage users and logins separately which is a pain. People often required help with the SSH keys when getting started.
Git is hard to learn. I think Git’s greatest strength and its greatest weakness is that there is no abstraction on top of the architecture of its implementation. You really have to understand how it works internally in order to use it correctly. This means there’s a non-trivial learning curve. Having said that, even our most junior developers now use it successfully, so the excuse that ‘it’s far too difficult for my team to learn’, means that you are probably underestimating your team.
Some developers might worry that not having TFS experience on their CV could hurt their employment opportunities. On the other hand, our top developers think its pretty cool that we use the same tools that they use it for their open source projects.

So …

On the whole our experience of Git and GitHub has been good. Our primary fear, that some of the junior people would find it too difficult to learn, has proved to be unfounded. There’s no doubt that the learning curve is greater than for TFS or SVN, but the power is also greater. The performance of Git and GitHub continues to impress, and we have no complaints with the robustness or stability of either tool. The merging and branching power of Git has allowed us to introduce a far more flexible product strategy and the repo-in-the-cloud has made the geographic spread of the team a non-issue. In short, GitHub is a compelling alternative to TFS and is a choice that I’m happy we made.

EasyNetQ is my simple to use .NET API for the awesome RabbitMQ messaging broker. Architecting system around a message bus involves writing many small focussed components that sit on the bus waiting for messages they care about to arrive. These are best implemented as Windows services. My favourite way of implementing Windows services is to use TopShelf. This very nice open source library grew out of the excellent MassTransit project. It makes writing windows services super easy; you simply create a console project, “install-package Topshelf”, and use the neat fluent API to describe your service. An IoC container should be at the heart of any but the simplest .NET application. If you’ve not used an IoC container before I talked about why you should several years ago. There are quite a few IoC containers to choose from, my weapon of choice is Castle Windsor.

In this post I want to show how I wire up TopShelf, Windsor and EasyNetQ.

The Zen of IoC says that any application that uses an IoC container should reference it in only two places. First to create an instance of the container that lasts the lifetime of the application. Second to resolve a root service from the container. All the other dependencies are supplied invisibly by the container itself. This magic is what makes IoC containers such awesome and essential frameworks. Following this rule, in the Main() function we create the IoC container and resolve the root service of our application, in this case IVaultService, all within the fluent API provided by TopShelf.

public class Program
{
    public static void Main(string[] args)
    {
        // create the container and run any installers in this assembly
        var container = new WindsorContainer().Install(FromAssembly.This());
        // start of the TopShelf configuration
        HostFactory.Run(x =>
        {
            x.Service<IVaultService>(s =>
            {
                // resolve the root IVaultService service
                s.ConstructUsing(name => container.Resolve<IVaultService>());
                s.WhenStarted(tc => tc.Start());
                s.WhenStopped(tc =>
                {
                    tc.Stop();
                    // with Windsor you must _always_ release any components that you resolve.
                    container.Release(tc);
                    // make sure the container is disposed
                    container.Dispose();
                });
            });

            x.RunAsLocalSystem();

            x.SetDescription("Vault service.");
            x.SetDisplayName("My.Vault.Service");
            x.SetServiceName("My.Vault.Service");
        });
    }
}

A cardinal rule of Windsor is that you must release any components that you resolve. Windsor tracks any components that implement IDisposable and ensures that Dispose is called no matter where the component gets resolved in the dependency graph, but you need to call Release for this to happen correctly. The ‘tc’ variable is the instance of IVaultService that gets resolved in the ‘ConstructUsing’ call, so we can use it in the Release call.

What about EasyNetQ? The Zen of EasyNetQ says that you should create a single instance of IBus that lasts the lifetime of the application. Now we could have created our IBus instance in Main() alongside the TopShelf setup and newing up the container, but since we’ve got a container we want it to manage the lifetimes of all the components used in the application. First let’s create a simple factory method that gets the connection string for EasyNetQ and creates a new instance of IBus:

public class BusBuilder
{
    public static IBus CreateMessageBus()
    {
        var connectionString = ConfigurationManager.ConnectionStrings["easynetq"];
        if (connectionString == null || connectionString.ConnectionString == string.Empty)
        {
            throw new VaultServiceException("easynetq connection string is missing or empty");
        }

        return RabbitHutch.CreateBus(connectionString.ConnectionString);
    }
}

Now we can write our IWindsorInstaller to register our services:

public class Installer : IWindsorInstaller
{
    public void Install(IWindsorContainer container, IConfigurationStore store)
    {
        container.Register(
            Component.For<IVaultService>().ImplementedBy<VaultService>().LifestyleTransient(),
            Component.For<IBus>().UsingFactoryMethod(BusBuilder.CreateMessageBus).LifestyleSingleton()
            );
    }
}

Note that we tell Windsor to create our IBus instance using our factory with ‘UsingFactoryMethod’ rather than ‘Instance’. The Instance method tells Windsor that we will take responsibility for the lifetime of the service, but we want Windsor to call Dispose when the application shuts down, UsingFactoryMethod tells Windsor that it needs to manage the IBus lifestyle. We declare it as ‘LifestyleSingleton’ because we only want a single instance of IBus for the entire lifetime of the application.

Now we can reference IBus in our IVaultService implementation:

public interface IVaultService
{
    void Start();
    void Stop();
}

public class VaultService : IVaultService
{
    private readonly IBus bus;
    public VaultService(IBus bus)
    {
        this.bus = bus;
    }

    public void Start()
    {
        bus.SubscribeAsync<MyMessage>("vault_handler", msg => 
        {
            // handle the message here
        });
    }

    public void Stop()
    {
        // any shutdown code needed
    }
}

Here we are simply subscribing to MyMessage in the Start method of VaultService. I would probably also have an IMyMessageHandler service referenced by IVaultService to do the message handling itself.

So there you have it, a simple recipe for using these three excellent OSS projects together. As a side note it’s worth pointing out that they play together without depending on each other directly. I think this is the way to go with OSS components; they should provide points that allow you to plug in other pieces without mandating them.

Blog as notepad time. Just a little reminder for myself on how to return the result from BeginExecuteNonQuery as a Task<int>

public Task<int> Save(string value)
{
    var taskCompletionSource = new TaskCompletionSource<int>();

    var connection = new SqlConnection(connectionString);
    connection.Open();
    var command = new SqlCommand("uspSaveSomeValue", connection)
    {
        CommandType = CommandType.StoredProcedure
    };
    command.Parameters.AddWithValue("@myparam", value);
    command.BeginExecuteNonQuery(asyncResult =>
    {
        var result = command.EndExecuteNonQuery(asyncResult);
        command.Dispose();
        connection.Dispose();
        taskCompletionSource.SetResult(result);
    }, null);

    return taskCompletionSource.Task;
}

If you know a better way, please comment below.

Yes, yes, I know, but I’m not working on 4.5 yet :(

Update: Ken Egozi suggested using Task<int>.Factory.FromAsync. Of course! I’d been doing so much TaskCompletionSource manual task creation recently that I’d forgotten about this useful shortcut. Here’s a more succinct version using FromAsync:

public Task<int> Save(string value)
{
    var connection = new SqlConnection(connectionString);
    connection.Open();
    var command = new SqlCommand("uspSaveSomeValue", connection)
    {
        CommandType = CommandType.StoredProcedure
    };
    command.Parameters.AddWithValue("@myparam", value);

    return Task<int>.Factory.FromAsync(command.BeginExecuteNonQuery(), asyncResult =>
    {
        var result = command.EndExecuteNonQuery(asyncResult);
        command.Dispose();
        connection.Dispose();
        return result;
    });
}

EasyNetQ, my simple .NET API for RabbitMQ, is a library composed of small components. Until today, the code simply wired the components in a messy hard-coded routine. Now it has its own tiny internal IoC container. When you write:

var bus = RabbitHutch.CreateBus("host=localhost");

... the static method CreateBus registers the components with the container and then resolves the IBus instance. The really cool thing about this is that it allows you, the user, to replace any of the internal components, including IBus, with your own implementations. An overload of the CreateBus method provides the hook which gives you access to the component registration. The signature looks like this:

public static IBus CreateBus(string connectionString, Action<IServiceRegister> registerServices)

The IServiceRegister interface provides a single method:

public interface IServiceRegister
{
    IServiceRegister Register<TService>(Func<IServiceProvider, TService> serviceCreator) where TService : class;
}

So to register your own logger, based on IEasyNetQLogger, you'd write this code:

var logger = new MyLogger(); // MyLogger implements IEasyNetQLogger
var bus = RabbitHutch.CreateBus(connectionString, 
    serviceRegister => serviceRegister.Register(serviceProvider => logger));

The Register method's argument, Func<IServiceProvider, TService>, is a function that's run when CreateBus pulls together the components to make an IBus instance. IServiceProvider looks like this:

public interface IServiceProvider
{
    TService Resolve<TService>() where TService : class;
}

This allows you to access other services that EasyNetQ provides. If for example you wanted to replace the default serializer with your own implementation of ISerializer, and you wanted to construct it with a reference to the internal logger, you could do this:

var bus = RabbitHutch.CreateBus(connectionString, serviceRegister => serviceRegister.Register(
    serviceProvider => new MySerializer(serviceProvider.Resolve<IEasyNetQLogger>())));

There’s nothing to stop you registering your own interfaces with the container that you can then use with your implementations of EasyNetQ’s service interfaces.

To see the complete list of components that make up the IBus instance, and how they are assembled, take a look at the ComponentRegistration class.

Sprache is a very cool lightweight parser library for C#. Today I was experimenting with parsing EasyNetQ connection strings, so I thought I’d have a go at getting Sprache to do it. An EasyNetQ connection string is a list of key-value pairs like this:

key1=value1;key2=value2;key3=value3

The motivation for looking at something more sophisticated than simply chopping strings based on delimiters, is that I’m thinking of having more complex values that would themselves need parsing. But that’s for the future, today I’m just going to parse a simple connection string where the values can be strings or numbers (ushort to be exact).

So, I want to parse a connection string that looks like this:

virtualHost=Copa;username=Copa;host=192.168.1.1;password=abc_xyz;port=12345;requestedHeartbeat=3

… into a strongly typed structure like this:

public class ConnectionConfiguration : IConnectionConfiguration
{
    public string Host { get; set; }
    public ushort Port { get; set; }
    public string VirtualHost { get; set; }
    public string UserName { get; set; }
    public string Password { get; set; }
    public ushort RequestedHeartbeat { get; set; }
}

I want it to be as easy as possible to add new connection string items.

First let’s define a name for a function that updates a ConnectionConfiguration. A uncommonly used version of the ‘using’ statement allows us to give a short name to a complex type:

using UpdateConfiguration = Func<ConnectionConfiguration, ConnectionConfiguration>;

Now lets define a little function that creates a Sprache parser for a key value pair. We supply the key and a parser for the value and get back a parser that can update the ConnectionConfiguration.

public static Parser<UpdateConfiguration> BuildKeyValueParser<T>(
    string keyName,
    Parser<T> valueParser,
    Expression<Func<ConnectionConfiguration, T>> getter)
{
    return
        from key in Parse.String(keyName).Token()
        from separator in Parse.Char('=')
        from value in valueParser
        select (Func<ConnectionConfiguration, ConnectionConfiguration>)(c =>
        {
            CreateSetter(getter)(c, value);
            return c;
        });
}

The CreateSetter is a little function that turns a property expression (like x => x.Name) into an Action<TTarget, TProperty>.

Next let’s define parsers for string and number values:

public static Parser<string> Text = Parse.CharExcept(';').Many().Text();
public static Parser<ushort> Number = Parse.Number.Select(ushort.Parse);

Now we can chain a series of BuildKeyValueParser invocations and Or them together so that we can parse any of our expected key-values:

public static Parser<UpdateConfiguration> Part = new List<Parser<UpdateConfiguration>>
{
    BuildKeyValueParser("host", Text, c => c.Host),
    BuildKeyValueParser("port", Number, c => c.Port),
    BuildKeyValueParser("virtualHost", Text, c => c.VirtualHost),
    BuildKeyValueParser("requestedHeartbeat", Number, c => c.RequestedHeartbeat),
    BuildKeyValueParser("username", Text, c => c.UserName),
    BuildKeyValueParser("password", Text, c => c.Password),
}.Aggregate((a, b) => a.Or(b));

Each invocation of BuildKeyValueParser defines an expected key-value pair of our connection string. We just give the key name, the parser that understands the value, and the property on ConnectionConfiguration that we want to update. In effect we’ve defined a little DSL for connection strings. If I want to add a new connection string value, I simply add a new property to ConnectionConfiguration and a single line to the above code.

Now lets define a parser for the entire string, by saying that we’ll parse any number of key-value parts:

public static Parser<IEnumerable<UpdateConfiguration>> ConnectionStringBuilder =
    from first in Part
    from rest in Parse.Char(';').Then(_ => Part).Many()
    select Cons(first, rest);

All we have to do now is parse the connection string and apply the chain of update functions to a ConnectionConfiguration instance:

public IConnectionConfiguration Parse(string connectionString)
{
    var updater = ConnectionStringGrammar.ConnectionStringBuilder.Parse(connectionString);
    return updater.Aggregate(new ConnectionConfiguration(), (current, updateFunction) => updateFunction(current));
}

We get lots of nice things out of the box with Sprache, one of the best is the excellent error messages:

Parsing failure: unexpected 'x'; expected host or port or virtualHost or requestedHeartbeat or username or password (Line 1, Column 1).

Sprache is really nice for this kind of task. I’d recommend checking it out.

EasyNetQ, my super simple .NET API for RabbitMQ, now (from version 0.7.2.34) supports RabbitMQ clusters without any need to deploy a load balancer.

Simply list the nodes of the cluster in the connection string ...

var bus = RabbitHutch.CreateBus("host=ubuntu:5672,ubuntu:5673");

In this example I have set up a cluster on a single machine, 'ubuntu', with node 1 on port 5672 and node 2 on port 5673. When the CreateBus statement executes, EasyNetQ will attempt to connect to the first host listed (ubuntu:5672). If it fails to connect it will attempt to connect to the second host listed (ubuntu:5673). If neither node is available it will sit in a re-try loop attempting to connect to both servers every five seconds. It logs all this activity to the registered IEasyNetQLogger. You might see something like this if the first node was unavailable:

DEBUG: Trying to connect
ERROR: Failed to connect to Broker: 'ubuntu', Port: 5672 VHost: '/'. ExceptionMessage: 'None of the specified endpoints were reachable'
DEBUG: OnConnected event fired
INFO: Connected to RabbitMQ. Broker: 'ubuntu', Port: 5674, VHost: '/'

If the node that EasyNetQ is connected to fails, EasyNetQ will attempt to connect to the next listed node. Once connected, it will re-declare all the exchanges and queues and re-start all the consumers. Here's an example log record showing one node failing then EasyNetQ connecting to the other node and recreating the subscribers:

INFO: Disconnected from RabbitMQ Broker
DEBUG: Trying to connect
DEBUG: OnConnected event fired
DEBUG: Re-creating subscribers
INFO: Connected to RabbitMQ. Broker: 'ubuntu', Port: 5674, VHost: '/'

You get automatic fail-over out of the box. That’s pretty cool.

If you have multiple services using EasyNetQ to connect to a RabbitMQ cluster, they will all initially connect to the first listed node in their respective connection strings. For this reason the EasyNetQ cluster support is not really suitable for load balancing high throughput systems. I would recommend that you use a dedicated hardware or software load balancer instead, if that’s what you want.

Today I was idly thinking about an idea I had a couple of years ago for a functional IoC container. I’d had a go at implementing such a beast, but soon got bogged down in a tangled mess of spaghetti reflection code and gave it up as too much bother. But today it suddenly occurred to me that there was no need for any reflection voodoo; the C# type system is powerful enough to do all the work for us.

In object oriented programming languages we build programs from classes. Classes declare the contract(s) they support with interfaces and declare their dependencies with constructor arguments. We use an IoC container to wire instances of our classes together to make a running program.

Pure functional languages, like Haskell, don’t have any concept of class, instead they use currying and partial application to compose hierarchies of functions.

Here’s an example of a purely functional program written in C#.

public static class Module
{
    public static Data GetAndTransform(Func<Input,Data> dataAccsessor, Func<Data,Data> transformer, int id)
    {
        var input = new Input() {Id = id};
        var data = dataAccsessor(input);
        var transformed = transformer(data);
        return transformed;
    }

    public static Data DataAccsessor(Input input)
    {
        return new Data
        {
            Id = input.Id,
            Name = "Test"
        };
    }

    public static Data Transformer(Data original)
    {
        original.Name = original.Name + " transformed";
        return original;
    }
}

GetAndTransform simply takes an int id argument, does some work, and then returns some data. It needs a dataAccsessor and a transformer in order to do its job.

C# doesn’t support currying or partial application, so in order to run it we have to compose the program and execute it all in one step. For example:

var id = 10;
var data = Module.GetAndTransform(Module.DataAccsessor, Module.Transformer, id);

Console.Out.WriteLine("data.Id = {0}", data.Id);
Console.Out.WriteLine("data.Name = {0}", data.Name);

But what if we had a ‘currying container’, one that could compose the program in one step and then return a function for us to execute in another? Here is such a container at work with our program:

var registration = new Container()
    .Register<Func<Input, Data>, Func<Data, Data>, int, Data>(Module.GetAndTransform)
    .Register<Input,Data>(Module.DataAccsessor)
    .Register<Data,Data>(Module.Transformer);

var main = registration.Get<Func<int, Data>>();

var data = main(10);

Console.Out.WriteLine("data.Id = {0}", data.Id);
Console.Out.WriteLine("data.Name = {0}", data.Name);

In the first line, we create a new instance of our container. On the next three lines we register our functions. Unfortunately C#’s type inference isn’t powerful enough to let us do away with the tedious type annotations; we have to explicitly declare the argument and return types of each of our functions.

Once our functions are registered we can ask the container for a program (main) that takes an int and returns a Data instance. The container works out that it needs to curry GetAndTransform and then partially apply DataAccsessor and Transformer to it to produce the desired function.

We can then run our ‘main’ function which gives us our expected output:

data.Id = 10
data.Name = Test transformed

The container turns out to be very simple, just a dictionary that’s keyed by type and contains a collection of constructor functions that know how to build the target (key) type.

public interface IRegistration
{
    void Add(Type target, Func<object> constructor);
    T Get<T>();
}

public class Container : IRegistration
{
    private readonly Dictionary<Type, Func<object>> registrations = new Dictionary<Type, Func<object>>();

    public void Add(Type target,  Func<object> constructor)
    {
        registrations.Add(target, constructor);
    }

    public T Get<T>()
    {
        return (T)registrations[typeof (T)]();
    }
}

The magic sauce is in the Registration function overloads. If you take the standard functional idea that a function should only have one argument and one return type, you can take any input function, curry it, and then partially apply arguments until you are left with a Func<X,Y>. So you know what the ‘target’ type of each function should be, a function from the last argument to the return type. A Func<A,B,C,R> gets resolved to a Func<C,R>. There’s no need to explicitly register a target, it’s implicit from the type of the provided function:

public static class RegistrationExtensions
{
    public static IRegistration Register<A,R>(this IRegistration registration, Func<A, R> source)
    {
        var targetType = typeof (Func<A, R>);
        var curried = Functional.Curry(source);

        registration.Add(targetType, () => curried);

        return registration;
    }

    public static IRegistration Register<A,B,R>(this IRegistration registration, Func<A, B, R> source)
    {
        var targetType = typeof (Func<B, R>);
        var curried = Functional.Curry(source);

        registration.Add(targetType, () => curried(
            registration.Get<A>()
            ));

        return registration;
    }

    public static IRegistration Register<A, B, C, R>(this IRegistration registration, Func<A, B, C, R> source)
    {
        var targetType = typeof(Func<C, R>);
        var curried = Functional.Curry(source);

        registration.Add(targetType, () => curried(
            registration.Get<A>()
            )
            (
            registration.Get<B>()
            ));

        return registration;
    }
}

Each overload deals with an input function with a different number of arguments. My simple experiment only works with functions with up to three arguments (two dependencies and an input type), but it would be easy to extend for higher numbers. The Curry function is stolen from Oliver Sturm and looks like this:

public static class Functional
{
    public static Func<A, R> Curry<A, R>(Func<A, R> input)
    {
        return input;
    }

    public static Func<A, Func<B, R>> Curry<A, B, R>(Func<A, B, R> input)
    {
        return a => b => input(a, b);
    }

    public static Func<A, Func<B, Func<C,R>>> Curry<A, B, C, R>(Func<A, B, C, R> input)
    {
        return a => b => c => input(a, b, c);
    }
}

Rather nice, even if I say so myself.

Of course this little experiment has many limitations. For a start it only understands functions in terms of Func< … >, so you can’t have more than one function of each ‘type’. You couldn’t have two Func<int,int> for example, which might be somewhat limiting.

The code is on GitHub here if you want to have a play.

EasyNetQ is my easy-to-use .NET API for RabbitMQ.

The default AMQP publish is not transactional and doesn't guarantee that your message will actually reach the broker. AMQP does specify a transactional publish, but with RabbitMQ it is extremely slow, around 200 slower than a non-transactional publish, and we haven't supported it via the EasyNetQ API. For high-performance guaranteed delivery it's recommended that you use 'Publisher Confirms'. Simply speaking, this an extension to AMQP that provides a call-back when your message has been successfully received by the broker.

What does 'successfully received' mean? It depends ...

A transient message is confirmed the moment it is enqueued.
A persistent message is confirmed as soon as it is persisted to disk, or when it is consumed on every queue.
An unroutable transient message is confirmed directly it is published.

For more information on publisher confirms, please read the announcement on the RabbitMQ blog.

To use publisher confirms, you must first create the publish channel with publisher confirms on:

var channel = bus.OpenPublishChannel(x => x.WithPublisherConfirms())

Next you must specify success and failure callbacks when you publish your message:

channel.Publish(message, x => 
    x.OnSuccess(() =>
    {
// do success processing here
    })
    .OnFailure(() => 
    {
// do failure processing here
    }));

Be careful not to dispose the publish channel before your call-backs have had a chance to execute.

Here's an example of a simple test. We're publishing 10,000 messages and then waiting for them all to be acknowledged before disposing the channel. There's a timeout, so if the batch takes longer than 10 seconds we abort with an exception.

constint batchSize = 10000;
var callbackCount = 0;
var stopwatch = new Stopwatch();
stopwatch.Start();

using (var channel = bus.OpenPublishChannel(x => x.WithPublisherConfirms()))
{
for (int i = 0; i < batchSize; i++)
    {
        var message = new MyMessage {Text = string.Format("Hello Message {0}", i)};
        channel.Publish(message, x => 
            x.OnSuccess(() => {
                callbackCount++;
            })
            .OnFailure(() =>
            {
                callbackCount++;
            }));
    }

// wait until all the publications have been acknowleged.
while (callbackCount < batchSize)
    {
if (stopwatch.Elapsed.Seconds > 10)
        {
thrownew ApplicationException("Aborted batch with timeout");
        }
        Thread.Sleep(10);
    }
}

EasyNetQ is my lightweight easy-to-use .NET API for RabbitMQ.

Today I added a small but very nice feature, better client properties. Now when you look at connections created by EasyNetQ you can see the machine that connected, the application and the application’s location on disk. It also gives you the date and time that EasyNetQ first connected. Very useful for debugging.

Here’s an example. Check out the ‘Client Properties’ section.

nice_connection_properties

RabbitMQ comes with a very nice Management UI and a HTTP JSON API, that allows you to configure and monitor your RabbitMQ broker. From the website:

“The rabbitmq-management plugin provides an HTTP-based API for management and monitoring of your RabbitMQ server, along with a browser-based UI and a command line tool, rabbitmqadmin. Features include:
Declare, list and delete exchanges, queues, bindings, users, virtual hosts and permissions.
Monitor queue length, message rates globally and per channel, data rates per connection, etc.
Send and receive messages.
Monitor Erlang processes, file descriptors, memory use.
Export / import object definitions to JSON.
Force close connections, purge queues.”

Wouldn’t it be cool if you could do all these management tasks from your .NET code? Well now you can. I’ve just added a new project to EasyNetQ called EasyNetQ.Management.Client. This is a .NET client-side proxy for the HTTP-based API.

It’s on NuGet, so to install it, you simply run:

PM> Install-Package EasyNetQ.Management.Client

To give an overview of the sort of things you can do with EasyNetQ.Client.Management, have a look at this code. It first creates a new Virtual Host and a User, and gives the User permissions on the Virtual Host. Then it re-connects as the new user, creates an exchange and a queue, binds them, and publishes a message to the exchange. Finally it gets the first message from the queue and outputs it to the console.

var initial = new ManagementClient("http://localhost", "guest", "guest");

// first create a new virtual host
var vhost = initial.CreateVirtualHost("my_virtual_host");

// next create a user for that virutal host
var user = initial.CreateUser(new UserInfo("mike", "topSecret"));

// give the new user all permissions on the virtual host
initial.CreatePermission(new PermissionInfo(user, vhost));

// now log in again as the new user
var management = new ManagementClient("http://localhost", user.name, "topSecret");

// test that everything's OK
management.IsAlive(vhost);

// create an exchange
var exchange = management.CreateExchange(new ExchangeInfo("my_exchagne", "direct"), vhost);

// create a queue
var queue = management.CreateQueue(new QueueInfo("my_queue"), vhost);

// bind the exchange to the queue
management.CreateBinding(exchange, queue, new BindingInfo("my_routing_key"));

// publish a test message
management.Publish(exchange, new PublishInfo("my_routing_key", "Hello World!"));

// get any messages on the queue
var messages = management.GetMessagesFromQueue(queue, new GetMessagesCriteria(1, false));

foreach (var message in messages)
{
    Console.Out.WriteLine("message.payload = {0}", message.payload);
}

This library is also ideal for monitoring queue levels, channels and connections on your RabbitMQ broker. For example, this code prints out details of all the current connections to the RabbitMQ broker:

var connections = managementClient.GetConnections();

foreach (var connection in connections)
{
    Console.Out.WriteLine("connection.name = {0}", connection.name);
    Console.WriteLine("user:\t{0}", connection.client_properties.user);
    Console.WriteLine("application:\t{0}", connection.client_properties.application);
    Console.WriteLine("client_api:\t{0}", connection.client_properties.client_api);
    Console.WriteLine("application_location:\t{0}", connection.client_properties.application_location);
    Console.WriteLine("connected:\t{0}", connection.client_properties.connected);
    Console.WriteLine("easynetq_version:\t{0}", connection.client_properties.easynetq_version);
    Console.WriteLine("machine_name:\t{0}", connection.client_properties.machine_name);
}

On my machine, with one consumer running it outputs this:

connection.name = [::1]:64754 -> [::1]:5672
user:   guest
application:    EasyNetQ.Tests.Performance.Consumer.exe
client_api: EasyNetQ
application_location:   D:\Source\EasyNetQ\Source\EasyNetQ.Tests.Performance.Consumer\bin\Debug
connected:  14/11/2012 15:06:19
easynetq_version:   0.9.0.0
machine_name:   THOMAS

You can see the name of the application that’s making the connection, the machine it’s running on and even its location on disk. That’s rather nice. From this information it wouldn’t be too hard to auto-generate a complete system diagram of your distributed messaging application. Now there’s an idea :)

For more information, check out the documentation.

Consider these (somewhat) common programming challenges:

I’m using a third party library that is not thread safe, but I want my application to share work between multiple threads. How do I marshal calls between my multi-threaded code to the single threaded library?
I have a single source of events on a single thread, but I want to share the work between a pool of multiple threads?
I have multiple threads emitting events, but I want to consume them on a single thread?

One way of doing this would be to have some shared state, a field or a property on a static class, and wrap locks around it so that multiple threads can access it safely. This is a pretty common way of trying to skin this particular cat, but it’s shot through with traps for the unwary. Also, it can hurt performance because access to the shared resource is serialized, even though the things accessing it are running in parallel.

A better way is to use a BlockingCollection and have your threads communicate via message classes.

BlockingCollection is a class in the new System.Collections.Concurrent namespace that arrived with .NET 4.0. It contains a ConcurrentQueue, although you can swap this for a ConcurrentStack or a ConcurrentBag if you want. You push objects in at one end and sit in a loop consuming them from the other. The (multiple) producer(s) and (multiple) consumer(s) can be running on different threads without any locks. That’s OK because the Concurrent namespace collection classes are guaranteed to be thread safe. The ‘blocking’ part of the name is there because the consuming end blocks until an object is available. Justin Etheredge has an excellent post that looks at BlockingCollection in more detail here.

For an example, let’s implement a parallel pipeline. A ventilator produces tasks to be processed in parallel, a set of workers process the tasks on separate threads, and a sink collects the results back together again. It shows both one-to-many and many-to-one thread communication. I’ve stolen the idea and the diagram from the excellent ZeroMQ Guide:

First we’ll need a class that represents a piece of work, we’ll keep it super simple for this example:

publicclass WorkItem
{
publicstring Text { get; set; }
}

We’ll need two BlockingCollections, one to take the tasks from the ventilator to the workers, and another to take the finished work from the workers to the sink:

var ventilatorQueue = new BlockingCollection<WorkItem>();
var sinkQueue = new BlockingCollection<WorkItem>();

Now let’s write our ventilator:

publicstaticvoid StartVentilator(BlockingCollection<WorkItem> ventilatorQueue)
{
    Task.Factory.StartNew(() =>
    {
for (int i = 0; i < 100; i++)
        {
            ventilatorQueue.Add(new WorkItem { Text = string.Format("Item {0}", i) });
        }
    }, TaskCreationOptions.LongRunning);
}

It just iterates 100 times creating work items and pushing them on the ventilatorQueue.

Here is a worker:

publicstaticvoid StartWorker(int workerNumber,
    BlockingCollection<WorkItem> ventilatorQueue,
    BlockingCollection<WorkItem> sinkQueue)
{
    Task.Factory.StartNew(() =>
    {
foreach (var workItem in ventilatorQueue.GetConsumingEnumerable())
        {
// pretend to take some time to process
            Thread.Sleep(30);
            workItem.Text = workItem.Text + " processed by worker " + workerNumber;
            sinkQueue.Add(workItem);
        }
    }, TaskCreationOptions.LongRunning);
}

BlockingCollection provides a GetConsumingEnumerable method that yields each item in turn. It blocks if there are no items on the queue. Note that I’m not worrying about shutdown patterns in this code. In production code you’d need to worry about how to close down your worker threads.

Next let’s write our sink:

publicstaticvoid StartSink(BlockingCollection<WorkItem> sinkQueue)
{
    Task.Factory.StartNew(() =>
    {
foreach (var workItem in sinkQueue.GetConsumingEnumerable())
        {
            Console.WriteLine("Processed Messsage: {0}", workItem.Text);
        }
    }, TaskCreationOptions.LongRunning);
}

Once again, this sits in an infinite foreach loop consuming items from the sinkQueue.

Finally we need to wire up the pieces and kick it off:

StartSink(sinkQueue);

StartWorker(0, ventilatorQueue, sinkQueue);
StartWorker(1, ventilatorQueue, sinkQueue);
StartWorker(2, ventilatorQueue, sinkQueue);

StartVentilator(ventilatorQueue);

I’ve started the sink first, then the workers and finally the producer. It doesn’t overly matter what order they start in since the queues will store any tasks the ventilator creates before the workers and the sink start.

Running the code I get output something like this:

Processed Messsage: Item 1 processed by worker 1
Processed Messsage: Item 2 processed by worker 0
Processed Messsage: Item 0 processed by worker 2
Processed Messsage: Item 5 processed by worker 2
Processed Messsage: Item 3 processed by worker 1

....

Processed Messsage: Item 95 processed by worker 0
Processed Messsage: Item 98 processed by worker 0
Processed Messsage: Item 97 processed by worker 2
Processed Messsage: Item 96 processed by worker 1
Processed Messsage: Item 99 processed by worker 0

This pattern is a great way of decoupling the communication between a source and a sink, or a producer and a consumer. It also allows you to have multiple sources and multiple sinks, but primarily it’s a safe way for multiple threads to interact.

The complete example is here on GitHub.

Any reader of this blog will know that my big project over the last year has been to create a simple .NET API for RabbitMQ called EasyNetQ. I’ve been working as a software architect at 15Below for the last year and a half. The prime motivation for writing EasyNetQ was so that our developers would have an easy API for working with RabbitMQ on .NET. I was very fortunate that, founder and technical director, John Clynes, supported my wish to build it as an open source library. I originally wrote this post for the VMWare blog as a case study of running RabbitMQ in a Microsoft environment.

15Below is based in Brighton on the south coast of England, famous for it’s Regency pavilion and Victorian pier. We provide messaging and integration services for the travel industry. Our clients include Ryan Air, Qantas, JetBlue, Thomas Cook and around 30 other airline and rail customers. We send hundreds of millions of transactional notifications each year to our customer’s passengers.

RabbitMQ has helped us to significantly simplify and stabilise our software. It’s one of those black boxes that you install, configure, and then really don’t have to worry about. In over a year of production we’ve found it to be extremely stable and reliable.

Prior to introducing RabbitMQ our applications would use SQL Server as a queuing mechanism. Each task would be represented by a row in a workflow table. Each process in the workflow would poll the table looking for rows that matched its status, process the rows in in a batch, and then update the rows’ status field for the next process to pick up. Each step in the process would be hosted by an application service that implemented its own threading model, often using a different approach to all the other services. This created highly coupled software, with workflow steps and infrastructure concerns, such as threading and load balancing, mixed together with business logic. We also discovered that a relational database is not a natural fit for a queuing system. The contention on the workflow tables is high, with constant inserts, selects and updates causing locking issues. Deleting completed items is also problematic on highly indexed tables and we had considerable problems with continuously growing tables.

I wrote about the ‘Database As Queue Anti-Pattern’ in a previous post in more detail.

RabbitMQ provides a number of features that helped us overcome these problems. Firstly it is designed from the beginning as a high-performance messaging platform. It easily outperformed our SQL Server based solution with none of its locking or deletion problems. Rabbit’s event-oriented messaging model also takes away much of the need for complex multi-threaded batch processing code that was previously a cause of instability in our systems.

We first introduced RabbitMQ about 18 months ago as the core infrastructure behind our Flight Status product. We wanted a high performance messaging product with a proven track record that supported common messaging patterns, such as publish/subscribe and request/response. A further requirement was that it should provide automatic work distribution and load balancing.

The need to support messaging patterns ruled out simple store-and-forward queues such as MSMQ and ActiveMQ. We were very impressed by ZeroMQ, but decided that we really needed the centralised manageability of a broker based product. This left RabbitMQ. Although support for AMQP, an open messaging standard, wasn’t in our list of requirements, its implementation by RabbitMQ made us more confident that we were choosing a sustainable strategy.

We are very much a Microsoft shop, so we had some initial concerns about RabbitMQ’s performance and stability on Windows. We were reassured by reading some accounts of RabbitMQ’s and indeed Erlang’s use on Windows by organisations with some very impressive load requirements. Subsequent experience has borne these reports out, and we have found RabbitMQ on Server 2008 to be rock solid.

As a Microsoft shop, our development platform is .NET. Although VMWare provide an AMQP C# client, it is a low-level API, not suitable for use by more junior developers. For this reason we created our own high-level .NET API for RabbitMQ that provides simple single method access to common messaging patterns and does not require a deep knowledge of AMQP. This API is called EasyNetQ. We’ve open sourced it and, with over 3000 downloads, it is now the leading high-level API for RabbitMQ with .NET. You can find more information about it at EasyNetQ.com. We would recommend looking at it if you are a .NET shop using RabbitMQ.

15Below’s Flight-Status product provides real-time flight information to passengers and their family and friends. We interface with the airline’s real-time flight information stream generated from their operation systems and provide a platform that allows them to apply complex business logic to this stream. We render customer tailored output, and communicate with the airline’s customers via a range of channels, including email, SMS, voice and iPhone/Android push. RabbitMQ allows us to build each piece; the client for the fight information stream, the message renderer, the sink channels and the business logic; as separate components that communicate using structured messages. Our architecture looks something like this:

The green boxes are our core product systems, the blue boxes represent custom code that we write for each customer. A ‘customer saga’ is code that models a long-running business process and includes all the workflow logic for a particular customer’s flight information requirements. A ‘core product service’ is an independent service that implements a feature of our product. An example would be the service that takes flight information and combines it with a customer defined template to create an email to be sent to a passenger. Constructing services as independently deployable and runnable applications gives us great flexibility and scalability. If we need to scale up a particular component, we simply install more copies. RabbitMQ’s automatic work sharing feature means that we can do this without any reconfiguration of existing components. This architecture also makes it easy to test each application service in isolation since it’s simply a question of firing messages at the service and watching its response.

In conclusion, RabbitMQ has provided a rock solid piece of infrastructure with the features to allow us to significantly reduce the architectural complexity of our systems. We can now build software for our clients faster and more reliably. It scales to higher loads than our previous relational-database based systems and is more flexible in the face of changing customer requirements.

I love little phrases that sum up large scale behaviours in software systems and the organisations that produce them. One of my favourite is “The Onion Of Compromise.” I first heard this gem from my excellent friend Iain Holder. Iain doesn’t claim to be the author, that honour goes to a mysterious third person named ‘Mike’.

Being a programmer is all about making decisions. Lots and lots of little decisions. In fact every line of code is a decision; a little cog in the wheel of a grander machine. The simple thing that separates a good programmer from a poor programmer is that they tend to make relatively more good decisions and less bad ones.

Incidentally, that’s why it’s a mistake to think that you can hire an experienced ‘chief architect’ who ‘designs’ your system, while rooms full of junior/cheap developers churn out the code - and expect anything other than a disaster to occur. The decisions are just too granular to be made by one person on a large project.

Good decisions are ones which aid the evolution and stability of an application. They are summed up by epithets that describe general good practice, such as ‘Don’t Repeat Yourself’, ‘Open Closed Principle’ and a hundred others. An experienced programmer will employ a range of these rules-of-thumb to ensure that they don’t get tangled up in needless complexity as their application grows. You can tell a project where good decisions have been made; it’s easy to add new features and there are few bugs.

A bad decision often doesn’t seem like a bad decision at first, merely a way of implementing a feature or fixing a bug with the least possible impact on the code. Often the bad decision will introduce a constraint on further evolution of the software or a special case given a particular combination of factors. If a bad decision isn’t rolled back it can quickly lead to further bad decisions as the programmer works around it. Soon layers of poor design wrap that initial poor decision. This is ‘The Onion of Compromise’. That initial first mistake (or compromise) leads to a cascade of poor choices. Another name for the layers of the onion is ‘Technical Debt’.

It’s easy to spot software that has suffered from The Onion of Compromise; it’s brittle, you change one thing and it breaks seemingly unrelated parts of the system; it seems to take ages to implement new features; and there’s a high bug count.

WebRequest, or rather HttpWebRequest has the annoying behaviour or throwing a WebException when the server returns 404 ‘not found’ status, or in fact any unexpected status number. It would be much better if it didn’t do this and simply allowed the application to decide what it should do on different status codes. At the very least there should be some way of turning this behaviour on or off. In fact it would be nice if the whole WebRequest class wasn’t a monolithic black box, but a toolbox of components that allowed you to tailor an HTTP client to your requirements. I was a little surprised when I did some Googling earlier and couldn’t find a nice open source alternative to WebRequest; it’s the sort of thing that the community is usually quite good at coding around. Oh well, I’ll add that to my ever growing list of potential future GitHub projects (that will never happen).

My quick and dirty fix for this problem was an extension method that catches WebException, checks if the type of the exception is a protocol exception – this seems to be the status for status code exceptions, and then returns the response from the exception’s Response property. It’s horrible, but it seems to work:

publicstatic HttpWebResponse GetHttpResponse(this HttpWebRequest request)
{
    HttpWebResponse response = null;

try
    {
        response = (HttpWebResponse)request.GetResponse();
    }
catch (WebException exception)
    {
if (exception.Status == WebExceptionStatus.ProtocolError)
        {
            response = (HttpWebResponse) exception.Response;
        }
else
        {
throw;
        }
    }

return response;
}

It conveniently returns an HttpWebResponse instead of a WebResponse.

You could use it like this …

var response = request.GetHttpResponse();
if (response.StatusCode != HttpStatusCode.NotFound)
{
// handle appropriately
}

Of course, if you want to handle the response asynchronously, you’ll have to write an extension method for EndGetResponse as well.

… or, if you’re using .NET 4.5, you could use HttpClient. It wraps WebRequest internally, but it does return status codes rather than throwing it seems.

I’ve been having a lot of fun writing a little ‘re-tweeter’ this morning. We basically want to monitor our user stream and then re-tweet any status with a particular hash tag. I thought this would be an excellent little project for node, and indeed it proved to be extremely easy to do. I used the node-twitter library which worked fine for what I wanted to do.

If you want to use this code, you’ll need to do the following:

First you’ll need to go to https://dev.twitter.com/apps and register a new app. You can then copy and paste your consumer key, consumer secret, access token and access token secret into the ‘xxx’ fields.

Next install node-twitter with npm:

npm install twitter

Then just run the code with node (I’m a poet and I didn’t know it):

node twitter-retweeter.js

Here’s the code in the twitter-retweeter.js file:

var util = require('util');
var twitter = require('twitter');

var twit = new twitter({
    consumer_key: 'xxx',
    consumer_secret: 'xxx',
    access_token_key: 'xxx',
    access_token_secret: 'xxx'
});

var hashtag = '#iloveprog'

function write(data) {
if ( typeof data === 'string') {
        console.log(data);
    }
elseif (data.text && data.user && data.user.screen_name) {
        console.log(data.user.screen_name + ": " + data.text);
        testForHashtag(data);
    }
elseif (data.delete) {
        console.log('DELETE');
    }
elseif (data.message) {
        console.log('ERROR' + data.message);
    }
else {
        console.log(util.inspect(data));
    }
}

function testForHashtag(data) {
if(data.retweeted) return;
if(data.text.indexOf(hashtag) != -1) {
        twit.retweetStatus(data.id_str, function(){
            console.log('retweet callback');
        });
    }
}

function reconnect() {
    setTimeout(startStreaming, 1000);
}

function startStreaming() {
    twit.stream('user', function(stream) {
        console.log('starting stream');
        stream.on('data', write);
        stream.on('end', reconnect)
    });
}

startStreaming();

console.log('lisening for tweets');

It’s all really straight forward. The startStreaming function kicks of the callback on the twitter user stream. Each time an event occurs it calls the write function which checks for the given hashtag and then retweets the status if there’s a match.

Lovely!

God rest ye merry gentlemen! Welcome to my 2012 Geek Christmas Quiz. Every Friday morning at 15below we have a ‘DevEd’ session. Usually this is a presentation about some interesting tech, or a new way we want to do something at the company, but today I thought I would try to gauge the true geekiness of our development team with a quiz. The winners, and therefore crowned top geeks, were Toby and Linda who got a total of 32 points. See if you can do better dear reader.

You get one point for each correct answer. The quiz is split into six sections: Computers, ‘Name That Geek’, Science, Space, ‘Name That Spaceship’, and Geek Culture.

Computers

What does G.N.U. stand for?
What did the A in ARM originally stand for?
What does TCP stand for?
Who founded Microsoft with Bill Gates?
What is F2 (hexadecimal) in decimal?
Which operating system's development was based on the 'Balmer Peak'?
Who was the first programmer?
What year does UNIX time start?
What did SGI stand for?
Write down the type signature of the Monadic Bind method.

Name that Geek

Science

What are the four letters of DNA?
What does the 'c' in E = mc2 stand for?
What is the next number in this sequence: 1 1 2 3 5 8 _ ?
What is C8 H10 N4 02 ?
When did Australopithecus become extinct? (in millions of years ago)
Which of the following would you not expect to find in an atomic nucleus (electron, neutron, proton)
What is the most common gas in the Earth's atmosphere?
Write the formula for Ohm's law.
If, after you fold a piece of paper in half, the ratio between its longest side and its shortest side is the same, what is that ratio?
What living land mammal is the closest evolutionary relative to Whales? (cetaceans)

Space

What rocket engine powered the 2nd stage of the Saturn V?
What is Saturn's largest moon?
What fraction of the Earth's gravity would you experience on the moon?
Astronauts are weightless in space because there is no gravity. true or false?
What is the orbital period of a geosynchronous satellite?
What is the furthest planet from the sun? (now that Pluto has been demoted)
How many people are currently living aboard the ISS?
To the nearest thousand, how many satellites are currently orbiting the earth?
What was the name of the only satellite launched by the UK?
Who was the second man on the moon?

Name that spaceship

Geek Culture

Who was Spooky Mulder?
Was Kiki a trainee witch or an evil princes?
"Humans are a _____" (Agent Smith)
Who is Peter Parker?
"It's a b... It's a b... It's a small, off-duty Czechoslovakian traffic warden!" What is it really?
What does 'Otaku' (Japanese) mean?
What does R2D2 stand for?
What is Clarke's 3rd law?
What is the air speed velocity of an unladen swallow?
Open ___ ___ ___ ____ please H.A.L

Tracing System.Net to debug HTTP Clients

Reading UTF-8 Characters From An Infinite Byte Stream

Sprache – A Monadic Parser For C#

REST – Epic Semantic Fail

Using Git and GitHub in a Microsoft Development Team

Wiring up TopShelf, Windsor and EasyNetQ

Return a Task From BeginExecuteNonQuery

Replacing EasyNetQ Components

Parsing a Connection String With Sprache

EasyNetQ Cluster Support

A Functional IoC Container

EasyNetQ Publisher Confirms

Nicer Client Properties For EasyNetQ

A C# .NET Client Proxy For The RabbitMQ Management API

Using BlockingCollection To Communicate Between Threads

RabbitMQ On Windows With .NET, A Case Study

The Onion Of Compromise

WebRequest Throws On 404 Status Code

My Super Simple Node Twitter Re-Tweeter

A Geek Christmas Quiz