The next version of the .Net framework and Visual Studio both have some pretty cool features to help programmers work with multiple cores, which is great but doesn’t help the majority of us that are stuck with .Net 3.5 for the foreseeable future. Luckily Erik Meijer and the Cloud Programability Team have back ported the Parallel Extensions Framework (PFX) to .Net 3.5 and Silverlight 3 as part of the Reactive Extensions for .NET (Rx). Rx adds the IObservable<T> and IObserver<T> interfaces, which are the mathematical duality of the IEnumerable<T> and IEnumerator<T> and provide tools for doing Reactive Programming. There are many different ways to use Rx, but internally they all use the Task Parallel Library (TPL) as the “special sauce” to automate processing tasks across multiple threads.
Parallel.ForEach is a part of the TPL that can be used to unroll an outer loop and have it run across multiple threads. Take the following example. This is standard single threaded code that loops through a collection of 2 card Texas Holdem starting hands and evaluates all possible 7 card hands that include those two cards:
There are over 2 million hands in the inner loop, which gets run once for each of the possible starting hands in the outer loop. There are 12 offsuit starting hands and 4 suited starting hands with an Ace and a King, which means that the outer loop would run 16 times, however those 16 executions are separate and could easily be run across multiple threads. That is where Parallel.ForEach comes in. Here is the same code, which will automatically be scheduled across multiple threads:
The bulk of the code is the same, with the only changes being that the inner block gets converted into a Lambda Expression (could have also used a delegate) and instead of incrementing lCount in the inner loop we increment a local loop variable and then atomically add it to the global value. We could have used System.Threading.Interlocked.Increment to atomically increment lCount inside the inner loop, but this adds a lot of unneeded locks that slow down all the threads. Keeping a local copy of the values and only locking once at the end provides much better performance.
And with those few small changes we are able to start using multiple threads, which on my local machine with 4 cores and 8 threads ended up decreasing the processing time from 1.55 seconds to 0.67 seconds and more than doubling the number of hands processed per second from 16,420,998 to 37,816,595.
If you want to start using the TPL you can download it here and add a reference to the System.Threading.dll located in the “C:\Program Files (x86)\Microsoft Reactive Extensions\Redist\DesktopV2” folder. And while you are at it you might as well play around with Rx too!