Thursday 28 January 2010

Distributed caching with protobuf-net

No, this isn't my server farm.

I’ve been playing with distributed caches lately, and of course this is a really good use-case for something like protobuf-net – if there is anywhere you want to minimise your memory costs, network IO costs and CPU costs, it is your cache (it gets kinda busy).

As a pure example of “which system can I get installed and running the quickest”, I looked at memcached, using the windows service from here and the enyim.com client. Not only were they a breeze to get working, but this setup seems pretty popular, and also covers memcached providers, which uses enyim under the hood.

Fortunately, the enyim client has an extension-point that allows you to nominate a serializer through your config file (actually it calls it a transcoder, but I’m not sure it fits the definition). This allows us to swap BinaryFormatter for protobuf-net, which (for suitable types) has huge savings – in my crude tests I was seeing only 20% of the original network traffic, and roughly 75% of the original CPU costs.

Enough bluster…

OK, so in r282 I’ve added just such a transcoder (I still don’t like the name…) – and the good thing it is so effortless to configure it:

Before:

<enyim.com>
<memcached>
<servers>
<add address="127.0.0.1" port="11211" />

After:

<enyim.com>
<memcached transcoder="ProtoBuf.Caching.Enyim.NetTranscoder, protobuf-net.Extensions">
<servers>
<add address="127.0.0.1" port="11211" />

Wasn't that was painless? and some huge benefits (although obviously you’d want to run some integration/regression tests if applying this to an existing system). Oh, and of course you need to include protobuf-net and protobuf-net.Extensions.dll in your project, in addition to the enyim dll.

My only niggle is that because there is no way of knowing the type up-front, I had to include the outermost type information on the wire (hence the “Net” in the transcoder’s name, since it won’t be truly portable). I could get around this by having the caller indicate the type, but the enyim dll doesn’t play that game.

What next?

If you find it good/bad/pointless/buggy please let me know.

If you’ve got another cache that might benefit from some protobuf-net love, please let me know. Velocity, perhaps? Azure?

Monday 4 January 2010

Reflections on Optimisation – Part 3, Fields and Properties

Short version:

Unless you are doing something interesting in your accessors, then for “full” .NET on classes, it makes no difference. Use automatically-implemented properties, and “commit”.

Much ado about nothing:

Previously, I discussed ways of modelling a general-purpose property accessor, and looked at existing APIs. So what next? I’m rather bored of conversations like this on forums:

OP: My flooble-gunger application uses public fields to hold the flipple-count, but I’m getting problem…

Me: back up there…. why does it use public fields?

OP: For optimisation, obviously!

But is there anything “obvious” here? I don’t think this is news to most people, but public fields are not generally a good idea – it breaks encapsulation, is (despite belief otherwise) a breaking change (and sometimes build-breaking) to swap for properties, doesn’t work with data-binding, can’t participate in polymorphism or interface impementation, etc.

With regards to any performance difference, you would expect that in most “regular” code, member-access does not represent the overhead. We would hope that the only times when member-access might be truly interesting is (like for me) if you are writing serialization / materialization code. Still, let’s plough on with this scenario, and check things out.

In particular, for simple properties (for example, C# 3.0 “automatically implemented properties”) our expectation is that they will commonly be “inlined” by the JIT. The big caveat here (where I might forgive you) is Compact Framework (for example XNA on XBox 360), which has a much weaker JIT.

So as before; let’s test it. Another very simple test rig, I want to compare (each in a loop):

  • Direct property access on an object
  • Direct field access on an object
  • A wrapper (as before) that talks to a public property
  • A wrapper (as before) that talks to a private property
  • (automatically implemented properties vs manual properties)
  • A wrapper (as before) that talks to a public field
  • A wrapper (as before) that talks to a private field

To achieve the private tests, I’ve created the wrapper implementation as a nested type of the object I’m testing. Oddly enough, there were some oddities that popped up, but I’ll discuss those next time; this post has been long enough, so here’s the very uninteresting result (sorry for the anti-climax):

Implementation Get performance Set performance
  • Prop, no wrapper
  • Field, no wrapper
  • Public prop wrapper
  • Private prop wrapper
  • Auto prop wrapper
  • Public field wrapper
  • Private field wrapper
6, 6, 29, 28, 29, 29, 28 25, 28, 52, 50, 50, 50, 50
(numbers scaled, based on 650M iterations, lower is better)

Both direct and via abstraction, the cost is identical for fields and properties. Quite simply – don’t stress over it.