Code, code and more code.: 2010

Wednesday, 13 October 2010

DataTable – life in the old beast?

Updated: Billy referred me to the binary remoting format for DataTable; stats for BinaryFormatter now include both xml and binary.

This week, someone asked the fastest way to send a lot of data, only currently available in a DataTable (as xml), over the wire.

My immediate thought was simply GZIP the xml and have done with it, but I was intrigued… can we do better? In particular (just for a complete surprise) I was thinking whether protobuf-net could be used to write them. Now don’t get me wrong; I’m not a big supporter of DataTable, but they are still alive in the wild. And protobuf-net is a general-purpose serializer, so… why not try to be helpful, eh?

v1 won’t help at all with this, since the object-model just isn’t primed for extension, but in v2 we have many options. I thought I’d look at some typical data (SalesOrderDetail from AdventureWorks2008R2; 121317 rows and 11 columns), comparing the inbuilt xml support, BinaryFormatter support, and protobuf-net. And each of those uncompressed, via gzip, and via deflate.

At this point my new friend Lucy (right) is pestering me for a walk, but… it worked; purely experimental, but committed (r356). Results shown below:

Table loaded    11 cols 121317 rows
DataTable (xml) (vanilla)               2269ms/6039ms
                                       64,150,771 bytes
DataTable (xml) (gzip)                  4881ms/6714ms
                                       7,136,821 bytes
DataTable (xml) (deflate)               4475ms/6351ms
                                       7,136,803 bytes
BinaryFormatter (rf:xml) (vanilla)      3710ms/6623ms
                                       74,969,470 bytes
BinaryFormatter (rf:xml) (gzip)         6879ms/8312ms
                                       12,495,791 bytes
BinaryFormatter (rf:xml) (deflate)      5979ms/7472ms
                                       12,495,773 bytes
BinaryFormatter (rf:binary) (vanilla)   2006ms/3366ms
                                       11,272,592 bytes
BinaryFormatter (rf:binary) (gzip)      3332ms/4267ms
                                       8,265,057 bytes
BinaryFormatter (rf:binary) (deflate)   3216ms/4130ms
                                       8,265,039 bytes
protobuf-net v2 (vanilla)               316ms/773ms
                                       8,708,846 bytes
protobuf-net v2 (gzip)                  932ms/1096ms
                                       4,797,856 bytes
protobuf-net v2 (deflate)               872ms/1050ms
                                       4,797,838 bytes

Since “walkies” beckons, I’ll be brief:

the inbuilt xml and BinaryFormatter (as xml) are both very large natively and compress pretty well, but with noticeable additional CPU costs
with BinaryFormatter using the binary encoding it does much less work, getting (before compression) into the same size-bracket as the xml encoding managed with compression
protobuf-net is much faster in terms of CPU, but slightly larger than the gzip xml output
unusually for protobuf, it compresses quite will – presumably there is enough text in there to make it worthwhile; this then takes roughly half the size and much less CPU (compared to gzip xml)

Overall the current experimental code is exceptionally rough and ready, mainly as an investigation. But… should I pursue this? do people still care about DataTable enough? Or just use gzip/xml in this scenario?

Thursday, 23 September 2010

Other news…

Sorry if this is turning into a bit of a serialization fest; I do have other coding hobbies; simply this is the thing I’ve been spending my spare time on lately ;p

Silverlight WCF

It recently came to my attention that possibly Silverlight does now support the WCF extension points for swapping the serializer at runtime. Not via xml configuration, but via code configuration. Full details on Carlos’ Blog, but that is definitely an area I want to investigate – at least, if @abdullin will let me.

Interface serialization

Something a few people have asked for, and definitely not outside reach. In fact, Wallace Turner kindly donated a v1 patch to enable interface-based serialization. I’m keeping this “on ice” for now, simply to keep my main focus on finishing v2. But this exists and is certainly on the table for consideration.

Anyway, enough for now; the day job starts again in not-a-lot-of-hours…

protobuf-net on MonoDroid

I was very pleased to get an invite to the MonoDroid (Mono tools / environment for Android) beta earlier in the week, so in addition to a few v2 commits I’ve been playing with shiny new toys.

Wow; decent reflection!

Initially I made the mistake of thinking that MonoDroid would have the same reflection limitations as (say) MonoTouch. I even tied myself in knots trying to get around them! Then Jb Evian pointed out that actually MonoDroid can do full meta-programming at runtime! That is a huge plus for me… so; whack in a few conditional-compilation symbols and – wow! It works!

A taster

Don’t get too excited – this is only going to be brief…

Yes, another amazing example of my l33t UI skillz

Only intended to show something working, but here is v2 on MonoDroid, compiled at runtime, round-tripping some trivial data.

A small (but significant) step. Actually, the time taken to compile the model and run it the first time is still a bit slow for my liking – not sure if that is the emulator, the beta, or a general JIT issue – I’ll dig more and investigate rebasing further.

No really, this is all I have for gfx And of course as a fallback, I can still use runtime usage without “emit” – it is noticeably slower, but still works.

Again, without an actual device it is hard to gauge typical performance at this point, but I’m pretty hopeful.

So… if that works, what are you waiting for?

I still need to finish v2; some decent commits lately now that I’m all refreshed. The finishing line is within sight… oh, and MonoDroid needs to be released ;p

Sunday, 19 September 2010

protobuf-net; ushort glitch before r274

Dammit, I hate it when a gremlin shows up. I hate it even more when they show up a year later…

It turns out that a bug “fix” to ushort (unsigned 16-bit integers) handling back in October 2009 introduced a slight… “nuance” when moving between versions < r274 and >= r274. Basically, due to a braindead bug on my part, it was treating ushort data as strings rather than handling appropriately. This was fixed a year ago, but if you are moving between versions is a PITA.

Sorry for any inconvenience folks, but… well, it happened. I can’t change that. Anyway, how to fix?! The good news is NO DATA IS LOST - it is just a bit trickier to access. You wouldn't have encountered the issue if working against rigid .proto / cross-platform definitions (since .proto doesn't have direct ushort support), so I’ll limit myself to the pure .NET scenario. In which case, IMO the easiest fix is to introduce a shim property, so:

[ProtoMember(1)]   
public ushort Foo {get;set;}

might be updated to:

[ProtoMember(1)]   
private string FooLegacy { // a pass-thru    
    get {return Foo.ToString();}    
    set {Foo = ushort.Parse(value);}    
}    
private bool FooLegacySpecified { // suppress serialization    
    get {return false;}    
    set {}    
}    
[ProtoMember(42)] // any unused field number    
public ushort Foo {get;set;}

This allows old-style and the correct data to be used side-by-side (always writing to the new format). For arrays, see my longer answer here, which also switches to the more efficient "packed" encoding, introduced later to the protobuf spec.

Sorry for any inconvenience folks. Fortunately use of ushort is fairly uncommon in .NET, so my hope is that there aren't vast hordes of people impacted by this unfortunate foible.

As a side note, I should add: I considered trying to add automatic handling of old data, but it transpires that the borked data is so similar to "packed" encoding as to be ambiguous in some cases, so trying to guess was too risky. And it would compound the issue by layering hack upon bug.

Wednesday, 8 September 2010

Truer than true

(edit: apols for the formatting; google ate my images - have replaced with source, but raw)

I’ve been quiet for a little while, due to general fatigue, changing job etc – but after a holiday I’m back refreshed and renewed.

So let’s start with a fun edge condition, inspired by my colleague balpha, who asked me “does C#/.NET/LINQ guarantee false < true?”.

So, is it?

I’m leaving LINQ aside, as for anything other than LINQ-to-Objects, all bets are off; so let’s just limit ourselves to C#; in that case the answer is pretty simple, “yes”. System.Boolean implements IComparable and IComparable<bool> such that this will always work. But not all .NET is C# ;p So how can we be evil here?

What is a boolean, anyways?

First, we need to understand that (in common with many runtimes/languages) the CLI doesn’t really know much about booleans – under the hood they are treated pretty-much the same as integers, with special treatment for “zero” and “not zero”. That means that technically we aren’t restricted to the expected values – it is simply that C# handles all the logic to make sure you only get sane values. But we can write our own IL – here I’m using DynamicMethod to create a bool with underlying value 2:


// write an evil non-C# function that returns something 
// unexpected for a bool (but legal IL)
var dm = new DynamicMethod("mwahaha", typeof(bool), null);
var il = dm.GetILGenerator();
il.Emit(OpCodes.Ldc_I4_2);
il.Emit(OpCodes.Ret);
var func = (Func<bool>)dm.CreateDelegate(typeof(Func<bool>));
var superTrue = func();

The first thing to note is that you shouldn’t do this. So what does such a beast do? The first thing to note is that at first glance it appears to work as expected:


// is it true/false?
Console.WriteLine(superTrue); // prints: true
Console.WriteLine(!superTrue); // prints: false
// and does it *equal* true/false?
Console.WriteLine(superTrue == true); // prints: true
Console.WriteLine(superTrue == false); // prints: false

so the non-zero / zero handling is just like we expect? Well, no. The compiler isn’t helping us here, as it spots the constants. However, if we compare to variables, it uses the underlying value:


// really?
bool normalTrue = bool.Parse("true"), normalFalse = bool.Parse("false");
Console.WriteLine(superTrue == normalTrue); // prints: false
Console.WriteLine(superTrue == normalFalse); // prints: false

(I’m using bool.Parse here to avoid the compiler getting clever)

And ordering?

We could postulate that this “super-true” is more than “true”, or maybe we might suppose the implementation is going to just treat it as true. In fact, neither is correct – the implementation of CompareTo (testing in .NET 4) means that it simply breaks:


// how about sorting...
Console.WriteLine(superTrue.CompareTo(true)); // prints: 1
Console.WriteLine(true.CompareTo(superTrue)); // prints: 1 - oops!

That is going to make sorting very painful.

So what is your point?

Well, the first thing to note is that I haven’t disproven the original question; as far as I can tell, false is always less than true. However, you can get some very bizarrely behaved booleans. You would have to be crazy to deliberately introduce such shenanigans into your code, but it is something to perhaps at least know about as part of defensive programming / data sanitization – I’m fairly sure you could cause some nasty effects by passing 17 in as a bool to a few places…

Tuesday, 15 June 2010

Extending ASP.NET MVC with custom binders and results

Recently, I was putting together a brief sample for someone who wanted to use protobuf-net with ASP.NET as part of a basic HTTP transport; fairly simple stuff – just using protobuf as the binary body of the HTTP request/response.

It was a simple enough demo, but it got me wondering: what would be required to do this (more cleanly) in ASP.NET MVC? In many ways we just want something pretty similar to how many people currently use JSON: forget the formal API – just tell me the routes and let me know what data I need.

It turns out that this is astonishingly easy…

The controller

A good way of understanding the big picture is to look at the controller. So let’s take a peek:

One very basic controller, with one action. Things to note:

We want to read the request parameter (“req”) from the request body
To do that, we’re using a custom binder, via [ProtoPost] which we’ll see shortly
We could have exposed different input arguments separately – it would have added complexity, and isn’t very compatible with the way that requests are written in the .proto language, so I’ve restricted myself to binding a single (composite) parameter
We run some logic, same as normal
We return an action-result that will write the response back as protobuf

The nice thing about this is that (no matter which option you use to control your routes) it is very simple to add actions (as different urls).

That might be all you need to know!

I’ll include more explanation for completeness, but if you want a simple way of throwing objects over http efficiently and in an interoperable way, that might be enough.

The binder

The binder’s job is it understand the incoming request, and map that into things the controller can understand, such as method variables. There are two parts to a binder: writing the code to handle the input, and telling the system to use it. Writing a binder, it turns out, can be pretty simple (well – if you already have the serialization engine available); here’s all there is:

This simply checks that the caller is doing a POST, and if they are it passes the input stream to protobuf-net, based on the parameter type (which is available from the binding-context).

The other part is to tell ASP.NET MVC to use it; in my case I’m using a new [ProtoPost] attribute which simply creates the binder:

The result

Just as a controller doesn’t need to know how to parse input, it also doesn’t know how to format output – that is the job of an action-result; we’ll write our own, that we can re-use to write objects to the response stream. Which is actually less work than writing the last sentence!

Again – some very basic code that simply takes an object and writes it to the output stream via protobuf-net.

Anything else?

OK, I’ve taken some shortcuts there. To do it properly you might want to check for an expected content-type, etc – and things like security is left open for your own implementation (just like it would be for any other similar object-passing API). I’m just showing the core object <===> stream code here. But it works, and is painless.

The example code (include the ASP.NET / IHttpHandler equivalent, and an example client) is all available in the protobuf-net trunk, here.

Sunday, 6 June 2010

protobuf-net; a status update

I keep promising “v2” is just around the corner, and (quite reasonably) I get a steady stream of “when” e-mails / tweets etc, so it is time for a status update…

…but first I must apologise for the delay; the short version is that in the last few weeks I’ve changed jobs, and it has taken a bit of time to get my daily pattern (work time, family time, geek time, etc) sorted. Since I no longer spend 4 hours a day commuting, in theory I have more time available. In reality I’m spending lots more time with my family, which I don’t regret. But I’m getting back into the “crazy geek” thing ;p

So where is protobuf-net?

The first thing to note that “v2” is not a minor change; it completely changes the crazy-complex generics pipeline (which killed CF, and was generally a bad design choice for lots of reasons), opting instead for a non-generic pipeline coupled with an IL emitter to get the maximum performance. It also introduces a whole new metadata abstraction layer, and static dll usage (so it works on iPhone (but don’t ask me how the SDK rules affect this; IANAL), Phone 7, and so that CF doesn’t have to use reflection).

The numbers

Tests for new new features (metadata abstraction, IL generation, etc): 100%
Compatibility tests:79% (228 / 287, plus some skipped that won’t be in the first alpha release)

I should note: my unit tests for this tend, right or wrong, to each cover a number of features; it is very hard to make the regression tests overly granular, but to be honest I think we can all agree that the main objective is not to have the world’s most beautiful tests, but rather: to check that the system works robustly.

The gaps

Mapped enums; it’ll handle enums, but it currently only treats them as pass-thrus to the underlying primitive value. It doesn’t currently process [ProtoEnum]. Not hard, just time.
Member inference (which was the best “v1” could do to allow serialization of types outside your control); this doesn’t involve IL, so not too tricky – and not used by the majority.
Roundtrip for unexpected fields (currently dropped; this also impacts the memcached transcoder)
Parse/ToString fallback for unknown types (another “v1” feature)
Packed arrays (an optional, alternative wire-format for lists/arrays; this won’t affect you if you haven’t heard of it, since it is opt-in)
*WithLengthPrefix – for reading/writing multiple objects to/from a single stream
GetProto (this is deferred from the alpha release, as it isn’t “core” functionality, and is significant work to do properly)

No show-stoppers; just work to do. Not long now. I’m reluctant to promise a “when” since I’ve delayed already…

Sunday, 30 May 2010

When to optimize?

Prompted by a tweet earlier, I thought I’d share some thoughts on optimization. In code, there are diverging views here, varying between the extremes:

Forget the performance; write an awesome application, then see (perhaps profile, perhaps just use it) where it sucks
Performance rules! Give yourself an ulcer trying to eek out every last picosecond of CPU and nibble of RAM!

We've all been there Obviously there is some pragmatic middle-ground in the middle where you need to design an app deliberately to allow performance, without stressing over every line of code, every stack-frame, etc.

Is this juxtaposition valid?

I suspect it is. In my day job I’m an app developer: what matters is shipping an app that works, is usable, is clear, doesn’t corrupt the database, doesn’t e-mail real clients from the test server*, etc. Performance is important, but management aren’t going to be impressed with “I shaved 3ms/cycle from the ‘frobing’ loop! oh, yes, but the login page is still broken”.

But my “for kicks” OSS alter-ego is a library developer. Shipping your library with great features is important, but library developers are (in my experience at least) far more interested in all the micro-optimisations. And this makes sense; the people using your library are trusting you to do it well – they’re too busy getting the login page working. And for a library developer, getting these details right is part of how you deliver the best product.

Summary

So next time you hear somebody worrying about whether to use string.Concat(string[]), string.Concat(object[]) or StringBuilder: it could be random micro-optimisation gone wild (and please do kick them); but maybe, just maybe, you’re dealing with a closet library developer.

*=yes, I’ve accidentally done that. Once. Fortunately we caught it quickly, and the data wasn’t rude or anything. Lawsuit avoided.

Thursday, 6 May 2010

Strings: sorted

Sorting data… I don’t mean complex bespoke types that you’ve written – I mean inbuilt types like “string”. You probably wouldn’t think too long about trying to sort strings – you just expect them to work.

Oddly enough, this wasn’t actually always the case. OK, it is an edge case, but string comparison wasn’t actually guaranteed to be transitive. By this I mean that if we’ve got three values A, B and C, and we’ve tested that A comes before B, and B comes before C – then you might reasonably deduce that A comes before C. This is a key requirement for sorting (and has always been documented as such). But it wasn’t always the case!

Why is this bad? At best case, your data can choose random-looking sort orders. At worst case a “sort” operation may loop forever, attempting to shuffle data that is teasing it.

Even though this oddity was rare, it was problem enough for it to come up in the wild, way back when usenet was in vogue (in other news: Microsoft is shutting down their usenet farm… is that news to anyone?).

It was news to me, but while discussing this curio, somebody observed that it is now fixed in .NET 4.0; the nature of the fix means that the code in the connect article (above) needs re-ordering to show it working and not-working as you flip between framework versions; here’s the updated sample:

        string s1 = "-0.67:-0.33:0.33";
      string s2 = "-0.67:0.33:-0.33";
      string s3 = "0.67:-0.33:0.33";
    
      Console.WriteLine(s1.CompareTo(s2));
      Console.WriteLine(s2.CompareTo(s3));
      Console.WriteLine(s1.CompareTo(s3));

In .NET < 4.0 this shows “-1 –1 1” (this is a fail). In .NET 4.0, it shows “1 1 1” (a pass). I tried a range of combinations and couldn’t make it misbehave. A small matter, maybe, but I’m much happier knowing that this has finally been tucked up in bed.

Friday, 30 April 2010

Walkthrough: protobuf-net on Phone 7

UPDATE: better tools for this now exist; see precompile

Now’s my chance to demonstrate the (pre-alpha) “v2” protobuf-net API, while at the same time showing it working on Phone 7. In particular, this shows the new pre-compilation functionality, which is vital for good performance on Phone 7, iPhone, XNA, Compact Framework, etc (since they have scant meta-programming).

But first…

Why? Just why?

You have a mobile device; data should be considered a limited resource, both in terms of financial cost (data plans etc) and bandwidth-related performance. Therefore we want to send less over the wire.

Google’s protobuf wire-format is extremely dense. In theory you can get a few bytes tighter if you squeeze every nibble manually, but you’ll go crazy in the process.
Mobile devices rarely have full meta-programming; we don’t want reflection (too slow), so we want fully pre-compiled serializers.
Code-generation is one option, but most code-gen APIs want you to use their object model. Thanks very much, but I already have an object model. I’ll use that, ta.
Shiny.

Building your domain model

You can use protobuf-net from a .proto (Google’s DSL for describing protobuf schemas), but it is my belief that most .NET developers want to work with their existing model. So let’s start with a basic model.

For this demo, I’m starting with a Silverlight library project to represent my model (or it could also be a DTO layer). There are various ways of setting this up, but this is the simplest while I work out the kinks…

My model is very basic; just an order header / detail pair. For simplicity I’m using the protobuf-net attributes to allocate field-numbers to the properties, but there are may other ways of doing this now (including support for vanilla objects with no adornments). Here we go; very simple:

Generating a serialization assembly

This is the most obvious difference in “v2”; you can now pre-generate the serialization code into a fully static-typed assembly. The “generate at runtime” model is still supported for “regular” .NET, and has also had a complete overhaul improving performance noticeably.

“The plan” here is to write a simply utility exe that you can call in your build process to do this step, but I don’t have that complete yet. Instead, just for this demo, I’m going to use “regular” .NET console exe, referencing the “regular” protobuf-net assembly (and the DTO assembly), with an entire 4 lines of code:

This tells the API what types we are interested in, and offers a wide range of options for changing how the model is mapped. We’re passing in “true” to let it figure out the details itself.

Running this exe generates “MySerializer.dll”, which is our serialization assembly with a class “OrderSerializer”. It still needs some utility code from protobuf-net, so in our Phone 7 app we’ll need references to the DTO assembly, the serializer assembly and the Phone 7 version of protobuf-net.

Now all we need to do is use the serializer! In “v1” we would have used static methods on “ProtoBuf.Serializer”, but instead we want to use our specific pre-generated type:

We also need to create some sample data to play with:

Next, purely for illustration we’ll ignore the handy “DeepClone” method, and serialize / deserialize the object via a MemoryStream:

And because I have extremely limited UI skills, I’ll just throw the cloned data into a ListBox:

Et voila; working protobuf-net serialization via a pre-generated serialization dll, on Phone 7.

Initial performance tests seem to indicate that it is in the region of 30-50 times faster than the DataContractSerializer that ships with Phone 7, and the protobuf wire format typically takes about 20% of the bandwidth. Win win.

Caveats

The “v2” code isn’t complete yet! I’m still working through all the compatibility tests. It is getting there, though. It is a week or so away from a stable beta, I suspect.

Additionally, I’m still working through some of the issues associated with using “regular” .NET to generate the assembly for a light framework such as CF / Phone 7. Some combinations seem to work better than others.

protobuf-net v2 on Phone 7

I mentioned previously how protobuf-net “v2” works on iPhone; well, I guess we’re all still waiting to see how §3.3.1 plays out there…

So rather than mope, here it is working on Phone 7 Series emulator (¿simulator?), from the new CTP released today.

Importantly, Phone 7 lives somewhere between Silverlight and Compact Framework. It does not have any meta-programming support (no ILGenerator etc). So in “v1” (the currently downloadable dll) it had to fall back to reflection – slow, but it will get there.

In “v2”, we get the best of both worlds; on “full” .NET wecan use runtime meta-programming to generate fast IL on the fly, but if we know we are targeting a slim framework (or we just don’t want to use meta-programming at runtime) we can pre-generate a serialization dll.

I wish the screenshot was more exciting, but I’m a bit busy working through the last few “v1” compatibility checks. For obvious reasons I haven’t tried this on a physical device. When they’re ready, if Microsoft want to send me a unit for testing purposes, that would be fine… [holds breath].

Strong Naming on Phone 7

I spent a merry lunchtime looking at the new CTP of Phone 7. It is looking very promising, but interestingly it doesn’t (yet?) include options in the UI to strong-name your library projects. It does, however, still seem to work if you just edit the csproj by hand.

Notepad will do the job, or in Visual Studio right-click in Solution Explorer to “Unload Project”, then right-click again to “Edit …” it.

Add the lines below, and finally right-click to “Reload Project”. Inelegant, but I can live with it ;-p

<SignAssembly>true</SignAssembly>

<AssemblyOriginatorKeyFile>your.snk</AssemblyOriginatorKeyFile>

I have no idea if this is a sane thing to do, but it makes my code work (I’m doing something a bit out of the ordinary), so I’m happy.

Tuesday, 20 April 2010

Using dynamic to simplify reflection (and: generic specialization)

As I’ve noted in the past; reflection and generics don’t mix very well. The story isn’t quite as bad in .NET 4 though; here’s an example of something that is just painful to call (for the correct T known only at runtime) in C# 3.0:

And remember – we’re talking about the scenario where T is only known at runtime:

Previously we needed to use lots of MakeGenericMethod (or MakeGenericType) – things like:

You can see how that might quickly become boring. But in .NET 4 / C# 4.0 we can exploit “dynamic” to do all the heavy lifting and cache a delegate per scenario for us:

Much better!

Another possible use of this is generic type specialization:

Now this version is called for `int` arguments, otherwise the generic version is called – just as would have been the case from static C#. Magic!

Thursday, 15 April 2010

The spec is a lie! (the cake is probably real)

My current bugbear: the C# 4.0 spec. Obviously, with a new toy like VS2010 the first thing you do is read all the documentation, yes? yes? anyone? Oh, just me then…

For whatever reason, the version that shipped in the release version of VS2010 isn’t the proper version. We can conclude this at a minimum because it doesn’t include the lock or event changes.

Now call me crazy, but I’m of the opinion that the spec is quite important in something like a programming language, yet at the moment the new shiny compiler is (by definition) not a compliant C# 4.0 compiler; it generates the wrong IL (according to the spec) for locks and field-like events.

I’m also still very unclear whether any other subtle changes have sneaked in (excluding the obvious changes for things like dynamic, variance, named arguments, optional parameters, etc).

I should stress: my aim here is in part to say “don’t bother re-reading the spec yet”, and in part to add one more grain of sand to the nagging I’ve been doing behind the scenes for ages on this topic. The right people already know about this – I just really want them to get it done and publish the darned spec! And ideally a change log, if they are feeling generous.

You know it is the right thing!

Wednesday, 14 April 2010

protobuf-net / VS2010 (and writing VS2010 packages)

With the release of VS2010, I was torn – between:

re-releasing the current VS add-in (for protobuf-net v1, but with VS2010 support)
holding off (building VS packages is fiddly) and releasing the (hopefully not-very-far-away-at-all-honest-no-really) v2 with VS2010 support

It quickly became apparent that getting VS2010 support now was more important than getting the new shiny v2 bits, so without further ado protobuf-net (v1) with VS2010 support is now available on the project site.

As I say, building VS packages isn’t exactly something you want to do often (unless you do it routinely enough to have a highly polished and fully automated process), and I wasn’t 100% sure that the VS9 SDK API would work against VS10. Fortunately, it all went smoothly. Too smoothly! If you have your own custom packages, you may well find that simply duplicating your 9.0 hive changes into the 10.0 key does the job, giving you an installer that targets both IDEs.

On a v2 note; this really is getting there now; most of the core logic is implemented (although there is still a fair bit of testing to do), and it is looking very promising. I hope to be able to release something over the next few weeks.

You might guess (correctly) that I’m a bit peeved about the Apple shenanigans – but whichever way that plays out I’m content: protobuf-net v2 is quite simply a superior product to v1, and I’ve had lots of geek-fun (which is like regular fun, but harder to explain) learning the guts of IL; previously largely a read-only topic to me.

Any problems / feedback – please let me know.

Thursday, 18 March 2010

Revisited : Fun with field-like events

Previously, I spoke about some problems with field-like events, in particular exacerbated due to two points:

the synchronization strategy is different between the ECMA and MS specifications (so you can't robustly mimic the compiler's lock behaviour)
inside the type, += talks directly to the field and does not use the accessor (so you can't ask the compiler nicely if you can please use the thread-safe code that it wrote for you)

leaving you very few options (except manual backing delegate fields and manual synchronization) for writing properly thread-safe event code in some corner-cases (note: most events don't need this level of attention!).

Well interestingly enough, things have changed in 4.0; I must have missed the big announcement, but Chris Burrows has three posts covering this. The last is largely additional info, but the short version of the first two is:

The generated code is no longer a "lock(this)" (or "lock(type)" for static) - it uses lock-free exchange to do away with this ugly requirement (amusingly, the ECMA spec always maintained that the "how" was an implementation detail; only the MS spec insisted on a specific pattern)
Inside the type, += and -= now use the add and remove accessors respectively, not the fields

Basically, my previous problem-case would now have worked just fine. I feel vindicated, but also very happy that this has been fixed.

On a related note; if you aren't already aware, note that even "lock(someObj)" gets an overhaul in the 4.0 compiler - see Eric Lippert's post: Locks and Exceptions do not mix

Monday, 15 March 2010

When is an int[] not an int[]?

I’ve spent my entire train journey trying to get to the bottom of this, so I thought I'd blog it for posterity. In my crazed Reflection.Emit frenzy, my unit tests were erroring with PEVerify complaining about illegal ldlen codes:

[offset 0x....] Expected single-dimension zero-based array.

If you're doing meta-programming, tools like PEVerify and Reflector are your closest allies, but this took some head-scratching. I even distilled the code down to two seemingly identical bits of code that read and discard the length of an array variable initialized to null:

The first pane declares “loc 0” and “loc 2” as a local int[] variables; forget about “loc 1” – it is unrelated. The second pane initializes each array variable as a null reference, obtains the length (which is a “native int” which I immediately convert to Int32), and then discards the value.

So why the error? And why one error and not two? PEVerify is, after all, a chatty beast… Either I’ve gone crazy in my code, or somebody is lying to me! Actually, both it turns out.

Pop quiz: what is the difference between these two Type instances representing a 1-dimension array of int:

Type explicitRank = typeof(int).MakeArrayType(1),
implicitRank = typeof(int).MakeArrayType();

The second is our friend, int[]. The first is something different, though; it is a 1-dimensional array of int sure enough, but it isn’t explicitly zero-based! (correction due: see comments) D’oh! It goes by the moniker int[*].

Simply; you can’t use ldlen on an int[*] – only an int[]. What I don’t yet understand is why the upstream code (when it assigned the array “for real”) didn’t complain about the very attempt to assign an int[] value (from a standard “get” accessor) to an int[*] local variable. Presumably the PEVerify authors didn’t think anyone would be stupid enough to try ;-p

The moral here; sometimes it pays to be less explicit (and I don’t just mean the language I used when I found the problem). I’ve also left feedback with Red Gate to tweak how it displays, but to be honest the number of people this cosmetic glitch will affect is minimal.

Sunday, 14 March 2010

Binary data and strings

Don't expose your bits in public You have no idea how often I get an e-mail that talks about binary data (for me, usually protobuf data) and string encoding (usually via utf-8).

Here’s the thing folks: most binary data cannot be translated to a string with a utf-* translation. It just isn’t possible. If you are talking about an arbitrary byte sequence that you need to store or transmit as a string, then your best bet is base-64.

Fortunately, in .NET Convert.ToBase64String and Convert.FromBase64String deal with all the details, making it trivial to use.

I keep talking myself into (and then back out-of) adding string-based overloads (etc) to protobuf-net; maybe I’ll add them just to stop the internal debate…

Sunday, 7 March 2010

The last will be first and the first will be last

I’ve probably mentioned that I’m currently re-writing some code using IL emit. Very interesting work, and I’ve learned a lot about the CLI in the process.

The work I’m doing at the moment works using the decorator pattern, giving each node the chance to manipulate the value, which is usually expected to be at the top of the stack.

The problem is… imagine we want to call:

someWriter.WriteInt32(theValueOnTheStack);

To call this, we need “someWriter” before the value on the stack (i.e. push “someWriter”, push the value, callvirt). There are two options at this point:

Change the calling code so that we already pushed “someWriter” earlier in the code (very problematic – it makes the stack very complex, especially considering branching etc)
Declare a local variable, store the value on the stack into the variable, push “someWriter”, push the value from the local variable, callvirt

Only the second is an attractive option, but in itself this causes me problems – not least in some code where this would lead to a lot of locals being declared (even though I have some fancy local pooling to re-use like-typed variables as far as possible; the problem is when lots of different types are involved).

So how about we change the API? As it happens, “someWriter” is easily available (as arg1 or arg2). What about if we made it a static method, and passed the instance in last?

SomeWriterType.WriteInt32(theValueOnTheStack, someWriter);

OK – this is a pretty freaky calling convention, but it fits our typical usage perfectly. Given that the value is already on the stack, now I just need to push “someWriter”, call (not callvirt). Obviously the ex-instance-method code needs changing – essentially the different between:

Note that it is important that I am not using “virtual” methods in all this, as that would demand the original usage.

But does it cause a performance problem? (note: yes I know this is all micro-optimisation, but I have a genuine reason for trying to minimise the stack size, which is the real purpose behind this).

So I fired up a typical test-rig that calls the methods an insane number of times:

The results aren’t necessarily surprising – simply confirming that this doesn’t have any negative impact on performance. It actually makes things slightly faster, but all I really needed to know is whether it would end up being 100 times slower or not. It isn’t. Not having the locals is my main aim.

So it looks like I’m going to end up with backwards methods. Which is fine – only my code will see it (it isn’t part of the user-facing API, although it does have to be public to allow for calling from other assemblies).

In this case, I can live with an odd calling convention ;-p

Friday, 26 February 2010

protobuf-net v2 on the iPhone

A basic iPhone application showing two fields with values in; there isn't much to see.

It might not look very impressive to you, but this is pure joy to me. It is the first evidence I’ve seen that the ~~revised~~ rewritten protobuf-net can work in MonoTouch for use on the iPhone – and presumably MonoDroid in the future.

A big thanks here to Christian Weyer and Andrew Rimmer who both offered some help here (since I don’t actually own a mac, an iPhone or MonoTouch!). The grab is from Andrew (he assures me it works fine on the actual device too).

This piece of work has been a slog, and I kept putting it off (because frankly, re-writing the entire stack is a huge job). The goals have not been trivial:

Continue to support all the existing runtime-only usage…
..while also making it support fully static / AOT compatible
Allow it to write serialization dlls in advance (like sgen)
Fix the stability issues on Compact Framework (caused by overuse of generics)
Remove some performance bottlenecks on Compact Framework
Support POCO (or more accurately, work without requiring special [attributes]) so you can use it with types outside your influence, or for inheritance that isn’t known at compile-time
Make it work on iPhone etc
Make it even faster
Make it more extensible (for example, what if I want to write a proxy that writes a DataTable to a protobuf stream?)
Make it support mutable structs (yes I know that they are usually evil, but people simply keep asking for this, in particular for XNA)
Remove a heap of code-debt caused by an evolving project that grew without a 100% clear direction
Throw in some other feature-requests and bug-fixes for good measure
And as a side-benefit, it even compiles and runs on .NET 1.1 or via mcs on Mono (with minimal #if usage)

I should stress that it is not yet complete – but the core of the engine is there and working, demonstrating most of the key points above (albeit in a very raw state). There’s still a way to go, but it is getting there.

Saturday, 13 February 2010

Everything “is” an object?

This nugget of the .NET unified type system. And the cause of oh so much misunderstanding.

My way of looking at it is that it isn’t helpful to think of value-types (structs in C# parlance) “as objects”. You can treat them as though they are, but you need to fundamentally change them (box them) in order to do it. Which has lots of uses for sure; but just make sure you understand what you are doing and why. If a value-type truly was an object you wouldn’t have to do this.

It is a bit like saying all people are meat; technically true, but if you start manipulating me like you would beef, don’t expect me to be quite the same afterwards.

Instead, I try to emphasize that everything can be treated as an object.

Oh, and mutable structs are evil, but you already knew that.

Tuesday, 2 February 2010

New spike - looking hopeful

After a few false starts, I've finally found some time to spike the much-needed re-work for protobuf-net; this pretty much means "re-write it from scratch, learning from design mistakes and copying the occasional block of code".

Key changes / goals:

Generics: for the most part, gone from the guts of the implementation; it now uses a non-generic "decorator" approach - you wouldn't believe how much pain they have caused me (here)
This should hopefully fix compact-framework (here, here), and might (no promises at this stage) even allow it to work on .NET 1.1 and micro-framework
Build around a runtime model rather than attribute reflection (but with attribute reflection to be added as a model provider)
Much easier to unit-test individual decorators
Double implementation; vanilla decorator for "light" frameworks, and compiled implementation for the full-fat versions
I lack the hardware to test, but I'm hopeful that the vanilla decorators will work on iPhone - I don't suppose you have an iPhone and iMac to spare?
Potential to emit the compiled version to disk (think: sgen)
Added buffer-pooling (makes it possible to use a larger buffer with minimal allocation costs)
Re-written the string handling to use pointer (unsafe) code where appropriate, and better incremental encoder handling

There is still masses to do - but I'm feeling pretty happy with the initial results... I've got the core engine in place, and enough decorators to write some basic messages. Timings below, with meanings:

New (decorator) - vanilla decorator implementation - no optimisations (yet; although I'm sure we can improve this once I can tell what works/where)
New (compiled) - compile/flatten the decorators into a single operation
New (delegate) - same, but to a delegate rather than class (just for fun)
Old: the existing protobuf-net "generic" Serializer

   New (decorator): 3399ms
  New (compiled): 788ms
  New (delegate): 619ms
  Old: 758ms
  New (decorator): 2569ms
  New (compiled): 642ms
  New (delegate): 604ms
  Old: 754ms
  New (decorator): 2549ms
  New (compiled): 631ms
  New (delegate): 623ms
  Old: 740ms
  New (decorator): 2589ms
  New (compiled): 640ms
  New (delegate): 608ms
  Old: 750ms
  New (decorator): 2587ms
  New (compiled): 638ms
  New (delegate): 614ms
  Old: 750ms
  New (decorator): 2433ms
  New (compiled): 519ms
  New (delegate): 504ms
  Old: 664ms
  New (decorator): 2318ms
  New (compiled): 512ms
  New (delegate): 506ms
  Old: 596ms

Oh, and I need to fix all the other things, too... sigh.

Thursday, 28 January 2010

Distributed caching with protobuf-net

No, this isn't my server farm.

I’ve been playing with distributed caches lately, and of course this is a really good use-case for something like protobuf-net – if there is anywhere you want to minimise your memory costs, network IO costs and CPU costs, it is your cache (it gets kinda busy).

As a pure example of “which system can I get installed and running the quickest”, I looked at memcached, using the windows service from here and the enyim.com client. Not only were they a breeze to get working, but this setup seems pretty popular, and also covers memcached providers, which uses enyim under the hood.

Fortunately, the enyim client has an extension-point that allows you to nominate a serializer through your config file (actually it calls it a transcoder, but I’m not sure it fits the definition). This allows us to swap BinaryFormatter for protobuf-net, which (for suitable types) has huge savings – in my crude tests I was seeing only 20% of the original network traffic, and roughly 75% of the original CPU costs.

Enough bluster…

OK, so in r282 I’ve added just such a transcoder (I still don’t like the name…) – and the good thing it is so effortless to configure it:

Before:

<enyim.com>
   <memcached>
       <servers>
           <add address="127.0.0.1" port="11211" />

After:

<enyim.com>
   <memcached transcoder="ProtoBuf.Caching.Enyim.NetTranscoder, protobuf-net.Extensions">
       <servers>
           <add address="127.0.0.1" port="11211" />

Wasn't that was painless? and some huge benefits (although obviously you’d want to run some integration/regression tests if applying this to an existing system). Oh, and of course you need to include protobuf-net and protobuf-net.Extensions.dll in your project, in addition to the enyim dll.

My only niggle is that because there is no way of knowing the type up-front, I had to include the outermost type information on the wire (hence the “Net” in the transcoder’s name, since it won’t be truly portable). I could get around this by having the caller indicate the type, but the enyim dll doesn’t play that game.

What next?

If you find it good/bad/pointless/buggy please let me know.

If you’ve got another cache that might benefit from some protobuf-net love, please let me know. Velocity, perhaps? Azure?

Monday, 4 January 2010

Reflections on Optimisation – Part 3, Fields and Properties

Short version:

Unless you are doing something interesting in your accessors, then for “full” .NET on classes, it makes no difference. Use automatically-implemented properties, and “commit”.

Much ado about nothing:

Previously, I discussed ways of modelling a general-purpose property accessor, and looked at existing APIs. So what next? I’m rather bored of conversations like this on forums:

OP: My flooble-gunger application uses public fields to hold the flipple-count, but I’m getting problem…

Me: back up there…. why does it use public fields?

OP: For optimisation, obviously!

But is there anything “obvious” here? I don’t think this is news to most people, but public fields are not generally a good idea – it breaks encapsulation, is (despite belief otherwise) a breaking change (and sometimes build-breaking) to swap for properties, doesn’t work with data-binding, can’t participate in polymorphism or interface impementation, etc.

With regards to any performance difference, you would expect that in most “regular” code, member-access does not represent the overhead. We would hope that the only times when member-access might be truly interesting is (like for me) if you are writing serialization / materialization code. Still, let’s plough on with this scenario, and check things out.

In particular, for simple properties (for example, C# 3.0 “automatically implemented properties”) our expectation is that they will commonly be “inlined” by the JIT. The big caveat here (where I might forgive you) is Compact Framework (for example XNA on XBox 360), which has a much weaker JIT.

So as before; let’s test it. Another very simple test rig, I want to compare (each in a loop):

Direct property access on an object
Direct field access on an object
A wrapper (as before) that talks to a public property
A wrapper (as before) that talks to a private property
(automatically implemented properties vs manual properties)
A wrapper (as before) that talks to a public field
A wrapper (as before) that talks to a private field

To achieve the private tests, I’ve created the wrapper implementation as a nested type of the object I’m testing. Oddly enough, there were some oddities that popped up, but I’ll discuss those next time; this post has been long enough, so here’s the very uninteresting result (sorry for the anti-climax):

Implementation	Get performance	Set performance
Prop, no wrapper Field, no wrapper Public prop wrapper Private prop wrapper Auto prop wrapper Public field wrapper Private field wrapper

(numbers scaled, based on 650M iterations, lower is better)

Both direct and via abstraction, the cost is identical for fields and properties. Quite simply – don’t stress over it.