Code, code and more code.

Thursday, 23 September 2010

Other news…

Sorry if this is turning into a bit of a serialization fest; I do have other coding hobbies; simply this is the thing I’ve been spending my spare time on lately ;p

Silverlight WCF

It recently came to my attention that possibly Silverlight does now support the WCF extension points for swapping the serializer at runtime. Not via xml configuration, but via code configuration. Full details on Carlos’ Blog, but that is definitely an area I want to investigate – at least, if @abdullin will let me.

Interface serialization

Something a few people have asked for, and definitely not outside reach. In fact, Wallace Turner kindly donated a v1 patch to enable interface-based serialization. I’m keeping this “on ice” for now, simply to keep my main focus on finishing v2. But this exists and is certainly on the table for consideration.

Anyway, enough for now; the day job starts again in not-a-lot-of-hours…

protobuf-net on MonoDroid

I was very pleased to get an invite to the MonoDroid (Mono tools / environment for Android) beta earlier in the week, so in addition to a few v2 commits I’ve been playing with shiny new toys.

Wow; decent reflection!

Initially I made the mistake of thinking that MonoDroid would have the same reflection limitations as (say) MonoTouch. I even tied myself in knots trying to get around them! Then Jb Evian pointed out that actually MonoDroid can do full meta-programming at runtime! That is a huge plus for me… so; whack in a few conditional-compilation symbols and – wow! It works!

A taster

Don’t get too excited – this is only going to be brief…

Yes, another amazing example of my l33t UI skillz

Only intended to show something working, but here is v2 on MonoDroid, compiled at runtime, round-tripping some trivial data.

A small (but significant) step. Actually, the time taken to compile the model and run it the first time is still a bit slow for my liking – not sure if that is the emulator, the beta, or a general JIT issue – I’ll dig more and investigate rebasing further.

No really, this is all I have for gfx And of course as a fallback, I can still use runtime usage without “emit” – it is noticeably slower, but still works.

Again, without an actual device it is hard to gauge typical performance at this point, but I’m pretty hopeful.

So… if that works, what are you waiting for?

I still need to finish v2; some decent commits lately now that I’m all refreshed. The finishing line is within sight… oh, and MonoDroid needs to be released ;p

Sunday, 19 September 2010

protobuf-net; ushort glitch before r274

Dammit, I hate it when a gremlin shows up. I hate it even more when they show up a year later…

It turns out that a bug “fix” to ushort (unsigned 16-bit integers) handling back in October 2009 introduced a slight… “nuance” when moving between versions < r274 and >= r274. Basically, due to a braindead bug on my part, it was treating ushort data as strings rather than handling appropriately. This was fixed a year ago, but if you are moving between versions is a PITA.

Sorry for any inconvenience folks, but… well, it happened. I can’t change that. Anyway, how to fix?! The good news is NO DATA IS LOST - it is just a bit trickier to access. You wouldn't have encountered the issue if working against rigid .proto / cross-platform definitions (since .proto doesn't have direct ushort support), so I’ll limit myself to the pure .NET scenario. In which case, IMO the easiest fix is to introduce a shim property, so:

[ProtoMember(1)]   
public ushort Foo {get;set;}

might be updated to:

[ProtoMember(1)]   
private string FooLegacy { // a pass-thru    
    get {return Foo.ToString();}    
    set {Foo = ushort.Parse(value);}    
}    
private bool FooLegacySpecified { // suppress serialization    
    get {return false;}    
    set {}    
}    
[ProtoMember(42)] // any unused field number    
public ushort Foo {get;set;}

This allows old-style and the correct data to be used side-by-side (always writing to the new format). For arrays, see my longer answer here, which also switches to the more efficient "packed" encoding, introduced later to the protobuf spec.

Sorry for any inconvenience folks. Fortunately use of ushort is fairly uncommon in .NET, so my hope is that there aren't vast hordes of people impacted by this unfortunate foible.

As a side note, I should add: I considered trying to add automatic handling of old data, but it transpires that the borked data is so similar to "packed" encoding as to be ambiguous in some cases, so trying to guess was too risky. And it would compound the issue by layering hack upon bug.

Wednesday, 8 September 2010

Truer than true

(edit: apols for the formatting; google ate my images - have replaced with source, but raw)

I’ve been quiet for a little while, due to general fatigue, changing job etc – but after a holiday I’m back refreshed and renewed.

So let’s start with a fun edge condition, inspired by my colleague balpha, who asked me “does C#/.NET/LINQ guarantee false < true?”.

So, is it?

I’m leaving LINQ aside, as for anything other than LINQ-to-Objects, all bets are off; so let’s just limit ourselves to C#; in that case the answer is pretty simple, “yes”. System.Boolean implements IComparable and IComparable<bool> such that this will always work. But not all .NET is C# ;p So how can we be evil here?

What is a boolean, anyways?

First, we need to understand that (in common with many runtimes/languages) the CLI doesn’t really know much about booleans – under the hood they are treated pretty-much the same as integers, with special treatment for “zero” and “not zero”. That means that technically we aren’t restricted to the expected values – it is simply that C# handles all the logic to make sure you only get sane values. But we can write our own IL – here I’m using DynamicMethod to create a bool with underlying value 2:


// write an evil non-C# function that returns something 
// unexpected for a bool (but legal IL)
var dm = new DynamicMethod("mwahaha", typeof(bool), null);
var il = dm.GetILGenerator();
il.Emit(OpCodes.Ldc_I4_2);
il.Emit(OpCodes.Ret);
var func = (Func<bool>)dm.CreateDelegate(typeof(Func<bool>));
var superTrue = func();

The first thing to note is that you shouldn’t do this. So what does such a beast do? The first thing to note is that at first glance it appears to work as expected:


// is it true/false?
Console.WriteLine(superTrue); // prints: true
Console.WriteLine(!superTrue); // prints: false
// and does it *equal* true/false?
Console.WriteLine(superTrue == true); // prints: true
Console.WriteLine(superTrue == false); // prints: false

so the non-zero / zero handling is just like we expect? Well, no. The compiler isn’t helping us here, as it spots the constants. However, if we compare to variables, it uses the underlying value:


// really?
bool normalTrue = bool.Parse("true"), normalFalse = bool.Parse("false");
Console.WriteLine(superTrue == normalTrue); // prints: false
Console.WriteLine(superTrue == normalFalse); // prints: false

(I’m using bool.Parse here to avoid the compiler getting clever)

And ordering?

We could postulate that this “super-true” is more than “true”, or maybe we might suppose the implementation is going to just treat it as true. In fact, neither is correct – the implementation of CompareTo (testing in .NET 4) means that it simply breaks:


// how about sorting...
Console.WriteLine(superTrue.CompareTo(true)); // prints: 1
Console.WriteLine(true.CompareTo(superTrue)); // prints: 1 - oops!

That is going to make sorting very painful.

So what is your point?

Well, the first thing to note is that I haven’t disproven the original question; as far as I can tell, false is always less than true. However, you can get some very bizarrely behaved booleans. You would have to be crazy to deliberately introduce such shenanigans into your code, but it is something to perhaps at least know about as part of defensive programming / data sanitization – I’m fairly sure you could cause some nasty effects by passing 17 in as a bool to a few places…

Tuesday, 15 June 2010

Extending ASP.NET MVC with custom binders and results

Recently, I was putting together a brief sample for someone who wanted to use protobuf-net with ASP.NET as part of a basic HTTP transport; fairly simple stuff – just using protobuf as the binary body of the HTTP request/response.

It was a simple enough demo, but it got me wondering: what would be required to do this (more cleanly) in ASP.NET MVC? In many ways we just want something pretty similar to how many people currently use JSON: forget the formal API – just tell me the routes and let me know what data I need.

It turns out that this is astonishingly easy…

The controller

A good way of understanding the big picture is to look at the controller. So let’s take a peek:

One very basic controller, with one action. Things to note:

We want to read the request parameter (“req”) from the request body
To do that, we’re using a custom binder, via [ProtoPost] which we’ll see shortly
We could have exposed different input arguments separately – it would have added complexity, and isn’t very compatible with the way that requests are written in the .proto language, so I’ve restricted myself to binding a single (composite) parameter
We run some logic, same as normal
We return an action-result that will write the response back as protobuf

The nice thing about this is that (no matter which option you use to control your routes) it is very simple to add actions (as different urls).

That might be all you need to know!

I’ll include more explanation for completeness, but if you want a simple way of throwing objects over http efficiently and in an interoperable way, that might be enough.

The binder

The binder’s job is it understand the incoming request, and map that into things the controller can understand, such as method variables. There are two parts to a binder: writing the code to handle the input, and telling the system to use it. Writing a binder, it turns out, can be pretty simple (well – if you already have the serialization engine available); here’s all there is:

This simply checks that the caller is doing a POST, and if they are it passes the input stream to protobuf-net, based on the parameter type (which is available from the binding-context).

The other part is to tell ASP.NET MVC to use it; in my case I’m using a new [ProtoPost] attribute which simply creates the binder:

The result

Just as a controller doesn’t need to know how to parse input, it also doesn’t know how to format output – that is the job of an action-result; we’ll write our own, that we can re-use to write objects to the response stream. Which is actually less work than writing the last sentence!

Again – some very basic code that simply takes an object and writes it to the output stream via protobuf-net.

Anything else?

OK, I’ve taken some shortcuts there. To do it properly you might want to check for an expected content-type, etc – and things like security is left open for your own implementation (just like it would be for any other similar object-passing API). I’m just showing the core object <===> stream code here. But it works, and is painless.

The example code (include the ASP.NET / IHttpHandler equivalent, and an example client) is all available in the protobuf-net trunk, here.

Sunday, 6 June 2010

protobuf-net; a status update

I keep promising “v2” is just around the corner, and (quite reasonably) I get a steady stream of “when” e-mails / tweets etc, so it is time for a status update…

…but first I must apologise for the delay; the short version is that in the last few weeks I’ve changed jobs, and it has taken a bit of time to get my daily pattern (work time, family time, geek time, etc) sorted. Since I no longer spend 4 hours a day commuting, in theory I have more time available. In reality I’m spending lots more time with my family, which I don’t regret. But I’m getting back into the “crazy geek” thing ;p

So where is protobuf-net?

The first thing to note that “v2” is not a minor change; it completely changes the crazy-complex generics pipeline (which killed CF, and was generally a bad design choice for lots of reasons), opting instead for a non-generic pipeline coupled with an IL emitter to get the maximum performance. It also introduces a whole new metadata abstraction layer, and static dll usage (so it works on iPhone (but don’t ask me how the SDK rules affect this; IANAL), Phone 7, and so that CF doesn’t have to use reflection).

The numbers

Tests for new new features (metadata abstraction, IL generation, etc): 100%
Compatibility tests:79% (228 / 287, plus some skipped that won’t be in the first alpha release)

I should note: my unit tests for this tend, right or wrong, to each cover a number of features; it is very hard to make the regression tests overly granular, but to be honest I think we can all agree that the main objective is not to have the world’s most beautiful tests, but rather: to check that the system works robustly.

The gaps

Mapped enums; it’ll handle enums, but it currently only treats them as pass-thrus to the underlying primitive value. It doesn’t currently process [ProtoEnum]. Not hard, just time.
Member inference (which was the best “v1” could do to allow serialization of types outside your control); this doesn’t involve IL, so not too tricky – and not used by the majority.
Roundtrip for unexpected fields (currently dropped; this also impacts the memcached transcoder)
Parse/ToString fallback for unknown types (another “v1” feature)
Packed arrays (an optional, alternative wire-format for lists/arrays; this won’t affect you if you haven’t heard of it, since it is opt-in)
*WithLengthPrefix – for reading/writing multiple objects to/from a single stream
GetProto (this is deferred from the alpha release, as it isn’t “core” functionality, and is significant work to do properly)

No show-stoppers; just work to do. Not long now. I’m reluctant to promise a “when” since I’ve delayed already…

Sunday, 30 May 2010

When to optimize?

Prompted by a tweet earlier, I thought I’d share some thoughts on optimization. In code, there are diverging views here, varying between the extremes:

Forget the performance; write an awesome application, then see (perhaps profile, perhaps just use it) where it sucks
Performance rules! Give yourself an ulcer trying to eek out every last picosecond of CPU and nibble of RAM!

We've all been there Obviously there is some pragmatic middle-ground in the middle where you need to design an app deliberately to allow performance, without stressing over every line of code, every stack-frame, etc.

Is this juxtaposition valid?

I suspect it is. In my day job I’m an app developer: what matters is shipping an app that works, is usable, is clear, doesn’t corrupt the database, doesn’t e-mail real clients from the test server*, etc. Performance is important, but management aren’t going to be impressed with “I shaved 3ms/cycle from the ‘frobing’ loop! oh, yes, but the login page is still broken”.

But my “for kicks” OSS alter-ego is a library developer. Shipping your library with great features is important, but library developers are (in my experience at least) far more interested in all the micro-optimisations. And this makes sense; the people using your library are trusting you to do it well – they’re too busy getting the login page working. And for a library developer, getting these details right is part of how you deliver the best product.

Summary

So next time you hear somebody worrying about whether to use string.Concat(string[]), string.Concat(object[]) or StringBuilder: it could be random micro-optimisation gone wild (and please do kick them); but maybe, just maybe, you’re dealing with a closet library developer.

*=yes, I’ve accidentally done that. Once. Fortunately we caught it quickly, and the data wasn’t rude or anything. Lawsuit avoided.