Sunday, 19 September 2010

protobuf-net; ushort glitch before r274

Dammit, I hate it when a gremlin shows up. I hate it even more when they show up a year later…

It turns out that a bug “fix” to ushort (unsigned 16-bit integers) handling back in October 2009 introduced a slight… “nuance” when moving between versions < r274 and >= r274. Basically, due to a braindead bug on my part, it was treating ushort data as strings rather than handling appropriately. This was fixed a year ago, but if you are moving between versions is a PITA.

Sorry for any inconvenience folks, but… well, it happened. I can’t change that. Anyway, how to fix?! The good news is NO DATA IS LOST - it is just a bit trickier to access. You wouldn't have encountered the issue if working against rigid .proto / cross-platform definitions (since .proto doesn't have direct ushort support), so I’ll limit myself to the pure .NET scenario. In which case, IMO the easiest fix is to introduce a shim property, so:

public ushort Foo {get;set;}

might be updated to:

private string FooLegacy { // a pass-thru
    get {return Foo.ToString();}
    set {Foo = ushort.Parse(value);}
private bool FooLegacySpecified { // suppress serialization
    get {return false;}
    set {}
[ProtoMember(42)] // any unused field number
public ushort Foo {get;set;}

This allows old-style and the correct data to be used side-by-side (always writing to the new format). For arrays, see my longer answer here, which also switches to the more efficient "packed" encoding, introduced later to the protobuf spec.

Sorry for any inconvenience folks. Fortunately use of ushort is fairly uncommon in .NET, so my hope is that there aren't vast hordes of people impacted by this unfortunate foible.

As a side note, I should add: I considered trying to add automatic handling of old data, but it transpires that the borked data is so similar to "packed" encoding as to be ambiguous in some cases, so trying to guess was too risky. And it would compound the issue by layering hack upon bug.