Monday, 15 August 2011

Automatic serialization; what’s in a tuple?

I recently had a lot of reason (read: a serialization snafu) to think about tuples in the context of serialization. Historically, protobuf-net has focused on mutable types, which are convenient to manipulate while processing a protobuf stream (setting fields/properties on a per-field basis). However, if you think about it, tuples have an implicit obvious positional “contract”, so it makes a lot of sense to serialize them automatically. If I write:

var tuple = Tuple.Create(123, "abc");

then it doesn’t take a lot of initiative to think of tuple as an implicit contract with two fields:

  • field 1 is an integer with value 123
  • field 2 is a string with value “abc”

Since it is deeply immutable, at the moment we would need to either abuse reflection to mutate private fields, or write a mutable surrogate type for serialization, with conversion operators, and tell protobuf-net about the surrogate. Wouldn’t it be nice if protobuf-net could make this leap for us?

Well, after a Sunday-night hack it now (in the source code) does.

The rules are:

  • it must not already be marked as an explicit contract
  • only public fields / properties are considered
  • any public fields (spit) must be readonly
  • any public properties must have a get but not a set (on the public API, at least)
  • there must be exactly one interesting constructor, with parameters that are a case-insensitive match for each field/property in some order (i.e. there must be an obvious 1:1 mapping between members and constructor parameter names)

If all of the above conditions are met then it is now capable of behaving as you might hope and expect, deducing the contract and using the chosen constructor to rehydrate the objects. Which is nice! As a few side-benefits:

  • this completely removes the need for the existing KeyValuePairSurrogate<,>, which conveniently meets all of the above requirements
  • it also works for C# anonymous types if we want, since they too have an implicit positional contract (I am not convinced this is significant, but it may have uses)

This should make it into the next deploy, once I’m sure there are no obvious failures in my assumptions.

Just one more thing, sir…

While I’m on the subject of serialization (which, to be fair, I often am) – I have now also completed some changes to use RuntimeHelpers.GetHashCode()for reference-tracking (serialization). This lets met construct a reference-preserving hash-based lookup to minimise the cost of checking whether an object has already been seen (and if so, fetch the existing token). Wins all round.