Sunday 7 March 2010

The last will be first and the first will be last

I’ve probably mentioned that I’m currently re-writing some code using IL emit. Very interesting work, and I’ve learned a lot about the CLI in the process.

The work I’m doing at the moment works using the decorator pattern, giving each node the chance to manipulate the value, which is usually expected to be at the top of the stack.

The problem is… imagine we want to call:

someWriter.WriteInt32(theValueOnTheStack);

To call this, we need “someWriter” before the value on the stack (i.e. push “someWriter”, push the value, callvirt). There are two options at this point:

  • Change the calling code so that we already pushed “someWriter” earlier in the code (very problematic – it makes the stack very complex, especially considering branching etc)
  • Declare a local variable, store the value on the stack into the variable, push “someWriter”, push the value from the local variable, callvirt

Only the second is an attractive option, but in itself this causes me problems – not least in some code where this would lead to a lot of locals being declared (even though I have some fancy local pooling to re-use like-typed variables as far as possible; the problem is when lots of different types are involved).

So how about we change the API? As it happens, “someWriter” is easily available (as arg1 or arg2). What about if we made it a static method, and passed the instance in last?

SomeWriterType.WriteInt32(theValueOnTheStack, someWriter);

OK – this is a pretty freaky calling convention, but it fits our typical usage perfectly. Given that the value is already on the stack, now I just need to push “someWriter”, call (not callvirt). Obviously the ex-instance-method code needs changing – essentially the different between:

image

Note that it is important that I am not using “virtual” methods in all this, as that would demand the original usage.

But does it cause a performance problem? (note: yes I know this is all micro-optimisation, but I have a genuine reason for trying to minimise the stack size, which is the real purpose behind this).

So I fired up a typical test-rig that calls the methods an insane number of times:

image

The results aren’t necessarily surprising – simply confirming that this doesn’t have any negative impact on performance. It actually makes things slightly faster, but all I really needed to know is whether it would end up being 100 times slower or not. It isn’t. Not having the locals is my main aim.

So it looks like I’m going to end up with backwards methods. Which is fine – only my code will see it (it isn’t part of the user-facing API, although it does have to be public to allow for calling from other assemblies).

In this case, I can live with an odd calling convention ;-p