Friday, 28 March 2014

Windows Redis-64

As you know, I’m a huge fan of Redis. This week, a team at the Microsoft Open Tech group released a new drop of redis-64 for windows (note: all links there are old – github is fresher), with binaries available on both NuGet and chocolatey. It is even marked by them as “production ready”. The current drop is 2.8.4 in Redis terms (the last public drop was 2.6.12).

So congratulations, thanks and kudos to the team involved: an important milestone.

In production, at Stack Exchange we use CentOS to host Redis, so I’m simply not in a position to comment on how well it works in production, but for windows developers using Redis it is really handy having convenient access to a Redis server without having to spin up a VM.

One really important thing to watch

From reading the "Redis Release Notes.docx" and "redis.windows.conf" file in the packages/Redis-64.2.8.4 folder, the appoach they used to solve the fork problem was to introduce a shared, memory-mapped file, and it defaults to the size of your physical memory (and that is before it has forked, when it can grow due to copy-on-write). So whether you are using it as a production server or a dev tool, you might want to pay particular attention the the “maxheap” setting in your .conf file. For my local dev work I usually spin up 5 servers, and I currently have 24GB physical memory in my machine – so by default, when running 5 completely empty databases, it instantly chewed up 120GB of hard disk space (you get it back when the server exits). If you don’t have enough disk space: the server won’t start. You'll probably see:

QForkMasterInit: system error caught. error code=0x000005af, message=VirtualAllocEx failed.

Fortunately it is simple to configure a memory bound:

maxheap 4gb # or whatever

Please note – this really isn’t a criticism; I understand the “why” – they need the shared memory-mapped file for the pseudo-fork, and they need to allocate the “maxheap” amount from the outset because IIRC you can’t grow such a map very flexibly. My intent is merely to flag it in flashing colours that if you’re playing with Redis-64, you want to think about your memory vs disk.

Now go, play, have some Redis fun.

Tuesday, 18 March 2014

So I went and wrote another Redis client…

…aka: Introducing StackExchange.Redis (nuget | github)

The observant out there are probably thinking “Wut?” about now, after all didn’t I write this one? Yes, yes I did. This one too, although that was just a mechanism to illustrate the C# dynamic API. So you would be perfectly justified in thinking that I had finally lost the plot and meandered into crazy-town. Actually, you’d have been reasonably justified in thinking that already. So let me take a step back and explain….

Why? Just… why?

Firstly, BookSleeve has served us well: it has got the job done with ever increasing load through the network. If it is possible to salute code as a trusted friend, then BookSleeve: I salute you. But: some things had to change. There were a number of problems and design decisions that were conspiring against me, including:

  • Robustness: BookSleeve acts as a wrapper around a single connection to a single endpoint; while it is up: great! But occasionally sockets die, and it was incredibly hard to do anything useful either investigative or recover. In our internal code we had a whole second library that existed mainly to hide this wrinkle (and for things like emergency slave fallback), but even then it wasn’t perfect and we were getting increasing problems with network issues causing downstream impact.
  • Ability to identify performance issues: BookSleeve has only minimal instrumentation – it wasn’t a first class feature, and it showed. Again; fine when everything works great, but if you hit any load issues, there was virtually nothing you could do to resolve them.
  • Single-server: in common with “Robustness” above, the single-endpoint nature of BookSleeve means that it isn’t in a great position if you want to talk to multiple nodes; this could be to fully exploit the available masters and slave nodes, or thinking ahead could be in consideration of “redis cluster” (currently in beta) – and simply wrapping multiple RedisConnection instances doesn’t play nicely in terms of efficient network API usage
  • The socket query API: again, partly tied up to the above multi-server concerns, but also tied up to things like the thread pool and IO pool (the issues described there applies equally to the async network read API, not just the thread-pool)

And those are just the immediate problems.

There were some other longstanding glitches in the API that deserved some love, but didn’t justify huge work by themselves – things like the fact that only string keys were supported (some folks make use of binary keys), and things like constantly having to specify the database number (rather than abstracting over that) were troublesome. The necessity of involving the TPL when you didn’t actually want to be true “async” (forcing you to use sync-over-async, an ugly anti-pattern), and the fact that the names didn’t follow the correct async convention (although I can’t honestly remember whether the current conventions existed at the time).

Looking ahead, “redis cluster” as mentioned above introduced a range of concerns; it probably would have been possible to wrap multiple connections in a super connection, but the existing implementations didn’t really make that feasible without significant work. Also, when looking at “redis cluster”, it is critically important that you know at every point whether a particular value represents a key versus a valuekeys impact how commands are targeted for sharding (and impact whether a multi-key operation is even possible); values do not – and the distinction between them had been lost, which would basically need an operation-by-operation review of the entire codebase.

In short, to address any of:

  • our immediate internal needs
  • the community needs that weren’t internal priorities
  • any future “redis cluster” needs
  • providing a generally usable platform going forward

was going to require significant work, and would have by necessity involved significant breaking API changes. If I had reworked the existing code, not only would it have shattered the old API, but it would have meant compromise both for users of the old code and users of the new. And that simply wasn’t acceptable. So: I drew a line, and went for a clean break.

So what does this mean for users?

Firstly, if you are using BookSleeve, feel free to continue using it; I won’t delete the files or anything silly like that – but: my main development focus going forward is going to be in StackExchange.Redis, not BookSleeve. The API is basically similar – the work to migrate between them is not huge, but first – why would you? How about:

  • Full multi-server connection abstraction including automatic reconnect and fallback (so read operations continue on a slave if the master is unavailable), and automatic pub/sub resubscription
  • Ability to express preferences to target work at slaves, or demand a certain operation happens on a slave, etc – trivially
  • Full support for “redis cluster”
  • Completely reworked network layer, designed to avoid stalls due to worker starvation while efficiently scaling to multiple connections, while also reducing overheads and moving steps like text encode/decode to the caller/consumer rather than the network layer
  • Full support for both binary and string keys, while reducing (not increasing) the methods necessary (you no longer need to tell it which you want in advance)
  • Observes TPL guidance: no more sync-over-async (there is a full sync API that does not involve the TPL), and the TPL/async methods are now named appropriately
  • Instrumentation as a design feature
  • And lots more…
  • … heck, when I get a moment I might also throw our 2-tier cache (local in-memory cache with a shared redis central cache, including pub/sub-based cache expiry notification etc) down into the client library too

Is it ready?

Let’s consider it a “late beta” (edit: fully released now); on the Q&A sites we have now replaced all of our BookSleeve code with StackExchange.Redis code, which meant that hopefully we’ve already stubbed our toes on all the big bugs. I don’t plan on any breaking changes to the API (and will try to avoid it). Lua script support is not yet implemented (edit: it is now), and “redis cluster” isn’t yet released and thus support for this is still a work in progress, but basically: it works, and works fine.

A full introduction and example basic usage is shown on the project site; please do take a look, and let me know if I’ve moved too much cheese!

Thursday, 13 March 2014

Beware jamming the thread pool

 

I’ve been struggling over the last few days to diagnose and eradicate a fun little bug in the cache tier of Stack Overflow / Stack Exchange; for context, we use redis extensively as a shared cache (and for some other things – including the realtime updates via web-sockets, which rely heavily on redis pub/sub) – and have this week deployed a new implementation of our redis communications stack. Brain-dead bugs aside (and I really did manage a howler, for which I apologise: sorry), we got it in and stable: it would be working just fine, processing a few million messages without issue, and then out of the blue… WHAM! 10 thousand timeouts in a second, and when you immediately go to look, everything is happy again, merrily churning through load as though you had imagined things. Local load testing failed to reproduce this issue.

As always, I got bitten by the rule: interesting problems only happen at scale.

ONE DOES NOT SIMPLY ... DO ANYTHING AT STACKEXCHANGE SCALE

By which, I don’t mean to imply that we’re the biggest site out there, or that we’re doing anything especially clever (quite the opposite in my case, ahem), but we take great pride in the fact that we run a very busy site on very small numbers of servers. We accidentally found out on Tuesday (again, my bad) that we can still run the entire Stack Exchange Network on two web-servers. Not quite as well as normal, but it worked. edit - proof (courtesy of @Nick_Craver):

The Stack Exchange Network running on two servers

But unless you have a very expensive test lab and dedicated load-test team, it is really quite hard to simulate realistic load for the network.

Enough rambling; what went wrong?

I got hit by the thread-pool. You see, like BookSleeve (which I will be talking about probably next blog), the new client is an async-enabled multiplexer. As outbound messages for an endpoint come in, we dispatch them to a queue (you need to serialize work on a socket, and know which requests marry to which responses), and if we don’t already have a writer for that queue, we request one. The problem here was: I requested if from the thread-pool. Now, the thread-pool is very very clever – it has lots of smarts in there to handle automatic growth and shrinking, etc. It works perfectly well under sane conditions. But, I asked too much of it: asp.net will already be chewing through workers for requests, and we don’t currently use much async code – our requests are processed pretty much synchronously. This means that during a storm (and only then) we were essentially getting into a deadlock condition:

  • asp.net requests were using lots of workers
  • which were blocked waiting on a response from redis
  • which hadn’t been sent the request yet, because no thread-pool thread could be allocated to write the queue

Fortunately, this is self curing; eventually either:

  • the blocking requests will timeout (we always specify timeouts, and pretty short ones), allowing the requests to complete and making more writers available
  • the thread-pool will grow and allocate a writer (although to be honest, when competing with constant inbound asp.net requests, it is dubious whether the writer would get there fast enough)

That is why when you look at the system a second after the trouble, it was all healthy again - the timeouts have happened, releasing enough workers to the pool to service the queue.

Sigh; all fun.

The moral of the story

See, I do have morals! Don’t use the thread-pool for time-critical operations if there’s a good chance you’ll already be stressing the thread-pool. The fix was pretty simple once we understood the problem: we now retain a dedicated writer thread per multiplexer (note: not per socket). We do reserve the right to seek help from the thread-pool if there is a backlog, but frankly that is pretty rare – the dedicated writer is usually more than able to keep up (in local testing, I’ve seen the multiplexer servicing around 400k ops/s – far above what most people need).

Next time: announcing a new redis client for .NET! (also: reasons why, and what about BookSleeve?)

Saturday, 8 March 2014

Cleaner code: using partial methods for debugging utilities

Often in a complex code base, there are additional bits of work you need to do to help with debugging. The easy but hard-to-maintain way to do this is with #if:

obj.Wow();
#if DEBUG
// only want the overhead of checking this when debugging...
Very("abc");
#endif
Such();

This gets the job done, but can lead to a whole heap of problems:



  • general ugliness – especially when you have lots of different #if pieces all over the file
  • hard to track what uses each method (many code tools won’t look inside inactive conditional compilation blocks)
  • problems with automated tools, such as using directive removal or code organization tools, which get very confused

We can do better!


Partial classes were added to C# a long time ago – with one of their main aims being to help with extension points for generated code. You can then split a logical class into multiple files, as long as you use the partial modifier in each file. The contents of the file get merged by the compiler. So what about partial methods? These are like regular method, but with the odd property that they completely evaporate if you aren’t doing anything useful; as in – the compiler completely ignores all mention of them. You declare a partial method like so:


partial void OnSomethingAwesome(int someArg);


(noting that you cannot specify an access modifier, and that they follow the same rules as [Conditional(…)] methods: you cannot specify a return value or access modifier, and the parameters cannot be declared out (although they can be ref) – again, this is because the compiler might be removing all trace of them, and we can’t do anything that would upset “definite assignment”). Then elsewhere in the code you can use that method like normal:


Write("tralala");
OnSomethingAwesome(123);
return true;

That looks convincing right? Except: until we write the body of OnSomethingAwesome, it does not exist. At all. For example, in another file somewhere we could add:


partial class TheSameClass // and the same namespace!
{
partial void OnSomethingAwesome(int number)
{
Trace(number);
}
}

And only now does our code do anything useful. It is also important to note that just like [Conditional(…)] methods, the evaluation of the arguments and target are also removed, so you need to be a little careful… there is a huge difference between:


OnSomethingAwesome(SomethingCritical());

and


var value = SomethingCritical();
OnSomethingAwesome(value);

In reality, this is rarely an actual issue.


Applying this to debugging operations


Hopefully, it should be obvious that we can use this to move all of our debugging operations out of the main code file – perhaps into TheSameClass.debugging.cs file (or whatever you want) – which can then legitimately have a single #if conditional region. So to take our earlier example:


obj.Wow();
OnVery("abc");
Such();

with (elsewhere):


partial void OnVery(string much);

How about interfaces?


EDIT:It turns out I was completely wrong here; partial interface works fine - my mistake. I'm leaving this here as a mark of my public shame, but: ignore me. You can use interfaces much like the above.

There is no such thing as a partial interface, but what you can do is declare a separate debug-only interface, and then extend the type in a partial class:


partial class TheSameClass : IMagicDebugger
{
void IMagicDebugger.So() { /* ... */ }
}

Real world example


Here’s something from some real code, where during debugging and testing I need to keep additional counters, that are not needed in the production code:


#if DEBUG
partial class ResultBox
{
internal static long allocations;

public static long GetAllocationCount()
{
return Interlocked.Read(ref allocations);
}
static partial void OnAllocated()
{
Interlocked.Increment(ref allocations);
}
}
#endif

The intent should be pretty obvious from the context, but note that here everything to do with this debug-only feature is now neatly packaged together.


Enjoy.

Monday, 3 March 2014

Be strict with me, dear CLI

There is an old adage in code:

Be liberal in what you accept, and conservative in what you send

And right now, this adage can [obscenity]. The week before last, my main gripe was with broken browser implementations (not behind proxy servers) that clearly didn’t read RFC 6455 (aka WebSocket) – or at least, they read it enough to know to tend a Sec-WebSocket-Key, but not enough to understand client-to-server masking, or even simply to add a Connection or Upgrade header. But that’s not what I’m moaning about today; last week, my gripe was .NET. Specifically, IL emit.

(wibbly-wobbly screen-fade going to backstory)

A good number of the tools I work on involve metaprogramming, some of them to quite a scary degree. Which means lots of keeping track of the state of the stack as you bounce around the IL you are writing. And to be fair, yes it is my responsibility to make sure that the IL I emit is meaningful.. I just wish that the CLI was more consistent in terms of what it will allow.

You see, there’s an interesting rule in the CLI specification:

In particular, if that single-pass analysis arrives at an instruction, call it location X, that immediately follows an unconditional branch, and where X is not the target of an earlier branch instruction, then the state of the evaluation stack at X, clearly, cannot be derived from existing information. In this case, the CLI demands that the evaluation stack at X be empty.

The scenario this describes is actually very common – for example in a while loop (which is also the core of foreach), the most common way of setting those out is:

  • (preceding code)
  • unconditional-branch: {condition test}
  • label: {the actual code}
  • (the actual code)
  • label: {condition test}
  • (condition test)
  • branch-if-true: {the actual code}
  • (subsequent code)

The important point is that “{the actual code}” meets this special case; it is precisely the X mentioned in the specification: it immediately follows an unconditional branch, and is not the target of an earlier branch instruction (it is, however, the target of a later branch instruction). This means that to be valid, the stack (relative to that method) must be empty.

This would actually be easy enough to smoke-test… just setup some simple IL that forces this condition, press the button, and wait for the CLI to complain about it. Here’s the C# for that, on pastie.org. The only problem is that it runs without complaining. Well, maybe DynamicMethod is a special case… we’ll try a full assembly instead. And again: it works without the slightest complaint. To get it to notice, we need to write it to disk (assembly.Save("Broken.dll");) and then run peverify Broken.dll, which finally gives us the complaint we wanted:

[IL]: Error: [Broken.dll : BrokenType::BrokenMethod][offset 0x00000002] Stack height at all points must be determinable in a single forward scan of IL.
1 Error(s) Verifying Broken.dll

You might think I’m being fussy… I mean, if the CLI runs it anyway then what is the problem? A fair question, but the truth is more complicated. When the CLI is loading an assembly from disk it is often more strict. This depends on a lot of things, including the framework version, the framework platform, and the trust settings.

Oh for a simple “strict mode” for running in-memory dynamic assemblies: that would make it so much easier for us long-suffering metaprogrammers. And I’m not alone here: a quick google search shows this issue has bitten mono, ikvm, mono-cecil, and many others.

A silver lining…

The good news is that the excellent Sigil utility (by my colleague Kevin Montrose) now has support for this, via the strictBranchVerification flag. It also makes the IL emit a lot easier to grok – here’s the same example via Sigil. Sadly, I can’t use it in my particular scenario, unless I create a custom Sigil build that uses IKVM for cross-platform emit, but for most people this should help a lot.