Wednesday, 4 March 2020

Why do I rag on BinaryFormatter?

tl;dr: seriously, stop using BinaryFormatter

The other evening, in the context of protobuf-net.Grpc, someone asked me whether it was possible to use BinaryFormatter as the marshaller. This isn't an unreasonable question, especially as protobuf-net.Grpc is designed to allow you to swap out the marshaller (gRPC is usually used with protobuf, but it isn't restricted to that as long as both ends understand what they're dealing with).

This made me realise that while I've spent over a decade telling people variants of "don't use BinaryFormatter", I don't think I've ever collated the reasons in one place. I suspect that many people think I'm being self-serving by saying this - after all it is so easy to use BinaryFormatter, and I'm not exactly a disinterested observer when it comes to serialization tools.

So! I thought I'd take this opportunity to put together my thoughts and reasons in one place, while also providing a "custom marshaller" example for protobuf-net.Grpc. Because "reasons", I've done this as comments in the example, but I present them below. There are four sections, but if you aren't sold by the time you've finished the first ("Security") section, then frankly: I give up. Everything beyond that first section is just decoration!

So; if you're still using BinaryFormatter, I implore you: please just stop.

And without further embellishment, I present my thesis. If I missed anything, please let me know and we can add more. But again, no more should be needed.

Tuesday, 7 January 2020

.NET Core, .NET 5; the exodus of .NET Framework?

tl,dr; opinion: ongoing .NET Framework support for F/OSS libraries may quickly start evaporating, and this should be a consideration in migration planning.


First, a clarification of terms, because they matter:

  • .NET Framework - the original .NET, the one that ships on Windows and only on Windows; the current (and probably final) version of .NET Framework is 4.8
  • .NET Core - the evolution of .NET, that is not tied to the OS as much, with slightly different feature sets, and where most of the Microsoft .NET effort has been for the last few years; .NET Core 3.1 shipped recently
  • .NET Standard - an API definition (not implementation - akin to an interface) that allows a library to target a range of platforms in a single build, i.e. by targeting .NET Standard 2.0 a library can in theory run equivalently on .NET Core 3 and .NET Framework 4.6.2 (ish...) and others (Mono, Unity, etc), without needing to target each individually
  • .NET 5 - the next version of .NET Core; the naming deliberately emphasizes that there isn't a two-pronged development future consisting of "Framework" and "Core", but just one - this one - which isn't "Core" in the "minimal" sense, but is in fact now a very rich and powerful runtime; .NET 4 was avoided to prevent versioning confusion between .NET 4.* and .NET Framework 4.* (and again, to emphasize that this is the future direction of .NET, including if you are currently on .NET Framework)

The first thing we must be clear about, in case it isn't 100% clear from the above, is that .NET Framework is legacy completed. There isn't going to be a .NET Framework 4.9 or a .NET Framework 5. There might be some critical security fixes, but there aren't going to be feature additions, unless those additions come from out-of-band NuGet (etc) packages that just happened to work on .NET Framework on the first (or maybe second) try.

I commented on Twitter yesterday about my perceptions on the status of this, and how we (the .NET community) should look at the landscape; it goes without saying that I'm merely opining here - I'm not a spokesperson for Microsoft, but I am a library author and consumer, and I work extensively in the .NET space. Other views and conclusions are possible! But: I wanted to take the time to write up a more long-form version of what I see, with space to give reasons and discuss consequences.

What I said yesterday

The short version is: I expect that 2020 will see a lot of library authors giving serious consideration as to whether to continue shipping .NET Framework support on new library versions. There are lots of reasons for this, including:

  • increasing feature gaps making it increasingly expensive to support multiple frameworks, either via compatibility shims or framework-dependent feature sets
  • as more and more library authors complete their own migrations to .NET Core, the effort required to support a framework that they aren't using increases:
    • bugs don't get spotted until they've shipped to consumers
    • a lot of knowledge of "the old framework" needs to be retained and kept in mind - a particular issue with new contributors who might never have used that framework (and yes, there are some huge gotchas)
    • there are often two (or more) code implementations to support
    • builds are more complicated than necessary (requiring either Windows or the build-pack), and take longer
    • tests take longer and require Windows
    • packages and dependency trees are larger than necessary
  • not all new language features are equal citizens on down-level frameworks
    • some features, such as default interface methods, will not work on down-level frameworks
    • some important features like C# 8 nullability are in a weird middle ground where some bits kinda work sometimes most of the time except when it doesn't
    • some, like IAsyncEnumerable<T> may have compatibility shims, but that only allows minimal support on library surfaces, since of course many framework level pieces to produce or consume such will be missing
  • some APIs are fundamentally brittle on .NET Framework, especially when multi-targeting, with the breaks happening only at run-time (they are not obvious at build, and may not be obvious until a very specific code-path is hit, which might be a long time after initial deployment); a lot of this comes does to the assembly loader and assembly-binding-redirects (a problem that simply does not exist in .NET Core / .NET 5)
    • if you want to see a library author cry, mention System.ValueTuple, System.Numerics.Vectors, or System.Runtime.CompilerServices.Unsafe. Why? Because they are deployment nightmares if you are targeting multiple platforms, because .NET Framework makes a complete pig's ear of them; you can just about fix it up with assembly-binding-redirects some of the time, but the tooling will not and can not do this for you, which is pure pain for a library author
    • recall that .NET Framework is "complete"; the loader isn't going to be fixed (also, nobody wants to touch it); alternatively, it could be said that the loader has already been fixed; the fix is called .NET Core / .NET 5
  • a lot of recent performance-focused APIs are not available on .NET Framework, or perform very differently (which is almost the worst possible outcome for performance-focused APIs!); for example:
    • concurrency: a lot of async APIs designed for highly concurrent systems (servers, in particular) will be simply missing on .NET Framework, or may be implemented via async-over-sync / sync-over-async, which significantly changes the characteristics
    • allocations: ther are a lot of new APIs designed to avoid allocations, typically in library code related to IO, data-processing etc - things like Span<T>; the APIs to interact with the framework with these directly with these won't exist on .NET Framework, forcing dual code paths, but even when they do, .NET Framework uses a different (and less optimal) Span<T> implementation, and the JIT lacks the knowledge to make Span<T> be magical; you can hack over some of the API gaps using pointer-based APIs when they exist, but then you might be tempted to use Unsafe.*, which as already mentioned: wants to kill you
    • processing: one of the most powerful new toolkits in .NET for CPU-focused work is access to SIMD and CPU intrinsics; both of these work especially well when mixed with spans, due to the ability to coerce between spans and vectors - but we just saw how Span<T> is problematic; full CPU intrinsics are only available on .NET Core / .NET 5, but you can still get a lot done by using Vector<T> which allows SIMD on .NET Framework... except I already mentioned that System.Numerics.Vectors is one of the trifecta of doom - so yes, you can use it, but: brace yourself.
    • now consider that a lot of libraries - including Microsoft libraries on NuGet, and F/OSS libraries - are starting to make more and more use of these features for performance, and you start to see how brittle things get, and it often won't be the library author that sees the problem.
  • as .NET Core / .NET 5 expand our ability to reach more OSes, we already have enough permutations of configurations to worry about.
  • often, the issues here may not be just down to a library, but may be due to interactions of multiple libraries (or indeed, conflicting dependencies of multiple libraries), so the issues may be unique to specific deployments.

How about just offering patch support, not feature support?

So the theory here is that we can throw our hands in the air, and declare "no new features in the .NET Framework version - but we'll bugfix". This sounds great to the consumer, but... it isn't really very enticing to the maintainer. In reality, this means branching at some point, and now ... what happens? We still retain all of the build, test, deploy problems (although now we might need completely different build/CI tools for each), but now we have two versions of the code that are drifting apart; we need to keep all the old things in our mind for support, and when we bugfix the current code, we might also need to backport that bug into a branch that uses very different code, and test that. On a platform that the library maintainers aren't using.

F/OSS isn't free; it is paid for by the maintainers. When proposing something like the above, we need to be very clear about whose time we are committing, and why we feel entitled to commit it. Fundamentally, I don't think that option scales very well. At some point, I think it becomes increasingly necessary to think of .NET Framework in the same way that we have thought of .NET 1.* for a very long time - it is interesting to know that it exists, but the longer you stay stuck on that island, the harder life is going to become for you.

In particular, to spell it out explicitly; I expect a number of libraries will start rebasing to .NET Standard 2.1 and .NET Core 3.0 or 3.1 as their minimum versions, carving off .NET Framework. The choice of .NET Standard 2.1 here isn't necessarily "because we want to use APIs only available in 2.1", but is instead: "because we actively don't want .NET Framework trying to run this, and .NET Framework thinks, often mistakenly, that it works with .NET Standard 2.0" (again, emphasis here is that .NET Framework 4.6.2 only sort of implements .NET Standard 2.0, and even when it does, it drags in a large dependency graph; this is partly resolved if you also target .NET Framework 4.7.2, but your list of TFMs is now growing even further).

So what happens to .NET Framework folks?

I totally get that a lot of people will be stuck on .NET Framework for the foreseeable future. Hell, a lot of our code at Stack Overflow is still .NET Framework (we're working through migration). I completely understand and empathize with all the familiar topics of service lifetimes, SLAs, budgets, clients, contracts/legals, and all of those things.

Just like nobody is coming to take .NET Framework off your machine, nobody is coming to take F/OSS libraries either. What I'm saying is that a time may come - and it is getting closer on the horizon - when you just won't get updates. The library you have today will continue working, and will still be on NuGet, but there won't be feature updates, and very few (for the reasons above) bug fixes.

I know I've spoken about open source funding before, but: at some point, if your business genuinely needs additional support on .NET Framework where it is going to create significant extra work (see: everything above) for the maintainers, perhaps at some point this is simply a supply-chain issue, and one solution is to sponsor that work and the ongoing support. Another option may be to fork the project yourself at the point where you're stuck, and maintain all the changes there, perhaps even supporting the other folks using that level. If you're thinking "but that sounds like a lot of effort": congratulations, you're right - it is! That's why it isn't already being done. All such work is zero sum; time spent on the additional work needed to support .NET Framework is time not being spent actually developing the library for what the maintainer wants and needs, and: it is their time being spent.

Conclusion

A lot of what I've discussed here is opinion; I can't say for sure how it will play out, but I think it is a very real (and IMO likely) possibility. As such, I think it is just one facet of the matrix you should be considering in terms of "should we, or when should we, look to migrate to .NET Core / .NET 5"; key point: .NET Core 3.1 is a LTS release, so frankly, there's absolutely no better time than now. Is migrating work? Yes, it is. But staying put also presents challenges, and I do not believe that .NET Framework consumers can reasonably expect the status-quo of F/OSS support (for .NET Framework) to continue.

(the Twitter thread)