Preamble - not a part 2
A little while ago I blogged here and I set it up to be a "continues..." style post. I haven't had the energy to continue it in that context, and this fact was putting me off concluding the post. I then realised: the thing that matters isn't some overarching narrative structure, but that I get my ideas down. So: I'm aborting any attempt at making this post a continuation, and just focusing on the content!
Prefer ValueTask[<T>]
to Task[<T>]
, always.
There's been a lot of confusion over when to use Task[<T>]
vs ValueTask[<T>]
(note: I'm going to drop the [<T>]
from now on; just pretend they're there when you see Task
/ ValueTask
etc).
Context: what are Task
and ValueTask
?
In case you don't know, Task
and ValueTask
are the two primary implementations of "awaitable" types in .NET; "awaitable" here means that there is a duck-typed signature that allows the compiler to turn this:
int i = await obj.SomeMethodAsync();
into something like this:
var awaiter = obj.SomeMethodAsync().GetAwaiter();
if (!awaiter.IsCompleted)
{
// voodoo here that schedules a
// continuation that resumes here
// once the result becomes available
}
int i = awaiter.GetResult();
Task
is the original and most well known API, since it shipped with the TPL, but it means that an object allocation is necessary even for scenarios where it turns out that it was already available, i.e. awaiter.IsCompleted
returned true
. The ValueTask
value-type (struct
) acts as a hybrid result that can represent an already completed result without allocating or an incomplete pending operation. You can implement your own custom awaitables, but it isn't common.
When to choose each, the incorrect version
If you'd asked me a while back about when to choose each, I might have incorrectly said something like:
Use
Task
when something is usually or always going to be genuinely asynchronous, i.e. not immediately complete; useValueTask
when something is usually or always going to be synchronous, i.e. the value will be known inline; also useValueTask
in a polymorphic scenario (virtual
,interface
) where you can't know the answer.
The logic behind this incorrect statement is that if something is incomplete, your ValueTask
is going to end up being backed by a Task
anyway, but without the extra indirection and false promise of ValueTask
. This is incorrect, though, because it is based on the premise that a ValueTask
is a composite of "known result (T
)" and "Task
". In fact, ValueTask
is also a composite of a third thing: IValueTaskSource[<T>]
.
What is IValueTaskSource[<T>]
?
IValueTaskSource
is an abstraction that allows you to represent the logical behaviour of a task separately to the result itself. That's a little vague, so an example:
IValueTaskSource<int> someSource = // ...
short token = // ...
var vt = new ValueTask<int>(someSource, token);
// ...
int i = await vt;
This now functions like you'd expect from an awaitable, but even in the incomplete/asynchronous case the logic about how everything works is now down to whatever implements the interface - it does not need to be backed by a Task
. You might be thinking:
ah, but we still need an instance of whatever is implementing the interface, and we're treating it as a reference, so: we're still going to allocate; what's the point? what have you gained?
And that's when I need to point out the short token
. This little gem allows us to use the same interface instance with multiple value-tasks, and have them know the difference. There are two ways you could use this:
- keep the state for multiple asynchronous operations concurrently, using the
token
to pick the correct state (presumably from a vector) - keep a single piece of state for multiple consecutive operations, using the
token
to guarantee that we're talking about the correct one
The second is actually by far the more common implementation, and in fact is now included in the BCL for you to make direct use of - see ManualResetValueTaskSourceCore<T>
.
So what? How does this help me?
OK; so - we've seen that this alternative exists. There are two ways that people commonly author awaitable APIs today:
- using
TaskCompletionSource<T>
and handing the caller the.Task
(perhaps wrapped in aValueTask
), and callingTrySetResult
etc when we want to trigger completion - using
async
andawait
, having the compiler generate all the machinery behind the scenes - noting that this currently involves creating aTask
in the incomplete case, even forValueTask
methods (because it has to come from somewhere)
Hopefully you can see that if we have ValueTask
available to us it is relatively easy to substitute in a ManualResetValueTaskSourceCore
backer, allowing us to reuse the same IValueTaskSource
instance multiple times, avoiding lots of allocations. But: there's an important caveat - it changes the API. No, really. Let's take a stroll to discuss how...
Don't await twice
Right now, the following code works - assuming the result is backed by either a fixed T
or a Task<T>
:
var pending = obj.SomeMethodAsync();
int i = await pending;
// ...
int j = await pending;
You'll get the same answer from each await
, unsurprisingly - but the actual operation (the method) is only performed once. But: if we switch to ManualResetValueTaskSourceCore
, we should only assume that each token
is valid exactly once; once we've awaited the result, the entire point is that the backing implementation is free to re-use that IValueTaskSource
with a different token
for another consumer. That means that the code shown above is no longer legal, and we should expect that the second await
can now throw an exception about the token
being incorrect.
This is a pretty rare thing to see in code, so personally I'm OK with saying "tough; await once only". Think of it in human terms; this is like a manager going to someone's desk and saying:
Hi, I need the answer to (some topical question); do you know that now? if so, tell me now; otherwise, when you have the answer, bring it (somewhere) and nudge me.
All fine and reasonable so far; our office hero didn't know the answer right away, so they went away and got it, took it where instructed and handed the answer to the manager.
20 minutes later (or 2 days later), the manager stops by their desk again:
Hey, give me that answer
At this point, our hero might reasonably say
Boss, I already gave it you; I only printed it out once - you have the copy; I deal with lots of requests each day, and I can't even remember what you asked about, let alone what the answer was; if you've forgotten the answer, that's on you - feel free to ask again, it's all billable
This is kinda how I anthropomorphize ValueTask
, especially in the context of IValueTaskSource
. So key point: don't await twice. Treat the results of awaitables exactly the same as you would the result of any other expression: if you are going to need the value twice, store it in a local when you first fetch it.
How else can we benefit from IValueTaskSource?
So; we've seen how we can manually use an IValueTaskSource
to efficiently issue ValueTask
awaitable results; but if we use async
/await
, in the incomplete / asynchronous case the compiler is still going to be generating a Task
- and also generating a bunch of other state boxes associated with the continuation voodoo. But.. it doesn't have to! A while ago I did some playing in this area that resulted in "Pooled Await"; I'm not going to go into details about this here, and for reasons that will become clear in a moment, I don't recommend switching to this, but the short version is: you can write a method that behaves exactly like a ValueTask
awaitable method (including async
), but the library makes the compiler generate different code that using IValueTaskSource
to avoid the Task
allocation, and uses state machine boxing to reduce the other allocations. It works pretty well, but as you might expect, it has the above caveat about awaiting things more than once
So; why am I saying don't leap at this? That because the BCL folks are also now playing in this space, as evidenced by this PR, which has pretty much the exact same feature set, but the advantages of:
- being written by people who really, really understand async
- it not adding any dependencies - it would just work out of the box for
ValueTask
awaitables
If that happens, then a lot of asynchronous code will magically get less allocatey all at once. I know this is something they've discussed in the past, so maybe my "Pooled Await" stuff gave them the metaphorical kick to go and take another look at implementing it for real; or maybe it was just a timing coincidence.
For both my own implementation and the BCL version, it can't do all the magic if you return Task
- for best results, a ValueTask
is needed (although "Pooled Await" still reuses the state-machine boxes for Task
APIs)
Conclusion
So, going back to the earlier question of when to use Task
vs ValueTask
, IMO the answer is now obvious:
Use
ValueTask[<T>]
, unless you absolutely can't because the existing API isTask[<T>]
, and even then: at least consider an API break
And also keep in mind:
Only
await
any single awaitable expression once
If we put those two things together, libraries and the BCL are free to work miracles in the background to improve performance without the caller needing to care.