Consider this Windows Forms code (one could write similar WPF analogue):
public partial class Form1 : Form
{
    public Form1()
    {
        InitializeComponent();
    }
    private void TraceThreadInfo([CallerMemberName]string callerName = null)
    {
        Trace.WriteLine($"{callerName} is running on UI thread: {!this.InvokeRequired}");
    }
    private void DoCpuBoundWork([CallerMemberName]string callerName = null)
    {
        TraceThreadInfo(callerName);
        for (var i = 0; i < 1000000000; i++)
        {
            // do some work here
        }
    }
    private async Task Foo()
    {
        DoCpuBoundWork();
        await Bar();
    }
    private async Task Bar()
    {
        DoCpuBoundWork();
        await Boo();
    }
    private async Task Boo()
    {
        DoCpuBoundWork();
        // e.g., saving changes to database
        await Task.Delay(1000);
    }
    private async void button1_Click(object sender, EventArgs e)
    {
        TraceThreadInfo();
        await Foo();
        Trace.WriteLine("Complete.");
        TraceThreadInfo();
    }
}
Here's the chain of Foo/Bar/Boo methods, which I want to execute asynchronously, without blocking of UI thread. These methods are similar, in sense that all of them makes some CPU-bound work and ultimately calls "true" asynchronous operation (e.g., perform some heavy calculations an save result to the database).
The output from the code above is this:
button1_Click is running on UI thread: True
Foo is running on UI thread: True
Bar is running on UI thread: True
Boo is running on UI thread: True
Complete.
button1_Click is running on UI thread: True
So, all this stuff executes synchronously.
I know about capturing of current context by built-in awaitables. So, I thought, that it will be enough to call ConfigureAwait(false) like this:
    private async Task Foo()
    {
        await Task.Delay(0).ConfigureAwait(false);
        DoCpuBoundWork();
        await Bar();
    }
but, actually, this doesn't change anything.
I'm wondering, how this can be "pushed" to thread pool thread, assuming, that at the end of button1_Click method I need to return to UI thread?
Edit.
Task.Delay(0) actually optimizes call, when its argument is 0 (thanks to @usr for the note). This:
    private async Task Foo()
    {
        await Task.Delay(1).ConfigureAwait(false);
        DoCpuBoundWork();
        await Bar();
    }
will works, as expected (everything executes on thread pool, except button1_Click's code). But this is even worse: to capture context or to not capture depends on awaitable implementation.
 
     
    