__thread Foo foo;
How is foo actually resolved? Does the compiler silently replace every instance of foo with a function call? Is foo stored somewhere relative to the bottom of the stack, and the compiler stores this as "hey, for each thread, have this space near the bottom of the stack, and foo is stored as 'offset x from bottom of stack'"?