In that particular example:
template <int N>
struct Factorial {
    enum { value = N * Factorial<N - 1>::value };
};
template <>
struct Factorial<0> {
    enum { value = 1 };
};
The compiler generates the classes recursively. Remember, a template by itself is not a class, only a set of rules after which classes are generated.
Initially, only Factorial<0> exists. Then you write:
const int x = Factorial<4>::value;
which tells it it needs to generate the class Factorial<4> which would translate to:
template <>
struct Factorial<4> {
    enum { value = 4 * Factorial<3>::value };
};
which again tells it to generate the class Factorial<3> and so on. This stops when it reaches Factorial<0> because it was already defined and no new classes have to be generated to compute its value member.
Basically, it moves the computation from run-time to compile-time. Note that this doesn't always work (depending on the value of N, your compiler might not support that many levels of template recursion). Also, this increases code size. 
This is for the how - why this works it's because it's permitted by the standard.