Because it allows for a much more succint and useful comparison.
Using C makes the definition such that it says
f grows at roughly (asymptotically) the same speed as g (or slower)
Without the C, we'd lose the "roughly (asymptotically)" part. Without the C, we couldn't just say f = O(n^2), we'd have to say something like f = O(1.7 n^2) (unless the factor happened to be 1, of course). That's not particularly useful.
Big O is generally used to talk about algorithm classes, and about ratios. f = O(n^2) says: "When you scale n twice, the computation scales 4 times." This would still be true for O(4 n^2) or O(1.7 n^2) or even O(0.01 n^2).
The entire point of the Big O notation is to express asymptotic complexity—what trends there are when n gets large. We don't care whether it takes 4-times as much or half-as much at the same n; we care how it scales when n scales, and such scaling is invariant to a multiplicative constant.
Not to mention the fact that fixing the exact constant would be really difficult in specific cases. It's generally easy to show that e.g. an algorithm performs roughly n operations for each bit of input, and so has n^2 complexity. But it would be quite painful to analyse whether it performs 3 n operations for n / 2 input elements and 2 n operations for the other n / 2 elements, or something else.