Disclaimer - I'm not an information theorist, just a code monkey who works primarily in C and C++ (and thus, with fixed-width types), and my answer is going to be from that particular perspective.
It takes on average 3.2 bits to represent a single decimal digit - 0 through 7 can be represented in 3 bits, while 8 and 9 require 4. (8*3 + 2*4)/10 == 3.21.
This is less useful than it sounds. For one thing, you obviously don't have fractions of a bit. For another, if you're using native integer types (i.e., not BCD or BigInt), you're not storing values as a sequence of decimal digits (or their binary equivalents). An 8 bit type can store some values that take up to 3 decimal digits, but you can't represent all 3-decimal-digit values in 8 bits - the range is [0..255]. You cannot represent the values [256..999] in only 8 bits.
When we're talking about values, we'll use decimal if the application expects it (e.g., a digital banking application). When we're talking about bits, we'll usually use hex or binary (I almost never use octal since I work on systems that use 8-bit bytes and 32-bit words, which aren't divisible by 3).
Values expressed in decimal don't map cleanly on to binary sequences. Take the decimal value 255. The binary equivalents of each digit would be 010, 101, 101. Yet, the binary representation of the value 255 is 11111111. There's simply no correspondence between any of the decimal digits in the value to the binary sequence. But there is a direct correspondence with hex digits - F == 1111, so that value can be represented as FF in hex.
If you're on a system where 9-bit bytes and 36-bit words are the norm, then octal makes more sense since bits group naturally into threes.
- Actually, the average per digit is smaller since 0 and 1 only require a single bit, while 2 and 3 only require 2 bits. But, in practice, we consider 0 through 7 to take 3 bits. Just makes life easier in a lot of ways.