26

I was taking a look at a HDD and I found a document (from Toshiba, link: 2.5-Inch SATA HDD mq01abdxxx) that says:

"One Gigabyte (1GB) means 10^9 = 1,000,000,000 bytes using powers of 10. A computer operating system, however, reports storage capacity using powers of 2 for the definition of 1GB = 2 ^30 = 1,073,741,824 bytes, and therefore shows less storage capacity."

Then powers of 10 are bigger than powers of 2, OK.

Example 10^2 = 100 and 2^2 = 4.

But I do not understand the document which says for the same storage capacity:

1GB is 1,000,000,000 bytes (powers of 10) and 1,073,741,824 bytes (powers of 2), then: it shows less storage capacity (the powers of 2). Why is it less? If I see for 1GB more storage capacity in powers of 2 than powers of 10.

Burgi
  • 6,768

4 Answers4

61

The historical reason of using powers of 2 is that memory and hard disk are accessed by the CPU using an address space composed of lines on binary code. Hardware producers decided the names in this way:

2^10 = 1024 and as it's almost 1000 then call it 1 Kilobyte

2^20 = 1048576 bytes and as it's almost 1000000 then call it 1 Megabyte

For the normal user it is nonsense and cumbersome. In addition the prefixes "kilo", "mega", etc. come into conflict with the International System of Units (SI) standard where “1 kiloWatt” means 10^3 or 1000 Watts.

To solve the problem, in the year 2000 The International Electrotechnical Commission or IEC proposed a notation scheme for the units based in powers of 2 on the norm ISO/IEC 80000-13.

The new names were created by replacing the second syllable in the old name by ‘bi’ (referring to ‘2’). A kilobyte must be now a kibibyte and so on. The new units also got corresponding symbols, so ‘10 kibibyte’ is now written as 10 KiB instead of 10 kB. This is the correspondence table:

Notation      Symbol    Value
1 kilobyte    1 kB      10^3  = 1000 bytes
1 megabyte    1 MB      10^6  = 1000000 bytes
1 gigabyte    1 GB      10^9  = 1000000000 bytes
1 terabyte    1 TB      10^12 = 1000000000000 bytes


1 kibibyte    1 KiB     2^10 = 1024 bytes
1 mebibyte    1 MiB     2^20 = 1048576 bytes
1 gibibyte    1 GiB     2^30 = 1073741824 bytes
1 tebibyte    1 TiB     2^40 = 1099511627776 bytes

16 years later a lot of hardware and software vendors still refer to the base-2 units with their SI names. A “megabyte” can mean either 1000000 bytes or 1048576 bytes.

If you buy a 100 GB hard drive, the capacity is 100x10^9 or 10^11 bytes. But, and this is the big but, the operating system will only report the drive as having a capacity of 93 GB (10^11)/(2^30). You bought a 100 gigabyte drive, which is equivalent to a 93 gibibyte drive. The operating system is the one that uses the wrong notation.

Drive manufacturers hide this issue with disclaimers and explanations that always lead to the conclusion that “actual formatted capacity may be less”.

jcbermu
  • 17,822
23

In short: it was all about marketing.

jcbermu explained well, but I don't agree to the reasons behind all of that.

While any informatics system uses the binary system, the bits & bytes are written as ^2, which is normal. So it's not the operating system or software at fault for the confusion. Everything is binary here.

It's the fault of HDD manufacturers to state the HDD capacities in ^10 system, which robs you of quite some practical GB. A 20GB HDD will actually be able to store 18GB and so forth...a 1TB drive will be actually of ~930GB. The 'bibyte' mockery was invented to try to prevent some of the confusion but it utterly failed to be practically adopted.

Overmind
  • 10,308
16

jcbermu's answer is good, but I want to approach this from a different angle.

1GB is 1,000,000,000 bytes (powers of 10) and 1,073,741,824 bytes (powers of 2), then: it shows less storage capacity (the powers of 2). Why is it less? If I see for 1GB more storage capacity in powers of 2 than powers of 10.

A storage media -- any storage media -- can store a specific number of accessible bits. Usually in general purpose computing, it's expressed as bytes or some multiple of bytes, but if you start looking at for example memory ICs (integrated circuits, chips), you will see their memory capacity expressed in terms of accessible bits.

A hard disk will store some specific number of bits or bytes which, for technical reasons, are addressed in terms of sectors. For example, a 4 TB drive might have 7,814,037,168 sectors of 512 bytes each, which works out to a storage capacity of 4,000,787,030,016 bytes. That's what you actually get. (In practice, you then lose some of that to the computer's bookkeeping information: file system, journal, partitioning, etc. However, the bytes are still there, you just can't use them to store files, because they are needed to store the data that effectively allows you to store the files.)

Of course, the number 4,000,787,030,016 is somewhat unwieldy. For that reason, we choose to represent this information in some other way. But as jcbermu illustrated, we choose to do so in two different ways: in powers of ten, or powers of two.

In powers of ten, 4,000,787,030,016 bytes is 4.000787030016 * 10^12 bytes, which rounds quite nicely; with four significant digits, it rounds to 4.001 TB, for the SI definition of "tera": 10^12. Our hard disk can store more than 4 * 10^12 bytes, so in SI terms, it is a 4 terabyte storage device.

In powers of two, 4,000,787,030,016 bytes is 3.638694607 * 2^40 bytes, which doesn't round quite so nicely. It also looks like a smaller quantity, because 3.639 is less than 4.001, and that is bad for marketing (who wants to buy a 3.6 TB drive when the manufacturer next door sells a 4.0 TB drive for the same price?). This is the binary prefix 3.6 "tebibytes", where the "bi" indicates that it's a base-two quantity.

In reality, however, it's exactly the same number of bytes; the number is only expressed differently! If you do the math again, you will see that 3.638694607 * 2^40 = 4.000787030016 * 10^12, so you get the same storage capacity in the end.

user
  • 30,336
6

Other answers have addressed the historical reason for the difference, but it seems to me like you are asking about the difference according to the mathematics.

You are correct that one power of 10 is larger than one power of 2, and that conversely one gigabyte (10^9 bytes) is smaller than one gibibyte (2^30 bytes).

The reversal of sizes is explained by the fact that there are more powers in one gibibyte (30 powers) than there are powers in one gigabyte (9 powers). It turns out that the number of powers has a larger effect on the final size than does the size of each individual power.

As to why the reported size of a disk is smaller when measured in gibibytes (2^30) than when measured in gigabytes (10^9), it is natural than when measuring a fixed quantity that a larger unit of measure gives a smaller number. For example, consider height in inches versus height in centimetres. Because one inch is larger than one centimetre, the same height will measure fewer inches (e.g. 72 inches) than centimetres (e.g. 183 centimetres). The height is the same physical distance in both cases, but each measurement just gives a different number according to the unit of measure.