Why is storage and transfer given in bytes?

Question

As I understand it a storage device has x amount of bits which can be used for storage. Various software will store binary code in 32 or 64 bit, and some basics text files are just stored in 8bit ASCII and images, video, music may be something in between.

a) Is this understanding correct?

b) Why measure everything in (8 bit) bytes units if things are not even in 8 bit?

score 2 · Accepted Answer · answered Sep 24 '14 at 11:31

The majority of todays computer systems is internally working with multiples of 8 bits. On the lowest level, smaller quantities happen to be transferred (e.g. nybbles (4 bits) to the PHY of a 100 MBit/s ethernet connection), but everything exposed to higher levels is in multiples of 8 bits. This leads to memories working with 8 bit (or more) chunks. The lowest common demoninator of everything sensible to measure for the end-user is what we today call Byte, so that’s the unit of choice.

Data in files is often aligned to the byte boundary, because accessing single bits is a more expensive operation. The smallest quantity supported by most of todays computers instructions is 8 bits of data, and that is what a memory address points to. Thus, some parts of files may be 32 bit chunks or 64 bit chunks, but one will only rarely find a 7 bit chunk which is not filled up to 8 bits (like 7-bit ASCII).

score 0 · Answer 2 · answered Sep 24 '14 at 11:25

Yes, your understanding is correct.

Sizes are typically specified in bytes for a variety of reasons. One is that specifying numbers in bits would result in larger, less convenient numbers. Another is that units smaller than a single byte are almost never transferred, so there is no reason to use a smaller unit.

score 0 · Answer 3 · edited Mar 20 '17 at 10:17

Historically, computers have worked with widely varying word sizes. For example, 36 or 40 bit words wasn't entirely uncommon in early electronic computers (these led to convenient 18- or 20-bit "half words" which were sufficient for many purposes, while the full-length words allowed larger quantities or more precision where that was needed).

These days, almost all general-purpose computers work with data in terms of power-of-two multiples of eight bits. Eight bits is a convenient quantity to use as a baseline, and it fits nicely into the "power of two" scheme that because of their binary nature computers have an easy time working with.

Consequently, hardware is designed to work with such multiples of eight-bit quantities, in a sort of self-reinforcing loop.

In all honesty, today's computers are often designed to work efficiently with significantly larger quantities than eight bits at a time: not uncommonly 32, 64 or even 128 bits at a time. Note that all of these are power-of-two multiples of eight bits, and as such can easily be deconstructed or combined if necessary.

On the lower level, storage capacities are often specified in terms of bits, because some systems don't work in terms of bytes. It's also a fixed quantity: eight-bit words fit a lot of uses, but not all, so whereas bytes might not apply in every situation, available-bit counts always remain the same.

As David Schwartz pointed out, showing bit quantities to the user would simply inflate the numbers without providing much (if indeed any) actual additional information. While an electronics engineer or firmware programmer can be expected to know how to work in bits, the average computer user cannot be expected to have such knowledge. Early personal computers also used encoding schemes that always encoded a single character as a single byte (or in a few cases, some small multiple of bytes), so the concept of "character = byte" was easy to convey. This isn't quite the case today with variable-length encodings such as UTF-8, but on the other hand storage capacities are so great these days that we don't normally need to worry about those details.

Why is storage and transfer given in bytes?

3 Answers3