On the fascinating topic of bits, bytes, octets, words and smallest addressable units

I try, below, to condense the various (and often contradicting) bits (pun not intended) of information found in various sources and present my own understanding on this subject.

It turns out the the only two precisely defined terms whose meaning is context-invariant are "bit" and "octet" so I'll start with these:

bit

A single binary digit

octet

Eight binary digits

word

"Word" is usually defined as "the natural size with which a processor is handling data (the register size)". Wikipedia "defines" it as "the natural unit of data used by a particular processor design". Obviously this is a vague and ambiguous "definition".

The most common word sizes encountered today are 8, 16, 32 and 64 bits, but other sizes are possible. For examples, there were a few 36 bit machines, or even 12 bit machines. In other words, "word" is the most common chunk of bits with which a processor can do processing (like addition and subtraction) at a time. That definition is a bit fuzzy, as some processor might have different word sizes for different tasks (integer vs. floating point arithmetic for example). The word size is what the majority of operations work with (source). By that definition, the word size of a CPU is also the size of the data path of the ALU (NB: not to be confused with the memory data bus).

The word size is some multiple of the byte size. At any rate there is some ambiguity and context-specificity surrounding the definition of what precisely a "word" is. According to some authors, the word size is the same as your register size (64 bits on x86-64). But according to section 4.1 (“Fundamental Data Types”) of the Intel architecture manual, on x86 processors a word is 16 bits even though the registers are 64 bits. So the term "word" can have several different meanings depending on the context.

smallest addressable unit or byte

The byte size is the smallest unit you can address. For example in a program on my machine 0x20aa87c68 might be the address of one byte, then 0x20aa87c69 is the address of the next byte.

pointer size

TBD

data bus size

TBD

cache line

TBD