## What is Normal – Part 5: Numbers and Counting Systems

© 2012 Rex Jaeschke. All rights reserved.

We all know how to count. How do we count? Well, we just "know" how; there probably are some rules but most of us have long forgotten them.

In so-called "primitive" civilizations, they had (and perhaps still have) symbols for no more than the numbers 1, 2, and many. Didn't they ever need to count things, and, if so, how did they do it? The answer is, "They used tally sticks", which, as Wikipedia states, "… was an ancient memory aid device to record and document numbers, quantities, or even messages". Basically, the first time a shepherd let his sheep out to pasture, he carved a notch in a stick for each sheep that passed him. Then when he brought them back home again, he checked that he had one for each notch. [Tally sticks are not to be confused with tally marks.]

Now the Roman numeral system has been around for a while, if fact, since the Romans; hmm, funny about that! And while you might have amused yourself on occasion when trying to decipher the vintage of a movie copyright, if you've ever tried to do arithmetic using that system, you'll soon see why it didn't take off. Besides, you needed different symbols for the digit 1 in the numbers 1, 10, 100, 1000, and so on. Fortunately, the idea of place-value notation came along in the 5^{th} century. But even then we didn't get a symbol for the idea of zero for another 100 years. [I well remember observing a heated discussion some 25 years ago between people who thought zero was special while others thought it "was just another point on the number line".]

For quite a few centuries now, the predominant number system has been one based on the number 10; hence the term *decimal*, from the Latin *decimus*. We'll look at that number system later, along with number systems in general. The system most of us use now has the digits 0–9, which are referred to as Arabic numerals or, more correctly, Hindu-Arabic numerals.

**Things You Might be Taking for Granted
**

When writing large numbers, we often talk of having *thousands* separators, as in 1,234,567, which many Europeans would write as 1.234.567 or even 1 234 567, where the space separators are non-breaking spaces. (For a discussion of decimal comma see the March 2010 essay, "What is Normal? – Part 1: Getting Started".) But life is not like a box of chocolates, and some people group their digits in twos or threes, or combinations thereof. (Is that normal?)

Then there is the issue of how to write negative values. In some contexts, people use a preceding minus sign. In others, they enclose the number in parentheses, while in others they might use a DB prefix of suffix to indicate a financial debit (with CR being used to indicate a credit).

The digits in numbers are written left-to-right in decreasing position value, right? So if what we commonly use are referred to as Arabic digits and Arabic is written right-to-left, how does one write a multi-digit number in Arabic? Although I'd learned the answer to that question some time ago, it wasn't until I actually edited in Microsoft's Word some Arabic text containing numbers, that I fully appreciated it. Yes, the text is written right-to-left, but within that text, numbers are written left-to-right. Now while a word-processor can handle that by dynamically inserting extra digits to the right as they are typed, that is not true with handwriting. That is, when writing such text by hand, one needs to leave sufficient space for the whole number to the left of the previously written text, and then proceed to write the digits left-to-right.

When English speakers speak multi-digit numbers, they do so in the order the digits are written. That is, 57 is said "fifty seven" with each word separate. But that's not necessarily how the Germans do it; to them, 57 is written and pronounced as one word, "siebenundfünfzig", which literally is "seven and fifty". And yes, Germans like to write a large number a single very long word. For example, 3,526 (or 3.526 using their period thousands-separator) is written as "dreitausendfünfhundertsechsundzwanzig". That makes for interesting problems when breaking such long words across lines when typesetting. And what's with the French word for 80? I'm referring to "quatre-vingts", which literally is "four twenties". Sacré bleu!

**The Decimal Number System
**

Let's look more formally at the decimal number system, the one we use every day, for things like prices, street addresses, and telephone numbers.

Consider the number 12,458. (The comma here represents the thousands separator, but as we know, that is not universal.) We all "know" that number represents the value twelve thousand four hundred and fifty eight. How do we know that? Well for starters, the decimal number system has 10 distinct symbols, the digits 0–9. Decimal numbers are written such that the value of a digit is based on its position within that number. For example, the digit 5 can represent five, fifty, or five hundred. (This is different from some other number systems such as those developed by the Romans and Japanese.) Let's break down the decimal number 12,458 into its parts, as follows:

1 × 10,000 | = | 1 × 10 | = | 10,000 |

2 × 1,000 | = | 2 × 10 | = | 2,000 |

4 × 100 | = | 4 × 10 | = | 400 |

5 × 10 | = | 5 × 10 | = | 50 |

8 × 1 | = | 8 × 10 | = | 8 |

12,458 |

The digits, starting from the right-most one going left, are referred to as the *ones* digit, the *tens* digit, the *hundreds* digit, the *thousands* digit, the *ten thousands* digit, and so on, for obvious reasons.

Consider the following example in which we add two numbers together:

546 |

+478 |

1,024 |

When two digits are added, the result may be larger than one digit, in which case, we *carry* 1 to the next column to the left. For example, 6 + 8 = 14, so we carry 1 and put down 4. The next column to the left now becomes 4 + 7 + 1 = 12, the 1 is carried and the 2 is put down. And so the process is repeated. With subtraction, when we subtract one digit from another we may have to *borrow* 1 instead of carrying it. For example:

1,024 |

-478 |

546 |

Whether we realize it or not, we need to know 400 rules to do basic addition, subtraction, multiplication, and division of two decimal digits. We simply have to remember them! And even though we can take a few shortcuts (3 + 4 is equivalent to 4 + 3, for example) there are still a lot of rules.

If we were to build a machine to perform these basic arithmetic operations, it would have to know all these rules. It would also need a way to represent each of the 10 digits. Mechanically, this is possible; an old-style automobile mileage odometer is one example. However, representing 10 different states makes the machine more complicated/expensive. This is particularly so if the machine is based on magnetic, electrical, or electronic properties, as are modern computers.

The decimal number system is based on the number 10. The numbers we normally write then are base-10 numbers. We can write such numbers with the base explicitly shown. For example, the number 123 can be written formally with a decimal base using the notation 123_{10}, where the base is shown as a subscript. In ordinary usage, the subscript is omitted because the writer means "base-10 number system" and most readers know only that number system. However, other number systems are in common use in computing, as we shall see below.

For the most part, people other than computer programmers and engineers, and mathematicians can ignore number systems other than base-10.

**The Binary Number System
**

The rules we have seen for writing base-10 numbers can be applied to numbers of any arbitrary base. In fact, it's really the other way around; the base-10 system is a specific instance of the general number system idea.

To have a number system with base *B*, we simply need to pick *B* unique symbols, assign distinct values to them, and specify their value order. Therefore, to have a base-2 (or binary) number system, we need two symbols; 0 and 1 are used where 0 is less than 1. Here are some base-10 numbers and their base-2 equivalents, with the latter written in parentheses: 0 (0_{2}), 1 (1_{2}), 2 (10_{2}), 3 (11_{2}), 4 (100_{2}), 5 (101_{2}), 6 (110_{2}), 7 (111_{2}), 8 (1000_{2}), 9 (1001_{2}), 10 (1010_{2}), 100 (1100100_{2}), 1,000 (1111101000_{2}), 10,000 (10011100010000_{2}), 100,000 (11000011010100000_{2}), and 1,000,000 (11110100001001000000_{2}).

The larger the number system base (or *radix*), the bigger the number that can be expressed with a given number of digit positions. Conversely, the smaller the radix, the more digit positions are required to express a given number. Since 2 is the smallest possible radix, it produces the most unwieldy numbers, as shown in the examples above.

Numbers written in binary can get verbose, but they have one very important property: The number of rules needed to perform arithmetic operations on them is very small compared to the base-10 system. For example, in addition, we have 0 + 0, 0 + 1, 1 + 0, and 1 + 1, with only the last one producing a carry.

If we were to build a machine to perform these basic arithmetic operations, it would have to know only a few rules. It would also need a way to represent each of the two digits. This is easy, regardless of whether the machine is mechanical, magnetic, electrical, or electronic, so much so, that modern computers deal directly with base-2 numbers.

We've all heard the word *bit*. Just what is a bit? The term *bit* comes from the words *binary digit*. The longer term is cumbersome; while we can say that a given number can be represented in 16 binary digits, it's much easier to say it can be represented in 16 bits. An 8-bit computer processes values eight bits at a time while a 16-bit computer can manipulate 16 bits at a time. From this, we can reasonably assume that a 64-bit computer is more capable in some sense than a 32-bit computer, a 16-bit computer, and an 8-bit computer, in that order.

**The Hexadecimal Number System
**

After decimal, the most commonly used number system these days is hexadecimal (or *hex* for short). As its name implies, it has a base of 16 and, therefore, needs 16 symbols. The symbols used are 0–9 and A–F where A–F have the values 10–15, respectively. Lowercase a–f can be used instead of A–F, and upper- and lowercase letters can be mixed in the same number. If you think a number containing letters looks silly, remember in the hexadecimal system A–F *are* digits. Just take a mental step back and remember that the assignment of symbols to meanings (as with letters and digits) is simply a convention.

Here are some base-10 numbers and their base-16 equivalents, with the latter written in parentheses: 0 (0_{16}), 1 (1_{16}), 2 (2_{16}), 3 (3_{16}), 4 (4_{16}), 5 (5_{16}), 6 (6_{16}), 7 (7_{16}), 8 (8_{16}), 9 (9_{16}), 10 (A_{16}), 100 (64_{16}), 1,000 (3E8_{16}), 10,000 (2710_{16}), 100,000 (186A0_{16}), and 1,000,000 (F4240_{16}).

Mere mortals generally don't have to understand the hexadecimal number system, but if you have ever configured a wireless device to the internet using a password or access key, you may have found yourself entering a "secret code" that looks something like 4749D8B754. That's because these things are often required to be hexadecimal numbers. [I've also seen screens full of hexadecimal numbers scrolling or frozen on malfunctioning airport flight displays and even once on an in-flight screen when the video system was being reinitialized.]

**Other Number Systems
**

There are an infinite number of number systems; pick a base, chose the corresponding number of distinct symbols, and, *voila*, you have a new number system. However, you might have difficulty convincing others to use it.

Now you don't have to be a computer or math nerd to deal with number systems other than decimal, although it does help. Back in the *really* old days (even before your parents were born, as in the 3^{rd} millennium BCE), the Sumerians used the sexagesimal number system, which had a base of 60. They very generously passed that along to the Babylonians. Clearly, back then, counting on one's fingers was not possible! There are traces of this concept with us today. For example, in our time system, there are 60 seconds in a minute and 60 minutes in an hour. And when looking at latitude or longitude on a globe, there are 60 seconds in a minute and 60 minutes in a degree.

Another culture that was very advanced with respect to math and science was the Maya. Their number system used base twenty (vigesimal).

While an understanding of alternate number systems and computer-based arithmetic is not necessary for most computer programming tasks, without such knowledge programmers cannot exploit certain kinds of hardware or programming languages, or understand some of the principles on which modern computing is based.

Back in the old days of computing (1970s and earlier) many of us programmers had to be familiar with the octal number system, which has base 8. As you might expect, the digits in that system are 0–7. [I really *do* miss the DEC PDP-11!]

To read more than you probably want to know about number systems, click here.

**Conclusion
**

With respect to numbers and counting I do have a pet peeve, namely, the misuse of *number* when *digit* is meant. The number 123 contains three digits, 1, 2, and 3. It does *not* contain three numbers! That is, a number consists of one or more digits. A number cannot contain another number. So use the terms correctly or *Mr. and Mrs. Spank will have to make a trip to bottyland*! [That's a famous quote from QEI's nanny in an episode of the second series of *Black Adder*.]

We all know about percentages, and are used to seeing signs like "Sale, 50% off!" But did you know that percent had a sibling, per mil? A per mil is a tenth of a percent or one part per thousand, and is written using the symbol ‰. Now let's see who's the first to post a comment about seeing this symbol in a publication.

A number of companies sell calculators that allow numbers to be entered and displayed in various bases. I've been a fan of HP scientific and programmable calculators since 1970, and I own an HP-16C. This calculator supports binary, octal, decimal, and hexadecimal modes of operation; allows ones- or twos-complement or unsigned representation; has bit-manipulation operations; and supports word sizes up to 64 bits. It's so reliable I can always count on it!