ASCII is an acronym that you might have heard in relation to computer text, but it’s a term that's rapidly falling out of use thanks to a more powerful newcomer. But what is ASCII, and what is it used for?

What Does ASCII Stand For?

Perhaps the easiest place to begin is the acronym itself, so let’s expand it:

American Standard Code for Information Interchange

This mouthful of a phrase doesn’t really give the complete picture, but some parts immediately offer some clues, notably the first two words. ASCII is an American Standard, the significance of which will soon become apparent.

“Code for Information Interchange” suggests we’re talking about a format for passing data back and forth. Specifically, ASCII deals with textual data: characters making up words in a typically human-readable language.

ASCII solves the problem of how to assign values to letters and other characters so that, when they’re stored as ones and zeroes in a file, they can be translated back into letters when the file is read later. If different computer systems agree on the same code to use, such information can be interchanged reliably.

Related: How to Find Symbols and Look Up Their Meanings

The History of ASCII

Sometimes referred to as US-ASCII, ASCII was an American innovation developed in the 1960s. The standard has undergone many revisions since, primarily in 1977 and 1986, when ASCII was last updated.

Extensions and variations have built upon ASCII over the years, mainly to cater for the fact that ASCII omits many characters used, or even required, by languages other than US English. ASCII does not even cater for the UK currency symbol (“£”), although the pound is present in Latin-1, an 8-bit extension developed in the 1980s, which encodes several other currencies too.

ASCII was greatly extended and succeeded by Unicode, a much more comprehensive and ambitious standard, which is discussed below. In 2008, Unicode overtook ASCII in popularity for online usage.

What Characters Does ASCII Represent?

To a computer, the letter “A” is just as unfamiliar as the color purple or the feeling of jealousy. Computers deal in ones and zeroes, and it’s up to humans to decide how to use those ones and zeroes to represent numbers, words, images, and anything else.

You can think of ASCII as the Morse code of the digital world—the first attempt, anyway. Whilst Morse code is used to represent just 36 different characters (26 letters and 10 digits), ASCII was designed to represent up to 128 different characters in 7 bits of data.

ASCII is case-sensitive, meaning it represents 52 upper and lower case letters from the English alphabet. Alongside the same 10 digits, that’s about half the space used.

Punctuation, mathematical and typographic symbols occupy the remainder, and a collection of control characters, which are special non-printable codes with functional meanings—see below for more.

Here are some typical characters that ASCII encodes:

Binary

Decimal

Character

010 0001

33

!

011 0000

48

0

011 1001

57

9

011 1011

59

;

100 0001

65

A

100 0010

66

B

101 1010

90

Z

101 1011

91

[

110 0001

97

a

110 0010

98

b

111 1101

125

}

Note that the values chosen have some useful properties, in particular:

  • Letters of the same case can always be sorted numerically since they're in order. For example, A has a lower value than B, which has a lower value than Z.
  • Letters of different cases are offset by exactly 32. This makes it very easy to translate between lower and upper case since just a single bit needs to be switched for each letter, either way.

Control Characters

Other than letters, punctuation, and digits, ASCII can represent a number of control characters, special code points that do not produce single-character output but instead provide alternative meanings about the data to whatever might be consuming it.

For example, ASCII 000 1001 is the horizontal tab character. It represents the space you’ll get when you press the TAB key. You won’t typically see such characters directly, but their effect will often be shown. Here are some more examples:

Binary

Decimal

Character

000 1001

9

Horizontal Tab

000 1010

10

Line Feed

001 0111

23

End of Transmission Block

What About Other Characters?

ASCII was enormously successful during the early days of computing since it was simple and widely adopted. However, in a world with a more international outlook, one writing system just won’t cut it. Modern communications need to be possible in French, Japanese—in fact, any language we might want to store text in.

The Unicode character set can address a total of 1,112,064 different characters, although only about one-tenth of those are actually currently defined. That might sound like a lot, but the encoding aims to not only cater for tens of thousands of Chinese characters, it also covers emoji (nearly one and a half thousand) and even extinct writing systems such as Jurchen.

Related: The 100 Most Popular Emojis Explained

Unicode acknowledged ASCII’s dominance in its choice of the first 128 characters: they are exactly the same as ASCII. This allows ASCII-encoded files to be used in situations where Unicode is expected, providing backward compatibility.

Summary

ASCII text represents the 26 letters of the English alphabet, with digits, punctuation, and a few other symbols thrown in. It served its purpose very well for the best part of half a century,

It has now been superseded by Unicode, which supports a huge number of languages and other symbols, including emoji. UTF-8 is, for all practical purposes, the encoding that should be used to represent Unicode characters online.