We invented computers, and we made them in our image and likeness. As a result, computers inherited all our intellectual traits, including our way of processing, sending, and receiving information. So, if you thought computers had no humanity, you couldn't be more wrong. If anything, they are so much like us that they could be called our greatest impersonators. In this article, I will explain how computers exchange textual information.
We'll start with our way of exchanging semantic (textual) information. Our vocal organs can generate various sounds that we assign to particular alphabetic letters. The sounds are arranged into words, and the words are into sentences. Sounds propagate through the air as waves carrying information from one person to another. The process of exchanging information can be divided into the following stages.
1. Our brain analyses the message we want to pass to another person and converts it into the symbolic language of sounds. Let's name this process encoding.
2. Our vocal organs generate the required sound signals, which travel through the air and reach another person. Let's name the vocal organs a transmitter.
3. The other person picks up the sounds from the air with their ears. Let's name the ears a receiver.
4. The other person's brain converts these sounds back into semantic information. Let's name this process decoding.
Computers copy our mode of communication nearly to a T. But instead of sound signals, they use far superior electromagnetic ones. Electromagnetic signals also propagate as waves. But unlike sounds, they can also travel in empty space, allowing for the development of satellite internet. The main advantage, however, is speed. Electromagnetic waves are light (visible and invisible) and move nearly 900,000 times faster than sound, which means the information they carry also moves 900,000 times faster. Apart from that, computers follow the same pattern.
1. The program installed in your computer processes the information you type in and converts it into the symbolic language of electromagnetic signals (encoding).
2. The transmitter in your computer generates the required signals, which travel through space and reach another computer.
3. The receiver in the other computer picks these signals up from space.
4. The program installed in that computer converts the symbolic language of electromagnetic signals back into text, which appears on the computer screen (decoding).
How computers encode text
We could assign a particular electromagnetic signal to each alphabetic letter by analogy with our way of encoding sounds. But our vocal organs come for free, whereas building a transmitter that can generate 26 different signals will cost money. Besides, apart from the letters, our texts include numbers, punctuation marks, math, currency and other symbols. If you look at your keyboard, you find the control keys, such as "delete", "backspace", and "Escape", responsible for making texts presentable. Adding together all letters, numbers, punctuation marks, symbols, and controls, we come to a set of 256 characters, using which modern computers can transmit texts of any length and complexity.
The problem of building a transmitter capable of generating 256 different signals can be resolved if we number the characters. Our numeral system is very compact and uses only ten digits (0-9) to denote any number, big or small. This way, we can reduce the variety of signals to 10. And if we assign a "no signal" pause to digit 0, we can reduce the signals to 9 in total.
Note that the transmission time for each character should be the same to avoid any confusion between them. But characters numbered 0-9 will have a single-digit format, characters numbered 10-99 a two-digit form, and characters numbered 100-255 a three-digit format. To bring all characters into one format, we must add 00 to all single-digit numbers and 0 to all double-digit numbers. Zeros in front of a number don't change the value. And number 006, for example, will remain number 6.
Why do computers use a binary system?
The variety of signals can be reduced to an absolute minimum if we adopt a binary system. The advantage of the binary over decimal is that it uses only two digits (0 and 1) to denote any number, big or small. We have already assigned a "no signal" pause to digit 0. So, all the transmitter needs to do now is generate one kind of signal, switching it on for a digit 1 and off for a digit 0. It really can't get any better than that!
Expressing characters in binary also helps to solve the problem of noise in communication systems. Noise is the unwanted electromagnetic waves present in the atmosphere and unwanted electrical current inside the computer circuitry. Noise can interfere with the information-carrying signals and distort their characteristics. Consequently, one character can be mistaken for another, causing a breakdown in communication. But when the receiving computer must distinguish between just two options, such as the presence of a signal and the absence of it, the problem of bad reception can be reduced to a minimum.
Binary is not just a numeral system. It's also a logical tool for finding solutions to problems with two mutually exclusive alternatives, denoted as true-false or present-absent. In that sense, binary gives us extra flexibility in choosing an information carrier. In satellite internet, the carrier is the electromagnetic wave propagating through space. But in the ground internet, the chosen carrier is the electrical current flowing through the copper cables. A binary platform works equally well for both, with no additional requirements on the computer design. The signal is generated in both cases by turning the voltage on for the digit 1 and off for the digit 0. It's hardly surprising then that modern communication systems widely adopted the binary platform as the most adaptable and reliable.
How we encode text in binary
So, we have 256 characters, numbered from 0 to 255. The lowest number 0 in binary is 0 and has a one-digit format. The highest number 255 in binary is 1111 1111 and has an eight-digit format. We must bring all binary numerations into one format, adding 0s in front of those which need them. Binary 0 (0) will become 0000 0000, binary 1 (1) will become 0000 0001, and binary 101101 (45) will become 0010 1101. All 0s in front of the first 1 in binary codes have no meaning apart from bringing all codes into one format. We did this with decimal numbers before. Though we don't send signals in decimal, I used this exercise there only for illustration.
A binary digit is called a bit.
0 = one bit
1 = one bit
A sequence of 8 bits is called a byte.
8 bits = one byte.
The program that uses 256 characters, 8-bit each, is called the extended ASCII (American Standard Code for Information Interchange). The ASCII was developed in the 1960s for the English-speaking population and can still be used in simple texts like emails. It contains 128 characters (0-127), the highest of which #127 in binary is 1111111. Consequently, it has a 7-digit form.
It was extended in the 1970s to incorporate all Latin-based alphabets, such as French and Spanish, as well as additional currencies and internet symbols. Installed on your computers and mobiles, it encodes the text you send into binary signals and decodes the binary signals you receive back into text. The extended ASCII includes all ASCII characters, adding a 0 in front of them to make them the standard 8-bit wide, another proof of how adaptable a binary platform is.
The 1-byte=8-bit has been accepted as a standard unit of information, denoted as B (byte). Now you know what you pay for when buying, say, 500 MB (500 x 1,000,000 x bytes) of internet data for your mobile or computer.
By the end of the millennium, computers firmly established themselves worldwide, becoming a household staple and prompting the development of the Unicode program, which incorporated other known alphabets, such as Chinese, Arabic, and even ancient Egyptian hieroglyphics. Then arrived a new emoji craze from Japan, and the Unicode program gained popularity by giving home to thousands of them on its extensive platform. An increased number of characters resulted in lengthier binary codes. As a result, the Unicode program uses a 16-bit=2-byte format, where 1 character = 2 bytes.
With further progress in the computer industry, 32-bit and later 64-bit operating systems have been developed to increase the data processing power of computers. The 32-bit system can simultaneously process 32-bit (4-byte) of data, while the 64-bit system takes the speed further to 64-bit (8-byte). To ensure a seamless implementation of new programs into an existing framework, the bit/byte count increases exponentially. The interim counts, like 24-bit and 40-bit, are rarely used.
2^3 = 8-bit = 1-byte Can hold 2^8 = 256 different characters
2^4 = 16-bit = 2-byte Can hold 2^16= 65,536 different characters
2^5 = 32-bit = 4-byte Can hold 2^32= 4,294,967,296 different characters
2^6 = 64-bit = 8-byte Operating systems
2^7 =128-bit=16-byte Nothing so far
How we transmit text in binary
Let's assume we want to send the word No in the message. The uppercase letter N is #78 in the extended ASCII and has binary code 0100 1110. The transmitter will generate a sequence of 0 and 1 signals, as shown below. If N starts a new sentence, it will be preceded by #32 Space character (binary code 0010 0000) and #46 Full stop character (binary code 0010 1110). It will be followed by #111 letter o (binary code 01101111). Note that all 0 and 1 signals have equal time intervals.
Transmission of a letter N coded 01001110 in the extended ASCII.
Comments