Text B
Information Theory
Claude Shannon laid the foundation of information theory in 1948.His paper“A Mathematical Theory of Communication”published in Bell System Technical Journal is the basis for the entire telecommunications developments that have taken place during the last five decades. A good understanding of the concepts proposed by Shannon is a must for every budding telecommunication professional.
In any communication system,there will be an information source that produces information in some form, and an information sink absorbs the information. The communication medium connects the source and the sink. The purpose of a communication system is to transmit the information from the source to the sink without errors.However,the communication medium always introduces some errors because of noise.The fundamental requirement of a communication system is to transmit the information without errors in spite of the noise.
The requirement of a communication system is to transmit the information from the source to the sink without errors, in spite of the fact that noise is always introduced in the communication medium.
In a generic communication system,the information source produces symbols(such as English letters, speech, video, etc.) that are sent through the transmission medium by the transmitter. The communication medium introduces noise,and so errors are introduced in the transmitted data.At the receiving end,the receiver decodes the data and gives it to the information sink.
As an example, consider an information source that produces two symbols A and B. The transmitter codes the data into a bit stream.For example,A can be coded as 1 and B as 0.The stream of 1's and 0's is transmitted through the medium. Because of noise, 1 may become 0 or 0 may become 1 at random places,as illustrated below:
At the receiver,one bit is received in error.How to ensure that the received data can be made error free?Shannon provides the answer.
In this block diagram(Figure 1.2),the information source produces the symbols that are coded using two types of coding—source encoding and channel encoding—and then modulated and sent over the medium. At the receiving end, the modulated signal is demodulated, and the inverse operations of channel encoding and source encoding (channel decoding and source decoding) are performed.Then the information is presented to the information sink.
As proposed by Shannon, the communication system consists of source encoder, channel encoder and modulator at the transmitting end, and demodulator, channel decoder and source decoder at the receiving end.
Figure 1.2 The communication system model
Information source:The information source produces the symbols.If the information source is, for example,a microphone,the signal is in analog form.If the source is a computer,the signal is in digital form(a set of symbols).
Source encoder: The source encoder converts the signal produced by the information source into a data stream.If the input signal is analog,it can be converted into digital form using an analog-to-digital converter.If the input to the source encoder is a stream of symbols,it can be converted into a stream of 1 and 0 using some type of coding mechanism.For instance,if the source produces the symbols A and B,A can be coded as 1 and B as 0.Shannon's source coding theorem tells us how to do this coding efficiently.
Source encoding is done to reduce the redundancy in the signal.Source coding techniques can be divided into lossless encoding techniques and lossy encoding techniques. In lossy encoding techniques,some information is lost.
In source coding,there are two types of coding—lossless coding and lossy coding.In lossless coding,no information is lost.When we compress our computer files using a compression technique (for instance, WinZip), there is no loss of information. Such coding techniques are called lossless coding techniques.In lossy coding,some information is lost while doing the source coding.As long as the loss is not significant,we can tolerate it.When an image is converted into JPEG format,the coding is lossy coding because some information is lost. Most of the techniques used for voice, image,and video coding are lossy coding techniques.
The compression utilities we use to compress data files use lossless encoding techniques.JPEG image compression is a lossy technique because some information is lost.
Channel encoder:If we have to decode the information correctly,even if errors are introduced in the medium,we need to put some additional bits in the source-encoded data so that the additional information can be used to detect and correct the errors.This process of adding bits is done by the channel encoder.Shannon's channel coding theorem tells us how to achieve this.
In channel encoding,redundancy is introduced so that at the receiving end,the redundant bits can be used for error detection or error correction.
Modulation: Modulation is a process of transforming the signal so that the signal can be transmitted through the medium.
Demodulator:The demodulator performs the inverse operation of the modulator.
Channel decoder: The channel decoder analyzes the received bit stream and detects and corrects the errors,if any,using the additional data introduced by the channel encoder.
Source decoder: The source decoder converts the bit stream into the actual information. If analog-to-digital conversion is done at the source encoder,digital-to-analog conversion is done at the source decoder. If the symbols are coded into 1 and 0 at the source encoder, the bit stream is converted back to the symbols by the source decoder.
Information sink:The information sink absorbs the information.
What is information?How do we measure information?These are fundamental issues for which Shannon provided the answers.We can say that we received some information if there is decrease in uncertainty.Consider an information source that produces two symbols A and B.The source has sent A, B, B, A, and now we are waiting for the next symbol. Which symbol will it produce? If it produces A,the uncertainty that was there in the waiting period is gone,and we say that information is produced.Note that we are using the term information from a communication theory point of view;it has nothing to do with the usefulness of the information.
Shannon proposed a formula to measure information. The information measure is called the entropy of the source.If a source produces N symbols,and if all the symbols are equally likely to occur,the entropy of the source is given by:
Shannon introduced the concept of channel capacity,the limit at which data can be transmitted through a medium.The errors in the transmission medium depend on the energy of the signal,the energy of the noise,and the bandwidth of the channel.Conceptually,if the bandwidth is high,we can pump more data in the channel.If the signal energy is high,the effect of noise is reduced.According to Shannon, the bandwidth of the channel and signal energy and noise energy are related by the formula:
Where C is channel capacity in bits per second(bps),W is bandwidth of the channel in Hz,S/N is the signal-to-noise power ratio(SNR).The value of the channel capacity obtained using this formula is the theoretical maximum.So,we cannot transmit data at a rate faster than this value in a voice-grade line. An important point to be noted is that in the above formula, Shannon assumes only thermal noise. To increase C, can we increase W? No, because increasing W increases noise as well, and SNR will be reduced.To increase C,can we increase the SNR?No,that results in more noise,called intermodulation noise.
The entropy of information source and channel capacity are two important concepts,based on which Shannon proposed his theorems.
In a digital communication system,the aim of the designer is to convert any information into a digital signal, pass it through the transmission medium and, at the receiving end, reproduce the digital signal exactly.To achieve this objective,two important requirements are:
▶ To code any type of information into digital format, note that the world is analog—voice signals are analog, images are analog. We need to devise mechanisms to convert analog signals into digital format.If the source produces symbols(such as A,B),we also need to convert these symbols into a bit stream. This coding has to be done efficiently so that the smallest number of bits is required for coding.
▶ To ensure that the data sent over the channel is not corrupted.We cannot eliminate the noise introduced on the channels, and hence we need to introduce special coding techniques to overcome the effect of noise.
These two aspects have been addressed by Claude Shannon in his classical paper“A Mathematical Theory of Communication”published in 1948 in Bell System Technical Journal, which gave the foundation to information theory.Shannon addressed these two aspects through his source coding theorem and channel coding theorem.