In the beginning was the Word. 情報理論：日本語，英語で隔年開講 今年度は日本語 で授業を行う が ， スライドは英語 のものを使用 Information Theory: English and Japanese, alternate years the course will b e taught in Japanese in this year video-recorded English classes Lecture Archives 2011 Slides are in English
Information Theory: English and Japanese, alternate years
Information Theory （情報理論）
In this class, we learn basic subjects of information theory.
(half undergraduate level + half graduate school level)
Claude E. Shannon
This class consists of four chapters(+ this introduction):
To understand our problem, date back to 1940s...
They already had “digital communication”.
No computers yet, but there were “machines”...
Teletype model 14-KTR, 1940
The efficiency and the reliability were two major problems.
A communication system can be modeled as;
C.E. Shannon, A Mathematical Theory of Communication,
The Bell System Technical Journal, 27, pp. 379–423, 623–656, 1948.
A communication is efficient if the size of B is small.
Example: You need to record the weather of Tokyo everyday.
Can we shorten the representation?
The code B gives shorter representation than the code A.
Sometimes, events are not equally likely...
20.5 + 20.3 + 10.2 = 1.8 bit / event in average
10.5 + 20.3 + 20.2 = 1.5 bit/ event in average
Can we represent information with 0.00000000001 bit per event?
→ For this event set, we need 1.485 or more bit per event.
This is the amount of information
which must be carried by the code.
information in a quantitative manner.
A communication is reliable if A = Dor A ≈ D.
Communication is not always reliable.
Alpha, Bravo, Charlie
Alpha, Bravo, Charlie
→use this mechanism over 0-1 data, and we can correct errors!
redundant (冗長な) information
for correcting possible errors
Q. Can we add “redundancy” to binary data?
A. Yes, use parity bits.
A parity bit is...
a binary digit which is to make the number of 1’s in data even.
One parity bit may tell you that there are odd numbers of errors,
but not more than that.
basic idea: use several parity bits to correct errors
Example: Add five parity bits to four-bits data (a0,a1, a2, a3).
(a0,a1, a2, a3, p0,p1, q0,q1, r)
This code corrects one-bit error,
but it is too straightforward.
statistics in 2011: A ... 51 / B ... 20 / C ... 18 / did not pass ... 13
“To tell plenty of things, we need more words.”
...maybe true, but can you give the proof of this statement?
We will need to...
and its representation.
Chapter 1 focuses on the first step above.
Information tells what has happened at the information source.
the difference of uncertainty the amount of information
FIRST, we need to measure the uncertainty of information source.
this difference indicates
the amount of information
The uncertainty is defined according to the statistics (統計量),
we do not have enough time today....
In the rest of today’s talk,
we study two typical information sources.
In this class, we assume that...
(discrete-time information source)
(digital information source)
Note however that, in the real world,
there are continuous-time and/or analogueinformation sources.
(S is said to be a k-ary information source.)
Example: S = fair dice
if the message is
A memoryless & stationary information source satisfies...
“A symbol is chosen independently from past symbols.”
“The probability distribution is time invariant.”
the same probability distribution
Examples of memoryless & stationary information source:
information sources with memory:
not-stationary information sources:
Markov information source
at most m previous symbols
(m-th order Markov source)
m = 0 memoryless source
m = 1 simple Markov source
1-bit registerExample of (simple) Markov source
S ... memoryless & stationary source with P(0) = q, P(1) = 1 – q
1 / 1–q
0 / q
1 / q
0 / 1–qMarkov source as a finite state machine
m-th order k-ary Markov source:
finite state machine
Ctwo important properties
irreducible (既約) Markov source:
this example is NOT irreducible
this example is NOT aperiodic
irreducible + aperiodic = regular
start from the state 0
start from the state 1
converge (収束する) to
1/0.2computation of the stationary probabilities
t+1 = 0.9t + 0.8t
t+1 = 0.1t + 0.2t
t+1+ t+1= 1
1/0.2Markov source as a stationary source
After enough time has elapsed...
a regular Markov source can be regarded as a stationary source
0 will be produced with probabilityP(0) = 0.9 + 0.8 = 0.889
1 will be produced with probabilityP(1) = 0.1 + 0.2 = 0.111
This is to check your understanding.
This is not a report assignment.