Information Theory: From Wireless Communication to DNA Sequencing. David Tse Dept. of EECS U.C. Berkeley Gilbreth Lecture. TexPoint fonts used in EMF: A A A A A A A A A A A A A A A A. Information in an Information Age. Some fundamental questions: How to quantify information?
Dept. of EECS
TexPoint fonts used in EMF: AAAAAAAAAAAAAAAA
Some fundamental questions:
Given statistical models for source and channel:
A unified way of looking at all communication problems.
(a gigantic jigsaw puzzle)
~ 0 mobile phones in mid 90’s ~ 6 billions now
low-rate voice high-rate data
Engineering meets science.
New points of views arise.
Classical view: fading channels are unreliable
line-of-sight is best.
Compensatesfor deep fades via diversity techniques over time, frequency and space.
line-of-sight like channel
to achieve capacity, transmit opportunistically.
(Goldsmith & Varaiya 96)
Knopp & Humblet 95
numberof antennas per device
Line-of-sight allows more power transfer via beamforming.
Multipaths provides more signal dimensions for spatial multiplexing.
Information theory: more dimensions is better than more power.
Process of obtaining the sequence of nucleotides.
A basic workhorse of modern biology and medicine.
3 billion basepairs
Cost of one human genome
Time to sequence one genome: years/months hours
100 million species
7 billion individuals
(SNP, personal genomics)
1013 cells in a human
(e.g. somatic mutations
such as HIV, cancer)
Reads are assembled to reconstruct the original DNA sequence.
Motahari, Bresler & Tse 12
Question: what is the max. sequencing rate such that reliable reconstruction is possible?
H2(p) is (Renyi) entropy rate
of the DNA sequence .
The higher the entropy,
the easier the problem!