Jointly Optimal Transmission and Probing Strategies for Multichannel Systems

Jointly Optimal Transmission and Probing Strategies for Multichannel Systems Saswati Sarkar University of Pennsylvania Joint work with Sudipto Guha (Upenn) and Kamesh Munagala (Duke)

Multichannel Systems • Current wireless networks have access to multiple channels • Frequencies in FDMA, codes in CDMA, antennas in MIMO, polarization states of antennas, access points in systems with multiple access points • IEEE 802.11a, IEEE802.11b, IEEE802.11h • Future wireless networks are likely to have access to larger number of channels • Potential deregulation of spectrum • Advances in device technology

Multichannel systems • Transmission quality in different channels stochastically vary with time. • A user is likely to find a channel with acceptable transmission quality when the number of channels is large, • if he samples all the channels • Sampling consumes energy and generates interference for other users • Amount of information a user acquires about its channels becomes an important decision variable.

Policy Decisions • Probing policy • How many channels to probe? • Which channels to probe? • Sequence of probing • Selection policy • Which channel to transmit (probed channel or unprobed channel)?

System Goal • Gain is the difference between the expected rate of successful transmission and the cost of probing • Maximize the system gain • Need to jointly optimize the probing and selection policies • Policies will be adaptive • Policies may depend on higher order statistics and cross statistics of channels.

Sense of déjà vu? • Stopping Time problem (Bertsekas, Dynamic Programming, Vol. 1) • Maximize reward by optimally selecting between two control actions at any given time: (a) continue or (b) stop • IID channels (Kanodia et al, 2004, Ji et al, 2004) • Generalized stopping time problems • Select between multiple control actions • Only limited results are known

Sense of déjà vu? • Stochastic multi-armed bandit problem (Bertsekas, Dynamic Programming, Vol. 2) • Observe the state of only one arm at each slot • Select the arm to observe and get reward from it • State of an arm changes only after you observe it • State changes are stochastic • Gittins index based policies maximize the reward (Gittins-Jones, 1974)

Sense of déjà vu? • Adversarial multi-armed bandit problems (Auer et al, 2002) • State of an arm may change even when you don’t observe it • State changes are adversarial • Some versions allow you to get reward from only one selected arm and subsequently observe the states of all arms in the slot • Given a class of N policies and a time horizon of T slots, bound the performance loss as compared to the best of the policies in terms of O(sqrt(T * log N)).

Our Contribution • Polynomial complexity optimal solution when channels are mutually and temporally independent and each channels has two states. • Sigmetrics, 2006, poster paper • Polynomial complexity approximate solution when channels have multiple states • Approximation ratio ½ for arbitrary number of states • Approximation ration 2/3 for 3 states

System Model • Single user with access to n channels • Each channel can be in one of K states, 0,…K-1. • Channel i is in state j with probability pji • Probability of successful transmission in state j is rj • User knows the state of channel i in a slot only if it probes it. • Cost of probing channel i is ci • Channel states are temporally and mutually important

Optimal policy for K = 2 (Sigmetrics, 2006) • Exhaustive (S, i) policy: • probes all channels in set S, • if no probed channel is in state 1, transmit in channel i (backup). • Optimal policy is exhaustive (Si, i) for some i • Si = {j: p1j(1-p1i) > cj} • Probes all channels with high cross-uncertainty with the backup channel • Probes channels in decreasing order of p1j/cj • Search space for optimal policy restricted (n policies) • Determine the optimal by explicitly evaluating the gain in the above class.

Additional challenges for K > 2 • For K = 2, the optimal policy is static • Actions can be determined before observation • For K > 3, the optimal policy may be adaptive • While probed channels are in state 0, focus on channels with high expected reward • After observing a channel in intermediate state, focus on channels which have higher probabilities of being in the highest state • After observing a channel in intermediate state, may select either the observed channel or the backup. • Optimal policy is a decision tree • Exponential number of decision trees • Storage of decision tree consumes exponential complexity

Constant Factor Approximation Algorithm for K > 2 • Let policy A be the best among those that always transmits in a probed channel • Gain GA • Let policy B be the best among those that always transmits in an unprobed channel • Policy B is clearly the one that never probes any channel and transmits in the channel with the highest expected transmission rate. • Gain GB • Theorem 1: Max(GA , GB) >= ½ Maximum Gain • The best among A and B attains half the maximum gain

Proof of Theorem 1 • Q: Probability that the optimal policy transmit in an unprobed channel • OPT: Expected gain of the optimal policy • G: Expected gain of the optimal policy given that it transmits in an unprobed channel

Proof of Theorem 1 • Modify the optimal so that it transmits in the best probed channel whenever it was using the backup • Expected gain OPT’ which is less than or equal to GA • Expected Gain given that the optimal uses a backup x • OPT – OPT’ <= Q (G – x) <= QG <= GB OPT <= OPT’ + GB <= GA +GB <= 2 max(GA , GB)

Missing link • Need to compute the policy that maximizes gains among those that transmit only in probed channels. • Once a channel is observed to be in state u, probe only those channels for which the expected improvement above u exceeds cost.

Special case: K = 3 • Theorem 2: There exists an optimum policy which uses a unique backup channel. • There exists a channel l which satisfies the following characteristic: if the optimum policy uses a backup it uses l.

Approximation algorithm for attaining 2/3 approximation ratio • Let P(l) be the class of policies which • never probe channel l and • never uses any channel as a backup other than l. • Let C be the best policy in P(l) for the best choice of l. • Theorem 3: Max(GA , GB , GC) >= 2/3 Maximum Gain • Best among A, B, C attains 2/3 of the optimum gain.

Open Problems • Is the optimization problem NP-hard when K exceeds 2? • Can we get better approximation ratios? • What happens when channels are not mutually independent? • What happens when channels are not temporally independent? • What happens when you can simultaneously transmit in multiple channels? • Power control • What happens when you need not transmit in every slot? • Lazy scheduling

Jointly Optimal Transmission and Probing Strategies for Multichannel Systems

Jointly Optimal Transmission and Probing Strategies for Multichannel Systems

Presentation Transcript

MultiChannel Data Strategies

Seldon: Probing Terrorist Mitigation Strategies

Optimal Racing Strategies

Psychological Strategies for Optimal Performance

Optimal Option Portfolio Strategies

Optimal conservation strategies for dynamic landscapes

Telecommunication Transmission and Switching Systems

ELG4125 Transmission and Distribution Systems

Transmission Systems

Multichannel Learning Systems Working Group

Telecommunication Transmission and Switching Systems

Optimal Depth-First Strategies for And-Or Trees

Account Strategies for Optimal SEM Results

Degree–Optimal Routing for P2P Systems

Optimal Power-Down Strategies

Learning Optimal Strategies for Spoken Dialogue Systems

Mental Game Strategies for Optimal Concentration

Optimal search strategies for hidden targets

OPTIMAL CONTROL SYSTEMS

Optimal Sampling Strategies for Multiscale Stochastic Processes

Choosing the Optimal Sequences and Strategies for Functional MRI

Discussion on portable multichannel systems for MPGDs