Optimal Defense Against Jamming Attacks in Cognitive Radio Networks
This presentation is the property of its rightful owner.
Sponsored Links
1 / 49

Presenter: Wayne Hsiao Advisor: Frank , Yeong -Sung Lin PowerPoint PPT Presentation


  • 56 Views
  • Uploaded on
  • Presentation posted in: General

Optimal Defense Against Jamming Attacks in Cognitive Radio Networks Using the Markov Decision Process Approach. Yongle Wu, Beibei Wang, and K. J. Ray Liu . Presenter: Wayne Hsiao Advisor: Frank , Yeong -Sung Lin . Agenda. Introduction Related Works System Model

Download Presentation

Presenter: Wayne Hsiao Advisor: Frank , Yeong -Sung Lin

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Presenter wayne hsiao advisor frank yeong sung lin

Optimal Defense Against Jamming Attacks in Cognitive Radio Networks Using the MarkovDecision Process Approach

Yongle Wu, Beibei Wang, and K. J. Ray Liu

Presenter:WayneHsiao

Advisor:Frank, Yeong-Sung Lin


Agenda

Agenda

  • Introduction

  • RelatedWorks

  • SystemModel

  • OptimalStrategywithPerfectKnowledge

    • MarkovModels

    • MarkovDecisionProcess

  • LearningtheParameters

  • SimulationResults


Agenda1

Agenda

  • Introduction

  • RelatedWorks

  • SystemModel

  • OptimalStrategywithPerfectKnowledge

    • MarkovModels

    • MarkovDecisionProcess

  • LearningtheParameters

  • SimulationResults


Introduction

Introduction

  • Cognitive radio technology has been receiving a growing attention

  • In a cognitive radio network

    • Unlicensed users (secondary users)

    • Spectrumholders(primaryusers)

  • Secondary users usually compete for limited spectrum resources

    • Game theory has been widely applied as a flexible and proper tool to model and analyze their behavior in the network


Introduction1

Introduction

  • Cognitive radio networks are vulnerable to malicious attacks

  • Security countermeasures

    • Crucial to the successful deployment of cognitive radio networks

  • We mainly focus on the jamming attack

    • One of the major threats to cognitive radio networks

    • Several malicious attackers intend to interrupt the communications of secondary users by injecting interference


Introduction2

Introduction

  • Secondary user could hop across multiple bands in order to reduce the probability of being jammed

    • Optimal defense strategy

    • Markov decision process (MDP)

  • The optimal strategy strikes a balance between the cost associated with hopping and the damage caused by attackers


Introduction3

Introduction

  • In order to determine the optimal strategy, the secondary user needs to know some information

    • the number of attackers

  • Maximum Likelihood Estimation (MLE)

    • A learning process in this paper that the secondary user estimates the useful parameters based on past observations


Agenda2

Agenda

  • Introduction

  • RelatedWorks

  • SystemModel

  • OptimalStrategywithPerfectKnowledge

    • MarkovModels

    • MarkovDecisionProcess

  • LearningtheParameters

  • SimulationResults


Related works

RelatedWorks

  • The problem becomes more complicated in a cognitive radio network

    • Primary users’ access has to be taken into consideration

  • We consider the scenario

    • Asingle-radio secondary user

    • Defense strategy is to hop across different bands


Agenda3

Agenda

  • Introduction

  • RelatedWorks

  • SystemModel

  • OptimalStrategywithPerfectKnowledge

    • MarkovModels

    • MarkovDecisionProcess

  • LearningtheParameters

  • SimulationResults


System model

SystemModel

  • A secondary user opportunistically accesses one of the predefined M licensed bands

  • Each licensed band is time-slotted

  • The access pattern of primary users can be characterized by an ON-OFF model


System model1

SystemModel

  • Assume all bands share the same channel model and parameters

  • But different bands are used by independent primary users


System model2

SystemModel

  • Secondary user has to detect the presence of the primary user at the beginning of each time slot


System model3

SystemModel

  • Communication gain R

    • When the primary user is absent in that band

  • The cost associated with hoppingisC

  • We assume there are m (m ≥ 1) malicious single-radio attackers

  • Attackers do not want to interfere with primary users

    • Because primary users’ usage of spectrum is enforced by their ownership of bands


System model4

SystemModel

  • On finding the secondary user

    • Attacker will immediately inject jamming power which makes the secondary user fail to decode data packets

  • We assume that the secondary user suffers from a significant loss L when jammed

  • When all the attackers coordinate to maximize the damage

    • they detect m channels in a time slot


System model5

SystemModel

  • The longer the secondary user stays in a band, the higher risk to be exposed to attackers

  • At the end of each time slot the secondary user decides

    • to stay

    • to hop

  • The secondary user receives an immediate payoff U(n) in the nth time slot


System model6

SystemModel

  • 1(.) is an indicator function

    • Returning 1 when the statement in the parenthesis holds true

    • 0 otherwise


System model7

SystemModel

  • Average Payoff Ū

    • The secondary user wants to maximize

    • Malicious attackers want to minimize

  • The discount factor δ (0 < δ < 1) measures how much the secondary user values a future payoff over the current one


Agenda4

Agenda

  • Introduction

  • RelatedWorks

  • SystemModel

  • OptimalStrategywithPerfectKnowledge

    • MarkovModels

    • MarkovDecisionProcess

  • LearningtheParameters

  • SimulationResults


Optimal strategy with perfect knowledge

OptimalStrategywithPerfectKnowledge

  • Attackstrategy

    • Attackers coordinately tune their radios randomly to m undetected bands in each time slot

    • When either all bands have been sensed or the secondary user has been found and jammed

  • The jamming game can be reduced to a Markov decision process

    • We first show how to model the scenario as an MDP

    • Then solve it using standard approaches


Optimal strategy with perfect knowledge1

OptimalStrategywithPerfectKnowledge

  • At the end of the nth time slot

    • The secondary user observes the state of the current time slot S(n)

    • And chooses an action a(n)

      • Whether to tune the radio to a new band or not, which takes effect at the beginning of the next time slot

  • S(n) = P

    • The primary user occupied the band inthenthtimeslot

  • S(n) = J

    • The secondary user was jammedinthenthtimeslot


Optimal strategy with perfect knowledge2

OptimalStrategywithPerfectKnowledge

  • a(n) = h

    • The secondary user to hop to a new band

  • The secondary user has transmitted a packet successfully in the time slot

    • ‘to hop’ (a(n) = h)

    • ‘tostay’ (a(n) = s)

  • S(n) = K

    • This is theKthconsecutiveslotwithsuccessfultransmission in thesameband


Optimal strategy with perfect knowledge3

OptimalStrategywithPerfectKnowledge

  • The immediate payoff depends on both the state and the action

  • p(S’|S, h)

    • The transition probability from an old state S to a new state S’ when taking the action h

  • p(S’|S, s)

    • The transition probability from an old state S to a new state S’ when taking the action s


Optimal strategy with perfect knowledge4

OptimalStrategywithPerfectKnowledge

  • If the secondary user hops to a new band, transition probabilities do not depend on the old state

  • The only possible new states are

    • P (the new band is occupied by the primary user)

    • J (transmission in the new band is detected by an attacker)

    • 1 (successful transmission begins in the new band)


Optimal strategy with perfect knowledge5

OptimalStrategywithPerfectKnowledge

  • When the total number of bands M is large

    • M ≫ 1

  • Assume that the probability of primary user’s presence in the new band equalthesteady-stateprobabilityoftheON-OFFmodel

    • Neglecting the case that the secondary user hops back to some band in very short time,


Optimal strategy with perfect knowledge6

OptimalStrategywithPerfectKnowledge

  • The secondary user will be jammed with the probability m/M

    • Each attacker detects one band without overlapping

  • Transition probabilities are


Optimal strategy with perfect knowledge7

OptimalStrategywithPerfectKnowledge

  • Note that s is not a feasible action when the state is in J or P

  • At state K, only max(M−Km,0) bands have not been detected by attackers

    • But another m bands will be detected in the upcoming time slot

    • The probability of jamming conditioned on the absence of primary user


Optimal strategy with perfect knowledge8

OptimalStrategywithPerfectKnowledge

  • To sum up, transition probabilities associated with the action s are as follows: ∀K ∈ {1,2,3,...}


Agenda5

Agenda

  • Introduction

  • RelatedWorks

  • SystemModel

  • OptimalStrategywithPerfectKnowledge

    • MarkovModels

    • MarkovDecisionProcess

  • LearningtheParameters

  • SimulationResults


Markov decision process

MarkovDecisionProcess

  • If the secondary user stays in the same band for too long, he/she will eventually be found by an attacker

    • p(K + 1|K,s) = 0 if K > M/m − 1

  • Therefore, we can limit the state S to a finite set ,where


Markov decision process1

MarkovDecisionProcess

  • An MDP consists of four important components

    • a finite set of states

    • a finite set of actions

    • transition probabilities

    • immediate payoffs

  • The optimal defense strategy can be obtained by solving the MDP


Markov decision process2

MarkovDecisionProcess

  • A policy is defined as a mapping from a state to an action

    • π : S(n) → a(n)

  • A policy π specifies an action π(S) to take whenever the user is in state S

  • Among all possible policies, the optimal policy is the one that maximizes the expected discounted payoff


Markov decision process3

MarkovDecisionProcess

  • The value of a state S is defined as the highest expected payoff given the MDP starts from state S

  • The optimal policy is the optimal defense strategy that the secondary user should adopt since it maximizes the expected payoff


Markov decision process4

MarkovDecisionProcess

  • After a first move the remaining part of an optimal policy should still be optimal

  • The first move should maximize the sum of immediate payoff and expected payoff conditioned on the currentaction

    • Bellman equation


Markov decision process5

MarkovDecisionProcess

  • Critical state K*(K∗≤ )

  • K∗ can be obtained from solving the MDP, and the optimal strategy becomes


Agenda6

Agenda

  • Introduction

  • RelatedWorks

  • SystemModel

  • OptimalStrategywithPerfectKnowledge

    • MarkovModels

    • MarkovDecisionProcess

  • LearningtheParameters

  • SimulationResults


Learning the parameters

LearningtheParameters

  • A learning scheme

    • Maximum Likelihood Estimation (MLE)

  • The secondary user simply sets a value as an initial guess of the optimal critical state K∗

  • And follows the strategy (10) with the estimate during the whole learning period


Learning the parameters1

LearningtheParameters

  • This guess needs not to be accurate

  • After the learning period,the secondary user updates the critical state K∗ accordingly.

  • F

    • Thetotal number of transitions from S to S’ with the action h taken

  • T

  • T

  • t


Learning the parameters2

LearningtheParameters

  • The likelihood that such a sequence has occurred

    • A product over all feasible transition tuples

    • (S,a,S’) ∈ {P,J,1,2,3,...,KL + 1}×{s,h}×{P,J,1,2,3,...,KL +1}

  • Define

  • The following proposition gives the MLE of the parameters β, γ, and ρ


Learning the parameters3

LearningtheParameters

  • Proposition1: Given ,S ∈and,S∈counted from history of transitions, the MLE of primary users’ parameters are


Learning the parameters4

LearningtheParameters

  • The MLE of attackers’ parameters ρML is the unique root within an interval (0, 1/(KL + 1)) of the following (KL + 1) order polynomial

  • Proof


Learning the parameters5

LearningtheParameters

  • With transition probabilities specified in (4) – (7)

  • The likelihood of observed transitions (11) can be decoupled into a product of three terms Λ = ΛβΛγΛρ


Learning the parameters6

LearningtheParameters

  • BydifferentiatinglnΛβ,lnΛγ,lnΛρandequatingthemto0

    • ObtaintheMLE(12)(13)and(14)

  • To ensure that the likelihood is positive, ρ has to lie in the interval (0, 1/(K + 1))

    • The left-hand side of equation (14) decreases monotonically and approaches positive infinity as ρ goes to 0

    • The right-hand side increases monotonically and approaches positive infinity as ρ goes to 1/(KL + 1)


Learning the parameters7

LearningtheParameters

  • After the learning period, the secondary user rounds M ·ρML to the nearest integer as an estimation of m

  • Calculate the optimal strategy using the MDP approach described in the previous section


Agenda7

Agenda

  • Introduction

  • RelatedWorks

  • SystemModel

  • OptimalStrategywithPerfectKnowledge

    • MarkovModels

    • MarkovDecisionProcess

  • LearningtheParameters

  • SimulationResults


Simulation result

SimulationResult

  • Communication gain R = 5

  • Hopping cost C = 1

  • Total number of bands M = 60

  • Discount factor δ = 0.95

  • Primary users’ access pattern

    • β = 0.01, γ = 0.1


Simulation result1

SimulationResult

  • When the threat from attackers are more stronger the secondary user should proactively hop more frequently

    • Toavoid being jammed


Simulation result2

SimulationResult

  • Always hopping:the secondary user will hop every time slot

  • Staying whenever possible:the secondary user will always stay in the band unless the primary user reclaims the band or the band is jammed by attackers.


  • Login