Optimal Defense Against Jamming Attacks in Cognitive Radio Networks Using the Markov Decision Process Approach Presenter: Wayne Hsiao Advisor: Frank, Yeong-Sung.

Slides:



Advertisements
Similar presentations
Markov Decision Process
Advertisements

Min Song 1, Yanxiao Zhao 1, Jun Wang 1, E. K. Park 2 1 Old Dominion University, USA 2 University of Missouri at Kansas City, USA IEEE ICC 2009 A High Throughput.
AALTO UNIVERSITY School of Science and Technology Wang Wei
Optimal Jamming Attacks and Network Defense Policies in Wireless Sensor Networks Mingyan Li, Iordanis Koutsopoulos, Radha Poovendran (InfoComm ’07) Presented.
Advisor: Yeong-Sung Lin Presented by I-Ju Shih 2011/3/07 Defending simple series and parallel systems with imperfect false targets R. Peng, G. Levitin,
An Introduction to Markov Decision Processes Sarah Hickmott
MULTI-BAND CSMA/CA- BASED COGNITIVE RADIO NETWORKS Jo Woon Chong, Youngchul Sung, and Dan Keun Sung School of EECS KAIST IWCMC
Planning under Uncertainty
Lehrstuhl für Informatik 2 Gabriella Kókai: Maschine Learning Reinforcement Learning.
June 4, 2015 On the Capacity of a Class of Cognitive Radios Sriram Sridharan in collaboration with Dr. Sriram Vishwanath Wireless Networking and Communications.
*Sponsored in part by the DARPA IT-MANET Program, NSF OCE Opportunistic Scheduling with Reliability Guarantees in Cognitive Radio Networks Rahul.
CS 5253 Workshop 1 MAC Protocol and Traffic Model.
Research on cognitive radios Presented by George Fortetsanakis.
Introduction to Cognitive radios Part two HY 539 Presented by: George Fortetsanakis.
EE 685 presentation Optimization Flow Control, I: Basic Algorithm and Convergence By Steven Low and David Lapsley Asynchronous Distributed Algorithm Proof.
CS 5253 Workshop 1 MAC Protocol and Traffic Model.
Medium Access Control Sublayer
A FREQUENCY HOPPING SPREAD SPECTRUM TRANSMISSION SCHEME FOR UNCOORDINATED COGNITIVE RADIOS Xiaohua (Edward) Li and Juite Hwu Department of Electrical and.
Reliability-Redundancy Allocation for Multi-State Series-Parallel Systems Zhigang Tian, Ming J. Zuo, and Hongzhong Huang IEEE Transactions on Reliability,
CS Reinforcement Learning1 Reinforcement Learning Variation on Supervised Learning Exact target outputs are not given Some variation of reward is.
Multiantenna-Assisted Spectrum Sensing for Cognitive Radio
Jamming and Anti-Jamming in IEEE based WLANs Ravi Teja C 4/9/2009 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:
Cooperative spectrum sensing in cognitive radio Aminmohammad Roozgard.
COGNITIVE RADIO FOR NEXT-GENERATION WIRELESS NETWORKS: AN APPROACH TO OPPORTUNISTIC CHANNEL SELECTION IN IEEE BASED WIRELESS MESH Dusit Niyato,
MAKING COMPLEX DEClSlONS
1 Secure Cooperative MIMO Communications Under Active Compromised Nodes Liang Hong, McKenzie McNeal III, Wei Chen College of Engineering, Technology, and.
POWER CONTROL IN COGNITIVE RADIO SYSTEMS BASED ON SPECTRUM SENSING SIDE INFORMATION Karama Hamdi, Wei Zhang, and Khaled Ben Letaief The Hong Kong University.
When rate of interferer’s codebook small Does not place burden for destination to decode interference When rate of interferer’s codebook large Treating.
1 Performance Analysis of Coexisting Secondary Users in Heterogeneous Cognitive Radio Network Xiaohua Li Dept. of Electrical & Computer Engineering State.
Sponsor: Dr. K.C. Chang Tony Chen Ehsan Esmaeilzadeh Ali Jarvandi Ning Lin Ryan O’Neil Spring 2010.
Markov Decision Processes1 Definitions; Stationary policies; Value improvement algorithm, Policy improvement algorithm, and linear programming for discounted.
1 ECE-517 Reinforcement Learning in Artificial Intelligence Lecture 7: Finite Horizon MDPs, Dynamic Programming Dr. Itamar Arel College of Engineering.
 Leaf test codes are secure sine they would not be jammed by jammers.  When few normal users are present, many leaf code tests are wasted since absent.
A Survey of Spectrum Sensing Algorithm for Cognitive Radio Applications YaGun Wu netlab.
DISCERN: Cooperative Whitespace Scanning in Practical Environments Tarun Bansal, Bo Chen and Prasun Sinha Ohio State Univeristy.
Tarun Bansal, Bo Chen and Prasun Sinha
Dynamic Spectrum Access in the Time Domain: Modeling and Exploiting White Space Stefan Geirhofer and Lang Tong, Cornell University Brian M. Sadler, United.
Robustness of complex networks with the local protection strategy against cascading failures Jianwei Wang Adviser: Frank,Yeong-Sung Lin Present by Wayne.
TOPOLOGY MANAGEMENT IN COGMESH: A CLUSTER-BASED COGNITIVE RADIO MESH NETWORK Tao Chen; Honggang Zhang; Maggio, G.M.; Chlamtac, I.; Communications, 2007.
EE 685 presentation Optimization Flow Control, I: Basic Algorithm and Convergence By Steven Low and David Lapsley.
Optimal Resource Allocation for Protecting System Availability against Random Cyber Attack International Conference Computer Research and Development(ICCRD),
Information Theory for Mobile Ad-Hoc Networks (ITMANET): The FLoWS Project Competitive Scheduling in Wireless Networks with Correlated Channel State Ozan.
Author: Tadeusz Sawik Decision Support Systems Volume 55, Issue 1, April 2013, Pages 156–164 Adviser: Frank, Yeong-Sung Lin Presenter: Yi-Cin Lin.
Outage in Large Wireless Networks with Spectrum Sharing under Rayleigh Fading MASc. Defence SYSC Dept., Carleton University 1 Arshdeep S. Kahlon, B.E.
Zaid A. Shafeeq Mohammed N. Al-Damluji Al-Ahliyya Amman University Amman - Jordan September
Decision Theoretic Planning. Decisions Under Uncertainty  Some areas of AI (e.g., planning) focus on decision making in domains where the environment.
1 Introduction to Statistics − Day 4 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Lecture 2 Brief catalogue of probability.
Advisor: Yeong-Sung Lin Presented by I-Ju Shih Defending against multiple different attackers Kjell Hausken, Vicki M. Bier 2011/3/14 1.
Markov Decision Process (MDP)
MDPs and Reinforcement Learning. Overview MDPs Reinforcement learning.
Chance Constrained Robust Energy Efficiency in Cognitive Radio Networks with Channel Uncertainty Yongjun Xu and Xiaohui Zhao College of Communication Engineering,
Presented by Yu-Shun Wang Advisor: Frank, Yeong-Sung Lin Near Optimal Defense Strategies to Minimize Attackers’ Success Probabilities for networks of Honeypots.
By: Donté Howell Game Theory in Sports. What is Game Theory? It is a tool used to analyze strategic behavior and trying to maximize his/her payoff of.
Advisor: Yeong-Sung Lin Presented by I-Ju Shih 2011/11/29 1 Research Direction Introduction.
Overcoming the Sensing-Throughput Tradeoff in Cognitive Radio Networks ICC 2010.
Network System Lab. Sungkyunkwan Univ. Differentiated Access Mechanism in Cognitive Radio Networks with Energy-Harvesting Nodes Network System Lab. Yunmin.
1 A Proportional Fair Spectrum Allocation for Wireless Heterogeneous Networks Sangwook Han, Irfanud Din, Woon Bong Young and Hoon Kim ISCE 2014.
Phd Proposal Investigation of Primary User Emulation Attack in Cognitive Radio Networks Chao Chen Department of Electrical & Computer Engineering Stevens.
SPECTRUM SHARING IN COGNITIVE RADIO NETWORK
Making complex decisions
Game Theory in Wireless and Communication Networks: Theory, Models, and Applications Lecture 2 Bayesian Games Zhu Han, Dusit Niyato, Walid Saad, Tamer.
Chapter 3: The Reinforcement Learning Problem
13. Acting under Uncertainty Wolfram Burgard and Bernhard Nebel
Chapter 3: The Reinforcement Learning Problem
Chapter 3: The Reinforcement Learning Problem
Resilient Aggregation in Sensor Networks
Spectrum Sharing in Cognitive Radio Networks
Efficient QoS for secondary users in cognitive radio systems
Information Theoretical Analysis of Digital Watermarking
Optimal defence of single object with imperfect false targets
Presentation transcript:

Optimal Defense Against Jamming Attacks in Cognitive Radio Networks Using the Markov Decision Process Approach Presenter: Wayne Hsiao Advisor: Frank, Yeong-Sung Lin Yongle Wu, Beibei Wang, and K. J. Ray Liu

Agenda Introduction Related Works System Model Optimal Strategy with Perfect Knowledge o Markov Models o Markov Decision Process Learning the Parameters Simulation Results

Agenda Introduction Related Works System Model Optimal Strategy with Perfect Knowledge o Markov Models o Markov Decision Process Learning the Parameters Simulation Results

Introduction Cognitive radio technology has been receiving a growing attention In a cognitive radio network o Unlicensed users (secondary users) o Spectrum holders (primary users) Secondary users usually compete for limited spectrum resources o Game theory has been widely applied as a flexible and proper tool to model and analyze their behavior in the network

Introduction Cognitive radio networks are vulnerable to malicious attacks Security countermeasures o Crucial to the successful deployment of cognitive radio networks We mainly focus on the jamming attack o One of the major threats to cognitive radio networks o Several malicious attackers intend to interrupt the communications of secondary users by injecting interference

Introduction Secondary user could hop across multiple bands in order to reduce the probability of being jammed o Optimal defense strategy o Markov decision process (MDP) The optimal strategy strikes a balance between the cost associated with hopping and the damage caused by attackers

Introduction In order to determine the optimal strategy, the secondary user needs to know some information o the number of attackers Maximum Likelihood Estimation (MLE) o A learning process in this paper that the secondary user estimates the useful parameters based on past observations

Agenda Introduction Related Works System Model Optimal Strategy with Perfect Knowledge o Markov Models o Markov Decision Process Learning the Parameters Simulation Results

Related Works The problem becomes more complicated in a cognitive radio network o Primary users’ access has to be taken into consideration We consider the scenario o A single-radio secondary user o Defense strategy is to hop across different bands

Agenda Introduction Related Works System Model Optimal Strategy with Perfect Knowledge o Markov Models o Markov Decision Process Learning the Parameters Simulation Results

System Model A secondary user opportunistically accesses one of the predefined M licensed bands Each licensed band is time-slotted The access pattern of primary users can be characterized by an ON-OFF model

System Model Assume all bands share the same channel model and parameters But different bands are used by independent primary users

System Model Secondary user has to detect the presence of the primary user at the beginning of each time slot

System Model Communication gain R o When the primary user is absent in that band The cost associated with hopping is C We assume there are m (m ≥ 1) malicious single-radio attackers Attackers do not want to interfere with primary users o Because primary users’ usage of spectrum is enforced by their ownership of bands

System Model On finding the secondary user o Attacker will immediately inject jamming power which makes the secondary user fail to decode data packets We assume that the secondary user suffers from a significant loss L when jammed When all the attackers coordinate to maximize the damage o they detect m channels in a time slot

System Model The longer the secondary user stays in a band, the higher risk to be exposed to attackers At the end of each time slot the secondary user decides o to stay o to hop The secondary user receives an immediate payoff U(n) in the nth time slot

System Model 1(.) is an indicator function o Returning 1 when the statement in the parenthesis holds true o 0 otherwise

System Model Average Payoff Ū o The secondary user wants to maximize o Malicious attackers want to minimize The discount factor δ (0 < δ < 1) measures how much the secondary user values a future payoff over the current one

Agenda Introduction Related Works System Model Optimal Strategy with Perfect Knowledge o Markov Models o Markov Decision Process Learning the Parameters Simulation Results

Optimal Strategy with Perfect Knowledge Attack strategy o Attackers coordinately tune their radios randomly to m undetected bands in each time slot o When either all bands have been sensed or the secondary user has been found and jammed The jamming game can be reduced to a Markov decision process o We first show how to model the scenario as an MDP o Then solve it using standard approaches

Optimal Strategy with Perfect Knowledge At the end of the nth time slot o The secondary user observes the state of the current time slot S(n) o And chooses an action a(n) Whether to tune the radio to a new band or not, which takes effect at the beginning of the next time slot S(n) = P o The primary user occupied the band in the nth time slot S(n) = J o The secondary user was jammed in the nth time slot

Optimal Strategy with Perfect Knowledge a(n) = h o The secondary user to hop to a new band The secondary user has transmitted a packet successfully in the time slot o ‘to hop’ (a(n) = h) o ‘to stay’ (a(n) = s) S(n) = K o This is the Kth consecutive slot with successful transmission in the same band

Optimal Strategy with Perfect Knowledge The immediate payoff depends on both the state and the action p(S’|S, h) o The transition probability from an old state S to a new state S’ when taking the action h p(S’|S, s) o The transition probability from an old state S to a new state S’ when taking the action s

Optimal Strategy with Perfect Knowledge If the secondary user hops to a new band, transition probabilities do not depend on the old state The only possible new states are o P (the new band is occupied by the primary user) o J (transmission in the new band is detected by an attacker) o 1 (successful transmission begins in the new band)

Optimal Strategy with Perfect Knowledge When the total number of bands M is large o M ≫ 1 Assume that the probability of primary user’s presence in the new band equal the steady-state probability of the ON- OFF model o Neglecting the case that the secondary user hops back to some band in very short time,

Optimal Strategy with Perfect Knowledge The secondary user will be jammed with the probability m/M o Each attacker detects one band without overlapping Transition probabilities are

Optimal Strategy with Perfect Knowledge Note that s is not a feasible action when the state is in J or P At state K, only max(M−Km,0) bands have not been detected by attackers o But another m bands will be detected in the upcoming time slot o The probability of jamming conditioned on the absence of primary user

Optimal Strategy with Perfect Knowledge To sum up, transition probabilities associated with the action s are as follows: ∀ K ∈ {1,2,3,...}

Agenda Introduction Related Works System Model Optimal Strategy with Perfect Knowledge o Markov Models o Markov Decision Process Learning the Parameters Simulation Results

Markov Decision Process If the secondary user stays in the same band for too long, he/she will eventually be found by an attacker o p(K + 1|K,s) = 0 if K > M/m − 1 Therefore, we can limit the state S to a finite set, where

Markov Decision Process An MDP consists of four important components o a finite set of states o a finite set of actions o transition probabilities o immediate payoffs The optimal defense strategy can be obtained by solving the MDP

Markov Decision Process A policy is defined as a mapping from a state to an action o π : S(n) → a(n) A policy π specifies an action π(S) to take whenever the user is in state S Among all possible policies, the optimal policy is the one that maximizes the expected discounted payoff

Markov Decision Process The value of a state S is defined as the highest expected payoff given the MDP starts from state S The optimal policy is the optimal defense strategy that the secondary user should adopt since it maximizes the expected payoff

Markov Decision Process After a first move the remaining part of an optimal policy should still be optimal The first move should maximize the sum of immediate payoff and expected payoff conditioned on the current action o Bellman equation

Markov Decision Process Critical state K * (K ∗ ≤ ) K ∗ can be obtained from solving the MDP, and the optimal strategy becomes

Agenda Introduction Related Works System Model Optimal Strategy with Perfect Knowledge o Markov Models o Markov Decision Process Learning the Parameters Simulation Results

Learning the Parameters A learning scheme o Maximum Likelihood Estimation (MLE) The secondary user simply sets a value as an initial guess of the optimal critical state K ∗ And follows the strategy (10) with the estimate during the whole learning period

Learning the Parameters This guess needs not to be accurate After the learning period, the secondary user updates the critical state K ∗ accordingly. F o The total number of transitions from S to S’ with the action h taken T t

Learning the Parameters The likelihood that such a sequence has occurred o A product over all feasible transition tuples o (S,a,S’) ∈ {P,J,1,2,3,...,K L + 1}×{s,h}×{P,J,1,2,3,...,K L +1} Define The following proposition gives the MLE of the parameters β, γ, and ρ

Learning the Parameters Proposition1: Given, S ∈ and, S ∈ counted from history of transitions, the MLE of primary users’ parameters are

Learning the Parameters The MLE of attackers’ parameters ρ ML is the unique root within an interval (0, 1/(K L + 1)) of the following (K L + 1) order polynomial Proof

Learning the Parameters With transition probabilities specified in (4) – (7) The likelihood of observed transitions (11) can be decoupled into a product of three terms Λ = Λ β Λ γ Λ ρ

Learning the Parameters By differentiating ln Λ β, ln Λ γ, lnΛ ρ and equating them to 0 o Obtain the MLE (12) (13) and (14) To ensure that the likelihood is positive, ρ has to lie in the interval (0, 1/(K + 1)) o The left-hand side of equation (14) decreases monotonically and approaches positive infinity as ρ goes to 0 o The right-hand side increases monotonically and approaches positive infinity as ρ goes to 1/(K L + 1)

Learning the Parameters After the learning period, the secondary user rounds M ·ρ ML to the nearest integer as an estimation of m Calculate the optimal strategy using the MDP approach described in the previous section

Agenda Introduction Related Works System Model Optimal Strategy with Perfect Knowledge o Markov Models o Markov Decision Process Learning the Parameters Simulation Results

Simulation Result Communication gain R = 5 Hopping cost C = 1 Total number of bands M = 60 Discount factor δ = 0.95 Primary users’ access pattern o β = 0.01, γ = 0.1

Simulation Result When the threat from attackers are more stronger the secondary user should proactively hop more frequently o To avoid being jammed

Simulation Result o Always hopping: the secondary user will hop every time slot o Staying whenever possible: the secondary user will always stay in the band unless the primary user reclaims the band or the band is jammed by attackers.