Game Theory Applications in Network Design
Game Models in Various Applications In game theory, a lot of game models have been developed; each game has several features. Usually, games are broadly divided based on weather players make decisions independently or not. According to this criterion, game models can be broadly classified into two different groups; non-cooperative games and cooperative games. However, some special games contain the mixed features of non-cooperative and cooperative games. Constitute ~이 되는 것으로 여겨지다
Non-cooperative Games In non-cooperative games, players make decisions independently and are unable to make any collaboration contracts with other players in the game. Therefore, a family of non-cooperative games is presented in which players do not cooperate and selfishly select a strategy that maximizes their own utility. Initially, traditional applications of game theory developed from these games. There are various kinds of non-cooperative games. Constitute ~이 되는 것으로 여겨지다
Non-cooperative Games Static game If all players select their strategy simultaneously, without knowledge of the other players’ strategies, these games are called static games. Sometimes, static games are also called strategic games, one-shot games, single stage games or simultaneous games. Traditionally, static games are represented by the normal form; if two players play a static game, it can be represented in a matrix format: Each element represents a pair of payoffs when a certain combination of strategies in used. Therefore, these games are called matrix games or coordination games. Constitute ~이 되는 것으로 여겨지다
Non-cooperative Games Static game One example of matrix games is the stag hunt game. In this game situation, two hunters go out on a hunt. Each can individually choose to hunt a stag or a hare. Each hunter must choose an action without knowing the choice of the other. If an individual hunts a stag, he must have the cooperation of his partner in order to succeed. An individual can get a hare by himself, but a hare is worth less than a stag. This is taken to be an important analogy for social cooperation. Therefore, the stag hunt game is used to describe a conflict between safety and social cooperation. Constitute ~이 되는 것으로 여겨지다
Non-cooperative Games Static game Stag Hare A, a (e.g., 2,2) C, b (e.g., 0,1) B, c (e.g., 1,0) D, d (e.g., 1,1) A solution concept for static games is Nash equilibrium. In the stag hunt game, there are two Nash equilibria when both hunters hunt a stag and both hunters hunt a hare. Sometimes, like as the Prisoner's Dilemma, an equilibrium is not an efficient solution despite that players can get a Pareto efficient solution. Due to this reason, many researchers focus on how to drive a game where players have a non-cooperative behavior to an optimal outcome. A (a) > B (b) ≥ D (d) > C (c). Constitute ~이 되는 것으로 여겨지다
Non-cooperative Games Constitute ~이 되는 것으로 여겨지다
Non-cooperative Games Dynamic game Dynamic games are mathematical models of the interaction between different players who are controlling a dynamical situation. Players in a dynamic game have at least some information about the strategy chosen by the others and play a similar stage game dynamically. Therefore, strategies of players influence the evolution over time. Based on the history of selected strategies, players select their strategies sequentially based on the mapping between the information available to the player and his strategy set. Unlike static games, the threats from other players can encourage cooperation without the need for communication among the players. Dynamic games are classified into three different classes : repeated games, sequential games and stochastic games. Constitute ~이 되는 것으로 여겨지다
Non-cooperative Games Sequential game Sequential games constitute a major class of dynamic games in which players select their strategy following a certain predefined order. In a sequential game, a player can observe the strategies of other players who acted before him, and make a strategic choice accordingly. Therefore, each player can take alternate turns to make their selections, given information available on the selected strategies of the other players. In sequential games, the sequence of strategic selection made by the players strongly impacts the outcome of the game. If players cannot observe the actions of previous players, this game is a static game. Therefore, static game is a special case of sequential game. Constitute ~이 되는 것으로 여겨지다
Non-cooperative Games Sequential game According to the role of information, sequential games can be distinguish between perfect information sequential games and imperfect information sequential games. If a player observes the strategies of every other player who has gone before him, this game is a perfect information game. If some, but not all players observe prior strategies, while other players act simultaneously, the game is an imperfect information game. Sequential games are represented by the extensive form. In the extensive form, the game is represented as a tree, where the root of the tree is the start of the game. One level of the tree is referred as a stage. The nodes of a tree show the possible unfolding of the game, meaning that they represent the sequence relation of the moves of the players. Constitute ~이 되는 것으로 여겨지다
Non-cooperative Games Sequential game Sequential games are solved using the solution concept of subgame perfect equilibrium, which is a refinement of a Nash equilibrium for dynamic games. Therefore, players’ strategies constitute a Nash equilibrium in every subgame of the original game. It may be found by backward induction, an iterative process for solving finite sequential games; working from the end of the game back to the root of the game tree involved having the player’s best response in each layer. As a consequence of working backwards through a sequence of best responses, the subgame perfect equilibrium can be obtained. Constitute ~이 되는 것으로 여겨지다
Non-cooperative Games Repeated game When a static non-cooperative strategic game is repeated over time, players iteratively interact by playing a similar stage game. This type of game is called as a repeated game. By repeating a game over time, the players may become aware of past behavior of the players and change their strategies accordingly. Therefore, a repeated game is an extensive form game which consists in some number of repetitions of some base game. Sequential games and repeated games are very similar, but there is a difference. In sequential games, players make decisions following a certain predefined order. Beyond sequential games, a repeated game allows for a strategy to be contingent on past actions while allowing for reputation effects and retribution. Constitute ~이 되는 것으로 여겨지다
Non-cooperative Games Repeated game It captures the idea that a player will have to take into account the impact of his current strategy on the future strategy of other players; this is called player’s reputation. This approach can encourage cooperation. Repeated games may be divided into finite or infinite games. Generally, infinitely repeated games can encourage cooperation. Since a player will play the game again with the same player, the threat of retaliation is real. Therefore, one essential part of infinitely repeated games is punishing players who deviate from this cooperative strategy. The punishment may be something like playing a strategy which leads to the reduced payoff to both players for the rest of the game. Constitute ~이 되는 것으로 여겨지다
Non-cooperative Games Repeated game For T-period repeated game, at each period 𝑡, the actions during periods 1, . . . , 𝑡−1 are known to every player. Let 𝛽 be the discount factor (The valuation of the game diminishes with time depending on the discount factor 𝛽 (0< 𝛽≤1)). The total discounted payoff for each player is computed by 𝑡=1 𝑇 𝛽 𝑡−1 𝑈 𝑘 (𝑡) where 𝑈 𝑘 𝑡 is the payoff to player k in the 𝑡’th game round. If 𝑇 = ∞, the game is referred as the infinitely-repeated game. The average payoff ( 𝑈 ) to player k is then given by: 𝑈 𝑘 = 1−𝛽 𝑡=1 ∞ 𝛽 𝑡−1 𝑈 𝑘 𝑡 Constitute ~이 되는 것으로 여겨지다
Non-cooperative Games Repeated game Note that for each player, maximizing total payoff is the same as maximizing average payoff. One way to interpret the discount factor 𝛽 is as an expression of traditional time preference. For example, if you received a dollar today, you could deposit it in the bank and it would be worth $ 1+𝑟 tomorrow, where 𝑟 is the per-period rate of interest. You would be indifferent between a payment of $ 𝑥 𝑡 today and $ 𝑥 𝑡+𝜏 received 𝜏 periods later only if the future values were equal: 1+𝑟 𝑥 𝑡 = 𝑥 𝑡+𝜏 . Comparing this indifference condition, we see that the two representations of inter-temporal preference are equivalent when 𝛽 = 1/ (1 + 𝑟). Constitute ~이 되는 것으로 여겨지다
Non-cooperative Games Repeated game We can relate a player’s discount factor 𝛽 to player’s patience. How much more does a player value a dollar today, at time 𝑡, than a dollar received 𝜏>0 periods later? As 𝜷→𝟏, so as the player’s discount factor increases, the player values the later amount more and more nearly as much as the earlier payment. A person is more patient, the less she minds waiting for something valuable rather than receiving it immediately. So we interpret higher discount factors as higher levels of patience. Constitute ~이 되는 것으로 여겨지다
Non-cooperative Games Repeated game A history in a repeated game is a list ℎ 𝑡 =( 𝑠 0 , 𝑠 1 , ..., 𝑠 𝑡−1 ) of what has previously occurred. Let 𝐻 𝑡 be the set of 𝑡-period histories. A strategy for the player 𝑖 is a sequence of maps 𝑠 𝑖 𝑡 : 𝐻 𝑡 → 𝑆 𝑖 . A mixed strategy 𝜎 𝑖 is a sequence of maps 𝜎 𝑖 𝑡 : 𝐻 𝑡 → 𝛥( 𝑆 𝑖 ). A strategy profile is 𝜎 = ( 𝜎 1 , ..., 𝜎 𝐼 ). Optimal method of playing a repeated game is not to repeatedly play a Nash strategy of the static game, but to cooperate and play a socially optimum strategy. There are many results in theorems which deal with how to achieve and maintain a socially optimal equilibrium in repeated games. These results are collectively called ‘Folk Theorems’. Constitute ~이 되는 것으로 여겨지다
Non-cooperative Games Repeated game Folk Theorem : 𝐿𝑒𝑡 𝛼 1 , . . . , 𝛼 𝑛 be the payoffs from a Nash equilibrium of game and let ( 𝛼 1 , . . . , 𝛼 𝑛 ) be any feasible payoffs. There exists an equilibrium of the infinitely repeated game that attains ( 𝛼 1 , . . . , 𝛼 𝑛 ) as the average payoff for 𝛼 𝑖 > 𝛼 𝑖 , ∀𝑖, provided that 𝛽 is sufficiently close to 1. From Folk Theorem, we know that in an infinitely repeated game, any feasible outcome that gives each player better payoff than the Nash equilibrium can be obtained. Therefore, repeated game can be used to improve the performance of Nash equilibrium in one-shot interaction and repeated game strategies is considered as alternative methods to expand the set of equilibrium outcomes. Constitute ~이 되는 것으로 여겨지다
Non-cooperative Games Repeated game By using the repeated game, any feasible payoffs better than the Nash equilibrium can be obtained : the greedy players are forced to cooperate and have better payoffs. The reason is that, with enough patience, a player’s non-cooperative behavior will be punished by the future revenge from other cooperative players. So, the hot issue of repeated games is defining a good rule to enforce the cooperation. tit-for-tat and cartel maintenance are well-known rules for repeated games. Constitute ~이 되는 것으로 여겨지다
Non-cooperative Games Repeated game Tit-for-tat is a type of trigger strategy in which a player responds in one period with the same action his opponent used in the last period. The advantage of tit-for-tat is its implementation simplicity. However, there are two potential problems. First, the best response of a player is not the same action of the other opponent. Second, the information of other players’ actions is hard to obtain. Constitute ~이 되는 것으로 여겨지다
Non-cooperative Games Repeated game Cartel maintenance is one of the optimal design criteria. In the repeated game model, the basic idea of the cartel maintenance is to provide enough threat to greedy players so as to prevent them from deviating from cooperation. If the cooperation is obtained, all players have better performances than those of non-cooperative plays. However, if any player deviates from cooperation while others still play cooperatively, this deviating player has a better payoff, while others have relatively worse payoffs. If no rule is employed, the cooperative players will also have incentives to deviate, and total payoff deteriorates due to the non-cooperation. Constitute ~이 되는 것으로 여겨지다
Non-cooperative Games Repeated game Therefore, most repeated game models provide a punishment mechanism so that the current defecting gains of the selfish player will be outweighed by future punishment strategies from other players. For rational players, this threat of punishment prevents them from deviation, so the cooperation is enforced. Based on the Cartel maintenance, the self-learning repeated game model was developed with parameter (V, T, N); T is a punishment time, N is a predefined time period and V is a trigger threshold. N is constant, but T and V are variables and dynamically adjusted during the repeated game. At initial time, all players play game with the same strategy 𝑠 . The trigger threshold (V) is the payoff of the strategy 𝑠 . In each step, players play repeated game strategy. Constitute ~이 되는 것으로 여겨지다
Non-cooperative Games Repeated game If all players play cooperatively, every player will have some benefits. If any player deviates from cooperation by playing non-cooperatively and other players still play cooperatively, this player will have more benefits, while others suffer with lower benefits. To implement punishment mechanism, player can observe the public information 𝐼 𝑡 (e.g. the outcome of the game) at time t. In a punishment mechanism, the greedy player’s short term benefit should be eliminated by the long term punishment. Constitute ~이 되는 것으로 여겨지다
Non-cooperative Games Repeated game Based on the assumption of rational players, we can think that all players concern the long term payoff. Therefore, all players will have no incentive to deviate from cooperation. The self-learning repeated game model let distributed players learn the optimal strategy step by step, while within each step, the strategy of repeated game is applied to ensure the cooperation among players. With increasing of T, the benefit of one time deviation will be eliminated out sooner or later. Therefore, the selfish players who deviate will have much lower payoff in the punishment period. So finally, no player wants to deviate and 𝐼 𝑡 is better than V. Constitute ~이 되는 것으로 여겨지다
Non-cooperative Games Repeated game If the game status is stable in the cooperation during a period of time N, the self-learning repeated game model assumes that the cooperation is enforced, and changes to the next step to improve the current cooperation. In the next step, the algorithm tries to self-learn the optimal strategy by modifying the current strategy with the goal to optimize the payoff. Therefore, each player dynamically adjusts his strategy; different players may have different strategies. In the next game step, all players observe if their payoffs become better. If not, the adjusted strategy is changed to the previous strategy. Otherwise, based on the adjusted strategies, the trigger threshold (V) is updated as the current average payoff (𝑉 = 𝐼 𝑡 ), and the punishment time (T) is adaptively adjusted. And then, a new repeated game is re-started. Finally, the game will converge to a stable status. Constitute ~이 되는 것으로 여겨지다
Non-cooperative Games Repeated game Practical algorithm is given as follows: 1) Player i plays the strategy ( 𝑠 ) of the cooperation phase. 2) If the cooperation phase is played in period t and 𝐼 𝑡 >𝑉, the player i plays the cooperation phase in the next period t + 1. 2.1) each player dynamically adjusts his strategy 2.2) based on the current 𝐼 𝑡 , strategies are decided and V and T are adaptively modified. 3) If the cooperation phase is played in period t and 𝐼 𝑡 <𝑉 (i.e., someone deviates), the player i switches to a punishment phase during N time period. The players play non-cooperatively during T time period. 3.1) After the punishing T time period, players return to the cooperative phase. 4) self-learning repeated game is re-started until to reach the stable state. Constitute ~이 되는 것으로 여겨지다
Non-cooperative Games Repeated game Even though repeated games have some advantages, there are shortcomings for practical implementation. In a repeated game, players perform monitoring and provide incentives in a distributed way. It is assumed that the game model involves a stable long-term relationship among players. However, many real world systems such as mobile and vehicular networks do not involve such stable long-term relationships. It can make the repeated game neither applicable nor useful. Moreover, repeated game strategies are constrained by the selfish behavior of players. In particular, equilibrium strategies must guarantee that players execute punishment and reward in the manner intended in their self-interest, which may require a complex structure of strategies. Constitute ~이 되는 것으로 여겨지다
Non-cooperative Games Stochastic game In the early 1950s, Lloyd Shapley introduced the concept of stochastic game, which is composed of a number of states. A stochastic game is a specialized dynamic game with probabilistic transitions between the different states of the game, and it can be generalized by using Markov decision process. Therefore, stochastic games generalize both Markov decision processes and repeated games, which correspond to the special case where there is only one state. Notice that a one-state stochastic game is equal to a repeated game, and a one-player stochastic game is equal to a Markov decision process. For example, at the beginning of each state, the game is in a particular state. In this state, the players select their actions and each player receives a payoff that depends on the current state and the chosen strategies. Constitute ~이 되는 것으로 여겨지다
Non-cooperative Games Stochastic game The game then moves to a new random state whose distribution depends on the previous state and the actions chosen by the players. Therefore, the stochastic game is played in a sequence of states. The procedure is repeated at the new state and the game continues for a finite or infinite number of states. The formal definition of n-player stochastic game consists of a finite, non-empty set of states 𝑆, a finite set 𝒜 𝑖 of strategies for the player i, a conditional probability distribution p on 𝑆 × 𝒜 1 × 𝒜 2 × ∙ ∙ ∙ × 𝒜 n , and payoff functions is defined based on the history space ℍ=S × 𝒜 × S × 𝒜∙∙∙, where 𝒜= 𝑖=1 𝑛 𝒜 𝑖 . In particular, the game is called an n-player deterministic game if, for each state 𝑠 ∈𝑆 and each strategy selection 𝐚= 𝑎 1 , 𝑎 2 , . . . , 𝑎 𝑛 , there is a unique state 𝑠′ such that 𝑝 𝑠 ′ 𝑠, 𝐚 =1. Constitute ~이 되는 것으로 여겨지다
Non-cooperative Games Stochastic game If there is a finite number of players, the strategy sets, and the set of states are finite, then a stochastic game with a finite number of stages always has a Nash equilibrium. For stochastic games, a Markov perfect equilibrium is another solution concept. It is a set of mixed strategies for each of the players, and a refinement of the concept of sub-game perfect Nash equilibrium to stochastic games. In a Markov perfect equilibrium, the strategies have the Markov property of memoryless, meaning that each player's mixed strategy can be conditioned only the state of the game. The state can only be decided according to the payoff-relevant information. Therefore, strategies that depend on signals, negotiation, or cooperation between the players are excluded. Constitute ~이 되는 것으로 여겨지다
Non-cooperative Games Stochastic game Finally, these strategies form a subgame perfect equilibrium of the stochastic game. Simply, Markov perfect equilibrium can be defined a subgame perfect equilibrium in which all players use Markov strategies. Stochastic games have applications in industrial organization, macroeconomics, political economy, evolutionary biology and computer networks. Markov game is an extension of game theory to Markov decision process. In an Markov decision process, an optimal strategy is one that maximizes the expected sum of discounted reward and is undominated, meaning that there is no state from which any other strategy can achieve a better expected sum of discounted reward. Constitute ~이 되는 것으로 여겨지다
Non-cooperative Games Stochastic game For many Markov games, there is no strategy that is undominated because performance depends critically on the choice of opponent player. Therefore, the optimal stationary strategy is sometimes probabilistic, mapping states to discrete probability distributions over actions. A classic example is ‘rock, paper, scissors game’ in which any deterministic policy can be consistently defeated. Constitute ~이 되는 것으로 여겨지다
Non-cooperative Games Potential game In 1973, Robert W. Rosenthal proposed the concept of potential game. A game is said to be a potential game if the incentive of all players to change their strategy can be expressed in one global function (i.e., the potential function). Generally, potential games can be categorized as ordinal potential games or exact potential games. In ordinal potential games, it must be possible to construct a single-dimensional potential function where the sign of the change in value is the same as the sign of the change in payoff of the deviating player. An ordinal potential game 𝑃 : 𝑆→ ℝ satisfies ∀ 𝑠 𝑖 ′ ∈ 𝑆 𝑖 , 𝑃 𝑠 𝑖 ′ , 𝒔 −𝑖 − 𝑃 𝑠 𝑖 , 𝒔 −𝑖 >0 iff 𝑢 𝑖 ( 𝑠 𝑖 ′ , 𝒔 −𝑖 ) − 𝑢 𝑖 ( 𝑠 𝑖 , 𝒔 −𝑖 ) >0 Constitute ~이 되는 것으로 여겨지다
Non-cooperative Games Potential game If the change caused by any player’s unilateral action is exactly the same as the change in the potential function, this kind of game is called an exact potential game. A potential game becomes an exact potential game when: ∀ 𝑠 𝑖 ′ ∈ 𝑆 𝑖 , 𝑃 𝑠 𝑖 ′ , 𝒔 −𝑖 − 𝑃 𝑠 𝑖 , 𝒔 −𝑖 = 𝑢 𝑖 ( 𝑠 𝑖 ′ , 𝒔 −𝑖 ) − 𝑢 𝑖 ( 𝑠 𝑖 , 𝒔 −𝑖 ) A global objective function P is called an exact potential function, where individual payoff change as a result of a unilateral player’s action is exactly reflected in this global function. The existence of a potential function that reflects the change in the utility function of any unilateral deviating player is the characteristic of a potential game. Constitute ~이 되는 것으로 여겨지다
Non-cooperative Games Potential game If the potential of any strategy profile is finite, every sequence of improvement steps is finite. We can assume that Nash equilibrium is a local maximum (or minimum) point of the potential function, defined as a strategy profile where changing one coordinate cannot result in a greater potential function value. Therefore, any sequence of unilateral improvement steps converges to a pure strategy Nash equilibrium, which is also a local optimum point of a global objective given by the potential function. To summarize, an important feature of a potential game is that potential game has been shown to always converge to a Nash equilibrium when the best response dynamics is performed. Constitute ~이 되는 것으로 여겨지다
Non-cooperative Games Potential game The best response dynamics is a dynamic process of updating strategies, in which a player chooses a strategy that maximizes its respective utility, given the current strategy of other players remain fixed. Best response dynamics ( 𝑠 𝑖 𝑘+1 ( 𝒔 −𝑖 )) of the player i to the strategy profile 𝒔 −𝑖 at time k+1 is a strategy that satisfies 𝑠 𝑖 𝑘+1 ( 𝒔 −𝑖 ) ∈ arg 𝐦𝐚𝐱 𝑠 𝑖 ′ ∈ 𝑆 𝑖 𝑢 𝑖 𝑠 𝑖 ′ , 𝒔 −𝑖 𝑘 where 𝑠 𝑖 ′ , 𝒔 −𝑖 𝑘 ∈ 𝑆 𝑖 denotes the action profile at time k. The potential function is a useful tool to analyze equilibrium properties of games, since the incentives of all players are mapped into one function, and the set of pure Nash equilibrium can be found by locating the local optima of the potential function. Constitute ~이 되는 것으로 여겨지다
Non-cooperative Games Congestion game In 1996, Monderer and Shapley proved the converse; for any potential game, there is a congestion game with the same potential function. In a congestion game, players and resources are defined, where the payoff of each player depends on the resources he chooses and the number of players choosing the same resource. A congestion game Γ is defined as a tuple { 𝒩, ℛ, { 𝑆 𝑖 } 𝑖∈𝒩 , { 𝑐 𝑟 } 𝑟∈ℛ } where 𝒩 = {1,…,n} is the set of players, ℛ is the set of resources, 𝑠 𝑖 ⊂ 2 ℛ is the strategy space of player i, and 𝑐 𝑟 : 𝒩 → ℛ is a cost function associated with resource r ∈ ℛ. Congestion games are a special case of potential games and first proposed by Robert W. Rosenthal in 1973. Rosenthal proved that any congestion game is a potential game. Constitute ~이 되는 것으로 여겨지다
Non-cooperative Games Congestion game This function is related to the total number of players using resource r. S = ( 𝑠 1 ,…, 𝑠 𝑛 ) is a state of the game, in which the player i chooses strategy 𝑠 𝑖 ∈ 𝑆 𝑖 . Players are assumed to act selfishly and aim at choosing strategies while minimizing their individual cost. The cost function 𝑐_𝑓 𝑖 is a function of the strategy 𝑠 𝑖 ∈ 𝑆 𝑖 selected by the player i, with the current strategy profile of the other players, which is usually indicated with s−i. The cost function 𝑐_𝑓 𝑖 is defined by 𝑐_𝑓 𝑖 𝑠 𝑖 , 𝒔 −𝒊 = 𝑟∈ 𝑠 𝑖 𝑐 𝑟 . A player in this game aims to minimize its total cost which is the sum of costs over all resources that his strategy involves. Constitute ~이 되는 것으로 여겨지다
Non-cooperative Games Congestion game Given any state S, an improvement step of the player i is a change of its strategy from 𝑠 𝑖 to 𝑠 𝑖 ′ , such that the cost of the player i decreases. A classical result from Rosenthal work shows that sequences of improvement steps do not run into cycles, but reach a Nash equilibrium after a finite number of steps. This proposition is shown by a potential function argument. In particular, a potential function Φ(S) : 𝑠 1 × · · · × 𝑠 𝑚 → R is defined as Φ 𝑆 = 𝑟∈ℛ 𝑖=1 𝓃 𝑟 𝑐 𝑟 (𝑖) where 𝓃 𝑟 is the total number of players by using the resource r. Constitute ~이 되는 것으로 여겨지다
Non-cooperative Games Congestion game Rosenthal’s potential function can be shown to be ordinal, so congestion game is easily demonstrated to be a potential game. In fact, congestion game is a special case of potential game. Therefore, Nash equilibrium is the only fixed point of the dynamics defined by improvement steps. In the same way as the potential game, a pure Nash equilibrium is easily obtained in a congestion game through the sequence of improvement steps. Constitute ~이 되는 것으로 여겨지다
Non-cooperative Games Stackelberg game In 1934, German economist H. V. Stackelberg proposed a hierarchical strategic game model based on two kinds of different decision makers. Under a hierarchical decision making structure, one or more players declare and announce their strategies before the other players choose their strategies. In game theory terms, the declaring players are called as leaders while the players who react to the leaders are called as followers. Leaders are in position to enforce their own strategies on the followers. In this case, the leader has a commitment power and makes his decisions by considering the possible reactions of followers. The followers react dependently based on the decision of the leader while attempting to maximize their satisfaction. Constitute ~이 되는 것으로 여겨지다
Non-cooperative Games Stackelberg game Therefore, leader and followers have their own hierarchy level, utility function and strategies; they are forced to act according to their hierarchy level. Stackelberg game model is mathematically formulated as follows. 𝐦𝐢𝐧 𝑥 𝐹 𝑥, 𝑦 , s.t., 𝑔 𝑥, 𝑦 ≤0 𝑦 ∈arg 𝐦𝐢𝐧{𝑓 𝑥, 𝑦 :ℎ(𝑥, 𝑦) ≤0} where 𝐹 𝑥, 𝑦 , 𝑔 𝑥, 𝑦 and 𝑥 are called the higher level function, constraint and control parameter, respectively. They are used for the leader. The other function, constraint and parameter (𝑓 𝑥, 𝑦 , ℎ 𝑥, 𝑦 and 𝑦) are defined for the follower. Constitute ~이 되는 것으로 여겨지다
Non-cooperative Games Stackelberg game The Stackelberg model can be solved to find the subgame perfect Nash equilibrium. It is the strategy profile that serves best each player, given the strategies of the other player and that entails every player playing in a Nash equilibrium in every subgame. Nash equilibrium solution concept in Stackelberg game is so-called Stackelberg equilibrium. It provides a reasonable hierarchical equilibrium solution concept when the roles of the players are asymmetric; one of the players has an ability to enforce his strategy on the other players. Usually, Stackelberg equilibrium is more efficient than the Nash equilibrium. Constitute ~이 되는 것으로 여겨지다
Non-cooperative Games Stackelberg game To formally express the Stackelberg equilibrium, let 𝒢= 𝒦, 𝒜 𝑘 , 𝑈 𝑘 represent a game where 𝒦= 1, ∙ ∙ ∙ , 𝐾 is the set of players, 𝒜 𝑘 is the set of actions available to user 𝑘, and 𝑈 𝑘 is the user 𝑘’s payoff. The action 𝑎 𝑘 ∗ is a best response to actions 𝒂 −𝑘 and the set of user 𝑘’s best response (BR) to 𝒂 −𝑘 is denoted as 𝐵 𝑅 𝑘 𝒂 −𝑘 if 𝑈 𝑘 𝐵 𝑅 𝑘 𝒂 −𝑘 , 𝒂 −𝑘 =𝑈 𝑘 𝑎 𝑘 ∗ , 𝒂 −𝑘 ≥ 𝑈 𝑘 𝑎 𝑘 , 𝒂 −𝑘 ∀ 𝑎 𝑘 ∈ 𝒜 𝑘 and 𝑘∈𝒦 With leader and follower, the action profile 𝒂 is a Stackelberg equilibrium if leader maximizes his payoff subject to the constraint that follower chooses according to his best response function. Constitute ~이 되는 것으로 여겨지다
Non-cooperative Games Stackelberg game The Stackelberg equilibrium prescribes an optimal strategy for the leader if its followers always react by playing their Nash equilibrium strategies in the smaller sub-game. For example, an action 𝑎 𝑙 ∗ is the Stackelberg equilibrium strategy for the leader if 𝑈 𝑙 𝑎 𝑙 ∗ , 𝐵 𝑅 𝑓 𝑎 𝑙 ∗ ≥ 𝑈 𝑙 𝑎 𝑙 , 𝐵 𝑅 𝑓 𝑎 𝑙 , ∀ 𝑎 𝑙 ∈ 𝒜 𝑙 where 𝐵 𝑅 𝑓 (∙) is the follower’s best response. Let 𝑁𝐸 𝑎 𝑘 be the Nash equilibrium strategy of the remaining players if player k chooses to play 𝑎 𝑘 , 𝑁𝐸 𝑎 𝑘 = 𝒂 −𝑘 , ∀ 𝑎 𝑖 = 𝐵 𝑅 𝑖 𝒂 −𝑖 , 𝑎 𝑖 ∈ 𝒜 𝑖 and 𝑖≠𝑘 Constitute ~이 되는 것으로 여겨지다
Non-cooperative Games Stackelberg game Finally, Stackelberg equilibrium can be defined in the general case. The strategy profile 𝑎 𝑘 ∗ , 𝑁𝐸 𝑎 𝑘 ∗ is a Stackelberg equilibrium with user k iff 𝑈 𝑘 𝑎 𝑘 ∗ , 𝑁𝐸 𝑎 𝑘 ∗ ≥ 𝑈 𝑘 𝑎 𝑘 , 𝑁𝐸 𝑎 𝑘 , ∀ 𝑎 𝑘 ∈ 𝒜 𝑘 Specifically, for the Stackelberg game, the equilibrium strategy can be derived by solving a bi-level programming. The concept of bi-level programming can be generalized allowing the analysis of an arbitrary number of levels and an arbitrary number of decision-makers. The decision makers at the upper level make their decisions first. Then, the decision makers at the lower level specify their decisions given the decisions made by the upper level. Constitute ~이 되는 것으로 여겨지다
Non-cooperative Games Stackelberg game All of the divisions at the lower level react simultaneously to the preemptive decisions from the upper level. Bi-level programming problems can be analyzed using concepts from game theory. Within each level, the decision makers play an n-person non-zero sum game similar to those studied and solved by J. Nash. Between levels, the sequential decision process is an n-person leader-follower game similar to those studied and solved by von Stackelberg. Thus, the overall bi-level programming problem can be thought of as a Stackelberg game embedded with ‘Nash-type’ decision problems at each level. For this reason, bi-level programming problem is called a Nash-Stackelberg game. Constitute ~이 되는 것으로 여겨지다