Normal-Form Game:Prisoners'Dilemma Example:Prisoner's Dilemma Two prisoners questioned in isolated cells Each prisoner can Cooperate or Defect Utilities (row agent 1,column agent 2): C D C-1,1 -5,0 D0,-5 -3,-3 4口◆4⊙t1三1=,¥9QC
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Normal-Form Game: Prisoners’ Dilemma
Normal-Form Game:Rock-Paper-Scissors Example:Rock-Paper-Scissors Two players,three actions Rock beats Scissors beats Paper beats Rock 。Utilities:: RP S R 0,0 -1,1 1,-1 1,-1 0,0 -1,1 S-1,1 1,-1 0,0 口卡回t·三4色,是分Q0
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Normal-Form Game: Rock-Paper-Scissors
Optimality Concepts Optimality Concepts in Normal-Form Games: Best-Response Function:set of optimal strategies given the other agents current strategies. π∈BR(π-) iff Vπ;∈PD(A) R(π,π-)≥R((π,T-) Nash Equilibria:all agents are using best-response strategies. i=1..nπi∈BR(r-i) All Normal-Form Games have at least one Nash Equilibrium 4口◆4⊙t1三1=,¥9QC
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Optimality Concepts Optimality Concepts in Normal-Form Games: ▶ Best-Response Function: set of optimal strategies given the other agents current strategies. π ∗ i ∈ BRi(π−i) iff ∀πi ∈ PD(Ai) Ri(⟨π ∗ i , π−i⟩) ≥ Ri(⟨πi , π−i⟩) ▶ Nash Equilibria: all agents are using best-response strategies. ∀i = 1 . . . n πi ∈ BRi(π−i) ▶ All Normal-Form Games have at least one Nash Equilibrium
Game Classification:Zero-sum .2 players with opposing objectives. There is only one Nash equilibrium 。Minimax to find it. (a)Reward function for player 1 (b)Reward function for player 2 口卡4·三4色,是分QC
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Game Classification: Zero-sum
Two-Player Zero-Sum Games Characteristics: Two opponents play against each other. symmetrical rewards (always sum zero). Usually only one equilibrium and if more exist they are interchangeable ·Interchangeable::(π1,T2〉和(,2))是两个Nash equilibria, 则(π1,μ2),(1,r2〉也是Nash equilibria;并且它们效用都相 Minimax to find an equilibrium (2,A,O,R,-R): max min∑x(a)R(a,o) TEPD(A)oEO aEA Formulated as a Linear Program. Solution in the strategy space:simultaneous playing invalidates deterministic strategies. 4口◆4⊙t4三1=,¥9QC
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Two-Player Zero-Sum Games ▶ Characteristics: ▶ Two opponents play against each other. ▶ symmetrical rewards (always sum zero). ▶ Usually only one equilibrium and if more exist they are interchangeable ▶ Interchangeable: ⟨π1, π2⟩ 和 ⟨µ1, µ2⟩ 是两个 Nash equilibria, 则 ⟨π1, µ2⟩, ⟨µ1, π2⟩ 也是 Nash equilibria;并且它们效用都相 等 ▶ Minimax to find an equilibrium (2, A, O, R, −R): max π∈PD(A) min o∈O ∑ a∈A π(a)R(a, o) ▶ Formulated as a Linear Program. ▶ Solution in the strategy space: simultaneous playing invalidates deterministic strategies