11. Game Theory ( 不是電玩理論 ... )

11. Game Theory(不是電玩理論 ... ) 人生充滿著衝突與競爭。。。有太多的決策，不是自己說了算! 前面所有的分析，決策者都只有一個人，其他因素都是經驗累積而來的數據，或是統計後的或然率。當影響決策不是一個人時。。。所謂game(賽局)，指用一群人所熟知競賽規則，來規範大家的行動及結果，還有每個人因自己決策而獲得的payoffs。每個參賽者，在規則下，透過方案選擇，爭取以最有效率的方式達到設定的目標。例如高爾夫球，用最少的桿數完成18洞的比賽。或像是撞球賽，誰先打完就贏。後者有鮮明的競爭性，達成目標的過程有衝突性，做球卡死對方。。。高爾夫球不會這樣!!! 通常競賽達成目標後，都會有某種payoffs(如獎金)，本章討論game theory時，假設所有的payoffs，都可以轉成以金錢表示。大家要注意，game theory不是在教大家怎麼玩(競爭)!! 而是關於玩的過程中，應如何選擇策略方案的方法與原則。所以game theory其實是一種有競爭狀況下的decision theory。

a11 a12 ... a1n ... aij ... am1 am2 ... amn 11.1 Definition ofPayoff matrix (or table) • Payoff matrix (or table) • An m×n matrix is called a payoff matrix of a game if it satisfies, • There are only two players R and C. (such as 2 persons, 2 companies or 2 nations ...) • Player R has m choices, and player C has n choices. • If R chooses alternative Ri, and C chooses alternative Cj then aij denotes the payoff of C to R. Payoff matrix

2 -1 -1 -1 2 -1 -1 -1 2 Example 11.1 R與C兩人猜拳，剪刀/石頭/布。勝負之payoffs協議如下: (正值代表R獲利，負值時代表R損失。) 規則 1. 一樣，R賺2元。 2. 不同，R賠1元。 Payoff matrix

11.2 Zero sum game (零和遊戲) Two persons zero sum game means that the sum of total payoffs of players R and C is zero! On the other words, any one’s gain is the other’s lost. 想辦法讓自己降低損失，他有n種策略可供選擇。 Column player y=(y1, y2, ..., yn)是一個probability vector，yj代表C採用Cj策略的機率。所以 ∑j=1 to n(yj) = 1。 Row player 想辦法讓自己增加獲利，他有m種策略可供選擇。 x=(x1, x2, ..., xm)是一個probability vector，xi代表R採用Ri策略的機率。所以 ∑i=1 to m(xi) = 1。 R的立場

II I 剪刀剪刀石頭布 III IV Example 11.1 -續 R與C兩人猜拳，剪刀/石頭/布。勝負之payoffs協議如下: (正值代表R獲利，負值時代表R損失。) R的probability vector 如果是(0.5, 0.25, 0.25)，表示R有一半的機會出剪刀，另一半是出石頭或布。轉輪盤是策略還是賭博?!

Example 11.2 某社區原只有一家超商7-11，最近來了一家新超商全家。全家有三種廣告策略方案，用來吸引顧客。7-11因此也擬訂了三種反制的廣告策略，將顧客流失率降到最低。下表是根據過去兩家超商競爭經驗的分析後，所歸納出來的最低payoff table。在競爭中，不要讓對方知道自己的策略，是很重要的關鍵。在不知道對方採取的競爭策略方案下，如何在零和遊戲中，選擇對自己最有利的策略? 全家: 必須使用maxmin rule，目的是maximizing the minimal payoff。所以選擇廣告3。 7-11: 必須採用minmax rule，目的是minimizing the maximal loss。所以選擇反制2。 Payoff指全家最少可以吸引到顧客幾千人。。。零和遊戲的定義對這個範例很適用嗎? - 全家廣告後，其獲利真的就必然是7-11的損失嗎?

11.3 Saddle point (賽局的鞍點) Let G=(R, C, A) denote a two persons zero sum game if there is a value VG such that VG = MaxRMinC A(ri, cj) = MinCMaxR A(ri, cj) then VG is said the pure value of G and its location in A is said a saddle point. 3 3 MaxR{MinC A(ri, cj)} R player的最佳策略: r2 3 VG The pure value of G Saddle point MinC{MaxR A(ri, cj)} C player的最佳策略: c2 有鞍點的賽局稱為「strictly determined game」

a b c d Example 11.3 將 payoff matrix A 完成，使其成為一 non-strictly determined game。就是不能出現saddle point的意思! 分析如下: a12 Theorem 11.1 A 2×2 payoff matrix A= [ ] is a non-strictly determined game if and only if Max{b, c} < Min {a, d} or Max{a, d} < Min{b, c} a12

MinC A(ri, cj) 0 6 1 0 2 3 5 3 5 7 1 -3 6 -4 -4 MaxR A(ri, cj) 6 3 6 7 11.4 Mixed strategy (混合策略) For a given G=(R, C, A) a two persons zero sum game, any strategy of x or y is said to be a pure strategy if its probability vector exists a component with value 1 (i.e., the others are all zero), otherwise it is a mixed strategy. MaxR{MinC A(ri, cj)} R player的最佳策略: r2 所以probability vector是 [0, 1, 0] 換句話說，有saddle points的game 其row或column player，所採取的策略，必然就是pure strategy。 A strictly determined game has pure strategy. A non-strictly determined game has mixed strategy. MinC{MaxR A(ri, cj)} C player的最佳策略: c2 所以probability vector是 [0, 1, 0, 0]

所以問題不在strictly determined game，而是在non-strictly determined game。因為混合策略時，才會有策略方案選擇的問題。到底哪個方案比較好？通常以優勝比值(oddment)來決定。 Example 11.3 R player 混合策略比值: [6/13, 7/13] 6 7 10 3 C player選擇c2策略後，R player會失去當C player選擇c1策略時，所能帶來的可能payoff之誤差。(這是C player選擇後唯一可以確知的事) C player 混合策略比值: [3/13, 10/13]

Example 11.4: 2×3 Payoff matrix 如果不知道R所採取的混合策略是啥，C player 該怎麼進行策略選擇? c1: 17/8 c2: -13/8 c3: -13/8 R player [3/8, 5/8] 所以C的混合策略應該是[0, 9/10, 1/10]最為有利。 C player [9/10, 1/10] R player [9/14, 5/14] R player [12/22, 10/22] 堅決不採混合策略採取pure strategy c2 Why? C player選c1時，R payoff的期望值為 -6×(12/22)+7×(10/22) = -2/22 選c2時，-1×(12/22)+(-2)×(10/22) = -32/22 選c3時，4×(12/22)+(-5)×(10/22) = -2/22 C player [9/22, 13/22] C player [1/14, 13/14] 看來R採取[12/22, 10/22]的混合策略很不利。因為C不論怎麼選都會贏。如果R player的混合策略是[9/14, 5/14]，那C player選c1時，R payoff的期望值為 -6×(9/14)+7×(5/14) = -19/14 選c2時，-1×(9/14)+(-2)×(5/14) = -19/14 選c3時，4×(9/14)+(-5)×(5/14) = 11/14 對C player不利，所以C的混合策略應該是[1/14, 13/14, 0]最為有利。

Example 11.5: 3×3 Payoff matrix 求R oddment = [r1 odd:r2 odd:r3 odd]，先降行… = [6:6:48] = [1:1:8] |10×1－(-2)×(-2)| = 6 ← r1 odd |6×1－(-2)×(-6)| = 6 ← r2 odd |6×(-2)－(-6)×(10)| = 48 ← r3 odd 不管怎樣R一定贏求C oddment = [c1 odd:c2 odd:c3 odd]，先降列… = [38:14:8] = [19:7:4] • If C takes pure strategy c1 then the expected payoff of R is (6×1+8×1+4×8)/10 = 23/5, and • the same expected payoff is come out in c2 and c3 cases. • 2. If R takes pure strategy r1, r2 or r3 then C has the same expected payoff (loss) 23/5 will be figured out. － = |2×(-5)－6×(-8)| = 38 ← c1 odd |-2×(-5)－6×4| = 14 ← c2 odd |-2×(-8)－2×4| = 8 ← c3 odd

11. Game Theory ( 不是電玩理論 ... )