Читайте также:
|
|
The strategy profile defined in the proof of Proposition 149.1, in which players are punished for failing to mete out the punishment that they are assigned, may fail to be a subgame perfect equilibrium when the players' preferences are represented by the discounting criterion. The reason is as follows. Under the strategy profile a player who fails to participate in a punishment that was supposed to last, say, periods, is himself punished for, say,
periods, where
may be much larger then
. Further deviations may require even longer punishments, with the result that the strategies should be designed to carry out punishments that are unboundedly long. However slight the discounting, there may thus be some punishment that results in losses that can never be recovered. Consequently, the strategy profile may not be a subgame perfect equilibrium if the players' preferences are represented by the discounting criterion.
To establish an analog to Proposition 149.1 for the case that the players' preferences are represented by the discounting criterion, we construct a new strategy. In this strategy players who punish deviants as the strategy dictates are subsequently rewarded, making it worthwhile for them to complete their assignments. As in the previous section we construct a strategy profile only for the case in which the equilibrium path consists of the repetition of a single (strictly enforceable) outcome. The result requires a restriction on the set of games that is usually called full dimensionality.
Proposition 151.1 (Perfect folk theorem for the discounting criterion) Let be a strictly enforceable outcome of
. Assume that there is a collection
of strictly enforceable outcomes of
such that for every player
we have
and
for all
. Then there exists
such that for all
there is a subgame perfect equilibrium of the
– discounted infinitely repeated game of
than generates the path
in which
for all
.
Proof. The strategy profile in which each player uses the following machine is a subgame perfect equilibrium that supports the outcome in every period. The machine has three types of states. In state
the action profile chosen by the players is
. For each
the state
is a state of "reconciliation" that is entered after any punishment of player
is complete; in this state the action profile that is chosen is
. For each player
and period
between
and some number
that we specify later, the state
is one in which there remain
periods in which player
is supposed to be punished; in this state every player
other than
takes the action
, which holds
down to his minmax payoff. If any player
deviates in any state there is a transition to the state
, (that is, the other players plan to punish player
for
periods). If in none of the
periods of punishment there is a deviation by a single punisher the state changes to
. The set
of states serves as a system that punishes players who misbehave during a punishment phase: if player
does not punish player
as he is supposed to, then instead of the state becoming
, in which the outcome is
, player
is punished for
periods, after which the state becomes
, in which the outcome is
.
To summarize, the machine of player is defined as follows, where for convenience we write
; we specify
later.
· Set of states .
· Initial state: .
· Output function: In choose
. In
choose
if
, and
if
.
· Transitions in response to an outcome :
o From stay in
unless a single player
deviated from
, in which case move to
.
o From :
§ If a single player deviated from
then move to
.
§ Otherwise move to if
and to
if
We now specify the values of and
. As before, let
be the maximum of
over all
and
. We choose
and
to be large enough that all possible deviations are deterred. To deter a deviation of any player in any state
we take
large enough that
for all i
and all
and choose
where
is close enough to
that for all
we have
(This condition is sufficient since for
). If a player
deviates from
for
then he obtains at most
in the period that he deviates followed by
periods of
and
subsequently. If he does not deviate then he obtains
for between
and
periods and
subsequently. Thus to deter a deviation it is sufficient to choose
close enough to one that for all
we have
(Such a value of exists because of our assumption that
if
)
Exercise 152.1 Consider the three-player symmetric infinitely repeated game in which each player's preferences are represented by the discounting criterion and the constituent game is where for
we have
and
for all
.
d) Find the set of enforceable payoffs of the constituent game.
e) Show that for any discount factor the payoff of any player in any subgame perfect equilibrium of the repeated game is at least
.
f) Reconcile these results with Proposition 151.1.
Дата добавления: 2015-11-14; просмотров: 46 | Нарушение авторских прав
<== предыдущая страница | | | следующая страница ==> |
Punishing the Punisher: A Perfect Folk Theorem for the Overtaking Criterion | | | The Structure of Subgame Perfect Equilibria Under the Discounting Criterion |