|
The Prisoner's Dilemma
General Model of the Two player Game
The Prisoner's Dilemma model as presented by Robert Axelrod,
Douglas Hofstadter, and others (See References at end), goes
as follows:
Two prisoners, lets call them Joe and Sam, are being held for
trial. They are being held in separate cells with no means of
communication. The prosecutor offers each of them a deal. He
also disclosed to each that the deal was made to the other.
The deal he offered is this:
a) If you will confess that the two of you committed the crime
and the other guy denies it, we will let you go free and send
him up for five years.
b) If you both deny the crime, we have enough circumstantial
evidence to put both of you away for two years.
c) If both of you confess to the crime, then you'll both get 4
year sentences.
Put yourself in Joe's position. If Sam stays mum and you
sing, you get zero years. If he stays mum and you stay mum,
you will each get 2 years. On the other hand if both of you
confess, you both get 4 years. Finally, if he confesses and
you don't, you will get 5 years. Whatever Sam does, it is to your
advantage to admit your wrong doing. Of course, Sam is also a
rational person and he will, therefore, come to the same
conclusion. So you both end up confessing which nets a total
of 8 man-years in the pokey. The paradox is, if you had both
denied the crime, a total of only 4 man-years would be spent
behind bars.
Wait a minute! Can it really be that rationality leads to an
inferior result? Let's look at this one more time. We will use
a payoff matrix, a common tool of the game theoreticians.
The payoff matrix is usually presented in the following form:
ACTION PAYOFF
Joe Sam Joe Sam
Cooperate Cooperate -2 (R) -2 (R)
Cooperate Defect -5 (S) 0 (T)
Defect Cooperate 0 (T) -5 (S)
Defect Defect -4 (P) -4 (P)
(The codes represent standard terminology for each action:
R Reward for mutual cooperation
S Sucker's payoff
T Temptation to defect
P Punishment for mutual defection )
The general form of the Prisoner's Dilemma model is that the
preference ranking of the four payoffs be, from best to worst,
T, R, P, S and that R be greater than the average of T and S.
That is, any situation that meets these conditions will be a
"Prisoner's Dilemma".
In summary, the prisoner's dilemma model postulates a
condition in which the rational action of each individual is
to not cooperate (that is, to defect), yet, if both parties
act rationally, each party's reward is less that it would have
been if both acted irrationally and cooperated.
The model can be applied to many real world situations, from
genetics to business transactions to international politics.
Iterated "Prisoner's Dilemma" with multiple participants
If the game is played only once there is no incentive for
either player to do anything but defect, as discussed above.
In fact, if the game is to be played a known number of rounds,
there is no better choice than to defect. (Why? Because you
both know you will defect on the last move. That puts you in
the same situation for the next to the last move - and so on
for all.) But if the game is to be played an indefinite number
of times, under certain conditions, cooperation will evolve as
the best policy.
Another addition to the game that makes it more realistic is
to assume that each player interacts with a multitude of other
players. Additionally, it is assumed that each player
remembers the past history of the interactions with each of
the other players and that past history is the only
information he has.
The Iterated "Prisoner's Dilemma" has been the subject of much
study and computer simulation (see references). An interesting
and possibly useful result of these studies is that a player's
best strategy in this "game" is "Tit for Tat", with the
additional proviso that the player be initially cooperative.
That is, "I'll start off being nice but from that point on,
whatever you do to me, I will do to you on the next
interaction". This strategy has been shown to be clearly more
productive than "The Golden Rule"!
Note that we are discussing multiple participants in which
activities are between pairs of "actors". There is yet another
more complex situation in which an individual is interacting
with ALL of the other participants at once. This situation,
which is more common in the real world, is called the "Many-
person-dilemma" or "Voter's Paradox". See the companion essay,
"Voter's Paradox" at this and other sites.
Author: Leon Felkins
Email: leonf@perspicuity.net
Written: 10/13/95
References:
1. Axelrod, Robert. 1984. The Evolution of Cooperation. New
York: Basic Books.
2. Hofstadter, Douglas R. 1983, "Metamagical Themas: Computer
Tournaments of the Prisoner's Dilemma Suggest How
Cooperation Evolves". Scientific American 248 (no.5):16-26.
3. On the Internet: http://pespmc1.vub.ac.be/PRISDIL.html.
Author: F.Heylighen. Date: Apr 13, 1995 (modified)
4. Several other essays are on the Internet. Just do a search
on "Prisoner Dilemma"
|
|