posted on: Tuesday, August 20, 2013 (7:28 am) by Chase Stevens
Recently, a couple of friends and I entered the Brilliant.org Hunger Games competition, in which competitors must create programs which are pitted against others in an arena of sorts. In any given round of the competition, the programs must choose for each other program whether to cooperate with them or take advantage of them. In the Hunger Games competition, these are framed as "hunting" and "slacking".
At face value, the competition is a very standard iterated prisoner’s dilemma. For the general case, the optimal strategy for this game has already been discovered: tit-for-tat. This essence of this strategy is to reciprocate everything done to you in the previous round, and cooperate with everyone in the first round in a show of "good faith". However, the Hunger Games competition had a slight twist: although your program would be able to see the overall reputation of each opponent (id est how many times they hunted versus slacked), no other identifying information would be supplied. This limited knowledge makes true tit-for-tat impossible, since your opponents are effectively anonymized. Although you may know that a player you decided to cooperate with in the previous round defected against you, there’s no way to retaliate against the same player in this round with 100% confidence.
My team’s strategy, nicknamed "The Underminer", both compensated for this lack of knowledge and managed to exploit it. We started with the assumption that tit-for-tat is possible, to a degree. As the number of rounds increases, the reputations of individual opponents becomes solidified, thus making this a means of identification. Although a player with a reputation of 1.0 in round one could drop to a reputation of 0.5 in round two, a player with a reputation of 1.0 in round 20 can only drop to 0.95. Based on this, one can say that the ranking of players by reputation remains mostly static: the worst-ranked opponent will have a very difficult time becoming the highest-ranked. While this is very untrue in the first rounds, at a certain point changing ranks becomes almost impossible. This phenomenon can be imagined like a zipper: before any zipping has occurred, there’s a very large degree of freedom of motion available. However, as you begin to zip, the freedom of motion becomes more and more constrained, until none remains.
While our program implements tit-for-tat as described above in most circumstances, there’s a very specific scenario in which it deviates from this otherwise optimal strategy. As mentioned, the "tit-for-tat" allowed by the Hunger Games competition is not foolproof, since player identities can only be approximately tracked at the beginning of the game. Assuming other players are also tracking opponents by reputation, we can exploit this limitation in knowledge by occasionally attempting to "undermine" another player, assuming their identity. This technique is probably best illustrated by example. Suppose in some round we’re ranked 2nd by reputation at 0.7, while another player is ranked 3rd at 0.6 reputation. Assuming both we and the 3rd ranked player slacked completely this round, there would be no way for us to displace our opponent as 3rd ranked player, since they already have the "bad reputation" head-start. However, the likelihood of our opponent slacking completely this round is very low. In fact, the decisions of our opponent can be estimated given their current reputation, the number of rounds elapsed so far, and a rating of how certain we want our estimation to be by using the lower bound of the Wilson score interval. While this formula is most often used to rank items based on user reviews (and is employed perhaps most famously by Reddit’s "hot" algorithm), in this case we can use it to infer the "true" cooperation rate of opponents and, based on this, their (probable) reputation at the end of the round. Supposing in this circumstance that we predict with a large amount of certainty that our opponent’s reputation at the end of this round will be at worst 0.55, and we can manage to lower our reputation below that, then we choose to slack completely this round. Assuming that the other player remained at 0.6 reputation, while we dove down to 0.5, from a third player’s perspective, this is what happened: the 2nd ranked player went from 0.7 reputation to 0.6 reputation, and the 3rd ranked player went from 0.6 reputation to 0.5 reputation. For the third player to make the assumption that the 2nd ranked player went under the 3rd ranked player - going from 0.7 reputation to 0.5 - would be a strange and unintuitive leap of logic. So, in this way, we can choose to take the advantageous route of fully slacking while passing off some of the negative repercussions of this to a worse-ranked player.
In the above, we can see The Underminer repeatedly performing this tactic against a standard tit-for-tat bot. After each undermine, The Underminer reverts to a basic tit-for-tat strategy, which in this simulation caused its reputation to increase over time. As soon as it’s again in a position where it can undermine the tit-for-tat player, it does, repeatedly reaping the rewards of doing so.
Unfortunately, I can’t yet reveal whether The Underminer was successful or not – mostly because I haven’t found out myself. Hopefully, however, in two weeks’ time the results of the Hunger Games competition will be announced, at which point I’ll write another analysis on how our program fared overall. In the meantime, you can go check out the source code for our program as well as the testing/simulation program we used at this github repo
Tags: anonymity, game theory, probability, programming, python
posted on: Friday, November 25, 2011 (1:56 pm) by Chase Stevens
Human beings are social animals. We live and interact in societies, adopt social norms, and incorporate our society into our identity. I would even say that our sense of justice, morality, and fairness stem from the fact that we are constantly interacting with each other within (in the ancestral environment) a mostly closed community. Doing so means that we are in a constant, mutiplayer iterated prisoner's dilemma, wherein we can earn reputations which will influence our interactions with others. For those not familiar with this concept, allow me to explain: the prisoner's dilemma can be explained as a two-player game. Each of the two players has but a single option: to either cooperate or to defect (in the original scenario, there were two prisoners who could either choose to "rat out" their criminal accomplice or to remain silent). Both players cooperating produces a moderately good outcome for both, whereas both defecting produces a mutually bad outcome for both. Should one player choose to cooperate while the other chooses to defect, the cooperating player gets the worst possible outcome and the defecting player gets the best. This can be represented in the below table:
|Player 1 Cooperates||Player 1 Defects|
|Player 2 Cooperates||Player 1: 3|
Player 2: 3
|Player 1: 5|
Player 2: 0
|Player 2 Defects||Player 1: 0|
Player 2: 5
|Player 1: 1|
Player 2: 1
If two players choose to play only one round of this game, the optimal strategy is to defect, as defection will net an average of 3 points, whereas cooperation will net only 1.5. Moreover, if this game is viewed under the lens of the maximin principle (which states that you should opt for the decision that will have the best worst-case scenario
), the worst case for cooperating gets you 0 points, whereas defecting ensures you are rewarded with at least one point.
However, if two players choose to play this game for a number of rounds, the general best strategy is to behave in an altruistic but reciprocal manner (although for larger groups and more complex but similar games, other more refined strategies
come into play). This is to say, initially assume in good faith that your partner is going to cooperate and do so yourself, but if they defect, defect yourself in turn. When this is applied to a larger group, this translates into looking into the reputation of the person you have been partnered with, seeing what their record is in terms of defection and cooperation, and responding accordingly.
As has been a fairly recent focus of attention
, this poses a problem on the internet. Online, being anonymous (or at least having the ability to operate under a discardable pseudonym) is the rule, not the exception, the result of which being that you either are incapable of gaining a reputation or can shed one easily if necessary. This breaks the connection between iterations that allows for cooperative strategies in the prisoner's dilemma, essentially turning the game into a series of one-off rounds, as neither party can ever truly know whom they're interacting with. This is only compounded by the fact that human empathy grossly decreases
when we're not face-to-face with one another; inability to observe the facial expressions, verbal tone, and body language of a person you're interacting with impairs your ability to create a model of that person
and thereby your ability to empathize with them. This leads to people on the internet often making incredibly callous, distasteful and offensive statements, the likes of which they would never normally make "in real life," the obvious example of this being the infamous 4chan
. If given the opportunity, such as in the massively mutiplayer EVE Online, people will lie, cheat, steal, betray, backstab, defraud and destroy each other
with nary a second thought. However, when these people are encountered offline they are typically soft-spoken, courteous and otherwise normal
Of course, the internet isn't the only environment in which this can happen. Driving
also grants us an unusually high level of anonymity while cars themselves abstract our fellow humans into mere "drivers". This results in a casual level of rudeness that seems to avoid descending into complete sociopathic chaos only by the virtue of everyone having some degree of self-preservation.
Where else are these environmental conditions manifested? The voting booth. Although obviously a tad more complex than a prisoner's dilemma, it stands that voters quite often are put in a position where they could vote in a manner directly in accordance with their own self-interest, and not that of the community/country/what have you. Given what things people do in other circumstances when granted anonymity, it does not seem overly bold to assume that at least a certain percentage of people vote selfishly. The natural question that therefore arises (and which I'd like to address) is that of the benefits of voting records becoming public.
Before delving into that, however, we should investigate whether or not votes being made public would really change anything. It could perhaps be the case that people would continue to vote the same way regardless. However, the Bradley effect
would seem to claim otherwise. To be concise, the Bradley effect refers to a phenomenon wherein non-white candidates score higher in polls than they do once votes have actually been tallied, the presumption being that people being polled claim that they'll vote for the non-white candidate when asked by a pollster to avoid a potential accusation of racism. Historically the effect seems to have been responsible for polling discrepancies of up to 12 points. A reverse-Bradley effect has also been observed, such as in the case with former Louisiana State Representative David Duke, a Nazi sympathizer and former KKK Grand Wizard, who won his position despite few people being willing to admit to pollsters that they supported him. So indeed, it seems that, at least in some cases, people will change their votes (or at least admitted voting intentions) if the way they voted will be made public. Although perhaps the difference wouldn't be as great as observed in the above cases (more being at stake when actually voting, as opposed to telling someone how you'll vote), it would be reasonable to expect to see some change.
What is the advantage of having peoples votes be influenced by societal pressure? Well, for one, it would seem as though we might be able to avoid having KKK members as state representatives, which most people would agree is a definite boon. We could also expect to see more effects such as this, where people wouldn't vote in ways that others might find objectionable. One example might be gay marriage: those who privately oppose it could be far less willing to have their objections exposed publicly (this, of course, assumes that the legalization of gay marriage would be a "good thing"). Public voting outcomes could be as different from the anonymous voting outcomes we have now as behavior in public is from anonymous online behavior.
Of course, there are numerous objections to this course of action. What if society's standards are unconscionable? One might easily imagine an area of the country in which the pressure would be to vote for the KKK member, as opposed to against him. Could those who would otherwise oppose racism then be peer pressured into voting against their morals? Moreover, in a less extreme case, could the Bradley effect cause people to vote for non-white candidates not because of political ideology, but instead to not appear racist? Such a system would also enable a great deal of coercion to come into play, enabling unscrupulous groups or individuals to influence votes through bribery or threat. Buying votes would be a simple matter if only one could verify how someone voted. Lastly, does one not have a right to vote selfishly? It could easily be argued that the voting booth is one of the few places in which one can express their opinions and influence the world around them in an honest way, free from repercussions or persecution.
While the suggestion of having one's votes be made public might not be a likelihood, it's probably not as fanciful as we might be inclined to imagine. Although currently no one has access to how you voted, in the United States any interested party can find out whether or not you voted. In fact, in 2009 a group in Virginia called "The Know Campaign" sent out a letter to voters
informing them not only of their past voting history, but also that of their neighbors, all in an effort to use societal pressure in order to get people to vote more (the implication being that the neighbors have also received your
voting record). The campaign was directly stated by executive director Debra Girvin as being not a "shame tactic", but instead a "peer pressure tactic." Although the efficacy of the campaign hasn't (to my knowledge) been disclosed, it did generate at least 1,000 letters of complaint.Tags: anonymity, democracy, ethics, evolution, game theory, gaming, government, internet, morality, philosophy, probability, society, united states, voting