posted on: Wednesday, December 4, 2013 (4:02 am) by Chase Stevens
In this blog post, Iíd like to introduce a project Iíve been tinkering on and off (mostly off) with for the past few months. The idea behind the project was pretty simple: to write a program which, given a set number of triangles and a blank canvas, adjusts the coordinates, color, and transparency of the triangles over time in an attempt to replicate a given image. There were two primary motivations behind my project, at least initially:
to create (lossily) compressed versions of the source images that still look decent (and might even reduce image noise), and
to have those compressed versions themselves be scalable to any size.
While I found that my program met these goals fairly adequately, what I was both surprised and delighted by was its aesthetically pleasing and almost artful representations of the source images. Therefore, before delving into the more technical aspects of the implementation, Iíd like to present a few samples of the programís results. The first example uses 50 triangles, while the other two use 100.
This was one of the very first tests of my program, and is in my opinion also one of its most successful and impressive recreations. As you might infer from the stark difference between iteration 9,500 and 50,000 and moderate difference between iteration 50,000 and 1,293,000, the programís output approaches complete similarity to the target image roughly logarithmically with each iteration; this trend is plotted in the graph below.
The corresponding vector for the final result (below) is actually noticeably less accurate than the JPEG posted above, because the program has exploited JPEG artifacts as well as the specific dimensions of the target in creating its reproduction. When the triangles are used to create an SVG, these artifacts are no longer present and the image becomes less faithful.
One of the most noticeable issues with the programís image is that it entirely fails to capture the eyes in the source image. I explored various methods of solving this issue, which I will discuss in a later post on implementation. Overall, though, I thought that the program produced a more pleasing (if less accurate) image, kilobyte-for-kilobyte, than comparable "traditional" compression techniques (as pictured below), especially if enlarged.
Programís final output Image size: 2.74KB Compressed SVG size: 0.88KB
Original image Size: 6.53KB
Image compressed with Paint.NET Size: 1.91KB Compressed size: 1.58KB
To finish off this example: a video of the programís progress in making its output played at 30 frames per second, with frames produced every 500th iteration. The "jumps" in activity visible at 0:23 and 0:56 are instances when I made updates to the program.
This example yet again highlighted an issue with the image similarity evaluation methods employed by the program. While many aspects of the source image are replicated with high fidelity, the facial features of Christian Bale are not. Below again is a video of the programís progress over time.
In this instance, I experimented with another evaluation method that incorporated a saliency comparison between the target image and the output. Although the results presented here seem promising, other tests using the method were lacklustre, especially given that the saliency map calculations added a lot of processing time to each iteration.
Recently, a couple of friends and I entered the Brilliant.org Hunger Games competition, in which competitors must create programs which are pitted against others in an arena of sorts. In any given round of the competition, the programs must choose for each other program whether to cooperate with them or take advantage of them. In the Hunger Games competition, these are framed as "hunting" and "slacking".
At face value, the competition is a very standard iterated prisonerís dilemma. For the general case, the optimal strategy for this game has already been discovered: tit-for-tat. This essence of this strategy is to reciprocate everything done to you in the previous round, and cooperate with everyone in the first round in a show of "good faith". However, the Hunger Games competition had a slight twist: although your program would be able to see the overall reputation of each opponent (id est how many times they hunted versus slacked), no other identifying information would be supplied. This limited knowledge makes true tit-for-tat impossible, since your opponents are effectively anonymized. Although you may know that a player you decided to cooperate with in the previous round defected against you, thereís no way to retaliate against the same player in this round with 100% confidence.
My teamís strategy, nicknamed "The Underminer", both compensated for this lack of knowledge and managed to exploit it. We started with the assumption that tit-for-tat is possible, to a degree. As the number of rounds increases, the reputations of individual opponents becomes solidified, thus making this a means of identification. Although a player with a reputation of 1.0 in round one could drop to a reputation of 0.5 in round two, a player with a reputation of 1.0 in round 20 can only drop to 0.95. Based on this, one can say that the ranking of players by reputation remains mostly static: the worst-ranked opponent will have a very difficult time becoming the highest-ranked. While this is very untrue in the first rounds, at a certain point changing ranks becomes almost impossible. This phenomenon can be imagined like a zipper: before any zipping has occurred, thereís a very large degree of freedom of motion available. However, as you begin to zip, the freedom of motion becomes more and more constrained, until none remains.
While our program implements tit-for-tat as described above in most circumstances, thereís a very specific scenario in which it deviates from this otherwise optimal strategy. As mentioned, the "tit-for-tat" allowed by the Hunger Games competition is not foolproof, since player identities can only be approximately tracked at the beginning of the game. Assuming other players are also tracking opponents by reputation, we can exploit this limitation in knowledge by occasionally attempting to "undermine" another player, assuming their identity. This technique is probably best illustrated by example. Suppose in some round weíre ranked 2nd by reputation at 0.7, while another player is ranked 3rd at 0.6 reputation. Assuming both we and the 3rd ranked player slacked completely this round, there would be no way for us to displace our opponent as 3rd ranked player, since they already have the "bad reputation" head-start. However, the likelihood of our opponent slacking completely this round is very low. In fact, the decisions of our opponent can be estimated given their current reputation, the number of rounds elapsed so far, and a rating of how certain we want our estimation to be by using the lower bound of the Wilson score interval. While this formula is most often used to rank items based on user reviews (and is employed perhaps most famously by Redditís "hot" algorithm), in this case we can use it to infer the "true" cooperation rate of opponents and, based on this, their (probable) reputation at the end of the round. Supposing in this circumstance that we predict with a large amount of certainty that our opponentís reputation at the end of this round will be at worst 0.55, and we can manage to lower our reputation below that, then we choose to slack completely this round. Assuming that the other player remained at 0.6 reputation, while we dove down to 0.5, from a third playerís perspective, this is what happened: the 2nd ranked player went from 0.7 reputation to 0.6 reputation, and the 3rd ranked player went from 0.6 reputation to 0.5 reputation. For the third player to make the assumption that the 2nd ranked player went under the 3rd ranked player - going from 0.7 reputation to 0.5 - would be a strange and unintuitive leap of logic. So, in this way, we can choose to take the advantageous route of fully slacking while passing off some of the negative repercussions of this to a worse-ranked player.
In the above, we can see The Underminer repeatedly performing this tactic against a standard tit-for-tat bot. After each undermine, The Underminer reverts to a basic tit-for-tat strategy, which in this simulation caused its reputation to increase over time. As soon as itís again in a position where it can undermine the tit-for-tat player, it does, repeatedly reaping the rewards of doing so.
Unfortunately, I canít yet reveal whether The Underminer was successful or not Ė mostly because I havenít found out myself. Hopefully, however, in two weeksí time the results of the Hunger Games competition will be announced, at which point Iíll write another analysis on how our program fared overall. In the meantime, you can go check out the source code for our program as well as the testing/simulation program we used at this github repo.
Tags: anonymity, game theory, probability, programming, python
In the field of Computational Neuroscience, saliency maps are a means of graphically representing the areas of any visual scene presenting the most "bottom-up" saliency to a human observer (i.e. those most likely to draw the viewer's attention). Although the generation of these maps is not particularly difficult on a conceptual level, doing so is quite computationally expensive if using a serial approach. Below, I provide code for quickly generating the component contrast maps needed to build a saliency map by parallelizing the task on the GPU, as adapted from MATLAB code provided by Vicente Ordonez of SUNY. To run this, you'll need pyopencl v0.92, numpy, and PIL.
Having learned a little about d3.js at a hackathon I attended, I decided to put it to good use. Below is an interactive implementation of the k-means clustering algorithm, as visualized using d3.js. Click to place points. Centroids are (initially) set on randomly chosen points, and the number of clusters is set by a supposed rule of thumb. You can choose to step through the algorithm one centroid-readjustment/simultaneous point assignment at a time, or view the process in real time.
Our genetic algorithm had an incredibly large search space to explore, given this, it was imperative that multiple instances of the algorithm be run in parallel. Several different populations were maintained across multiple computers, with the standard population consisting of 150 AIs. The most successful members of different populations were occasionally manually merged into a single population, so as to allow for cross-population competition. Particularly successful members were also manually archived for later inclusion in a sample pack of pre-evolved AIs (available now on StratLoc's homepage). Overall, while many hundreds of thousands of games of StratLoc were played as part of the genetic algorithm, it was possible only to go through 40 or 50 generations of evolution per population.
Contrary to most genetic algorithms, one of the easiest parts of evolving AIs for StratLoc was the creation of the fitness function. Given that our goal was to have AIs capable of being competent players of the game, it made sense to rank AI success based on the percentage of played games they were able to win, when pitted against other (randomly selected) AIs in the same gene pool.
After each AI had participated in at least 5 games, all AIs which had a below-average fitness rating were discarded from the population. The empty slots were re-filled by children of various successful AIs. Three different methods were used to mate AIs. The first took the first half of the genome of one parent and combined it with the second half of the second parent's genome. The second mating method was similar, but instead of splitting the genomes evenly it would take a random number of bytes from the genome of one parent, then fill in the rest of the genome using the remaining parent. The third, final mating method was a simple "zipper" operation whereby the bytes of the child genome alternated in their parental source. Mutation was performed after mating on 5% of all children. If selected for mutation, each byte of an AI's genome would have a 6.67% chance of being replaced by a new, random byte.
Issues during evolution process
As the AIs were being evolved, members of the development team would occasionally take several members of a population and begin a game using them. Several days into AI generation, we observed that, although the AIs seemed to be successfully winning games of StratLoc against each other, their play was severely uninteresting. AIs never seemed to make any cities or even build mines. Instead, the metagame of the populations had seemed to favor the evolution of zerg-rush-like strategies. In the case of StratLoc, this meant spending the initial resources given to a player on a military unit instead of a worker (with which to cultivate an economy). The AIs would then be in a position of being unable to afford anything else, and were stuck with a single anti-air unit. However, the fast production of said unit would more often than not allow them to defeat other AIs who were "properly" playing the game. When several AIs deployed the same rush strategy, the game essentially became a matter of which AI could get their unit to opponents' undefended cities first.
That this was allowed to happen was a mixture of several failures on my part. The first was a misunderstanding of what we were really attempting to evolve: while the fitness function used by the genetic algorithm was successfully producing AIs that could win the game, what we wanted were AIs that were interesting and challenging to play against for humans. This is to say, while rushing may have technically been the best strategy for StratLoc, it certainly wasn't one that made for enjoyable and engrossing gameplay. Solving this involved awarding not only a single point to AIs for winning games, but also additional partial points for each city owned at the end of the game. Doing this meant that even AIs that didn't flat-out win games could still be rewarded for having built (or at least captured) a large number of cities; it also meant that, overall, games wherein more cities were produced ended up awarding higher fitness ratings to participant AIs.
The second issue highlighted by the evolution and dominance of the rush strategy was one of game balance, for which I was also responsible. To counteract the success of the rush, I adjusted the costs of units and initial starting resources for players such that it was impossible to build a military unit on the first turn. In this way, the only possible path to building military units was to first build a worker with which to develop an economy, then go from there. While this simple change would have eventually resulted in AIs which were more economically-minded during the early game, the prevalence of the rush strategy amongst the AIs would have meant that, for many generations, AIs would continue to try (unsuccessfully) to build military units at the game's start. It was therefore necessary (for the sake of time, at least) to hard-code actions for the AIs for the first few turns of the game, which would lead them to build workers and create mines. While this may have in some sense ruined the "purity" of the genetic algorithm, it ultimately made the game more interesting and playable for humans and prevented the AIs from having to re-evolve more sensible strategies for the beginning of the game.