posted on: Wednesday, December 4, 2013 (4:02 am) by Chase Stevens
In this blog post, I’d like to introduce a project I’ve been tinkering on and off (mostly off) with for the past few months. The idea behind the project was pretty simple: to write a program which, given a set number of triangles and a blank canvas, adjusts the coordinates, color, and transparency of the triangles over time in an attempt to replicate a given image. There were two primary motivations behind my project, at least initially:
to create (lossily) compressed versions of the source images that still look decent (and might even reduce image noise), and
to have those compressed versions themselves be scalable to any size.
While I found that my program met these goals fairly adequately, what I was both surprised and delighted by was its aesthetically pleasing and almost artful representations of the source images. Therefore, before delving into the more technical aspects of the implementation, I’d like to present a few samples of the program’s results. The first example uses 50 triangles, while the other two use 100.
This was one of the very first tests of my program, and is in my opinion also one of its most successful and impressive recreations. As you might infer from the stark difference between iteration 9,500 and 50,000 and moderate difference between iteration 50,000 and 1,293,000, the program’s output approaches complete similarity to the target image roughly logarithmically with each iteration; this trend is plotted in the graph below.
The corresponding vector for the final result (below) is actually noticeably less accurate than the JPEG posted above, because the program has exploited JPEG artifacts as well as the specific dimensions of the target in creating its reproduction. When the triangles are used to create an SVG, these artifacts are no longer present and the image becomes less faithful.
One of the most noticeable issues with the program’s image is that it entirely fails to capture the eyes in the source image. I explored various methods of solving this issue, which I will discuss in a later post on implementation. Overall, though, I thought that the program produced a more pleasing (if less accurate) image, kilobyte-for-kilobyte, than comparable "traditional" compression techniques (as pictured below), especially if enlarged.
Program’s final output Image size: 2.74KB Compressed SVG size: 0.88KB
Original image Size: 6.53KB
Image compressed with Paint.NET Size: 1.91KB Compressed size: 1.58KB
To finish off this example: a video of the program’s progress in making its output played at 30 frames per second, with frames produced every 500th iteration. The "jumps" in activity visible at 0:23 and 0:56 are instances when I made updates to the program.
This example yet again highlighted an issue with the image similarity evaluation methods employed by the program. While many aspects of the source image are replicated with high fidelity, the facial features of Christian Bale are not. Below again is a video of the program’s progress over time.
In this instance, I experimented with another evaluation method that incorporated a saliency comparison between the target image and the output. Although the results presented here seem promising, other tests using the method were lacklustre, especially given that the saliency map calculations added a lot of processing time to each iteration.
Recently, a couple of friends and I entered the Brilliant.org Hunger Games competition, in which competitors must create programs which are pitted against others in an arena of sorts. In any given round of the competition, the programs must choose for each other program whether to cooperate with them or take advantage of them. In the Hunger Games competition, these are framed as "hunting" and "slacking".
At face value, the competition is a very standard iterated prisoner’s dilemma. For the general case, the optimal strategy for this game has already been discovered: tit-for-tat. This essence of this strategy is to reciprocate everything done to you in the previous round, and cooperate with everyone in the first round in a show of "good faith". However, the Hunger Games competition had a slight twist: although your program would be able to see the overall reputation of each opponent (id est how many times they hunted versus slacked), no other identifying information would be supplied. This limited knowledge makes true tit-for-tat impossible, since your opponents are effectively anonymized. Although you may know that a player you decided to cooperate with in the previous round defected against you, there’s no way to retaliate against the same player in this round with 100% confidence.
My team’s strategy, nicknamed "The Underminer", both compensated for this lack of knowledge and managed to exploit it. We started with the assumption that tit-for-tat is possible, to a degree. As the number of rounds increases, the reputations of individual opponents becomes solidified, thus making this a means of identification. Although a player with a reputation of 1.0 in round one could drop to a reputation of 0.5 in round two, a player with a reputation of 1.0 in round 20 can only drop to 0.95. Based on this, one can say that the ranking of players by reputation remains mostly static: the worst-ranked opponent will have a very difficult time becoming the highest-ranked. While this is very untrue in the first rounds, at a certain point changing ranks becomes almost impossible. This phenomenon can be imagined like a zipper: before any zipping has occurred, there’s a very large degree of freedom of motion available. However, as you begin to zip, the freedom of motion becomes more and more constrained, until none remains.
While our program implements tit-for-tat as described above in most circumstances, there’s a very specific scenario in which it deviates from this otherwise optimal strategy. As mentioned, the "tit-for-tat" allowed by the Hunger Games competition is not foolproof, since player identities can only be approximately tracked at the beginning of the game. Assuming other players are also tracking opponents by reputation, we can exploit this limitation in knowledge by occasionally attempting to "undermine" another player, assuming their identity. This technique is probably best illustrated by example. Suppose in some round we’re ranked 2nd by reputation at 0.7, while another player is ranked 3rd at 0.6 reputation. Assuming both we and the 3rd ranked player slacked completely this round, there would be no way for us to displace our opponent as 3rd ranked player, since they already have the "bad reputation" head-start. However, the likelihood of our opponent slacking completely this round is very low. In fact, the decisions of our opponent can be estimated given their current reputation, the number of rounds elapsed so far, and a rating of how certain we want our estimation to be by using the lower bound of the Wilson score interval. While this formula is most often used to rank items based on user reviews (and is employed perhaps most famously by Reddit’s "hot" algorithm), in this case we can use it to infer the "true" cooperation rate of opponents and, based on this, their (probable) reputation at the end of the round. Supposing in this circumstance that we predict with a large amount of certainty that our opponent’s reputation at the end of this round will be at worst 0.55, and we can manage to lower our reputation below that, then we choose to slack completely this round. Assuming that the other player remained at 0.6 reputation, while we dove down to 0.5, from a third player’s perspective, this is what happened: the 2nd ranked player went from 0.7 reputation to 0.6 reputation, and the 3rd ranked player went from 0.6 reputation to 0.5 reputation. For the third player to make the assumption that the 2nd ranked player went under the 3rd ranked player - going from 0.7 reputation to 0.5 - would be a strange and unintuitive leap of logic. So, in this way, we can choose to take the advantageous route of fully slacking while passing off some of the negative repercussions of this to a worse-ranked player.
In the above, we can see The Underminer repeatedly performing this tactic against a standard tit-for-tat bot. After each undermine, The Underminer reverts to a basic tit-for-tat strategy, which in this simulation caused its reputation to increase over time. As soon as it’s again in a position where it can undermine the tit-for-tat player, it does, repeatedly reaping the rewards of doing so.
Unfortunately, I can’t yet reveal whether The Underminer was successful or not – mostly because I haven’t found out myself. Hopefully, however, in two weeks’ time the results of the Hunger Games competition will be announced, at which point I’ll write another analysis on how our program fared overall. In the meantime, you can go check out the source code for our program as well as the testing/simulation program we used at this github repo.
Tags: anonymity, game theory, probability, programming, python
In the field of Computational Neuroscience, saliency maps are a means of graphically representing the areas of any visual scene presenting the most "bottom-up" saliency to a human observer (i.e. those most likely to draw the viewer's attention). Although the generation of these maps is not particularly difficult on a conceptual level, doing so is quite computationally expensive if using a serial approach. Below, I provide code for quickly generating the component contrast maps needed to build a saliency map by parallelizing the task on the GPU, as adapted from MATLAB code provided by Vicente Ordonez of SUNY. To run this, you'll need pyopencl v0.92, numpy, and PIL.
posted on: Thursday, September 13, 2012 (4:01 pm) by Chase Stevens
I've recently been working on implementing various methods of image reconstruction in python. The idea is to, given several imperfect (that is to say, noisy, incomplete, photoshopped, or otherwise damaged) copies of some original image, attempt to arrive at something close to (if not precisely) the original image by combining said copies. Through these means, an approximation of the original image can be generated should the original be lost, itself damaged, irretrievable, or otherwise unavailable or useless. In writing the various functions for doing this, I implemented techniques used in signal averaging and variants thereof. I also implemented a “modal image" function which, for each pixel, uses the “modal" RGB values across all image copies or, failing that, performs a simple mean of values.
Examples and Analysis
For the following examples, I modified the above image of actor Christian Bale. Ironically enough, in testing for this article, I overwrote the original image and had to employ the use of Google's reverse image-search in order to find it.
Function results (listed with difference from original image as given by get_delta, lower is better):
<function average_images_add at 0x00000000030E8F28> : 155.35693875
<function average_images_sub at 0x00000000030E95F8> : 72.4844575
<function average_images at 0x0000000002EF0208> : 43.92254625
<function average_noisefilter_all_sub at 0x0000000002EF0278> : 51.1805645833
<function average_noisefilter_all_delta at 0x0000000002EF02E8> : 36.9071316667
<function modal_image at 0x0000000002EF0358> : 42.53322
As is readily visible, in this example the naïve “voting”-type approach used by modal_image is deceived in any case where forms of damage across multiple images “agree” with each other. Given a larger number of copies or a less artificial form of damage, this would likely cease to be an issue; in theory, modal_image could even go so far as to reconstruct the original image perfectly. average_noisefilter_all_delta produced the best results on average, although, to its detriment, its output relies on the order of the list of image copies passed to it. In addition, while it manages to be closer to the original image, the subtraction of image differences it employs creates a slightly “jarring” visual effect (as seen above). The inherent issue in reconstructing damaged images is one of knowledge. To humans, it seems obvious that the original image of Christian Bale didn't have streaks of white across it. However, this is a mere assumption based on our vast pool of experience in viewing photographs. Who's to say that the original photo didn't have white scribbles on it? The computer is hampered, from our perspective, by its inability to make these assumptions when reconstructing the original image, so although it produces results better than a mere superposition of the copies, they rarely are better than what could be accomplished by human hands.
Function results (listed with difference from original image as given by get_delta, lower is better):
<function average_images_add at 0x00000000030E8F28> : 103.01672125
<function average_images_sub at 0x00000000030E95F8> : 81.929801875
<function average_images at 0x0000000002EF0208> : 82.971985
<function average_noisefilter_all_sub at 0x0000000002EF0278> : 69.6356495833
<function average_noisefilter_all_delta at 0x0000000002EF02E8> : 75.23682
<function modal_image at 0x0000000002EF0358> : 65.2880416667
In this case, modal_image produced the best results, although its output is still quite noisy. Again, having more copies would significantly improve the end product. The output also appears to be slightly brighter than would be expected. average_noisefilter_all_sub comes in at a close second, although (to the human eye) its results appear to be quite a bit noisier than modal_image.
I tried to use these image reconstruction methods on compressed images with jpeg artifacts as well, although the results were much poorer. Output deltas ended up being worse than the least compressed copy of a given set, although better than an average copy. modal_image and average_images_add seemed to perform best. The overall problem was again one of knowledge: could less compressed images be prioritized or weighted given their closeness to the source, results would likely be much better. However, the computer has no means by which to determine what is closest to the original, and thus fails to leverage this. Some kind of programmatic detection of jpeg artifacts could be of great help in improving this situation.
As a final note before my code, I just recently launched Broverbs. Please do check it out if you've got the time!
A while ago, a friend of mine who was unfamiliar with python asked me how to go about parsing a CSV into a list of lists. Moreover, the data included integers which needed to be parsed as such (and not as strings). My solution was as follows:
(lambda f: [map((lambda x: x.isdigit() and int(x) or x),(line.split(','))) for line in f.readlines()]) #takes a file
Just yesterday, another friend of mine had accidentally merged and subsequently sorted two excel spreadsheets, and overwrote one with the result. She asked if I could take the merged file and the remaining original file and re-construct the file she had overwritten. Incorporating the CSV parsing function I had written earlier, I wrote this:
from tkFileDialog import asksaveasfile, askopenfile
full = (lambda f: [map((lambda x: x.isdigit() and int(x) or x),(line.split(','))) for line in f.readlines()])(askopenfile())
exclude = (lambda f: [map((lambda x: x.isdigit() and int(x) or x),(line.split(','))) for line in f.readlines()])(askopenfile())
(lambda f: map(f.write,(map(','.join,filter((lambda x: x not in exclude),full)))))(asksaveasfile())