Our genetic algorithm had an incredibly large search space to explore, given this, it was imperative that multiple instances of the algorithm be run in parallel. Several different populations were maintained across multiple computers, with the standard population consisting of 150 AIs. The most successful members of different populations were occasionally manually merged into a single population, so as to allow for cross-population competition. Particularly successful members were also manually archived for later inclusion in a sample pack of pre-evolved AIs (available now on StratLoc's homepage). Overall, while many hundreds of thousands of games of StratLoc were played as part of the genetic algorithm, it was possible only to go through 40 or 50 generations of evolution per population.
Contrary to most genetic algorithms, one of the easiest parts of evolving AIs for StratLoc was the creation of the fitness function. Given that our goal was to have AIs capable of being competent players of the game, it made sense to rank AI success based on the percentage of played games they were able to win, when pitted against other (randomly selected) AIs in the same gene pool.
After each AI had participated in at least 5 games, all AIs which had a below-average fitness rating were discarded from the population. The empty slots were re-filled by children of various successful AIs. Three different methods were used to mate AIs. The first took the first half of the genome of one parent and combined it with the second half of the second parent's genome. The second mating method was similar, but instead of splitting the genomes evenly it would take a random number of bytes from the genome of one parent, then fill in the rest of the genome using the remaining parent. The third, final mating method was a simple "zipper" operation whereby the bytes of the child genome alternated in their parental source. Mutation was performed after mating on 5% of all children. If selected for mutation, each byte of an AI's genome would have a 6.67% chance of being replaced by a new, random byte.
Issues during evolution process
As the AIs were being evolved, members of the development team would occasionally take several members of a population and begin a game using them. Several days into AI generation, we observed that, although the AIs seemed to be successfully winning games of StratLoc against each other, their play was severely uninteresting. AIs never seemed to make any cities or even build mines. Instead, the metagame of the populations had seemed to favor the evolution of zerg-rush-like strategies. In the case of StratLoc, this meant spending the initial resources given to a player on a military unit instead of a worker (with which to cultivate an economy). The AIs would then be in a position of being unable to afford anything else, and were stuck with a single anti-air unit. However, the fast production of said unit would more often than not allow them to defeat other AIs who were "properly" playing the game. When several AIs deployed the same rush strategy, the game essentially became a matter of which AI could get their unit to opponents' undefended cities first.
That this was allowed to happen was a mixture of several failures on my part. The first was a misunderstanding of what we were really attempting to evolve: while the fitness function used by the genetic algorithm was successfully producing AIs that could win the game, what we wanted were AIs that were interesting and challenging to play against for humans. This is to say, while rushing may have technically been the best strategy for StratLoc, it certainly wasn't one that made for enjoyable and engrossing gameplay. Solving this involved awarding not only a single point to AIs for winning games, but also additional partial points for each city owned at the end of the game. Doing this meant that even AIs that didn't flat-out win games could still be rewarded for having built (or at least captured) a large number of cities; it also meant that, overall, games wherein more cities were produced ended up awarding higher fitness ratings to participant AIs.
The second issue highlighted by the evolution and dominance of the rush strategy was one of game balance, for which I was also responsible. To counteract the success of the rush, I adjusted the costs of units and initial starting resources for players such that it was impossible to build a military unit on the first turn. In this way, the only possible path to building military units was to first build a worker with which to develop an economy, then go from there. While this simple change would have eventually resulted in AIs which were more economically-minded during the early game, the prevalence of the rush strategy amongst the AIs would have meant that, for many generations, AIs would continue to try (unsuccessfully) to build military units at the game's start. It was therefore necessary (for the sake of time, at least) to hard-code actions for the AIs for the first few turns of the game, which would lead them to build workers and create mines. While this may have in some sense ruined the "purity" of the genetic algorithm, it ultimately made the game more interesting and playable for humans and prevented the AIs from having to re-evolve more sensible strategies for the beginning of the game.
As mentioned in this series' previous post, the aim of evolving AIs for StratLoc was not to generate systems capable of manipulating individual units but, instead, to have them issue overall goals or directives for their turns in response to environmental data fed to them from the game. These directives, in turn, would be implemented by hardcoded finite state machines. An immediately obvious failure of this design was that AIs could easily issue directives that were flat-out impossible, such as attempting to attack an enemy player without possessing any units. There exist two naďve solutions to this issue, neither of which work particularly well. The first would be to ignore unattainable goals or unimplementable directives. This has the distinct disadvantage of allowing AIs to accomplish nothing during a turn, simply because they have not evolved to have goals compatible with the current state of the game. Moreover, because of the extremely large search space we were already tasking our genetic algorithm with exploring (thanks to the number of environmental variables we were feeding it, as will be discussed below), even AIs that for the most part had evolved excellent strategies could find themselves left dead in the water when presented with a novel situation. The other solution would be to simply feed in more data to the AI, so that it could adapt to any situation, and could evolve in a manner such that, if it did not have military units, it would not attempt to attack anyone. However, as mentioned, the search space we were already operating in was quite large – it would have taken several years to fully explore it with our available hardware, according to my calculations. Each additional bit of information exposed to the genetic algorithm would have increased the search space by a factor of two, as well as doubling the size of the genome (which I had already calculated as being at least a sizable 6.4 megabytes).
My solution to this was to imbue the AI's directives with decompositionality. This is to say, for any action the AI might attempt to make that had circumstances under which it would fail, the state machine charged with executing that action would, under those circumstances, defer to an alternate state machine in an attempt to remedy the circumstances. As an example, supposing in some situation the AI attempted to build a tank but had insufficient resources to do so. The task of making a tank would then decompose into the task of seizing additional resources. Should this be impossible (due, perhaps, to a lack of workers with which to build mines), the task would decompose into building workers, and so on. Even when an entire chain of actions led to a dead end my design attempted to let the AI do something by (in most circumstances) deferring to a "default action" specified by the genome. Only in extraordinary cases would the end result of an AI directive lead to nothing being done.
This section will give a general layout of the genome. If you are interested in specifics, I recommend taking a look at the official structure specification I wrote up before StratLoc was implemented. Overall, the genome can be broken into two sections. The first contains a series of "threshold" values and takes up 26 bits in total. The threshold values encode variables referenced by the action-implementing finite state machines, such as "combat win chance tolerance", variables that are used in the generation of environmental inputs, such as the "[unit] queue length tolerance", and the aforementioned "default" AI action. The second can be imagined as a 2D array, with the first index being a 21-bit value representing environmental conditions, and the second index pointing to any of five 5-bit AI action encodings, all of which are intended to be executed during a given turn. This allows for 32 possible actions. While most actions are straightforward orders to create units, attack enemy players, build structures or fortify locations, seven actions are reserved for manipulating the internal state of the AI. This allows it to alter its combat win chance tolerance (which will result in different combat behavior), alter its turns left threshold (which is reflected in the environmental conditions), or clear the AI's unit queue. Specifics as to the execution of AI actions can be found here, but be forewarned, these underwent some changes in the final implementation. Environmental conditions encode what occurred last round (such as being attacked or building cities), the ranking of the AI (in several regards), which military unit type was most numerous, and whether any of the AI's thresholds had been triggered. While it might seem like these sparse details would provide insufficient information to the AI, any increase in the size of the environmental conditions would have (quite literally) exponentially increased the difficulty of the genetic algorithm's task. Overall, however, the AIs seemed to do quite well with the details they were provided and, regardless, the intelligent decomposition of AI actions ensured that even totally randomly generated AIs would have been moderately successful at playing StratLoc.
In this series of posts, I’ll be discussing my experience designing and helping to implement AI opponents for the turn-based, open-source strategy game StratLoc. The AIs were created using a genetic algorithm which was used to crudely simulate natural selection and evolution, allowing our AIs (over the generations) to become competent players of the game. Although the use of genetic algorithms in video games is not uncommon, the technique is often applied when something needs fine-tuning, such as the numeric value of some "aggressiveness" variable, or for the optimization of well-defined subtasks with limited scope, such as situational unit movement patterns or the micromanagement of a base. Our AI, on the other hand, was tasked with providing turn-by-turn, high-level strategies and actions in response to the overall climate of the game, the execution of which was then handled by a hardcoded system.
This first introductory post will give an overview of Stratloc as a game, so as to provide some context. StratLoc is a turn-based strategy game played on a hexagonal board, a laCivilization (sans a great deal of complexity). The objective of the game is to eliminate your opponents by capturing their cities. The game is played on a randomly generated map featuring plains, hills, mountains and water tiles (both of which are impassable), metal tiles, and four starting cities (one for each player). From cities, players can build military units and workers; players must also control at least one city to avoid losing the game. Cities are permanent fixtures which can be captured by opposing players.
The game has a single overt resource, "metal", which can be harvested from metal tiles by building mines. Mines provide a steady stream of metal per turn inversely proportional to the distance of the mine to its owner’s nearest city. Mines, once built, cannot be destroyed, but can switch ownership if captured by another player. The player must also manage "power", which acts as a supply cost for units. Power is provided in small quantities by cities and in bulk by power stations. Power stations are not capturable, but may be destroyed by enemies. A final building, the fortification, can be built in order to provide a defensive bonus and gain vision around a tile. Fortifications are permanent but capturable. All buildings (including cities) must be built using a "worker" unit, which is destroyed in the process.
There are three military units in the game: anti-air, tanks, and air. Anti-air units have a standard range and low base damage, but are cheap and gain a combat advantage against air units. Tanks are slightly more expensive, with a standard base damage but high range. Tanks also suffer a penalty against air. Air units have a high metal cost and high power usage, with a low HP. However, they make up for this by providing huge line of sight and high mobility. In this way, combat strategy is more or less a game of rock-paper-scissors, with air countering tanks, tanks countering anti-air, and anti-air (obviously) countering air. Units can gain a defensive bonus by being located on a city, fortification, or hill, or by accumulating a fortification bonus by remaining in the same spot for several turns.
posted on: Thursday, September 13, 2012 (4:01 pm) by Chase Stevens
I've recently been working on implementing various methods of image reconstruction in python. The idea is to, given several imperfect (that is to say, noisy, incomplete, photoshopped, or otherwise damaged) copies of some original image, attempt to arrive at something close to (if not precisely) the original image by combining said copies. Through these means, an approximation of the original image can be generated should the original be lost, itself damaged, irretrievable, or otherwise unavailable or useless. In writing the various functions for doing this, I implemented techniques used in signal averaging and variants thereof. I also implemented a “modal image" function which, for each pixel, uses the “modal" RGB values across all image copies or, failing that, performs a simple mean of values.
Examples and Analysis
For the following examples, I modified the above image of actor Christian Bale. Ironically enough, in testing for this article, I overwrote the original image and had to employ the use of Google's reverse image-search in order to find it.
Function results (listed with difference from original image as given by get_delta, lower is better):
<function average_images_add at 0x00000000030E8F28> : 155.35693875
<function average_images_sub at 0x00000000030E95F8> : 72.4844575
<function average_images at 0x0000000002EF0208> : 43.92254625
<function average_noisefilter_all_sub at 0x0000000002EF0278> : 51.1805645833
<function average_noisefilter_all_delta at 0x0000000002EF02E8> : 36.9071316667
<function modal_image at 0x0000000002EF0358> : 42.53322
As is readily visible, in this example the naďve “voting”-type approach used by modal_image is deceived in any case where forms of damage across multiple images “agree” with each other. Given a larger number of copies or a less artificial form of damage, this would likely cease to be an issue; in theory, modal_image could even go so far as to reconstruct the original image perfectly. average_noisefilter_all_delta produced the best results on average, although, to its detriment, its output relies on the order of the list of image copies passed to it. In addition, while it manages to be closer to the original image, the subtraction of image differences it employs creates a slightly “jarring” visual effect (as seen above). The inherent issue in reconstructing damaged images is one of knowledge. To humans, it seems obvious that the original image of Christian Bale didn't have streaks of white across it. However, this is a mere assumption based on our vast pool of experience in viewing photographs. Who's to say that the original photo didn't have white scribbles on it? The computer is hampered, from our perspective, by its inability to make these assumptions when reconstructing the original image, so although it produces results better than a mere superposition of the copies, they rarely are better than what could be accomplished by human hands.
Function results (listed with difference from original image as given by get_delta, lower is better):
<function average_images_add at 0x00000000030E8F28> : 103.01672125
<function average_images_sub at 0x00000000030E95F8> : 81.929801875
<function average_images at 0x0000000002EF0208> : 82.971985
<function average_noisefilter_all_sub at 0x0000000002EF0278> : 69.6356495833
<function average_noisefilter_all_delta at 0x0000000002EF02E8> : 75.23682
<function modal_image at 0x0000000002EF0358> : 65.2880416667
In this case, modal_image produced the best results, although its output is still quite noisy. Again, having more copies would significantly improve the end product. The output also appears to be slightly brighter than would be expected. average_noisefilter_all_sub comes in at a close second, although (to the human eye) its results appear to be quite a bit noisier than modal_image.
I tried to use these image reconstruction methods on compressed images with jpeg artifacts as well, although the results were much poorer. Output deltas ended up being worse than the least compressed copy of a given set, although better than an average copy. modal_image and average_images_add seemed to perform best. The overall problem was again one of knowledge: could less compressed images be prioritized or weighted given their closeness to the source, results would likely be much better. However, the computer has no means by which to determine what is closest to the original, and thus fails to leverage this. Some kind of programmatic detection of jpeg artifacts could be of great help in improving this situation.
As a final note before my code, I just recently launched Broverbs. Please do check it out if you've got the time!
posted on: Thursday, September 6, 2012 (1:28 pm) by Chase Stevens
People don't think philosophy is useful. Why is this? Macmillan Dictionary has two definitions of philosophy. The first, "the study of theories about the meanings of things such as life, knowledge, and beliefs", is what people think when they hear "philosophy". They think of boring and antiquated texts on the nature of what is sacred. They think of white-bearded men debating over the trivialities of pointless and vague quandaries. They think of glassy-eyed teenagers wondering aloud, "What if, like, our universe is just inside an atom in another universe?" These are not philosophy. Or, at least, these are not the heart of philosophy, but, rather, its side-effects. The second definition offered by Macmillan is more true to what philosophy is at its core: "a system of beliefs that influences someone's decisions and behaviour." Much like the purpose of the brain is to control the body's movement, the purpose of philosophy is to act as a tool for making decisions in your life. In trying to do so, philosophy might delve into complex and counter-intuitive ethical systems, philosophers might write long, tedious and precisely-worded treatises, and strange hypothetical situations or worlds might be concocted and considered, but philosophy is not a collection of books, or a series of musings, or a set of questions, or even a group of people. Philosophy is a process, a methodology, and, ultimately, a mode of thinking about things and examining situations. As I've previously bemoaned, not having philosophical foundations for decision-making or at least a sense of what things you value and what goals you wish to achieve (both of which philosophy can help determine) can lead to terrible, contradictory choices that are without sense and without purpose.
In the above video, Jonathan Blow, a wildly popular figure in the gaming world, talks about game design and what makes a game worth playing. He finds several things to be valuable: not only "fun", as traditionally emphasized in gaming, but also the ability to make one think, or to evoke an emotional response. This is exactly what has made Jonathan Blow so popular, at least from my perspective: he's taken the time to philosophize about video games, and examined the gaming world through the philosophical lens he's constructed. Far beyond that, he's used his philosophical framework to inform the design and development of his indie game hit, Braid. He decided, from a philosophical and ethical perspective, that games shouldn't try to "trick" the player but, instead, should respect the player's intelligence and time. The decisions he made in creating Braid reflected this considered viewpoint and (at least in part) made it the excellent and critically lauded game that it is. Regardless of whether you agree or disagree with his views, Jonathan Blow has become an influential and well-respected developer because he has internalized and applied the spirit of philosophy. And this, furthermore, is what I encourage you to do. Next time you need to make a decision, spend some time considering what you think has inherent value, how your goals reflect that, and how (if at all) your options appeal to those values. You might not end up making the right decision, but you'll at least make a well-reasoned and justifiable decision.