Evolving Game AI – Part 2 (Design and Genome)
posted on: Sunday, January 6, 2013 by Chase Stevens
This is part two of a three-part series. For the first installment, see Part One (Overview).
As mentioned in this series' previous post, the aim of evolving AIs for StratLoc
was not to generate systems capable of manipulating individual units but, instead, to have them issue overall goals or directives for their turns in response to environmental data fed to them from the game. These directives, in turn, would be implemented by hardcoded finite state machines. An immediately obvious failure of this design was that AIs could easily issue directives that were flat-out impossible, such as attempting to attack an enemy player without possessing any units. There exist two naïve solutions to this issue, neither of which work particularly well. The first would be to ignore unattainable goals or unimplementable directives. This has the distinct disadvantage of allowing AIs to accomplish nothing during a turn, simply because they have not evolved to have goals compatible with the current state of the game. Moreover, because of the extremely large search space we were already tasking our genetic algorithm with exploring (thanks to the number of environmental variables we were feeding it, as will be discussed below), even AIs that for the most part had evolved excellent strategies could find themselves left dead in the water when presented with a novel situation. The other solution would be to simply feed in more data to the AI, so that it could adapt to any situation, and could evolve in a manner such that, if it did not have military units, it would not attempt to attack anyone. However, as mentioned, the search space we were already operating in was quite large – it would have taken several years to fully explore it with our available hardware, according to my calculations. Each additional bit of information exposed to the genetic algorithm would have increased the search space by a factor of two, as well as doubling the size of the genome (which I had already calculated as being at least a sizable 6.4 megabytes).
My solution to this was to imbue the AI's directives with decompositionality. This is to say, for any action the AI might attempt to make that had circumstances under which it would fail, the state machine charged with executing that action would, under those circumstances, defer to an alternate state machine in an attempt to remedy the circumstances. As an example, supposing in some situation the AI attempted to build a tank but had insufficient resources to do so. The task of making a tank would then decompose into the task of seizing additional resources. Should this be impossible (due, perhaps, to a lack of workers with which to build mines), the task would decompose into building workers, and so on. Even when an entire chain of actions led to a dead end my design attempted to let the AI do something by (in most circumstances) deferring to a "default action" specified by the genome. Only in extraordinary cases would the end result of an AI directive lead to nothing being done.
This section will give a general layout of the genome. If you are interested in specifics, I recommend taking a look at the official structure specification
I wrote up before StratLoc was implemented. Overall, the genome can be broken into two sections. The first contains a series of "threshold" values and takes up 26 bits in total. The threshold values encode variables referenced by the action-implementing finite state machines, such as "combat win chance tolerance", variables that are used in the generation of environmental inputs, such as the "[unit] queue length tolerance", and the aforementioned "default" AI action. The second can be imagined as a 2D array, with the first index being a 21-bit value representing environmental conditions, and the second index pointing to any of five 5-bit AI action encodings, all of which are intended to be executed during a given turn. This allows for 32 possible actions. While most actions are straightforward orders to create units, attack enemy players, build structures or fortify locations, seven actions are reserved for manipulating the internal state of the AI. This allows it to alter its combat win chance tolerance (which will result in different combat behavior), alter its turns left threshold (which is reflected in the environmental conditions), or clear the AI's unit queue. Specifics as to the execution of AI actions can be found here
, but be forewarned, these underwent some changes in the final implementation. Environmental conditions encode what occurred last round (such as being attacked or building cities), the ranking of the AI (in several regards), which military unit type was most numerous, and whether any of the AI's thresholds had been triggered. While it might seem like these sparse details would provide insufficient information to the AI, any increase in the size of the environmental conditions would have (quite literally) exponentially increased the difficulty of the genetic algorithm's task. Overall, however, the AIs seemed to do quite well with the details they were provided and, regardless, the intelligent decomposition of AI actions ensured that even totally randomly generated AIs would have been moderately successful at playing StratLoc.
In my next post
, I'll go over the implementation details of the genetic algorithm itself and detail some of the problems we ran into with the AI.