Saturday, March 28, 2009

Typing Monkeys and Genetic Algorithms

Recently I spent time playing with generic algorithms. I was looking for something easy enough to implement (read: PLAIN LAZY) and I found the Hello World! (literally) of genetic algorithms.

Gathering the algorithm itself from the C++ example (in the link above), I re-designed and re-coded the whole thing in C#.

TypingMonkey is a very simple application which uses a number (defined by the user) of random strings to evolve the user input text.

From the screen-shot below you can see how it took the algorithm 87 generations to evolve the string ".NET Butchering" from an initial population of 2500 random 15-characters strings:



Why "typing monkey"? Inspired by the Infinite Monkey Theorem, I am pretending our population of random string is generated by a typing monkey:

/// The Typing monkey generates random strings - can't be static 'cause it's a monkey.
/// If you wait long enough it will eventually produce Shakespeare.
class TypingMonkey
{
private const string legalCharacters = " abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890.";

static Random random = new Random();

/// The Typing Monkey Generates a random string with the given length.
public string TypeAway(int size)
{
StringBuilder builder = new StringBuilder();
char ch;

for (int i = 0; i < size; i++)
{
ch = legalCharacters[random.Next(0, legalCharacters.Length)];
builder.Append(ch);
}

return builder.ToString();
}
}

You can find the source code here: http://github.com/JohnIdol/typingmonkey/tree/master

Feel free to contribute or despise.

6 comments:

Anonymous said...

"Feel free to contribute or despise." - We can't hate you, we love you too much Johnny Idol!

Johnny Idol said...

@Anonymous

thanks for the love dawg.

P.S. I obviously hope you are a girl

Anonymous said...

Only on Fridays, when my name is Phyllis!

Johnny Idol said...

@Anomynmous

if that's a joke you're gonna have to explain it

EJ Alexandra said...

Just out of curiosity, your "EvolveGeneration" method preserves the "elite" members of that generation, but then randomly mates all of the rest.

The algorithm seems to be much more efficient if, rather than just preserving the "most elite", you simply mate the "most elite" with the other members of the populate - no? Isn't this what happens in nature? The most elite don't live longer, but they get to mate with whomever they choose and therefore have more children.

It gets even more efficient if the most elite only mate with the the top 50% of the rest of the populate (for example).

I just changed lines 82 and 83 in the evolution manager to:

int activeMateIndex = Dice.Roll(0, this.population.Count / 5);
int passiveMateIndex = Dice.Roll(0, this.population.Count / 2);

And it seems to find the target string in ~30-40 generations (rather than 100's). Am I missing something, or is "mating the fittest" a closer match to nature?

ej

Johnny Idol said...

@ej

Awesome comment man, thanks! your findings are extremely interesting and you do make great points.

"Preserving the elite" is a trick/cheat often used in GAs and I blindly applied it.

I think "mating the fittest" is definitely a closer match to nature and I like it much more. I would prefer to mate the fittest with all the candidates (including the other elite candidates) without constraining to the top 50% to retain a bit more randomness and keep the door open to "creative" solutions (which will often help you escape local fitness maxima in more complicated problems), but in this case it seems to do the job!

Thanks again!