# Haskell Genetic Algorithm "Hello, world!"

March 14, 2016: Reformatting

A post on reddit linked to several implementations of a cute "Hello, world!" program demonstrating a genetic algorithm that evolves towards a target string. Example programs were written in several languages, and I thought a Haskell version could be worthwhile as it demonstrates the use of random numbers, an issue that frustrates many newcomers to the language.

I use the MonadRandom package to deal with plumbing a random number generator through the code, but otherwise stick to lists and the usual cast of characters from the Prelude.

```{-# LANGUAGE NamedFieldPuns, TupleSections #-}
import Control.Applicative
import Control.Arrow (second)
import Control.Monad (liftM, replicateM)
import Control.Monad.Random
import Data.Function (on)
import Data.List (minimumBy, sortBy, nub, (\))
import Data.Ord (comparing)
import Text.Printf (printf)
```

The example programs associated with the original post all come with tests. Some of these are standard unit tests that verify hand-picked examples, while others verify properties of random processes. To support both kinds of tests, we use HUnit for unit tests, and QuickCheck for property testing.

```import Test.QuickCheck
import Test.QuickCheck.Monadic
import Test.HUnit hiding (assert)
```

For clarity, we create a type alias for the raw data we will be working with.

```type Gene = String

target :: Gene
target = "Hello, world!"
```

Next are the functions for working with individual `Gene`s.

```mate :: RandomGen g => Gene -> Gene -> Rand g Gene
mate g1 g2 = (++) <\$> flip take g1 <*> flip drop g2 <\$> pivot
where pivot = getRandomR (0, length g1 - 1)

mutate :: RandomGen g => Gene -> Rand g Gene
mutate g = (uncurry (++) .) . second . (:) <\$> delta <*> parts
where delta = getRandomR (' ', 'z')
idx = getRandomR (0, length g - 1)
parts = second tail . flip splitAt g <\$> idx

fitness :: Gene -> Int
fitness = sum . map abs . zipWith ((-) `on` fromEnum) target
```

We also want a random `Gene` generator.

```randomGene :: RandomGen g => Rand g Gene
randomGene = replicateM (length target) \$ getRandomR (' ', 'z')
```

Finally we can start thinking about populations. The example Python program I looked at had several parameters for initializing the population and controlling how it evolves. We capture those parameters in a record.

```data PopInfo = PopInfo { size      :: Int
, crossover :: Float
, elitism   :: Float
, mutation  :: Float }
```

A `Population` is a pair of a record describing the population and a collection of `Gene`s.

```type Population = (PopInfo, [Gene])

defaultPop :: PopInfo
defaultPop = PopInfo 1024 0.8 0.1 0.03
```

We will use a helper function to produce a randomized initial population.

```randomPop :: RandomGen g => PopInfo -> Rand g Population
randomPop = liftA2 (,) <\$> pure <*> flip replicateM randomGene . size
```

A tournament selection method is used to select parent `Gene`s from a `Population`.

```tournamentSize :: Int
tournamentSize = 3

tournamentSelection :: RandomGen g => Population -> Rand g Gene
tournamentSelection (info, genes) =
minimumBy (comparing fitness) .  map (genes !!) <\$>
replicateM tournamentSize (getRandomR (0, size info - 1))

twoM :: Monad m => m a -> m (a, a)
twoM = liftM (\[x,y] -> (x,y)) . replicateM 2

selectParents :: RandomGen g => Population -> Rand g (Gene, Gene)
selectParents = twoM . tournamentSelection
```

The meat of the algorithm is the `evolve` function that, appropriately enough, evolves a `Population`.

```evolve :: RandomGen g => Population -> Rand g Population
evolve p@(info@(PopInfo {size, crossover, elitism, mutation}), genes) =
(info,) . sortBy (comparing fitness) . (take idx genes ++) <\$>
replicateM (size - idx) (twoM getRandom >>= go)
where idx = round (fromIntegral size * elitism)
go (r1,r2) | r1 <= crossover =
selectParents p >>= uncurry mate >>= addChild r2
| otherwise = addMutation r2
addChild r c
| r <= mutation = mutate c
| otherwise = return c
addMutation r
| r <= mutation = mutate . (genes !!) =<< getRandomR (idx, size - 1)
| otherwise = (genes !!) <\$> getRandomR (idx, size - 1)
```

Now we're ready to kick off a multi-generation quest for the string we already wrote down up top.

```iterateUntil :: Monad m => (a -> Bool) -> (a -> m a) -> a -> m a
iterateUntil stop f = go
where go x | stop x = return x
| otherwise = f x >>= go

maxGenerations :: Int
maxGenerations = 16384

main = evalRandIO (randomPop defaultPop >>= iterateUntil done step . (, 0))
>>= result
where step (p,gen) = (,) <\$> evolve p <*> pure (gen+1)
done ((_, g:_), generation) =
generation == maxGenerations || fitness g == 0
result ((_, g:_), generation)
| generation == maxGenerations =
putStrLn "Maximum generations reached without success."
| fitness g == 0 = printf "Reached target (%d): %s\n" generation g
| otherwise = putStrLn "Evolution is hard. Let's go shopping."
```

In keeping with the spirit of the original programs, we include tests to exercise some of the component functionality we've put together.

```testGen = run (evalRandIO randomGene) >>= assert . check
where check g = and \$ map (\$ g) [ (>= 0) . fitness
, (== 13) . length
, all (between 32 122 . fromEnum) ]
between l r x = l <= x && x <= r

testMut = run (evalRandIO \$ randomGene >>= pairWithMutant) >>= assert . check
where pairWithMutant = liftA2 (,) <\$> pure <*> mutate
check (g,m) = length g == length m && length (nub g \ nub m) <= 1

testMate = run (evalRandIO \$ twoM randomGene >>= pairWithChild) >>=
assert . check
where pairWithChild (mom,dad) = (mom,dad,) <\$> mate mom dad
check (m,d,c) = length c == 13 &&
(and . map (\(_,y,z) -> y == z) .
dropWhile (\(x,y,_) -> x == y) \$ zip3 m c d)

unitTests = test [ "fitness1" ~: 0 ~=? fitness "Hello, world!"
, "fitness2" ~: 399 ~=? fitness "H5p&J;!l<X\7l"
, "fitness3" ~: 297 ~=? fitness "Vc;fx#QRP8V\\$"
, "fitness4" ~: 415 ~=? fitness "t\O`E_Jx\$n=NF" ]

runTests = do mapM_ (quickCheck . monadicIO) [testGen, testMut, testMate]
runTestTT unitTests
```

You can compile this post with GHC, or just run it in GHCi to poke around and ensure that the tests all pass.