Home > Programming > Haskell Genetic Algorithm “Hello, world!”

Haskell Genetic Algorithm “Hello, world!”

April 15th, 2011

A post on reddit linked to several implementations of a cute "Hello, world!" program demonstrating a genetic algorithm that evolves towards a target string. Example programs were written in several languages, and I thought a Haskell version could be worthwhile as it demonstrates the use of random numbers, an issue that frustrates many newcomers to the language.

I use the MonadRandom package to deal with plumbing a random number generator through the code, but otherwise stick to lists and the usual cast of characters from the Prelude.

> {-# LANGUAGE NamedFieldPuns, TupleSections #-}
> import Control.Applicative
> import Control.Arrow (second)
> import Control.Monad (liftM, replicateM)
> import Control.Monad.Random
> import Data.Function (on)
> import Data.List (minimumBy, sortBy, nub, (\\))
> import Data.Ord (comparing)
> import Text.Printf (printf)

 
The example programs associated with the original post all come with tests. Some of these are standard unit tests that verify hand-picked examples, while others verify properties of random processes. To support both kinds of tests, we use HUnit for unit tests, and QuickCheck for property testing.

> import Test.QuickCheck
> import Test.QuickCheck.Monadic
> import Test.HUnit hiding (assert)

 
For clarity, we create a type alias for the raw data we will be working with.

> type Gene = String
>
> target :: Gene
> target = "Hello, world!"

 
Next are the functions for working with individual Genes.

> mate :: RandomGen g => Gene -> Gene -> Rand g Gene
> mate g1 g2 = (++) <$> flip take g1 <*> flip drop g2 <$> pivot
> where pivot = getRandomR (0, length g1 - 1)
>
> mutate :: RandomGen g => Gene -> Rand g Gene
> mutate g = (uncurry (++) .) . second . (:) <$> delta <*> parts
> where delta = getRandomR (' ', 'z')
> idx = getRandomR (0, length g - 1)
> parts = second tail . flip splitAt g <$> idx
>
> fitness :: Gene -> Int
> fitness = sum . map abs . zipWith ((-) `on` fromEnum) target

 
We also want a random Gene generator.

> randomGene :: RandomGen g => Rand g Gene
> randomGene = replicateM (length target) $ getRandomR (' ', 'z')

 
Finally we can start thinking about populations. The example Python program I looked at had several parameters for initializing the population and controlling how it evolves. We capture those parameters in a record.

> data PopInfo = PopInfo { size      :: Int
> , crossover :: Float
> , elitism :: Float
> , mutation :: Float }

 
A Population is a pair of a record describing the population and a collection of Genes.

> type Population = (PopInfo, [Gene])
>
> defaultPop :: PopInfo
> defaultPop = PopInfo 1024 0.8 0.1 0.03

 
We will use a helper function to produce a randomized initial population.

> randomPop :: RandomGen g => PopInfo -> Rand g Population
> randomPop = liftA2 (,) <$> pure <*> flip replicateM randomGene . size

 
A tournament selection method is used to select parent Genes from a Population.

> tournamentSize :: Int
> tournamentSize = 3
>
> tournamentSelection :: RandomGen g => Population -> Rand g Gene
> tournamentSelection (info, genes) =
> minimumBy (comparing fitness) . map (genes !!) <$>
> replicateM tournamentSize (getRandomR (0, size info - 1))
>
> twoM :: Monad m => m a -> m (a, a)
> twoM = liftM (\[x,y] -> (x,y)) . replicateM 2
>
> selectParents :: RandomGen g => Population -> Rand g (Gene, Gene)
> selectParents = twoM . tournamentSelection

 
The meat of the algorithm is the evolve function that, appropriately enough, evolves a Population.

> evolve :: RandomGen g => Population -> Rand g Population
> evolve p@(info@(PopInfo {size, crossover, elitism, mutation}), genes) =
> (info,) . sortBy (comparing fitness) . (take idx genes ++) <$>
> replicateM (size - idx) (twoM getRandom >>= go)
> where idx = round (fromIntegral size * elitism)
> go (r1,r2) | r1 <= crossover =
> selectParents p >>= uncurry mate >>= addChild r2
> | otherwise = addMutation r2
> addChild r c
> | r <= mutation = mutate c
> | otherwise = return c
> addMutation r
> | r <= mutation = mutate . (genes !!) =<< getRandomR (idx, size - 1)
> | otherwise = (genes !!) <$> getRandomR (idx, size - 1)

 
Now we’re ready to kick off a multi-generation quest for the string we already wrote down up top.

> iterateUntil :: Monad m => (a -> Bool) -> (a -> m a) -> a -> m a
> iterateUntil stop f = go
> where go x | stop x = return x
> | otherwise = f x >>= go
>
> maxGenerations :: Int
> maxGenerations = 16384
>
> main = evalRandIO (randomPop defaultPop >>= iterateUntil done step . (, 0))
> >>= result
> where step (p,gen) = (,) <$> evolve p <*> pure (gen+1)
> done ((_, g:_), generation) =
> generation == maxGenerations || fitness g == 0
> result ((_, g:_), generation)
> | generation == maxGenerations =
> putStrLn "Maximum generations reached without success."
> | fitness g == 0 = printf "Reached target (%d): %s\n" generation g
> | otherwise = putStrLn "Evolution is hard. Let's go shopping."

 
In keeping with the spirit of the original programs, we include tests to exercise some of the component functionality we’ve put together.

> testGen = run (evalRandIO randomGene) >>= assert . check
> where check g = and $ map ($ g) [ (>= 0) . fitness
> , (== 13) . length
> , all (between 32 122 . fromEnum) ]
> between l r x = l <= x && x <= r
>
> testMut = run (evalRandIO $ randomGene >>= pairWithMutant) >>= assert . check
> where pairWithMutant = liftA2 (,) <$> pure <*> mutate
> check (g,m) = length g == length m && length (nub g \\ nub m) <= 1
>
> testMate = run (evalRandIO $ twoM randomGene >>= pairWithChild) >>=
> assert . check
> where pairWithChild (mom,dad) = (mom,dad,) <$> mate mom dad
> check (m,d,c) = length c == 13 &&
> (and . map (\(_,y,z) -> y == z) .
> dropWhile (\(x,y,_) -> x == y) $ zip3 m c d)
>
> unitTests = test [ "fitness1" ~: 0 ~=? fitness "Hello, world!"
> , "fitness2" ~: 399 ~=? fitness "H5p&J;!l<X\\7l"
> , "fitness3" ~: 297 ~=? fitness "Vc;fx#QRP8V\\$"
> , "fitness4" ~: 415 ~=? fitness "t\\O`E_Jx$n=NF" ]
>
> runTests = do mapM_ (quickCheck . monadicIO) [testGen, testMut, testMate]
> runTestTT unitTests

 
You can compile this post with GHC, or just run it in GHCi to poke around and ensure that the tests all pass.

Share
Categories: Programming Tags:
  1. John Svazic
    April 15th, 2011 at 17:05 | #1

    Many thanks for the Haskell implementation! I’ll be adding it to the main list (with proper credit of course) over the next few days.

  2. Thomas Miedema
    June 13th, 2011 at 10:24 | #2

    Nice code. Your mate function only produces 1 child though, where the other implementations produce 2. Would it complicate the evolve function too much?

Comments are closed.