Copyright | (c) 2012 Alp Mestanogullari |
---|---|
License | BSD3 |
Maintainer | alpmestan@gmail.com |
Stability | experimental |
Portability | GHC |
Safe Haskell | None |
Language | Haskell98 |
An implementation of feed-forward neural networks in pure Haskell.
It uses weight matrices between each layer to represent the connections between neurons from a layer to the next and exports only the useful bits for a user of the library.
Here is an example of using this module to create a feed-forward neural network with 2 inputs, 2 neurons in a hidden layer and one neuron in the output layer, with random weights, and compute its output for [1,2] using the sigmoid function for activation for all the neurons.
import AI.HNN.FF.Network import Numeric.LinearAlgebra main = do n <- createNetwork 2 [2] 1 :: IO (Network Double) print $ output n sigmoid (fromList [1, 1])
Note: Here, I create a Network Double
, but you can replace Double
with any number type
that implements the appropriate typeclasses you can see in the signatures of this module.
Having your number type implement the Floating
typeclass too is a good idea, since that's what most of the
common activation functions require.
Note 2: You can also give some precise weights to initialize the neural network with, with
fromWeightMatrices
. You can also restore a neural network you had saved using loadNetwork
.
Here is an example of how to train a neural network to learn the XOR function. ( for reference: XOR(0, 0) = 0, XOR(0, 1) = 1, XOR(1, 0) = 1, XOR(1, 1) = 0 )
First, let's import hnn's feedforward neural net module, and hmatrix's vector types.
import AI.HNN.FF.Network import Numeric.LinearAlgebra
Now, we will specify our training set (what the net should try to learn).
samples :: Samples Double samples = [ fromList [0, 0] --> fromList [0] , fromList [0, 1] --> fromList [1] , fromList [1, 0] --> fromList [1] , fromList [1, 1] --> fromList [0] ]
You can see that this is basically a list of pairs of vectors, the first vector being the input given to the network, the second one being the expected output. Of course, this imply working on a neural network with 2 inputs, and a single neuron on the output layer. Then, let's create one!
main = do net <- createNetwork 2 [2] 1
You may have noticed we haven't specified a signature this time, unlike in the earlier snippet.
Since we gave a signature to samples, specifying we're working with Double
numbers, and since
we are going to tie net
and samples
by a call to a learning function, GHC will gladly figure out
that net
is working with Double
.
Now, it's time to train our champion. But first, let's see how bad he is now. The weights are most likely
not close to those that will give a good result for simulating XOR. Let's compute the output of the net on
the input vectors of our samples, using tanh
as the activation function.
mapM_ (print . output net tanh . fst) samples
Ok, you've tested this, and it gives terrible results. Let's fix this by letting trainNTimes
teach our neural net
how to behave. Since we're using tanh
as our activation function, we will tell it to the training function,
and also specify its derivative.
let smartNet = trainNTimes 1000 0.8 tanh tanh' net samples
So, this tiny piece of code will run the backpropagation algorithm on the samples 1000 times, with a learning rate of 0.8. The learning rate is basically how strongly we should modify the weights when we try to correct the error the net makes on our samples. The bigger it is, the more the weights are going to change significantly. Depending on the case, it can be good, but sometimes it can make the backprop algorithm oscillate around good weight values without actually getting to them. You usually want to test several values and see which ones get you the nicest neural net, which generalizes well to samples that are not in the training set while giving decent results on the training set.
Now, let's see how that worked out for us:
mapM_ (print . output smartNet tanh . fst) samples
You could even save that neural network's weights to a file, so that you don't need to train it again in the future, using saveNetwork
:
saveNetwork "smartNet.nn" smartNet
Please note that saveNetwork
is just a wrapper around zlib compression + serialization using the binary package.
AI.HNN.FF.Network also provides a Binary
instance for Network
, which means you can also simply use
encode
and decode
to have your own saving/restoring routines, or to simply get a bytestring
we can send over the network, for example.
Here's a run of the program we described on my machine (with the timing): first set of
fromList's is the output of the initial neural network, the second one is the output of
smartNet
:-)
fromList [0.574915179613429] fromList [0.767589097192215] fromList [0.7277396607146663] fromList [0.8227114080561128] ------------------ fromList [6.763498312099933e-2] fromList [0.9775186355284375] fromList [0.9350823296850516] fromList [-4.400205702560454e-2] real 0m0.365s user 0m0.072s sys 0m0.016s
Rejoyce! Feel free to play around with the library and report any bug, feature request and whatnot to us on our github repository https://github.com/alpmestan/hnn/issues using the appropriate tags. Also, you can see the simple program we studied here with pretty colors at https://github.com/alpmestan/hnn/blob/master/examples/ff/xor.hs and other ones at https://github.com/alpmestan/hnn/tree/master/examples/ff.
- newtype Network a = Network {}
- type ActivationFunction a = a -> a
- type ActivationFunctionDerivative a = a -> a
- type Sample a = (Vector a, Vector a)
- type Samples a = [Sample a]
- (-->) :: Vector a -> Vector a -> Sample a
- createNetwork :: (Variate a, Storable a) => Int -> [Int] -> Int -> IO (Network a)
- fromWeightMatrices :: Storable a => Vector (Matrix a) -> Network a
- output :: (Floating (Vector a), Numeric a, Storable a, Num (Vector a)) => Network a -> ActivationFunction a -> Vector a -> Vector a
- tanh :: Floating a => a -> a
- tanh' :: Floating a => a -> a
- sigmoid :: Floating a => a -> a
- sigmoid' :: Floating a => a -> a
- trainUntil :: (Floating (Vector a), Floating a, Numeric a, Num (Vector a), Container Vector a) => (Int -> Network a -> Samples a -> Bool) -> a -> ActivationFunction a -> ActivationFunctionDerivative a -> Network a -> Samples a -> Network a
- trainNTimes :: (Floating (Vector a), Floating a, Numeric a, Num (Vector a), Container Vector a) => Int -> a -> ActivationFunction a -> ActivationFunctionDerivative a -> Network a -> Samples a -> Network a
- trainUntilErrorBelow :: (Floating (Vector a), Floating a, Numeric a, Normed (Vector a), Ord a, Container Vector a, Num (RealOf a), a ~ RealOf a, Show a) => a -> a -> ActivationFunction a -> ActivationFunctionDerivative a -> Network a -> Samples a -> Network a
- quadError :: (Floating (Vector a), Floating a, Fractional (RealOf a), Normed (Vector a), Numeric a) => ActivationFunction a -> Network a -> Samples a -> RealOf a
- loadNetwork :: (Storable a, Element a, Binary a) => FilePath -> IO (Network a)
- saveNetwork :: (Storable a, Element a, Binary a) => FilePath -> Network a -> IO ()
Types
type ActivationFunction a = a -> a Source #
The type of an activation function, mostly used for clarity in signatures
type ActivationFunctionDerivative a = a -> a Source #
The type of an activation function's derivative, mostly used for clarity in signatures
(-->) :: Vector a -> Vector a -> Sample a Source #
Handy operator to describe your learning set, avoiding unnecessary parentheses. It's just a synonym for '(,)'. Generally you'll load your learning set from a file, a database or something like that, but it can be nice for quickly playing with hnn or for simple problems where you manually specify your learning set. That is, instead of writing:
samples :: Samples Double samples = [ (fromList [0, 0], fromList [0]) , (fromList [0, 1], fromList [1]) , (fromList [1, 0], fromList [1]) , (fromList [1, 1], fromList [0]) ]
You can write:
samples :: Samples Double samples = [ fromList [0, 0] --> fromList [0] , fromList [0, 1] --> fromList [1] , fromList [1, 0] --> fromList [1] , fromList [1, 1] --> fromList [0] ]
Creating a neural network
createNetwork :: (Variate a, Storable a) => Int -> [Int] -> Int -> IO (Network a) Source #
The following creates a neural network with n
inputs and if l
is [n1, n2, ...]
the net will have n1 neurons on the first layer, n2 neurons on the second, and so on
ending with k neurons on the output layer, with random weight matrices as a courtesy of
uniformVector
.
createNetwork n l k
fromWeightMatrices :: Storable a => Vector (Matrix a) -> Network a Source #
Creates a neural network with exactly the weight matrices given as input here. We don't check that the numbers of rows/columns are compatible, etc.
Computing a neural network's output
output :: (Floating (Vector a), Numeric a, Storable a, Num (Vector a)) => Network a -> ActivationFunction a -> Vector a -> Vector a Source #
Computes the output of the network on the given input vector with the given activation function
sigmoid' :: Floating a => a -> a Source #
Derivative of the sigmoid function: sigmoid x * (1 - sigmoid x)
Training a neural network
trainUntil :: (Floating (Vector a), Floating a, Numeric a, Num (Vector a), Container Vector a) => (Int -> Network a -> Samples a -> Bool) -> a -> ActivationFunction a -> ActivationFunctionDerivative a -> Network a -> Samples a -> Network a Source #
Generic training function.
The first argument is a predicate that will tell the backpropagation algorithm when to stop. The first argument to the predicate is the epoch, i.e the number of times the backprop has been executed on the samples. The second argument is the current network, and the third is the list of samples. You can thus combine these arguments to create your own criterion.
For example, if you want to stop learning either when the network's quadratic error on the samples, using the tanh function, is below 0.01, or after 1000 epochs, whichever comes first, you could use the following predicate:
pred epochs net samples = if epochs == 1000 then True else quadError tanh net samples < 0.01
You could even use trace
to print the error, to see how the error evolves while it's learning,
or redirect this to a file from your shell in order to generate a pretty graphics and what not.
The second argument (after the predicate) is the learning rate. Then come the activation function you want,
its derivative, the initial neural network, and your training set.
Note that we provide trainNTimes
and trainUntilErrorBelow
for common use cases.
trainNTimes :: (Floating (Vector a), Floating a, Numeric a, Num (Vector a), Container Vector a) => Int -> a -> ActivationFunction a -> ActivationFunctionDerivative a -> Network a -> Samples a -> Network a Source #
Trains the neural network with backpropagation the number of times specified by the Int
argument,
using the given learning rate (second argument).
trainUntilErrorBelow :: (Floating (Vector a), Floating a, Numeric a, Normed (Vector a), Ord a, Container Vector a, Num (RealOf a), a ~ RealOf a, Show a) => a -> a -> ActivationFunction a -> ActivationFunctionDerivative a -> Network a -> Samples a -> Network a Source #
Trains the neural network until the quadratic error (quadError
) comes below the given value (first argument),
using the given learning rate (second argument).
Note: this can loop pretty much forever when you're using a bad architecture for the problem, or inappropriate activation functions.
quadError :: (Floating (Vector a), Floating a, Fractional (RealOf a), Normed (Vector a), Numeric a) => ActivationFunction a -> Network a -> Samples a -> RealOf a Source #
Quadratic error on the given training set using the given activation function. Useful to create
your own predicates for trainUntil
.