Copyright | (c) Edward Kmett 2010-2021 |
---|---|
License | BSD3 |
Maintainer | ekmett@gmail.com |
Stability | experimental |
Portability | GHC only |
Safe Haskell | Safe-Inferred |
Language | Haskell2010 |
This module provides reverse-mode Automatic Differentiation using post-hoc linear time topological sorting.
For reverse mode AD we use StableName
to recover sharing information from
the tape to avoid combinatorial explosion, and thus run asymptotically faster
than it could without such sharing information, but the use of side-effects
contained herein is benign.
Synopsis
- data Kahn a
- auto :: Mode t => Scalar t -> t
- grad :: (Traversable f, Num a) => (f (Kahn a) -> Kahn a) -> f a -> f a
- grad' :: (Traversable f, Num a) => (f (Kahn a) -> Kahn a) -> f a -> (a, f a)
- gradWith :: (Traversable f, Num a) => (a -> a -> b) -> (f (Kahn a) -> Kahn a) -> f a -> f b
- gradWith' :: (Traversable f, Num a) => (a -> a -> b) -> (f (Kahn a) -> Kahn a) -> f a -> (a, f b)
- jacobian :: (Traversable f, Functor g, Num a) => (f (Kahn a) -> g (Kahn a)) -> f a -> g (f a)
- jacobian' :: (Traversable f, Functor g, Num a) => (f (Kahn a) -> g (Kahn a)) -> f a -> g (a, f a)
- jacobianWith :: (Traversable f, Functor g, Num a) => (a -> a -> b) -> (f (Kahn a) -> g (Kahn a)) -> f a -> g (f b)
- jacobianWith' :: (Traversable f, Functor g, Num a) => (a -> a -> b) -> (f (Kahn a) -> g (Kahn a)) -> f a -> g (a, f b)
- hessian :: (Traversable f, Num a) => (f (On (Kahn (Kahn a))) -> On (Kahn (Kahn a))) -> f a -> f (f a)
- hessianF :: (Traversable f, Functor g, Num a) => (f (On (Kahn (Kahn a))) -> g (On (Kahn (Kahn a)))) -> f a -> g (f (f a))
- diff :: Num a => (Kahn a -> Kahn a) -> a -> a
- diff' :: Num a => (Kahn a -> Kahn a) -> a -> (a, a)
- diffF :: (Functor f, Num a) => (Kahn a -> f (Kahn a)) -> a -> f a
- diffF' :: (Functor f, Num a) => (Kahn a -> f (Kahn a)) -> a -> f (a, a)
- vgrad :: Grad i o o' a => i -> o
- vgrad' :: Grad i o o' a => i -> o'
- class Num a => Grad i o o' a | i -> a o o', o -> a i o', o' -> a i o
Documentation
Kahn
is a Mode
using reverse-mode automatic differentiation that provides fast diffFU
, diff2FU
, grad
, grad2
and a fast jacobian
when you have a significantly smaller number of outputs than inputs.
Instances
Gradient
grad :: (Traversable f, Num a) => (f (Kahn a) -> Kahn a) -> f a -> f a Source #
The grad
function calculates the gradient of a non-scalar-to-scalar function with kahn-mode AD in a single pass.
>>>
grad (\[x,y,z] -> x*y+z) [1.0,2.0,3.0]
[2.0,1.0,1.0]
grad' :: (Traversable f, Num a) => (f (Kahn a) -> Kahn a) -> f a -> (a, f a) Source #
The grad'
function calculates the result and gradient of a non-scalar-to-scalar function with kahn-mode AD in a single pass.
>>>
grad' (\[x,y,z] -> x*y+z) [1.0,2.0,3.0]
(5.0,[2.0,1.0,1.0])
gradWith :: (Traversable f, Num a) => (a -> a -> b) -> (f (Kahn a) -> Kahn a) -> f a -> f b Source #
gradWith' :: (Traversable f, Num a) => (a -> a -> b) -> (f (Kahn a) -> Kahn a) -> f a -> (a, f b) Source #
Jacobian
jacobian :: (Traversable f, Functor g, Num a) => (f (Kahn a) -> g (Kahn a)) -> f a -> g (f a) Source #
The jacobian
function calculates the jacobian of a non-scalar-to-non-scalar function with kahn AD lazily in m
passes for m
outputs.
>>>
jacobian (\[x,y] -> [y,x,x*y]) [2.0,1.0]
[[0.0,1.0],[1.0,0.0],[1.0,2.0]]
jacobian' :: (Traversable f, Functor g, Num a) => (f (Kahn a) -> g (Kahn a)) -> f a -> g (a, f a) Source #
The jacobian'
function calculates both the result and the Jacobian of a nonscalar-to-nonscalar function, using m
invocations of kahn AD,
where m
is the output dimensionality. Applying fmap snd
to the result will recover the result of jacobian
| An alias for gradF'
ghci> jacobian' ([x,y] -> [y,x,x*y]) [2.0,1.0] [(1.0,[0.0,1.0]),(2.0,[1.0,0.0]),(2.0,[1.0,2.0])]
jacobianWith :: (Traversable f, Functor g, Num a) => (a -> a -> b) -> (f (Kahn a) -> g (Kahn a)) -> f a -> g (f b) Source #
'jacobianWith g f' calculates the Jacobian of a non-scalar-to-non-scalar function f
with kahn AD lazily in m
passes for m
outputs.
Instead of returning the Jacobian matrix, the elements of the matrix are combined with the input using the g
.
jacobian
=jacobianWith
(_ dx -> dx)jacobianWith
const
= (f x ->const
x<$>
f x)
jacobianWith' :: (Traversable f, Functor g, Num a) => (a -> a -> b) -> (f (Kahn a) -> g (Kahn a)) -> f a -> g (a, f b) Source #
jacobianWith
g f' calculates both the result and the Jacobian of a nonscalar-to-nonscalar function f
, using m
invocations of kahn AD,
where m
is the output dimensionality. Applying fmap snd
to the result will recover the result of jacobianWith
Instead of returning the Jacobian matrix, the elements of the matrix are combined with the input using the g
.
jacobian'
==jacobianWith'
(_ dx -> dx)
Hessian
hessian :: (Traversable f, Num a) => (f (On (Kahn (Kahn a))) -> On (Kahn (Kahn a))) -> f a -> f (f a) Source #
Compute the hessian
via the jacobian
of the gradient. gradient is computed in Kahn
mode and then the jacobian
is computed in Kahn
mode.
However, since the
is square this is not as fast as using the forward-mode grad
f :: f a -> f ajacobian
of a reverse mode gradient provided by hessian
.
>>>
hessian (\[x,y] -> x*y) [1.0,2.0]
[[0.0,1.0],[1.0,0.0]]
hessianF :: (Traversable f, Functor g, Num a) => (f (On (Kahn (Kahn a))) -> g (On (Kahn (Kahn a)))) -> f a -> g (f (f a)) Source #
Derivatives
diff :: Num a => (Kahn a -> Kahn a) -> a -> a Source #
Compute the derivative of a function.
>>>
diff sin 0
1.0
>>>
cos 0
1.0
diff' :: Num a => (Kahn a -> Kahn a) -> a -> (a, a) Source #
The diff'
function calculates the value and derivative, as a
pair, of a scalar-to-scalar function.
>>>
diff' sin 0
(0.0,1.0)
diffF :: (Functor f, Num a) => (Kahn a -> f (Kahn a)) -> a -> f a Source #
Compute the derivatives of a function that returns a vector with regards to its single input.
>>>
diffF (\a -> [sin a, cos a]) 0
[1.0,0.0]
diffF' :: (Functor f, Num a) => (Kahn a -> f (Kahn a)) -> a -> f (a, a) Source #
Compute the derivatives of a function that returns a vector with regards to its single input as well as the primal answer.
>>>
diffF' (\a -> [sin a, cos a]) 0
[(0.0,1.0),(1.0,0.0)]
Unsafe Variadic Gradient
Variadic combinators for variadic mixed-mode automatic differentiation.
Unfortunately, variadicity comes at the expense of being able to use
quantification to avoid sensitivity confusion, so be careful when
counting the number of auto
calls you use when taking the gradient
of a function that takes gradients!