Safe Haskell | None |
---|---|
Language | Haskell2010 |
Generic Boltzmann samplers.
Here, the words "sampler" and "generator" are used interchangeably.
Given an algebraic datatype:
data A = A1 B C | A2 D
a Boltzmann sampler is recursively defined by choosing a constructor with some fixed distribution, and independently generating values for the corresponding fields with the same method.
A key component is the aforementioned distribution, defined for every type such that the resulting generator produces a finite value in the end. These distributions are obtained from a precomputed object called oracle, which we will not describe further here.
Oracles depend on the target size of the generated data (except for singular samplers), and can be fairly expensive to compute repeatedly, hence some of the functions below attempt to avoid (re)computing too many of them even when the required size changes.
When these functions are specialized, oracles are memoized and will be reused for different sizes.
- type Size' = Int
- generatorSR :: (Data a, MonadRandomLike m) => Size' -> m a
- generatorP :: (Data a, MonadRandomLike m) => Size' -> m a
- generatorPR :: (Data a, MonadRandomLike m) => Size' -> m a
- generatorR :: (Data a, MonadRandomLike m) => Size' -> m a
- generatorP' :: (Data a, MonadRandomLike m) => Size' -> m a
- generatorPR' :: (Data a, MonadRandomLike m) => Size' -> m a
- generatorR' :: (Data a, MonadRandomLike m) => Size' -> m a
- generator' :: (Data a, MonadRandomLike m) => Size' -> m a
- generatorSRWith :: (Data a, MonadRandomLike m) => [AliasR m] -> Size' -> m a
- generatorPWith :: (Data a, MonadRandomLike m) => [Alias m] -> Size' -> m a
- generatorPRWith :: (Data a, MonadRandomLike m) => [AliasR m] -> Size' -> m a
- generatorRWith :: (Data a, MonadRandomLike m) => [AliasR m] -> Size' -> m a
- generatorPWith' :: (Data a, MonadRandomLike m) => [Alias m] -> Size' -> m a
- generatorPRWith' :: (Data a, MonadRandomLike m) => [AliasR m] -> Size' -> m a
- generatorRWith' :: (Data a, MonadRandomLike m) => [AliasR m] -> Size' -> m a
- generatorWith' :: (Data a, MonadRandomLike m) => [Alias m] -> Size' -> m a
- type Points = Int
- generatorM :: (Data a, MonadRandomLike m) => [Alias m] -> Points -> Size' -> m a
- generatorMR :: (Data a, MonadRandomLike m) => [AliasR m] -> Points -> Size' -> (Size', Size') -> m a
- generator_ :: (Data a, MonadRandomLike m) => [Alias m] -> Points -> Maybe Size' -> m a
- generatorR_ :: (Data a, MonadRandomLike m) => [AliasR m] -> Points -> Maybe Size' -> (Size', Size') -> m a
- class Monad m => MonadRandomLike m where
- newtype AMonadRandom m a = AMonadRandom {
- asMonadRandom :: m a
- alias :: (Monad m, Data a, Data b) => (a -> m b) -> Alias m
- aliasR :: (Monad m, Data a, Data b) => (a -> m b) -> AliasR m
- coerceAlias :: Coercible m n => Alias m -> Alias n
- coerceAliases :: Coercible m n => [Alias m] -> [Alias n]
- data Alias m where
- type AliasR m = Alias (RejectT m)
Documentation
The size of a value is its number of constructors.
Here, however, the Size'
type is interpreted differently to make better
use of QuickCheck's size parameter provided by the sized
combinator, so that we generate non-trivial data even at very small size
values.
For infinite types, with objects of unbounded sizes > minSize
, given a
parameter delta ::
, the produced values have an average size close
to Size'
minSize + delta
.
For example, values of type Either () [Bool]
have at least two constructors,
so
generator
delta ::Gen
(Either () [Bool])
will target sizes close to 2 + delta
;
the offset becomes less noticeable as delta
grows to infinity.
For finite types with sizes in [minSize, maxSize]
, the target expected
size is obtained by clamping a Size'
to [0, 99]
and applying an affine
mapping.
Main functions
Suffixes
S
- Singular sampler.
This works with recursive tree-like structures, as opposed to (lists of) structures with bounded size. More precisely, the generating function of the given type should have a finite radius of convergence, with a singularity of a certain kind (see Duchon et al., reference in the README), so that the oracle can be evaluated at that point.
This has the advantage of using the same oracle for all size parameters, which simply specify a target size interval.
P
- Generator of pointed values.
It usually has a flatter distribution of sizes than a simple Boltzmann sampler, making it an efficient alternative to rejection sampling.
It also works on more types, particularly lists and finite types, but relies on multiple oracles.
R
- Rejection sampling.
These generators filter out values whose sizes are not within some interval. In the first two sections, that interval is implicit:
[(1-
, forepsilon
)*size', (1+epsilon
)*size']
.epsilon
= 0.1The generator restarts as soon as it has produced more constructors than the upper bound, this strategy is called ceiled rejection sampling.
Pointing
The pointing of a type t
is a derived type whose values are essentially
values of type t
, with one of their constructors being "pointed".
Alternatively, we may turn every constructor into variants that indicate
the position of points.
-- Original type data Tree = Node Tree Tree | Leaf -- Pointing of Tree data Tree' = Tree' Tree -- Point at the root | Node'0 Tree' Tree -- Point to the left | Node'1 Tree Tree' -- Point to the right
Pointed values are easily mapped back to the original type by erasing the point. Pointing makes larger values occur much more frequently, while preserving the uniformness of the distribution conditionally to a fixed size.
generatorSR :: (Data a, MonadRandomLike m) => Size' -> m a Source #
generatorSR
:: Int ->Gen
aasMonadRandom
.generatorSR
::MonadRandom
m => Int -> m a
Singular ceiled rejection sampler.
generatorP :: (Data a, MonadRandomLike m) => Size' -> m a Source #
generatorP
:: Int ->Gen
aasMonadRandom
.generatorP
::MonadRandom
m => Int -> m a
Generator of pointed values.
generatorPR :: (Data a, MonadRandomLike m) => Size' -> m a Source #
Pointed generator with rejection.
generatorR :: (Data a, MonadRandomLike m) => Size' -> m a Source #
Generator with rejection and dynamic average size.
Fixed size
The '
suffix indicates functions which do not do any
precomputation before passing the size parameter.
This means that oracles are computed from scratch for every size value, which may incur a significant overhead.
generatorP' :: (Data a, MonadRandomLike m) => Size' -> m a Source #
Pointed generator.
generatorPR' :: (Data a, MonadRandomLike m) => Size' -> m a Source #
Pointed generator with rejection.
generatorR' :: (Data a, MonadRandomLike m) => Size' -> m a Source #
Ceiled rejection sampler with given average size.
generator' :: (Data a, MonadRandomLike m) => Size' -> m a Source #
Basic boltzmann sampler with no optimization.
Generators with aliases
Boltzmann samplers can normally be defined only for types a
such that:
- they are instances of
Data
; - the set of types of subterms of values of type
a
is finite; - and all of these types have at least one finite value (i.e., values with finitely many constructors).
Examples of misbehaving types are:
a -> b -- Not Data
data E a = L a | R (E [a]) -- Contains a, [a], [[a]], [[[a]]], etc.
data I = C I -- No finite value
Alias
The Alias
type works around these limitations (AliasR
for rejection
samplers).
This existential wrapper around a user-defined function f :: a -> m b
makes boltzmann-samplers
view occurences of the type b
as a
when
processing a recursive system of types, possibly stopping some infinite
unrolling of type definitions. When a value of type b
needs to be
generated, it generates an a
which is passed to f
.
let as = [aliasR
$ \() -> return (L []) ::Gen
(E [[Int]])] ingeneratorSRWith
asasGen
::Size
->Gen
(E Int)
Another use case is to plug in user-defined generators where the default is
not satisfactory, for example, to generate positive Int
s:
let as = [alias
$ \() ->choose
(0, 100) ::Gen
Int)] ingeneratorPWith
asasGen
::Size
->Gen
[Int]
or to modify the weights assigned to some types. In particular, in some
cases it seems preferable to make String
(and Text
) have the same weight
as Int
and ()
.
let as = [alias
$ \() -> arbitrary ::Gen
String] ingeneratorPWith
asasGen
::Size
->Gen
(Either Int String)
generatorSRWith :: (Data a, MonadRandomLike m) => [AliasR m] -> Size' -> m a Source #
generatorPWith :: (Data a, MonadRandomLike m) => [Alias m] -> Size' -> m a Source #
generatorPRWith :: (Data a, MonadRandomLike m) => [AliasR m] -> Size' -> m a Source #
generatorRWith :: (Data a, MonadRandomLike m) => [AliasR m] -> Size' -> m a Source #
Fixed size
generatorPWith' :: (Data a, MonadRandomLike m) => [Alias m] -> Size' -> m a Source #
generatorPRWith' :: (Data a, MonadRandomLike m) => [AliasR m] -> Size' -> m a Source #
generatorRWith' :: (Data a, MonadRandomLike m) => [AliasR m] -> Size' -> m a Source #
generatorWith' :: (Data a, MonadRandomLike m) => [Alias m] -> Size' -> m a Source #
Other generators
Used in the implementation of the generators above. These also allow to apply pointing more than once.
Suffixes
M
- Sized generators are memoized for some sparsely chosen values of
sizes. Subsequently supplied sizes are approximated by the closest larger
value. This strategy avoids recomputing too many oracles. Aside from
singular samplers, all other generators above not marked by
'
use this.
_
- If the size parameter is
Nothing
, produces the singular generator (associated with the suffixS
); otherwise the generator produces values with average size equal to the given value.
generatorM :: (Data a, MonadRandomLike m) => [Alias m] -> Points -> Size' -> m a Source #
generatorMR :: (Data a, MonadRandomLike m) => [AliasR m] -> Points -> Size' -> (Size', Size') -> m a Source #
generator_ :: (Data a, MonadRandomLike m) => [Alias m] -> Points -> Maybe Size' -> m a Source #
Boltzmann sampler without rejection.
generatorR_ :: (Data a, MonadRandomLike m) => [AliasR m] -> Points -> Maybe Size' -> (Size', Size') -> m a Source #
Boltzmann sampler with rejection.
Auxiliary definitions
Type classes
class Monad m => MonadRandomLike m where Source #
defines basic components to build generators,
allowing the implementation to remain abstract over both the
MonadRandomLike
mGen
type and MonadRandom
instances.
For the latter, the wrapper AMonadRandom
is provided to avoid
overlapping instances.
Called for every constructor. Counter for ceiled rejection sampling.
doubleR :: Double -> m Double Source #
doubleR upperBound
: generates values in [0, upperBound]
.
integerR :: Integer -> m Integer Source #
integerR upperBound
: generates values in [0, upperBound-1]
.
Default Int
generator.
Default Double
generator.
Default Char
generator.
MonadRandomLike Gen Source # | |
MonadRandom m => MonadRandomLike (AMonadRandom m) Source # | |
MonadRandomLike m => MonadRandomLike (RejectT m) Source # | |
newtype AMonadRandom m a Source #
AMonadRandom | |
|
MonadTrans AMonadRandom Source # | |
Monad m => Monad (AMonadRandom m) Source # | |
Functor m => Functor (AMonadRandom m) Source # | |
Applicative m => Applicative (AMonadRandom m) Source # | |
MonadRandom m => MonadRandomLike (AMonadRandom m) Source # | |
Alias
coerceAlias :: Coercible m n => Alias m -> Alias n Source #
coerceAlias :: Alias m -> Alias (AMonadRandom m)
coerceAliases :: Coercible m n => [Alias m] -> [Alias n] Source #
coerceAliases :: [Alias m] -> [Alias (AMonadRandom m)]