Copyright | Copyright (c) 2016 the Hakaru team |
---|---|
License | BSD3 |
Maintainer | wren@community.haskell.org |
Stability | experimental |
Portability | GHC-only |
Safe Haskell | None |
Language | Haskell2010 |
Disintegration via lazy partial evaluation.
N.B., the forward direction of disintegration is not just
partial evaluation! In the version discussed in the paper we
must also ensure that no heap-bound variables occur in the result,
which requires using HNFs rather than WHNFs. That condition is
sound, but a bit too strict; we could generalize this to handle
cases where there may be heap-bound variables remaining in neutral
terms, provided we (a) don't end up trying to go both forward
and backward on the same variable, and (more importantly) (b)
end up with the proper Jacobian. The paper version is rigged to
ensure that whenever we recurse into two subexpressions (e.g.,
the arguments to addition) one of them has a Jacobian of zero,
thus when going from x*y
to dx*y + x*dy
one of the terms
cancels out.
Developer's Note: To help keep the code clean, we use the
worker/wrapper transform. However, due to complexities in
typechecking GADTs, this often confuses GHC if you don't give
just the right type signature on definitions. This confusion
shows up whenever you get error messages about an "ambiguous"
choice of ABT
instance, or certain types of "couldn't match
a
with a1
" error messages. To eliminate these issues we use
-XScopedTypeVariables
. In particular, the abt
type variable
must be bound by the wrapper (i.e., the top-level definition),
and the workers should just refer to that same type variable
rather than quantifying over abother abt
type. In addition,
whatever other type variables there are (e.g., the xs
and a
of an abt xs a
argument) should be polymorphic in the workers
and should not reuse the other analogous type variables bound
by the wrapper.
Developer's Note: In general, we'd like to emit weights and
guards "as early as possible"; however, determining when that
actually is can be tricky. If we emit them way-too-early then
we'll get hygiene errors because bindings for the variables they
use have not yet been emitted. We can fix these hygiene erors
by calling atomize
, to ensure that all the necessary bindings
have already been emitted. But that may still emit things too
early, because emitting th variable-binding statements now means
that we can't go forward/backward on them later on; which may
cause us to bot unnecessarily. One way to avoid this bot issue
is to emit guards/weights later than necessary, by actually
pushing them onto the context (and then emitting them whenever
we residualize the context). This guarantees we don't emit too
early; but the tradeoff is we may end up generating duplicate
code by emitting too late. One possible (currently unimplemented)
solution to that code duplication issue is to let these statements
be emitted too late, but then have a post-processing step to
lift guards/weights up as high as they can go. To avoid problems
about testing programs/expressions for equality, we can use a
hash-consing trick so we keep track of the identity of guard/weight
statements, then we can simply compare those identities and only
after the lifting do we replace the identity hash with the actual
statement.
- disintegrateWithVar :: ABT Term abt => Text -> Sing a -> abt '[] (HMeasure (HPair a b)) -> [abt '[] (a :-> HMeasure b)]
- disintegrate :: ABT Term abt => abt '[] (HMeasure (HPair a b)) -> [abt '[] (a :-> HMeasure b)]
- densityWithVar :: ABT Term abt => Text -> Sing a -> abt '[] (HMeasure a) -> [abt '[] (a :-> HProb)]
- density :: ABT Term abt => abt '[] (HMeasure a) -> [abt '[] (a :-> HProb)]
- observe :: ABT Term abt => abt '[] (HMeasure a) -> abt '[] a -> [abt '[] (HMeasure a)]
- determine :: ABT Term abt => [abt '[] a] -> Maybe (abt '[] a)
- perform :: forall abt. ABT Term abt => MeasureEvaluator abt (Dis abt)
- atomize :: ABT Term abt => TermEvaluator abt (Dis abt)
- constrainValue :: ABT Term abt => abt '[] a -> abt '[] a -> Dis abt ()
- constrainOutcome :: forall abt a. ABT Term abt => abt '[] a -> abt '[] (HMeasure a) -> Dis abt ()
the Hakaru API
disintegrateWithVar :: ABT Term abt => Text -> Sing a -> abt '[] (HMeasure (HPair a b)) -> [abt '[] (a :-> HMeasure b)] Source #
Disintegrate a measure over pairs with respect to the lebesgue
measure on the first component. That is, for each measure kernel
n <- disintegrate m
we have that m == bindx lebesgue n
. The
first two arguments give the hint and type of the lambda-bound
variable in the result. If you want to automatically fill those
in, then see disintegrate
.
N.B., the resulting functions from a
to 'HMeasure b
are
indeed measurable, thus it is safe/appropriate to use Hakaru's
(':->)
rather than Haskell's (->)
.
BUG: Actually, disintegration is with respect to the Borel
measure on the first component of the pair! Alas, we don't really
have a clean way of describing this since we've no primitive
MeasureOp
for Borel measures.
Developer's Note: This function fills the role that the old
runDisintegrate
did (as opposed to the old function called
disintegrate
). [Once people are familiar enough with the new
code base and no longer recall what the old code base was doing,
this note should be deleted.]
disintegrate :: ABT Term abt => abt '[] (HMeasure (HPair a b)) -> [abt '[] (a :-> HMeasure b)] Source #
A variant of disintegrateWithVar
which automatically computes
the type via typeOf
.
densityWithVar :: ABT Term abt => Text -> Sing a -> abt '[] (HMeasure a) -> [abt '[] (a :-> HProb)] Source #
Return the density function for a given measure. The first two
arguments give the hint and type of the lambda-bound variable
in the result. If you want to automatically fill those in, then
see density
.
TODO: is the resulting function guaranteed to be measurable? If so, update this documentation to reflect that fact; if not, then we should make it into a Haskell function instead.
density :: ABT Term abt => abt '[] (HMeasure a) -> [abt '[] (a :-> HProb)] Source #
A variant of densityWithVar
which automatically computes the
type via typeOf
.
observe :: ABT Term abt => abt '[] (HMeasure a) -> abt '[] a -> [abt '[] (HMeasure a)] Source #
Constrain a measure such that it must return the observed value. In other words, the resulting measure returns the observed value with weight according to its density in the original measure, and gives all other values weight zero.
determine :: ABT Term abt => [abt '[] a] -> Maybe (abt '[] a) Source #
Arbitrarily choose one of the possible alternatives. In the future, this function should be replaced by a better one that takes some sort of strategy for deciding which alternative to choose.
Implementation details
perform :: forall abt. ABT Term abt => MeasureEvaluator abt (Dis abt) Source #
Simulate performing HMeasure
actions by simply emiting code
for those actions, returning the bound variable.
This is the function called (|>>)
in the disintegration paper.
atomize :: ABT Term abt => TermEvaluator abt (Dis abt) Source #
The goal of this function is to ensure the correctness criterion that given any term to be emitted, the resulting term is semantically equivalent but contains no heap-bound variables. That correctness criterion is necessary to ensure hygiene/scoping.
This particular implementation calls evaluate
recursively,
giving us something similar to full-beta reduction. However,
that is considered an implementation detail rather than part of
the specification of what the function should do. Also, it's a
gross hack and prolly a big part of why we keep running into
infinite looping issues.
This name is taken from the old finally tagless code, where "atomic" terms are (among other things) emissible; i.e., contain no heap-bound variables.
BUG: this function infinitely loops in certain circumstances (namely when dealing with neutral terms)
constrainValue :: ABT Term abt => abt '[] a -> abt '[] a -> Dis abt () Source #
Given an emissible term v0
(the first argument) and another
term e0
(the second argument), compute the constraints such
that e0
must evaluate to v0
. This is the function called
(<|)
in the disintegration paper, though notably we swap the
argument order so that the "value" is the first argument.
N.B., this function assumes (and does not verify) that the first
argument is emissible. So callers (including recursive calls)
must guarantee this invariant, by calling atomize
as necessary.
TODO: capture the emissibility requirement on the first argument in the types, to help avoid accidentally passing the arguments in the wrong order!
constrainOutcome :: forall abt a. ABT Term abt => abt '[] a -> abt '[] (HMeasure a) -> Dis abt () Source #
This is a helper function for constrainValue
to handle SBind
statements (just as the perform
argument to evaluate
is a
helper for handling SBind
statements).
N.B., We assume that the first argument, v0
, is already
atomized. So, this must be ensured before recursing, but we can
assume it's already been done by the IH. Technically, we con't
care whether the first argument is in normal form or not, just
so long as it doesn't contain any heap-bound variables.
This is the function called (<<|)
in the paper, though notably
we swap the argument order.
TODO: find some way to capture in the type that the first argument must be emissible, to help avoid accidentally passing the arguments in the wrong order!
TODO: under what circumstances is constrainOutcome x m
different
from constrainValue x =<< perform m
? If they're always the
same, then we should just use that as the definition in order
to avoid repeating ourselves