Safe Haskell | Safe |
---|---|
Language | Haskell2010 |
- wilson :: Double -> Int -> Int -> (Double, Double, Double)
- invnormcdf :: (Ord a, Floating a) => a -> a
- choose :: Integral a => a -> a -> a
- estimateComplexity :: (Integral a, Floating b, Ord b) => a -> a -> Maybe b
- showNum :: Show a => a -> String
- showOOM :: Double -> String
- log1p :: (Floating a, Ord a) => a -> a
- expm1 :: (Floating a, Ord a) => a -> a
- (<#>) :: (Floating a, Ord a) => a -> a -> a
- log1mexp :: (Floating a, Ord a) => a -> a
- log1pexp :: (Floating a, Ord a) => a -> a
- lsum :: (Floating a, Ord a) => [a] -> a
- llerp :: (Floating a, Ord a) => a -> a -> a -> a
- sigmoid2 :: (Fractional a, Floating a) => a -> a
- isigmoid2 :: (Fractional a, Floating a) => a -> a
Documentation
wilson :: Double -> Int -> Int -> (Double, Double, Double) Source #
Random useful stuff I didn't know where to put.
calculates the Wilson Score interval.
If (l,m,h) = wilson c x n
, then m
is the binary proportion and
(l,h)
it's c
-confidence interval for x
positive examples out of
n
observations. c
is typically something like 0.05.
invnormcdf :: (Ord a, Floating a) => a -> a Source #
estimateComplexity :: (Integral a, Floating b, Ord b) => a -> a -> Maybe b Source #
Try to estimate complexity of a whole from a sample. Suppose we
sampled total
things and among those singles
occured only once.
How many different things are there?
Let the total number be m
. The copy number follows a Poisson
distribution with paramter lambda
. Let z := e^{lambda}
, then
we have:
P( 0 ) = e^{-lambda} = 1/z P( 1 ) = lambda e^{-lambda} = ln z / z P(>=1) = 1 - e^{-lambda} = 1 - 1/z
singles = m ln z / z total = m (1 - 1/z)
D := totalsingles = (1 - 1z) * z / ln z f := z - 1 - D ln z = 0
To get z
, we solve using Newton iteration and then substitute to
get m
:
dfdz = 1 - Dz z' := z - z (z - 1 - D ln z) / (z - D) m = singles * z /log z
It converges as long as the initial z
is large enough, and 10D
(in the line for zz
below) appears to work well.
log1p :: (Floating a, Ord a) => a -> a Source #
Computes log (1+x)
to a relative precision of 10^-8
even for
very small x
. Stolen from http://www.johndcook.com/cpp_log_one_plus_x.html
expm1 :: (Floating a, Ord a) => a -> a Source #
Computes exp x - 1
to a relative precision of 10^-10
even for
very small x
. Stolen from http://www.johndcook.com/cpp_expm1.html
(<#>) :: (Floating a, Ord a) => a -> a -> a infixl 5 Source #
Computes log (exp x + exp y)
without leaving the log domain and
hence without losing precision.
log1mexp :: (Floating a, Ord a) => a -> a Source #
Computes log (1 - exp x)
, following Martin Mächler.
log1pexp :: (Floating a, Ord a) => a -> a Source #
Computes log (1 + exp x)
, following Martin Mächler.
lsum :: (Floating a, Ord a) => [a] -> a Source #
Computes \( \log ( \sum_i e^{x_i} ) \) sensibly. The list must be sorted in descending(!) order.
llerp :: (Floating a, Ord a) => a -> a -> a -> a Source #
Computes \( \log \left( c e^x + (1-c) e^y \right) \).
sigmoid2 :: (Fractional a, Floating a) => a -> a Source #
Kind-of sigmoid function that maps the reals to the interval
[0,1)
. Good to compute a probability without introducing boundary
conditions.