Copyright	(c) 2019 Composewell Technologies (c) 2013 Gabriel Gonzalez
License	BSD3
Maintainer	streamly@composewell.com
Stability	experimental
Portability	GHC
Safe Haskell	Safe-Inferred
Language	Haskell2010

Streamly.Internal.Data.Fold.Type

Contents

Imports
Types
Constructors
Folds
Combinators
Running A Fold
Transforming inner monad
Deprecated

Description

Stream Consumers

We can classify stream consumers in the following categories in order of increasing complexity and power:

Accumulators: Tee/Zip is simple, cannot be appended, good for scanning.
Terminating folds: Tee/Zip varies based on termination, can be appended, good for scanning, nesting (many) is easy.
Non-failing (backtracking only) parsers: cannot be used as scans because of backtracking, nesting is complicated because of backtracking, appending is efficient because of no Alternative, Alternative does not make sense because it cannot fail.
Parsers: Alternative on failure, appending is not as efficient because of buffering for Alternative.

First two are represented by the Fold type and the last two by the Parser type.

Folds that never terminate (Accumulators)

An Accumulator is the simplest type of fold, it never fails and never terminates. It can always accept more inputs (never terminates) and the accumulator is always valid. For example sum. Traditional Haskell left folds like foldl are accumulators.

Accumulators can be composed in parallel where we distribute the input stream to all accumulators. Since accumulators never terminate they cannot be appended.

An accumulator can be represented as:

data Fold0 m a b =
  forall s. Fold0
     (s -> a -> m s) -- step
     (m s)           -- initial
     (s -> m b)      -- extract

This is just a traditional left fold, compare with foldl. The driver of the fold would call initial at the beginning and then keep accumulating inputs into its result using step and finally extract the result using extract.

Folds that terminate after one or more input

Terminating folds are accumulators that can terminate, like accumulators they do not fail. Once a fold terminates it no longer accepts any more inputs. Terminating folds can be appended, the next fold can be applied after the first one terminates. Because they cannot fail, they do not need backtracking.

The take operation is an example of a terminating fold. It terminates after consuming n items. Coupled with an accumulator (e.g. sum) it can be used to process the stream into chunks of fixed size.

A terminating fold can be represented as:

data Step s b
    = Partial !s -- the fold can accept more input
    | Done !b    -- the fold is done

data Fold1 m a b =
  forall s. Fold1
     (s -> a -> m (Step s b)) -- step
     (m s)                    -- initial
     (s -> m b)               -- extract

The fold driver stops driving the fold as soon as the fold returns a Done. extract is required only if the fold has not stopped yet and the input ends. extract can never be called if the fold is Done.

Notice that the initial of Fold1 type does not return a Step type, therefore, it cannot say Done in initial. It always has to consume at least one element before it can say Done for termination, via the step function.

Folds that terminate after 0 or more input

The Fold1 type makes combinators like take 0 impossible to implement because they need to terminate even before they can consume any elements at all. Implementing this requires the initial function to be able to return Done.

data Fold m a b =
  forall s. Fold
     (s -> a -> m (Step s b)) -- step
     (m (Step s b))           -- initial
     (s -> m b)               -- extract

This is also required if we want to compose terminating folds using an Applicative or Monadic composition. pure needs to yield an output without having to consume an input.

initial now has the ability to terminate the fold without consuming any input based on the state of the monad.

In some cases it does not make sense to use a fold that does not consume any items at all, and it may even lead to an infinite loop. It might make sense to use a Fold1 type for such cases because it guarantees to consume at least one input, therefore, guarantees progress. For example, in classifySessionsBy or any other splitting operations it may not make sense to pass a fold that never consumes an input. However, we do not have a separate Fold1 type for the sake of simplicity of types/API.

Adding this capability adds a certain amount of complexity in the implementation of fold combinators. initial has to always handle two cases now. We could potentially not implement this in folds to keep fold implementation simpler, and these use cases can be transferred to the parser type. However, it would be a bit inconvenient to not have a take operation or to not be able to use `take 0` if we have it. Also, applicative and monadic composition of folds would not be possible.

Terminating Folds with backtracking

Consider the example of takeWhile operation, it needs to inspect an element for termination decision. However, it does not consume the element on which it terminates. To implement takeWhile a terminating fold will have to implement a way to return the unconsumed input to the fold driver.

Single element leftover case is quite common and its easy to implement it in terminating folds by adding a Done1 constructor in the Step type which indicates that the last element was not consumed by the fold. The following additional operations can be implemented as terminating folds if we do that.

takeWhile
groupBy
wordBy

However, it creates several complications. The most important one is that we cannot use such folds for scanning. We cannot backtrack after producing an output in a scan.

Nested backtracking

Nesting of backtracking folds increases the amount of backtracking required exponentially.

For example, the combinator many inner outer applies the outer fold on the input stream and applies the inner fold on the results of the outer fold.

many :: Monad m => Fold m b c -> Fold m a b -> Fold m a c

If the inner fold itself returns a Done1 then we need to backtrack all the elements that have been consumed by the outer fold to generate that value. We need backtracking of more than one element.

Arbitrary backtracking requires arbitrary buffering. However, we do not want to buffer unconditionally, only if the buffer is needed. One way to do this is to use a Continue constructor like parsers. When we have nested folds, the top level fold always returns a Continue to the driver until an output is generated by it, this means the top level driver keeps buffering until an output is generated via Partial or Done. Intermediate level Continue keep propagating up to the top level.

Parallel backtracking

In compositions like Alternative and Distributive we may have several branches. Each branch can backtrack independently. We need to keep the input as long as any of the branches need it. We can use a single copy of the buffer and maintain it based on all the branches, or we can make each branch have its own buffer. The latter approach may be simpler to implement. Whenever we branch we can introduce an independent buffer for backtracking. Or we can use a newtype that allows branched composition to handle backtracking.

Implementation Approach

To avoid these issues we can enforce, by using types, that the collecting folds can never return a leftover. This leads us to define a type that can never return a leftover. The use cases of single leftover can be transferred to parsers where we have general backtracking mechanism and single leftover is just a special case of backtracking.

This means: takeWhile, groupBy, wordBy would be implemented as parsers.

A proposed design is to use the same Step type with Error in Folds as well as Parsers. Folds won't use the Error constructor and even if they use, it will be equivalent to just throwing an error. They won't have an alternative.

Because of the complexity of implementing a distributive composition in presence of backtracking we could possibly have a type without backtracking but with the Continue constructor, and use either the Parser type or another type for backtracking.

Folds with an additional input

The Fold type does not allow a dynamic input to be used to generate the initial value of the fold accumulator. We can extend the type further to allow that:

data Refold m i a b =
  forall s. Refold
     (s -> a -> m (Step s b)) -- step
     (i -> m (Step s b))      -- initial
     (s -> m b)               -- extract

Parsers

The next upgrade after terminating folds with a leftover are parsers. Parsers are terminating folds that can fail and backtrack. Parsers can be composed using an alternative style composition where they can backtrack and apply another parser if one parser fails. satisfy is a simple example of a parser, it would succeed if the condition is satisfied and it would fail otherwise, on failure an alternative parser can be used on the same input.

We add Error and Continue to the Step type of fold. Continue is to skip producing an output or to backtrack. We also add the ability to backtrack in Partial and Done.:

Also extract now needs to be able to express an error. We could have it return the Step type as well but that makes the implementation more complicated.

data Step s b =
      Partial Int s   -- partial result and how much to backtrack
    | Done Int b      -- final result and how much to backtrack
    | Continue Int s  -- no result and how much to backtrack
    | Error String    -- error

data Parser a m b =
  forall s. Fold
     (s -> a -> m (Step s b))   -- step
     (m (Step s b))             -- initial
     (s -> m (Either String b)) -- extract

Types for Stream Consumers

We do not have a separate type for accumulators. Terminating folds are a superset of accumulators and to avoid too many types we represent both using the same type, Fold.

We do not club the leftovers functionality with terminating folds because of the reasons explained earlier. Instead combinators that require leftovers are implemented as the Parser type. This is a sweet spot to balance ease of use, type safety and performance. Using separate Accumulator and terminating fold types would encode more information in types but it would make ease of use, implementation, maintenance effort worse. Combining Accumulator, terminating folds and Parser into a single Parser type would make ease of use even better but type safety and performance worse.

One of the design requirements that we have placed for better ease of use and code reuse is that Parser type should be a strict superset of the Fold type i.e. it can do everything that a Fold can do and more. Therefore, folds can be easily upgraded to parsers and we can use parser combinators on folds as well when needed.

Fold Design

A fold is represented by a collection of "initial", "step" and "extract" functions. The "initial" action generates the initial state of the fold. The state is internal to the fold and maintains the accumulated output. The "step" function is invoked using the current state and the next input value and results in a Partial or Done. A Partial returns the next intermediate state of the fold, a Done indicates that the fold has terminated and returns the final value of the accumulator.

Every Partial indicates that a new accumulated output is available. The accumulated output can be extracted from the state at any point using "extract". "extract" can never fail. A fold returns a valid output even without any input i.e. even if you call "extract" on "initial" state it provides an output. This is not true for parsers.

In general, "extract" is used in two cases:

When the fold is used as a scan extract is called on the intermediate state every time it is yielded by the fold, the resulting value is yielded as a stream.
When the fold is used as a regular fold, extract is called once when we are done feeding input to the fold.

Alternate Designs

An alternate and simpler design would be to return the intermediate output via Partial along with the state, instead of using "extract" on the yielded state and remove the extract function altogether.

This may even facilitate more efficient implementation. Extract from the intermediate state after each yield may be more costly compared to the fold step itself yielding the output. The fold may have more efficient ways to retrieve the output rather than stuffing it in the state and using extract on the state.

However, removing extract altogether may lead to less optimal code in some cases because the driver of the fold needs to thread around the intermediate output to return it if the stream stops before the fold could return Done. When using this approach, the parseMany (FL.take filesize) benchmark shows a 2x worse performance even after ensuring everything fuses. So we keep the "extract" approach to ensure better perf in all cases.

But we could still yield both state and the output in Partial, the output can be used for the scan use case, instead of using extract. Extract would then be used only for the case when the stream stops before the fold completes.

Monoids

Monoids allow generalized, modular folding. The accumulators in this module can be expressed using mconcat and a suitable Monoid. Instead of writing folds we can write Monoids and turn them into folds.

Synopsis

data Step s b
- = Partial !s
- | Done !b
data Fold m a b = forall s. Fold (s -> a -> m (Step s b)) (m (Step s b)) (s -> m b)
foldl' :: Monad m => (b -> a -> b) -> b -> Fold m a b
foldlM' :: Monad m => (b -> a -> m b) -> m b -> Fold m a b
foldl1' :: Monad m => (a -> a -> a) -> Fold m a (Maybe a)
foldlM1' :: Monad m => (a -> a -> m a) -> Fold m a (Maybe a)
foldt' :: Monad m => (s -> a -> Step s b) -> Step s b -> (s -> b) -> Fold m a b
foldtM' :: (s -> a -> m (Step s b)) -> m (Step s b) -> (s -> m b) -> Fold m a b
foldr' :: Monad m => (a -> b -> b) -> b -> Fold m a b
foldrM' :: Monad m => (a -> b -> m b) -> m b -> Fold m a b
fromPure :: Applicative m => b -> Fold m a b
fromEffect :: Applicative m => m b -> Fold m a b
fromRefold :: Refold m c a b -> c -> Fold m a b
drain :: Monad m => Fold m a ()
toList :: Monad m => Fold m a [a]
toStreamK :: Monad m => Fold m a (StreamK n a)
toStreamKRev :: Monad m => Fold m a (StreamK n a)
rmapM :: Monad m => (b -> m c) -> Fold m a b -> Fold m a c
lmap :: (a -> b) -> Fold m b r -> Fold m a r
lmapM :: Monad m => (a -> m b) -> Fold m b r -> Fold m a r
postscan :: Monad m => Fold m a b -> Fold m b c -> Fold m a c
catMaybes :: Monad m => Fold m a b -> Fold m (Maybe a) b
scanMaybe :: Monad m => Fold m a (Maybe b) -> Fold m b c -> Fold m a c
filter :: Monad m => (a -> Bool) -> Fold m a r -> Fold m a r
filtering :: Monad m => (a -> Bool) -> Fold m a (Maybe a)
filterM :: Monad m => (a -> m Bool) -> Fold m a r -> Fold m a r
catLefts :: Monad m => Fold m a c -> Fold m (Either a b) c
catRights :: Monad m => Fold m b c -> Fold m (Either a b) c
catEithers :: Fold m a b -> Fold m (Either a a) b
take :: Monad m => Int -> Fold m a b -> Fold m a b
taking :: Monad m => Int -> Fold m a (Maybe a)
dropping :: Monad m => Int -> Fold m a (Maybe a)
splitWith :: Monad m => (a -> b -> c) -> Fold m x a -> Fold m x b -> Fold m x c
split_ :: Monad m => Fold m x a -> Fold m x b -> Fold m x b
data ManyState s1 s2
many :: Monad m => Fold m a b -> Fold m b c -> Fold m a c
manyPost :: Monad m => Fold m a b -> Fold m b c -> Fold m a c
groupsOf :: Monad m => Int -> Fold m a b -> Fold m b c -> Fold m a c
refoldMany :: Monad m => Fold m a b -> Refold m x b c -> Refold m x a c
refoldMany1 :: Monad m => Refold m x a b -> Fold m b c -> Refold m x a c
concatMap :: Monad m => (b -> Fold m a c) -> Fold m a b -> Fold m a c
duplicate :: Monad m => Fold m a b -> Fold m a (Fold m a b)
refold :: Monad m => Refold m b a c -> Fold m a b -> Fold m a c
teeWith :: Monad m => (a -> b -> c) -> Fold m x a -> Fold m x b -> Fold m x c
teeWithFst :: Monad m => (b -> c -> d) -> Fold m a b -> Fold m a c -> Fold m a d
teeWithMin :: Monad m => (b -> c -> d) -> Fold m a b -> Fold m a c -> Fold m a d
shortest :: Monad m => Fold m x a -> Fold m x b -> Fold m x (Either a b)
longest :: Monad m => Fold m x a -> Fold m x b -> Fold m x (Either a b)
extractM :: Monad m => Fold m a b -> m b
reduce :: Monad m => Fold m a b -> m (Fold m a b)
snoc :: Monad m => Fold m a b -> a -> m (Fold m a b)
addOne :: Monad m => a -> Fold m a b -> m (Fold m a b)
snocM :: Monad m => Fold m a b -> m a -> m (Fold m a b)
snocl :: Monad m => Fold m a b -> a -> Fold m a b
snoclM :: Monad m => Fold m a b -> m a -> Fold m a b
close :: Monad m => Fold m a b -> Fold m a b
isClosed :: Monad m => Fold m a b -> m Bool
morphInner :: (forall x. m x -> n x) -> Fold m a b -> Fold n a b
generalizeInner :: Monad m => Fold Identity a b -> Fold m a b
foldr :: Monad m => (a -> b -> b) -> b -> Fold m a b
serialWith :: Monad m => (a -> b -> c) -> Fold m x a -> Fold m x b -> Fold m x c

Imports

>>> :m
>>> :set -XFlexibleContexts
>>> import Control.Monad (void)
>>> import qualified Data.Foldable as Foldable
>>> import Data.Function ((&))
>>> import Data.Functor.Identity (Identity, runIdentity)
>>> import Data.IORef (newIORef, readIORef, writeIORef)
>>> import Data.Maybe (fromJust, isJust)
>>> import Data.Monoid (Endo(..), Last(..), Sum(..))

>>> import Streamly.Data.Array (Array)
>>> import Streamly.Data.Fold (Fold, Tee(..))
>>> import Streamly.Data.Stream (Stream)

>>> import qualified Streamly.Data.Array as Array
>>> import qualified Streamly.Data.Fold as Fold
>>> import qualified Streamly.Data.MutArray as MutArray
>>> import qualified Streamly.Data.Parser as Parser
>>> import qualified Streamly.Data.Stream as Stream
>>> import qualified Streamly.Data.StreamK as StreamK
>>> import qualified Streamly.Data.Unfold as Unfold

For APIs that have not been released yet.

>>> import qualified Streamly.Internal.Data.Fold as Fold
>>> import qualified Streamly.Internal.Data.Fold.Window as FoldW

Types

data Step s b Source #

Represents the result of the step of a Fold. Partial returns an intermediate state of the fold, the fold step can be called again with the state or the driver can use extract on the state to get the result out. Done returns the final result and the fold cannot be driven further.

Pre-release

Constructors

Partial !s
Done !b

Instances

Instances details

Bifunctor Step Source #	`first` maps over `Partial` and `second` maps over `Done`.
Instance details Defined in Streamly.Internal.Data.Fold.Step Methods bimap :: (a -> b) -> (c -> d) -> Step a c -> Step b d # first :: (a -> b) -> Step a c -> Step b c # second :: (b -> c) -> Step a b -> Step a c #
Functor (Step s) Source #	`fmap` maps over `Done`. fmap = `second`
Instance details Defined in Streamly.Internal.Data.Fold.Step Methods fmap :: (a -> b) -> Step s a -> Step s b # (<$) :: a -> Step s b -> Step s a #

data Fold m a b Source #

The type Fold m a b having constructor Fold step initial extract represents a fold over an input stream of values of type a to a final value of type b in Monad m.

The fold uses an intermediate state s as accumulator, the type s is internal to the specific fold definition. The initial value of the fold state s is returned by initial. The step function consumes an input and either returns the final result b if the fold is done or the next intermediate state (see Step). At any point the fold driver can extract the result from the intermediate state using the extract function.

NOTE: The constructor is not yet released, smart constructors are provided to create folds.

Constructors

forall s. Fold (s -> a -> m (Step s b)) (m (Step s b)) (s -> m b)

Fold step initial extract

Instances

Instances details

Monad m => Applicative (Fold m a) Source #	`Applicative` form of `splitWith`. Split the input serially over two folds. Note that this fuses but performance degrades quadratically with respect to the number of compositions. It should be good to use for less than 8 compositions.
Instance details Defined in Streamly.Internal.Data.Fold.Type Methods pure :: a0 -> Fold m a a0 # (<>) :: Fold m a (a0 -> b) -> Fold m a a0 -> Fold m a b # liftA2 :: (a0 -> b -> c) -> Fold m a a0 -> Fold m a b -> Fold m a c # (>) :: Fold m a a0 -> Fold m a b -> Fold m a b # (<*) :: Fold m a a0 -> Fold m a b -> Fold m a a0 #
Functor m => Functor (Fold m a) Source #	Maps a function on the output of the fold (the type `b`).
Instance details Defined in Streamly.Internal.Data.Fold.Type Methods fmap :: (a0 -> b) -> Fold m a a0 -> Fold m a b # (<$) :: a0 -> Fold m a b -> Fold m a a0 #

Constructors

foldl' :: Monad m => (b -> a -> b) -> b -> Fold m a b Source #

Make a fold from a left fold style pure step function and initial value of the accumulator.

If your Fold returns only Partial (i.e. never returns a Done) then you can use foldl'* constructors.

A fold with an extract function can be expressed using fmap:

mkfoldlx :: Monad m => (s -> a -> s) -> s -> (s -> b) -> Fold m a b
mkfoldlx step initial extract = fmap extract (foldl' step initial)

foldlM' :: Monad m => (b -> a -> m b) -> m b -> Fold m a b Source #

Make a fold from a left fold style monadic step function and initial value of the accumulator.

A fold with an extract function can be expressed using rmapM:

mkFoldlxM :: Functor m => (s -> a -> m s) -> m s -> (s -> m b) -> Fold m a b
mkFoldlxM step initial extract = rmapM extract (foldlM' step initial)

foldl1' :: Monad m => (a -> a -> a) -> Fold m a (Maybe a) Source #

Make a strict left fold, for non-empty streams, using first element as the starting value. Returns Nothing if the stream is empty.

Pre-release

foldlM1' :: Monad m => (a -> a -> m a) -> Fold m a (Maybe a) Source #

Like 'foldl1'' but with a monadic step function.

Pre-release

foldt' :: Monad m => (s -> a -> Step s b) -> Step s b -> (s -> b) -> Fold m a b Source #

Make a terminating fold using a pure step function, a pure initial state and a pure state extraction function.

Pre-release

foldtM' :: (s -> a -> m (Step s b)) -> m (Step s b) -> (s -> m b) -> Fold m a b Source #

Make a terminating fold with an effectful step function and initial state, and a state extraction function.

>>> foldtM' = Fold.Fold

We can just use Fold but it is provided for completeness.

Pre-release

foldr' :: Monad m => (a -> b -> b) -> b -> Fold m a b Source #

Make a fold using a right fold style step function and a terminal value. It performs a strict right fold via a left fold using function composition. Note that a strict right fold can only be useful for constructing strict structures in memory. For reductions this will be very inefficient.

Definitions:

>>> foldr' f z = fmap (flip appEndo z) $ Fold.foldMap (Endo . f)
>>> foldr' f z = fmap ($ z) $ Fold.foldl' (\g x -> g . f x) id

Example:

>>> Stream.fold (Fold.foldr' (:) []) $ Stream.enumerateFromTo 1 5
[1,2,3,4,5]

foldrM' :: Monad m => (a -> b -> m b) -> m b -> Fold m a b Source #

Like foldr' but with a monadic step function.

Example:

>>> toList = Fold.foldrM' (\a xs -> return $ a : xs) (return [])

Folds

fromPure :: Applicative m => b -> Fold m a b Source #

Make a fold that yields the supplied value without consuming any further input.

Pre-release

fromEffect :: Applicative m => m b -> Fold m a b Source #

Make a fold that yields the result of the supplied effectful action without consuming any further input.

Pre-release

fromRefold :: Refold m c a b -> c -> Fold m a b Source #

Make a fold from a consumer.

Internal

drain :: Monad m => Fold m a () Source #

A fold that drains all its input, running the effects and discarding the results.

>>> drain = Fold.drainMapM (const (return ()))
>>> drain = Fold.foldl' (\_ _ -> ()) ()

toList :: Monad m => Fold m a [a] Source #

Folds the input stream to a list.

Warning! working on large lists accumulated as buffers in memory could be very inefficient, consider using Streamly.Data.Array instead.

>>> toList = Fold.foldr' (:) []

toStreamK :: Monad m => Fold m a (StreamK n a) Source #

A fold that buffers its input to a pure stream.

>>> toStreamK = foldr StreamK.cons StreamK.nil
>>> toStreamK = fmap StreamK.reverse Fold.toStreamKRev

Internal

toStreamKRev :: Monad m => Fold m a (StreamK n a) Source #

Buffers the input stream to a pure stream in the reverse order of the input.

>>> toStreamKRev = Foldable.foldl' (flip StreamK.cons) StreamK.nil

This is more efficient than toStreamK. toStreamK has exactly the same performance as reversing the stream after toStreamKRev.

Pre-release

Combinators

Mapping output

rmapM :: Monad m => (b -> m c) -> Fold m a b -> Fold m a c Source #

Map a monadic function on the output of a fold.

Mapping Input

lmap :: (a -> b) -> Fold m b r -> Fold m a r Source #

lmap f fold maps the function f on the input of the fold.

Definition:

>>> lmap = Fold.lmapM return

Example:

>>> sumSquared = Fold.lmap (\x -> x * x) Fold.sum
>>> Stream.fold sumSquared (Stream.enumerateFromTo 1 100)
338350

lmapM :: Monad m => (a -> m b) -> Fold m b r -> Fold m a r Source #

lmapM f fold maps the monadic function f on the input of the fold.

postscan :: Monad m => Fold m a b -> Fold m b c -> Fold m a c Source #

Postscan the input of a Fold to change it in a stateful manner using another Fold.

postscan scanner collector

Pre-release

Filtering

catMaybes :: Monad m => Fold m a b -> Fold m (Maybe a) b Source #

Modify a fold to receive a Maybe input, the Just values are unwrapped and sent to the original fold, Nothing values are discarded.

>>> catMaybes = Fold.mapMaybe id
>>> catMaybes = Fold.filter isJust . Fold.lmap fromJust

scanMaybe :: Monad m => Fold m a (Maybe b) -> Fold m b c -> Fold m a c Source #

Use a Maybe returning fold as a filtering scan.

>>> scanMaybe p f = Fold.postscan p (Fold.catMaybes f)

Pre-release

filter :: Monad m => (a -> Bool) -> Fold m a r -> Fold m a r Source #

Include only those elements that pass a predicate.

>>> Stream.fold (Fold.filter (> 5) Fold.sum) $ Stream.fromList [1..10]
40

>>> filter p = Fold.scanMaybe (Fold.filtering p)
>>> filter p = Fold.filterM (return . p)
>>> filter p = Fold.mapMaybe (\x -> if p x then Just x else Nothing)

filtering :: Monad m => (a -> Bool) -> Fold m a (Maybe a) Source #

A scanning fold for filtering elements based on a predicate.

filterM :: Monad m => (a -> m Bool) -> Fold m a r -> Fold m a r Source #

Like filter but with a monadic predicate.

>>> f p x = p x >>= \r -> return $ if r then Just x else Nothing
>>> filterM p = Fold.mapMaybeM (f p)

catLefts :: Monad m => Fold m a c -> Fold m (Either a b) c Source #

Discard Rights and unwrap Lefts in an Either stream.

Pre-release

catRights :: Monad m => Fold m b c -> Fold m (Either a b) c Source #

Discard Lefts and unwrap Rights in an Either stream.

Pre-release

catEithers :: Fold m a b -> Fold m (Either a a) b Source #

Remove the either wrapper and flatten both lefts and as well as rights in the output stream.

Definition:

>>> catEithers = Fold.lmap (either id id)

Pre-release

Trimming

take :: Monad m => Int -> Fold m a b -> Fold m a b Source #

Take at most n input elements and fold them using the supplied fold. A negative count is treated as 0.

>>> Stream.fold (Fold.take 2 Fold.toList) $ Stream.fromList [1..10]
[1,2]

taking :: Monad m => Int -> Fold m a (Maybe a) Source #

dropping :: Monad m => Int -> Fold m a (Maybe a) Source #

Sequential application

splitWith :: Monad m => (a -> b -> c) -> Fold m x a -> Fold m x b -> Fold m x c Source #

Sequential fold application. Apply two folds sequentially to an input stream. The input is provided to the first fold, when it is done - the remaining input is provided to the second fold. When the second fold is done or if the input stream is over, the outputs of the two folds are combined using the supplied function.

Example:

>>> header = Fold.take 8 Fold.toList
>>> line = Fold.takeEndBy (== '\n') Fold.toList
>>> f = Fold.splitWith (,) header line
>>> Stream.fold f $ Stream.fromList "header: hello\n"
("header: ","hello\n")

Note: This is dual to appending streams using append.

Note: this implementation allows for stream fusion but has quadratic time complexity, because each composition adds a new branch that each subsequent fold's input element has to traverse, therefore, it cannot scale to a large number of compositions. After around 100 compositions the performance starts dipping rapidly compared to a CPS style implementation. When you need scaling use parser monad instead.

Time: O(n^2) where n is the number of compositions.

split_ :: Monad m => Fold m x a -> Fold m x b -> Fold m x b Source #

Same as applicative *>. Run two folds serially one after the other discarding the result of the first.

This was written in the hope that it might be faster than implementing it using splitWith, but the current benchmarks show that it has the same performance. So do not expose it unless some benchmark shows benefit.

Repeated Application (Splitting)

data ManyState s1 s2 Source #

many :: Monad m => Fold m a b -> Fold m b c -> Fold m a c Source #

Collect zero or more applications of a fold. many first second applies the first fold repeatedly on the input stream and accumulates it's results using the second fold.

>>> two = Fold.take 2 Fold.toList
>>> twos = Fold.many two Fold.toList
>>> Stream.fold twos $ Stream.fromList [1..10]
[[1,2],[3,4],[5,6],[7,8],[9,10]]

Stops when second fold stops.

Nested Application

concatMap :: Monad m => (b -> Fold m a c) -> Fold m a b -> Fold m a c Source #

Map a Fold returning function on the result of a Fold and run the returned fold. This operation can be used to express data dependencies between fold operations.

Let's say the first element in the stream is a count of the following elements that we have to add, then:

>>> import Data.Maybe (fromJust)
>>> count = fmap fromJust Fold.one
>>> total n = Fold.take n Fold.sum
>>> Stream.fold (Fold.concatMap total count) $ Stream.fromList [10,9..1]
45

This does not fuse completely, see refold for a fusible alternative.

Time: O(n^2) where n is the number of compositions.

Parallel Distribution

teeWith :: Monad m => (a -> b -> c) -> Fold m x a -> Fold m x b -> Fold m x c Source #

teeWith k f1 f2 distributes its input to both f1 and f2 until both of them terminate and combines their output using k.

Definition:

>>> teeWith k f1 f2 = fmap (uncurry k) (Fold.tee f1 f2)

Example:

>>> avg = Fold.teeWith (/) Fold.sum (fmap fromIntegral Fold.length)
>>> Stream.fold avg $ Stream.fromList [1.0..100.0]
50.5

For applicative composition using this combinator see Streamly.Data.Fold.Tee.

Parallel Alternative

shortest :: Monad m => Fold m x a -> Fold m x b -> Fold m x (Either a b) Source #

Shortest alternative. Apply both folds in parallel but choose the result from the one which consumed least input i.e. take the shortest succeeding fold.

If both the folds finish at the same time or if the result is extracted before any of the folds could finish then the left one is taken.

Pre-release

longest :: Monad m => Fold m x a -> Fold m x b -> Fold m x (Either a b) Source #

Longest alternative. Apply both folds in parallel but choose the result from the one which consumed more input i.e. take the longest succeeding fold.

If both the folds finish at the same time or if the result is extracted before any of the folds could finish then the left one is taken.

Pre-release

Running A Fold

extractM :: Monad m => Fold m a b -> m b Source #

Extract the accumulated result of the fold.

Definition:

>>> extractM = Fold.drive Stream.nil

Example:

>>> Fold.extractM Fold.toList
[]

Pre-release

reduce :: Monad m => Fold m a b -> m (Fold m a b) Source #

Evaluate the initialization effect of a fold. If we are building the fold by chaining lazy actions in fold init this would reduce the actions to a strict accumulator value.

Pre-release

snoc :: Monad m => Fold m a b -> a -> m (Fold m a b) Source #

Append a singleton value to the fold, in other words run a single step of the fold.

Example:

>>> import qualified Data.Foldable as Foldable
>>> Foldable.foldlM Fold.snoc Fold.toList [1..3] >>= Fold.drive Stream.nil
[1,2,3]

Pre-release

addOne :: Monad m => a -> Fold m a b -> m (Fold m a b) Source #

Append a singleton value to the fold.

See examples under addStream.

Pre-release

snocM :: Monad m => Fold m a b -> m a -> m (Fold m a b) Source #

Append a singleton value to the fold in other words run a single step of the fold.

Definition:

>>> snocM f = Fold.reduce . Fold.snoclM f

Pre-release

snocl :: Monad m => Fold m a b -> a -> Fold m a b Source #

Append a singleton value to the fold lazily, in other words run a single step of the fold.

Definition:

>>> snocl f = Fold.snoclM f . return

Example:

>>> import qualified Data.Foldable as Foldable
>>> Fold.extractM $ Foldable.foldl Fold.snocl Fold.toList [1..3]
[1,2,3]

Pre-release

snoclM :: Monad m => Fold m a b -> m a -> Fold m a b Source #

Append an effect to the fold lazily, in other words run a single step of the fold.

Pre-release

close :: Monad m => Fold m a b -> Fold m a b Source #

Close a fold so that it does not accept any more input.

isClosed :: Monad m => Fold m a b -> m Bool Source #

Check if the fold has terminated and can take no more input.

Pre-release

Transforming inner monad

morphInner :: (forall x. m x -> n x) -> Fold m a b -> Fold n a b Source #

Change the underlying monad of a fold. Also known as hoist.

Pre-release

generalizeInner :: Monad m => Fold Identity a b -> Fold m a b Source #

Adapt a pure fold to any monad.

>>> generalizeInner = Fold.morphInner (return . runIdentity)

Pre-release

Deprecated

foldr :: Monad m => (a -> b -> b) -> b -> Fold m a b Source #

Deprecated: Please use foldr' instead.

serialWith :: Monad m => (a -> b -> c) -> Fold m x a -> Fold m x b -> Fold m x c Source #

Deprecated: Please use "splitWith" instead