map-reduce-folds-0.1.0.0: foldl wrappers for map-reduce

Copyright(c) Adam Conner-Sax 2019
LicenseBSD-3-Clause
Maintaineradam_conner_sax@yahoo.com
Stabilityexperimental
Safe HaskellSafe
LanguageHaskell2010

Control.MapReduce.Engines

Contents

Description

Types and functions used by all the engines. Notes:

  1. The provided grouping functions group elements into a Seq as this is a good default choice.
  2. The Streamly engine is the fastest in my benchmarks. It's the engine used by default if you import Control.MapReduce.Simple.
  3. All the engines take a grouping function as a parameter and default ones are provided. For simple map/reduce, the grouping step may be the bottleneck and I wanted to leave room for experimentation. I've tried (and failed!) to find anything faster than using Map or HashMap via toList . fromListWith (<>).
Synopsis

Fold Types

type MapReduceFold y k c q x d = Unpack x y -> Assign k y c -> Reduce k c d -> Fold x (q d) Source #

Type-alias for a map-reduce-fold engine

type MapReduceFoldM m y k c q x d = UnpackM m x y -> AssignM m k y c -> ReduceM m k c d -> FoldM m x (q d) Source #

Type-alias for a monadic (effectful) map-reduce-fold engine

Engine Helpers

reduceFunction :: (Foldable h, Functor h) => Reduce k x d -> k -> h x -> d Source #

Turn Reduce into a function we can apply

reduceFunctionM :: (Traversable h, Monad m) => ReduceM m k x d -> k -> h x -> m d Source #

Turn ReduceM into a function we can apply

groupBy Helpers

fromListWithHT :: forall h k v s. (HashTable h, Eq k, Hashable k) => (v -> v -> v) -> [(k, v)] -> ST s (h s k v) Source #

an implementation of fromListWith for mutable hashtables from the hastables package. Basically a copy fromList from that package using mutate instead of insert to apply the given function if the was already in the map. Might not be the ideal implementation. Notes:

  • This function is specific hashtable agnostic so you'll have to supply a specific implementation from the package via TypeApplication
  • This function returns the hash-table in the ST monad. You can fold over it (using foldM from hashtables) and then use runST to get the grouped structure out.