Frames-map-reduce-0.4.1.1: Frames wrapper for map-reduce-folds and some extra folds helpers.
Copyright(c) Adam Conner-Sax 2019
LicenseBSD
Maintaineradam_conner_sax@yahoo.com
Stabilityexperimental
Safe HaskellNone
LanguageHaskell2010

Frames.Aggregation.Maybe

Description

Frames.Aggregation.Maybe contains types and functions to support a specific but common map/reduce operation. Frequently, data is given with more specificity than required for downstream operations. Perhaps an age is given in years and we only need to know the age-band. Assuming we know how to aggregagte data columns, we want to perform that aggregation on all the subsets required to build the data-set with the simpler key, while perhaps leaving some other columns alone. aggregateFold does this.

This module specializes the general versions to the (Maybe :. ElField) intepretation functor since that is a frequent use case.

Synopsis

Type-alias for maps from one record key to another

type RecordKeyMap record f k k' = record (f :. ElField) k -> record (f :. ElField) k' Source #

Type-alias for key aggregation functions.

Aggregation Function combinators

combineKeyAggregations :: forall (a :: [(Symbol, Type)]) b a' b' record. (a (a ++ b), b (a ++ b), Disjoint a' b' ~ 'True, RCastC a (a ++ b) record Maybe, RCastC b (a ++ b) record Maybe, IsoRec a' record Maybe, IsoRec b' record Maybe, IsoRec (a' ++ b') record Maybe) => RecordKeyMap record Maybe a a' -> RecordKeyMap record Maybe b b' -> RecordKeyMap record Maybe (a ++ b) (a' ++ b') Source #

Combine 2 key aggregation functions over disjoint columns.

keyMap :: forall a b record. (KnownField a, KnownField b, RecGetFieldC a record Maybe '[a], IsoRec '[b] record Maybe, Applicative Maybe) => (Snd a -> Snd b) -> RecordKeyMap record Maybe '[a] '[b] Source #

Promote an ordinary function a -> b to a RecordKeyMap aCol bCol where aCol holds values of type a and bCol holds values of type b.

aggregationFolds

aggregateAllFold Source #

Arguments

:: forall (ak :: [(Symbol, Type)]) ak' d record. ((ak' ++ d) ((ak ++ d) ++ ak'), ak (ak ++ d), ak' (ak' ++ d), d (ak' ++ d), Ord (record (Maybe :. ElField) ak'), Ord (record (Maybe :. ElField) ak), RCastC (ak' ++ d) ((ak ++ d) ++ ak') record Maybe, RCastC ak (ak ++ d) record Maybe, RCastC ak' (ak' ++ d) record Maybe, RCastC d (ak' ++ d) record Maybe, IsoRec d record Maybe, IsoRec (ak ++ d) record Maybe, IsoRec (ak' ++ d) record Maybe, IsoRec ak' record Maybe, IsoRec ((ak ++ d) ++ ak') record Maybe) 
=> RecordKeyMap record Maybe ak ak'

get aggregated key from key

-> Fold (record (Maybe :. ElField) d) (record (Maybe :. ElField) d)

aggregate data

-> Fold (record (Maybe :. ElField) (ak ++ d)) [record (Maybe :. ElField) (ak' ++ d)] 

Given some group keys in columns k, some keys to aggregate over in columns ak, some keys to aggregate into in (new) columns ak', a (hopefully surjective) map from records of ak to records of ak', and a fold over the data, in columns d, aggregating over the rows where ak was distinct but ak' is not, produce a fold to transform data keyed by k and ak to data keyed by k and ak' with appropriate aggregations done in the d. E.g., suppose you have voter turnout data for all 50 states in the US, keyed by state and age of voter in years. The data is two columns: total votes cast and turnout as a percentage. You want to aggregate the ages into two bands, over and under some age. So your k is the state column, ak is the age column, ak' is a new column with data type to indicate over/under. The Fold has to sum over the total votes and perform a weighted-sum over the percentages.

aggregateFold Source #

Arguments

:: forall (k :: [(Symbol, Type)]) ak ak' d record. ((ak' ++ d) ((ak ++ d) ++ ak'), ak (ak ++ d), ak' (ak' ++ d), d (ak' ++ d), Ord (record (Maybe :. ElField) ak'), Ord (record (Maybe :. ElField) ak), (k ++ (ak' ++ d)) ~ ((k ++ ak') ++ d), Ord (record (Maybe :. ElField) k), k ((k ++ ak') ++ d), k ((k ++ ak) ++ d), (ak ++ d) ((k ++ ak) ++ d), RCastC ak (ak ++ d) record Maybe, RCastC ak' (ak' ++ d) record Maybe, RCastC d (ak' ++ d) record Maybe, RCastC k ((k ++ ak) ++ d) record Maybe, RCastC (ak ++ d) ((k ++ ak) ++ d) record Maybe, RCastC (ak' ++ d) ((ak ++ d) ++ ak') record Maybe, IsoRec k record Maybe, IsoRec d record Maybe, IsoRec ((k ++ ak') ++ d) record Maybe, IsoRec (ak ++ d) record Maybe, IsoRec (ak' ++ d) record Maybe, IsoRec ak' record Maybe, IsoRec ((ak ++ d) ++ ak') record Maybe) 
=> RecordKeyMap record Maybe ak ak'

get aggregated key from key

-> Fold (record (Maybe :. ElField) d) (record (Maybe :. ElField) d)

aggregate data

-> Fold (record (Maybe :. ElField) ((k ++ ak) ++ d)) [record (Maybe :. ElField) ((k ++ ak') ++ d)] 

Aggregate key columns ak into ak' while leaving key columns k along. Allows aggregation over only some fields. Will often require a typeapplication to specify what k is.

mergeDataFolds :: forall (a :: (Symbol, Type)) b d record. (IsoRec '[b] record Maybe, IsoRec '[a] record Maybe, IsoRec '[a, b] record Maybe) => Fold (record (Maybe :. ElField) d) (record (Maybe :. ElField) '[a]) -> Fold (record (Maybe :. ElField) d) (record (Maybe :. ElField) '[b]) -> Fold (record (Maybe :. ElField) d) (record (Maybe :. ElField) '[a, b]) Source #