monoidmap: Monoidal map type

[ apache, data-structures, library, monoidal ] [ Propose Tags ] [ Report a vulnerability ]

Monoidal map type with support for semigroup and monoid subclasses.


[Skip to Readme]

Downloads

Maintainer's Corner

Package maintainers

For package maintainers and hackage trustees

Candidates

  • No Candidates
Versions [RSS] 0.0.0.0, 0.0.0.1, 0.0.1.0, 0.0.1.1, 0.0.1.2, 0.0.1.3, 0.0.1.4, 0.0.1.5, 0.0.1.6, 0.0.1.7, 0.0.1.8, 0.0.1.9, 0.0.2.0
Change log CHANGELOG.md
Dependencies base (>=4.14.3.0 && <4.21), containers (>=0.6.5.1 && <0.8), deepseq (>=1.4.4.0 && <1.6), groups (>=0.5.3 && <0.6), monoid-subclasses (>=1.2.3 && <1.3), monoidmap, nonempty-containers (>=0.3.4.4 && <0.4), nothunks (>=0.1.3 && <0.3) [details]
License Apache-2.0
Copyright 2022–2024 Jonathan Knowles
Author Jonathan Knowles
Maintainer mail@jonathanknowles.net
Category Data Structures
Bug tracker https://github.com/jonathanknowles/monoidmap/issues
Source repo head: git clone https://github.com/jonathanknowles/monoidmap
Uploaded by JonathanKnowles at 2024-07-22T06:21:59Z
Distributions LTSHaskell:0.0.1.9, Stackage:0.0.2.0
Downloads 1005 total (112 in the last 30 days)
Rating (no votes yet) [estimated by Bayesian average]
Your Rating
  • λ
  • λ
  • λ
Status Docs available [build log]
Last success reported on 2024-07-22 [all 1 reports]

Readme for monoidmap-0.0.1.5

[back to package description]

monoidmap

Overview

This library provides a MonoidMap type that:

Relationship between keys and values

A map of type MonoidMap k v associates every possible key of type k with a value of type v:

MonoidMap.get :: (Ord k, Monoid v) => k -> MonoidMap k v -> v

The empty map associates every key k with a default value of mempty:

∀ k. MonoidMap.get k MonoidMap.empty == mempty

Comparison with standard Map type

The MonoidMap type differs from the standard containers Map type in how it relates keys to values:

Type Models a total function with finite support
Map k v from keys of type k to values of type Maybe v.
MonoidMap k v from keys of type k to values of type v.

This difference can be illustrated by comparing the type signatures of operations to query a key for its value, for both types:

      Map.lookup ::             k ->       Map k v -> Maybe v
MonoidMap.get    :: Monoid v => k -> MonoidMap k v ->       v

Whereas a standard Map has a default value of Nothing, a MonoidMap has a default value of mempty:

∀ k.       Map.lookup k       Map.empty == Nothing
∀ k. MonoidMap.get    k MonoidMap.empty == mempty

In practice, the standard Map type uses Maybe to indicate the presence or absence of a value for a particular key. This representation is necessary because the Map type imposes no constraints on value types.

However, monoidal types already have a natural way to represent null or empty values: the mempty constant, which represents the neutral or identity element of a Monoid.

Consequently, using a standard Map with a monoidal value type gives rise to two distinct representations for null or empty values:

Map.lookup k m Interpretation
Nothing Map m has no entry for key k.
Just mempty Map m has an entry for key k, but the value is empty.

In constrast, the MonoidMap type provides a single, canonical representation for null or empty values, according to the following conceptual mapping:

Map.lookup k m MonoidMap.get k m
Nothing mempty
Just v | v == mempty mempty
Just v | v /= mempty v

Advantages of using a canonical representation

A canonical representation for mempty values can make it easier to correctly implement operations that compare or combine pairs of maps.

When comparing or combining maps of the standard containers Map type, there are two cases to consider for each key k in each map:

  • Map m associates k with Nothing.
  • Map m associates k with Just v.

With a pair of maps, there are four possible cases to consider for each key.

For maps with monoidal values, and in contexts that assume or require a default value of mempty, there are now three cases to consider for each map:

  • Map m associates k with Nothing.
  • Map m associates k with Just v where v == mempty.
  • Map m associates k with Just v where v /= mempty.

With a pair of maps, there are now nine possible cases to consider for each key.

Mishandling cases such as these can give rise to subtle bugs that manifest in unexpected places. For maps with more complex value types (such as maps that nest other maps), the number of cases requiring consideration can easily multiply further, making it even easier to introduce bugs.

Since all MonoidMap operations provide a canonical representation for mempty values, it's possible to write functions that compare or combine maps without having to consider Nothing and Just mempty as separate cases.

Encoding

A MonoidMap only encodes mappings from keys to values that are not equal to mempty.

The total function \(T\) modelled by a MonoidMap is encoded as a support map \(S\), where \(S\) is the finite subset of key-value mappings in \(T\) for which values are not equal to mempty (denoted by \(\varnothing\)):

$S = \{ (k, v) \in T \ |\ v \ne \varnothing \} $

Automatic minimisation

All MonoidMap operations perform automatic minimisation of the support map, so that mempty values do not appear in:

Constraints on values

MonoidMap operations require the monoidal value type to be an instance of MonoidNull.

Instances of MonoidNull must provide a null indicator function that satisfies the following law:

null v == (v == mempty)

MonoidMap operations use the null indicator function to detect and exclude mempty values from the support map.

Note that it is not generally necessary for the value type to be an instance of Eq.

Justification

The set of monoidal types that admit a MonoidNull instance is strictly larger than the set of monoidal types that admit an Eq instance.

For any type v that is an instance of both Eq and Monoid, it is always possible to define a MonoidNull instance:

instance MonoidNull v where
    null = (== mempty)

However, there are monoidal types for which it is possible to define a MonoidNull instance, but not practical (or possible) to define a lawful Eq instance.

For example, consider the following type:

Maybe (String -> Sum Natural)

Requiring a MonoidNull constraint instead of an Eq constraint allows MonoidMap to be usable with a greater range of monoidal value types.

Examples of automatic minimisation

Updating a map with a value that may be mempty

Consider the following operation, which constructs a map of type MonoidMap Int String:

>>> m0 = fromList [(1, "hello"), (2, "brave"), (3, "new"), (4, "world")]
>>> m0
fromList [(1, "hello"), (2, "brave"), (3, "new"), (4, "world")]

The Monoid instance for String defines mempty to be the empty String "".

If we update the map to associate key 3 with value "", that association will no longer appear when encoding the map:

>>> m1 = MonoidMap.set 3 "" m0
>>> m1
fromList [(1, "hello"), (2, "brave"), (4, "world")]

However, we can still read the updated value for key 3:

>>> MonoidMap.get 3 m1
""
Constructing a map from a list that may include mempty values

Consider the following operation, which constructs a map of type MonoidMap Char (Sum Natural):

>>> m = fromList [('a', Sum 0), ('b', Sum 1), ('c', Sum 2), ('d', Sum 3)]

The Monoid instance for Sum Natural defines mempty to be Sum 0.

The original list contained a mapping from key 'a' to value Sum 0, but that association will not appear when encoding the map:

>>> m
fromList [('b', Sum 1), ('c', Sum 2), ('d', Sum 3)]

Nevertheless, we can still read the value for key 'a':

>>> MonoidMap.get 'a' m
Sum 0
Combining maps with operations that may produce mempty values

Consider the following operations, which construct two maps of type MonoidMap Char (Sum Natural):

>>> m1 = fromList [('a', Sum 1), ('b', Sum   1 )]
>>> m2 = fromList [('a', Sum 1), ('b', Sum (-1))]

The Semigroup instance for Sum Natural defines <> as equivalent to ordinary addition.

If we add both maps together with <>, then each key in the resulting map will be associated with the result of applying <> to each matching pair of values in the original maps. However, adding together the values for key 'b' with <> produces Sum 0, so this value will not appear in the encoding:

>>> m1 <> m2
fromList [('a', Sum 2)]

Nevertheless, we can still read the value for key 'b':

>>> MonoidMap.get 'b' (m1 <> m2)
Sum 0

Advantages of automatic minimisation

Consistency

Automatic exclusion of mempty values can help to ensure consistency when encoding to or decoding from other formats such as JSON, CBOR, or YAML.

For example, you may wish to ensure that:

  • When encoding a map, no mempty values appear in the encoded result.
  • When decoding a map, no mempty values appear in the decoded result.

Performance

Automatic exclusion of mempty values makes it possible to perform certain operations in constant time, rather than in linear time, as it is never necessary to traverse the entire map in order to determine which values may or may not be mempty:

Operation With
Automatic
Minimisation
Without
Automatic
Minimisation
null $O(1)$ $O(n)$
nonNull $O(1)$ $O(n)$
nonNullCount $O(1)$ $O(n)$
toMap $O(1)$ $O(n)$

Memory usage

Automatic minimisation makes it easier to reason about the memory usage of a MonoidMap, as memory is not required to encode mappings from keys to empty values.

This is a useful property for large, long-lived map structures that are subject to multiple updates over their lifetimes, where it's important to not retain large numbers of mappings from keys to empty values.

Simplicity

Some total map data types only perform minimisation when explicitly demanded to.

For example, the TMap data type provides a trim operation that traverses the map and removes any values that are equal to the default value. This approach has some advantages, such the ability to provide a lawful Functor instance.

However, this approach also has some disadvantages:

  • It might not be obvious when it's necessary to call trim. For example, consider the following operation: m1 <> m2. Could this operation produce a map where some keys map to default values? (Answer: it depends on the choice of default value and the underlying value type.)
  • Calling trim when it isn't necessary might adversely affect performance, since trim must traverse the entire data structure.
  • Not calling trim when it is necessary might affect correctness. The compiler will not help here, as trimmed and untrimmed maps share the same type.
  • Even if trim is a semantic no-op, default values can still be made visible by operations that encode maps to other types.

Since all MonoidMap operations perform automatic minimisation when appropriate, it's not necessary for users to reason about when or whether it's necessary to "trim" the map.

Furthermore, for nested maps such as MonoidMap k1 (MonoidMap k2 v), automatic minimisation of inner maps enables seamless automatic minimisation of outer maps. See the NestedMonoidMap type for an example of this.

Limitations of automatic minimisation

The MonoidMap type has no Functor instance, as the requirement to exclude mempty values means that the map operation must remove mempty values from its result. Therefore, map does not unconditionally satisfy the functor composition law:

map (f . g) == map f . map g
Example violation

Consider the following MonoidMap m:

m :: MonoidMap String String
m = singleton "k" "v"

And the following functions f and g:

f :: a -> String
f = const "z"

g :: Monoid a => b -> a
g = const mempty

By substituting the above definitions into the left-hand side of the functor composition law, we obtain:

map (f . g) m = map (const "z" . const mempty) (singleton "k" "v")
              = map (const "z"               ) (singleton "k" "v")
              =                                (singleton "k" "z")

By substituting the above definitions into the right-hand side of the functor composition law, we obtain:

map f (map g m) = map (const "z") (map (const mempty) (singleton "k" "v"))
                = map (const "z") mempty
                =                 mempty

This leads to the following inequality between the left-hand side and right-hand side:

singleton "k" "z" /= mempty

Therefore, for this example, the functor composition law is not satisfied.

However, if applying function f to mempty produces mempty, the functor composition law is satisfied:

(f mempty == mempty) ==> (∀ g. map (f . g) == map f . map g)

Monoidal operations

The MonoidMap type provides a comprehensive set of monoidal operations for transforming, combining, and comparing maps.

Instances for several subclasses of Semigroup and Monoid are provided, including classes from the following libraries:

At the root of this hierarchy of subclasses is the Semigroup class, whose instance for MonoidMap is defined in terms of the underlying value type, so that applying <> to a pair of maps is equivalent to applying <> to all pairs of values for matching keys:

∀ k. MonoidMap.get k (m1 <> m2) == MonoidMap.get k m1 <> get k m2

In general, operations for subclasses of Semigroup and Monoid are defined analogously to the Semigroup instance, so that:

  • unary operations on individual maps are defined in terms of their distributive application to all values.
  • binary operations on pairs of maps are defined in terms of their distributive application to all pairs of values for matching keys.

Unary monoidal operations typically satisfy a property similar to:

∀ k. MonoidMap.get k (f m) == f (MonoidMap.get k m)

Binary monoidal operations typically satisfy a property similar to:

∀ k. MonoidMap.get k (f m1 m2) == f (MonoidMap.get k m1) (MonoidMap.get k m2)

Defining monoidal operations in this way makes it possible to transform, combine, and compare maps in ways that are consistent with the algebraic properties of the underlying monoidal value type.

Examples of monoidal operations and their properties

MonoidMap
operation
Required
class
constraint
Equivalent
class
method
Distributive
relationship
append Semigroup (<>) ∀ k. get k (m1 <> m2) ≡ get k m1 <> get k m2
minus Group (~~) ∀ k. get k (m1 ~~ m2) ≡ get k m1 ~~ get k m2
monus Monus (<\>) ∀ k. get k (m1 <\> m2) ≡ get k m1 <\> get k m2
intersection GCDMonoid gcd ∀ k. get k (m1 `gcd` m2) ≡ get k m1 `gcd` get k m2
union LCMMonoid lcm ∀ k. get k (m1 `lcm` m2) ≡ get k m1 `lcm` get k m2

Examples of monoidal operations applied to values

Example: MonoidMap k (Set Integer)

For maps with Set-based values, MonoidMap.union and MonoidMap.intersection compute the Set.union and Set.intersection of each pair of matching values, respectively.

Consider the following maps of type MonoidMap Char (Set Integer):

>>> m1 = fromList [('a', Set.fromList [0, 1]), ('b', Set.fromList [3, 4])]
>>> m2 = fromList [('a', Set.fromList [0, 2]), ('b', Set.fromList [3, 5])]

The MonoidMap.union of maps m1 and m2 is a map that associates every key k with the Set.union of the corresponding sets for k in m1 and m2:

>>> m1 `union` m2
fromList [('a', Set.fromList [0,1,2]), ('b', Set.fromList [3,4,5])]

The MonoidMap.intersection of maps m1 and m2 is a map that associates every key k with the Set.intersection of the corresponding sets for k in m1 and m2:

>>> m1 `intersection` m2
fromList [('a', Set.fromList [0]), ('b', Set.fromList [3])]
Example: MonoidMap k (Sum Integer)

Consider the following maps of type MonoidMap Char (Sum Integer):

>>> m1 = fromList [('a', Sum 10), ('b', Sum 20), ('c, Sum 40)]
>>> m2 = fromList [('a', Sum 40), ('b', Sum 20), ('c, Sum 10)]

The MonoidMap.invert operation produces a new map where every key is associated with the negation of its value in the original map:

>>> invert m1
fromList [('a', Sum (-10)), ('b', Sum (-20)), ('c, Sum (-40))]

>>> invert m2
fromList [('a', Sum (-40)), ('b', Sum (-20)), ('c, Sum (-10))]

The MonoidMap.minus operation, when applied to maps m1 and m2, produces a new map where every key k is associated with the value of k in m1 minus the value of k in m2:

>>> m1 `minus` m2
fromList [('a', Sum (-30)), ('c', Sum 30)]

>>> m2 `minus` m1
fromList [('a', Sum 30), ('c', Sum (-30))]
Example: MonoidMap k (Sum Natural)

For maps with Sum Natural values, MonoidMap.union and MonoidMap.intersection compute the maximum and minimum of each pair of matching values, respectively:

>>> m1 = fromList [('a', Sum 10), ('b', Sum 20)]
>>> m2 = fromList [('a', Sum 20), ('b', Sum 10)]

>>> m1 `union` m2
fromList [('a', Sum 20), ('b', Sum 20)]

>>> m1 `intersection` m2
fromList [('a', Sum 10), ('b', Sum 10)]
Example: MonoidMap k (Product Natural)

For maps with Product Natural values, MonoidMap.union and MonoidMap.intersection compute the lowest common multiple (LCM) and greatest common divisor (GCD) of each pair of matching values, respectively:

>>> m1 = fromList [('a', Product  6), ('b', Product 15), ('c', Product 35)]
>>> m2 = fromList [('a', Product 15), ('b', Product 35), ('c', Product 77)]

>>> m1 `union` m2
fromList [('a', Product 30), ('b', Product 105), ('c', Product 385)]

>>> m1 `intersection` m2
fromList [('a', Product 3), ('b', Product 5), ('c', Product 7)]

General basis for more specialised map types

The MonoidMap type can be used as a general basis for building other more specialised map types.

If you have a Map-based data type with an invariant that values must not be mempty, then by expressing this type in terms of MonoidMap, MonoidMap will handle the invariant for you:

- newtype SomeMap k v = SomeMap (      Map k (SomeMonoidalContainer v))
+ newtype SomeMap k v = SomeMap (MonoidMap k (SomeMonoidalContainer v))

If you're already using a specialised non-empty container type to enforce the invariant that values must not be empty, then MonoidMap makes it possible to replace the use of the specialised non-empty container type with its ordinary equivalent:

Example transformations:

  -- Non-empty lists:
- newtype ListMap k v = ListMap (      Map k (NonEmpty v))
+ newtype ListMap k v = ListMap (MonoidMap k          [v])

  -- Non-empty sets:
- newtype SetMap k v = SetMap (      Map k (NonEmptySet v))
+ newtype SetMap k v = SetMap (MonoidMap k         (Set v))

  -- Non-empty sequences:
- newtype SeqMap k v = SeqMap (      Map k (NonEmptySeq v))
+ newtype SeqMap k v = SeqMap (MonoidMap k         (Seq v))

Using MonoidMap can simplify the implementation of such types, as special handling code for empty values can often be greatly simplified or even eliminated.

Real-world examples from the Haskell ecosystem

Example: SignedMultiSet (a signed multiset type)

The signed-multiset library provides the SignedMultiSet type, which is internally defined as a Map from elements to signed integer occurrence counts:

newtype SignedMultiset a = SMS {unSMS :: Map a Int}

All SignedMultiSet operations maintain an invariant that the internal Map must not contain any mappings to 0 (zero). This requires SignedMultiSet functions to detect and eliminate values of 0.

For example, the insertMany operation:

insertMany :: Ord a => a -> Int -> SignedMultiset a -> SignedMultiset a
insertMany x n = SMS . Map.alter f x . unSMS
  where
    f Nothing  = Just n
    f (Just m) = let k = m + n in if k == 0 then Nothing else Just k

Let's redefine SignedMultiSet in terms of MonoidMap:

- newtype SignedMultiset a = SMS {unSMS ::       Map a      Int }
+ newtype SignedMultiset a = SMS {unSMS :: MonoidMap a (Sum Int)}

Here we've used the Sum wrapper type, whose Monoid instance defines mempty as Sum 0, and <> as ordinary addition.

Now we can redefine insertMany (and similar operations) in a simpler way:

  insertMany :: Ord a => a -> Int -> SignedMultiset a -> SignedMultiset a
+ insertMany x n = SMS . MonoidMap.adjust (+ Sum n) x . unSMS
- insertMany x n = SMS . Map.alter f x . unSMS
-   where
-     f Nothing  = Just n
-     f (Just m) = let k = m + n in if k == 0 then Nothing else Just k

Since the MonoidMap.adjust operation performs automatic minimisation, values of Sum 0 are automatically excluded from the internal data structure, and there is no need to handle them differently from non-zero values.

Example: SetMultiMap (a set-based multimap type)

The multi-containers library provides the SetMultiMap type, which is internally defined as a Map from keys to (possibly-empty) sets of values, together with a Size parameter that records the total number of elements in the map (counting duplicates):

newtype SetMultimap k a = SetMultimap (Map k (Set a), Size)
type Size = Int

All SetMultiMap operations maintain an invariant that the internal Map must not contain any mappings to empty sets. This requires SetMultiMap functions to detect and eliminate values of Set.empty (indicated by the Set.null function).

For example, the alterWithKey operation detects if the updated set is empty, and if so, performs a deletion instead of an insertion:

alterWithKey :: Ord k => (k -> Set a -> Set a) -> k -> SetMultimap k a -> SetMultimap k a
alterWithKey f k mm@(SetMultimap (m, _))
    | Set.null as = fromMap (Map.delete k    m)
    | otherwise   = fromMap (Map.insert k as m)
  where
    as = f k (mm ! k)

fromMap :: Map k (Set a) -> SetMultimap k a
fromMap m = SetMultimap (m', sum (fmap Set.size m'))
  where
    m' = Map.filter (not . Set.null) m

Let's redefine SetMultiMap in terms of MonoidMap:

- newtype SetMultimap k a = SetMultimap (      Map k (Set a), Size)
+ newtype SetMultimap k a = SetMultimap (MonoidMap k (Set a), Size)

Now we can provide a simpler definition for alterWithKey (and other operations):

alterWithKey :: Ord k => (k -> Set a -> Set a) -> k -> SetMultimap k a -> SetMultimap k a
alterWithKey f k (SetMultimap (m, size)) = SetMultiMap
    (MonoidMap.set k new m, size - Set.size old + Set.size new)
  where
    old = MonoidMap.get k m
    new = f k old

Since the MonoidMap.set operation performs automatic minimisation, empty sets are automatically excluded from the internal data structure, and there is no need to handle them any differently from non-empty sets.

Example: MultiMap (a list-based multimap type)

The multi-containers library provides the MultiMap type, which is internally defined as a Map from keys to non-empty lists of values, together with a Size parameter that records the total number of elements in the map (counting duplicates):

newtype Multimap k a = Multimap (Map k (NonEmpty a), Size)
type Size = Int

All MultiMap operations maintain the invariant that the internal Map must not contain any mappings to empty lists. This invariant is handled rather nicely by the use of the NonEmpty list type, which disallows empty lists by construction. As a result, it's arguably more difficult to make a mistake in the implementation than it would be if MultiMap were defined in terms of ordinary lists.

However, certain operations still need to differentiate between the empty and non-empty case, and it's still necessary to handle each case specially.

For example, the alterWithKey operation detects if the updated list is empty, and if so, performs a deletion instead of an insertion:

alterWithKey :: Ord k => (k -> [a] -> [a]) -> k -> Multimap k a -> Multimap k a
alterWithKey f k mm@(Multimap (m, _)) = case nonEmpty (f k (mm ! k)) of
    Just as' -> fromMap (Map.insert k as' m)
    Nothing  -> fromMap (Map.delete k     m)

fromMap :: Map k (NonEmpty a) -> Multimap k a
fromMap m = Multimap (m, sum (fmap length m))

Let's redefine MultiMap in terms of MonoidMap and ordinary lists:

- newtype Multimap k a = Multimap (      Map k (NonEmpty a), Size)
+ newtype Multimap k a = Multimap (MonoidMap k          [a], Size)

Now we can provide a simpler definition for alterWithKey (and other operations):

alterWithKey :: Ord k => (k -> [a] -> [a]) -> k -> Multimap k a -> Multimap k a
alterWithKey f k (Multimap (m, size)) = MultiMap
    (MonoidMap.set k new m, size - List.length old + List.length new)
  where
    old = MonoidMap.get k m
    new = f k old

Since the MonoidMap.set operation performs automatic minimisation:

  • empty lists are automatically excluded from the internal data structure.
  • there is no need to use a specialised NonEmpty type.
  • there is no need to handle empty lists differently from non-empty lists.

Example: MultiAsset (a nested map type)

The cardano-ledger library provides the MultiAsset type, which models a nested mapping from PolicyID keys to AssetName keys to Integer values:

newtype MultiAsset c = MultiAsset (Map (PolicyID c) (Map AssetName Integer))

Each Integer value represents the value of an asset on the Cardano blockchain, where each asset is uniquely identified by the combination of a PolicyID and an AssetName. (Multiple assets can share the same PolicyID.)

All MultiAsset operations maintain a dual invariant that:

  • there must be no mappings from PolicyID keys to empty maps; and that
  • there must be no mappings from AssetName keys to Integer values of 0.

To satisfy this invariant, MultiAsset operations use a variety of helper functions to ensure that MultiAsset values are always in a canonical form.

For example, consider the Semigroup instance for MultiAsset:

instance Semigroup (MultiAsset c) where
    MultiAsset m1 <> MultiAsset m2 =
        MultiAsset (canonicalMapUnion (canonicalMapUnion (+)) m1 m2)

The above definition of <> performs pointwise addition of all pairs of values for matching assets.

For example, if:

Then:

  • MultiAsset m1 <> m2 will map asset a to a value of 30.

The definition of <> uses a function called canonicalMapUnion, which does the heavy lifting work of performing a union while ensuring that each resulting Map is in canonical form.

Let's have a look at the definition of canonicalMapUnion:

canonicalMapUnion ::
  (Ord k, CanonicalZero a) =>
  (a -> a -> a) ->
  Map k a ->
  Map k a ->
  Map k a
canonicalMapUnion _f t1 Tip                 = t1
canonicalMapUnion  f t1 (Bin _ k x Tip Tip) = canonicalInsert f k x t1
canonicalMapUnion  f (Bin _ k x Tip Tip) t2 = canonicalInsert f k x t2
canonicalMapUnion _f Tip t2                 = t2
canonicalMapUnion  f (Bin _ k1 x1 l1 r1) t2 = case Map.splitLookup k1 t2 of
  (l2, mb, r2) -> case mb of
    Nothing ->
      if x1 == zeroC
        then link2 l1l2 r1r2
        else link k1 x1 l1l2 r1r2
    Just x2 ->
      if new == zeroC
        then link2 l1l2 r1r2
        else link k1 new l1l2 r1r2
      where
        new = f x1 x2
    where
      !l1l2 = canonicalMapUnion f l1 l2
      !r1r2 = canonicalMapUnion f r1 r2

The canonicalMapUnion function in turn relies on canonicalInsert, which handles individual insertions:

canonicalInsert ::
  (Ord k, CanonicalZero a) =>
  (a -> a -> a) ->
  k ->
  a ->
  Map k a ->
  Map k a
canonicalInsert f !kx x = go
  where
    go Tip = if x == zeroC then Tip else Map.singleton kx x
    go (Bin sy ky y l r) =
      case compare kx ky of
        LT -> link ky y (go l) r
        GT -> link ky y l (go r)
        EQ -> if new == zeroC then link2 l r else Bin sy kx new l r
          where
            new = f y x

Similarly, the insertMultiAsset function, which "inserts" the value of an individual asset into a MultiAsset value, has the following definition:

insertMultiAsset ::
  (Integer -> Integer -> Integer) ->
  PolicyID c ->
  AssetName ->
  Integer ->
  MultiAsset c ->
  MultiAsset c
insertMultiAsset combine pid aid new (MultiAsset m1) =
  case Map.splitLookup pid m1 of
    (l1, Just m2, l2) ->
      case Map.splitLookup aid m2 of
        (v1, Just old, v2) ->
          if n == 0
            then
              let m3 = link2 v1 v2
               in if Map.null m3
                    then MultiAsset (link2 l1 l2)
                    else MultiAsset (link pid m3 l1 l2)
            else MultiAsset (link pid (link aid n v1 v2) l1 l2)
          where
            n = combine old new
        (_, Nothing, _) ->
          MultiAsset
            ( link
                pid
                ( if new == 0
                    then m2
                    else Map.insert aid new m2
                )
                l1
                l2
            )
    (l1, Nothing, l2) ->
      MultiAsset
        ( if new == 0
            then link2 l1 l2
            else link pid (Map.singleton aid new) l1 l2
        )

A notable feature of all the above functions is that they completely eschew the use of Map.merge. Instead, they directly manipulate constructors exported from Map.Internal. This approach was probably taken for performance reasons.

However, it's clear that maintaining the invariant in this way comes at a cost: the code is rather complex, and it were not for a comprehensive test suite, it would probably be very easy to introduce a regression.

In the spirit of demonstration, let's see what happens if we redefine the MultiAsset type in terms of MonoidMap:

- newtype MultiAsset c = MultiAsset (Map       (PolicyID c) (      Map AssetName      Integer))
+ newtype MultiAsset c = MultiAsset (MonoidMap (PolicyID c) (MonoidMap AssetName (Sum Integer))

Note that we have replaced Integer with Sum Integer, whose Monoid instance defines mempty as Sum 0, and whose Semigroup instance defines <> as equivalent to ordinary integer addition.

Recall that all MonoidMap operations automatically take care of the invariant that values cannot be mempty. For the MultiAsset type, this means that:

  • outer maps are now prevented from including any mappings from PolicyID to empty inner maps.
  • inner maps are now prevented from including any mappings from AssetName to values of Sum 0.

As a result, we can remove virtually all code that deals with canonicalisation.

For example, we can now simplify the Semigroup instance for MultiAsset, dispensing entirely with the need to call canonicalMapUnion:

  instance Semigroup (MultiAsset c) where
      MultiAsset m1 <> MultiAsset m2 =
-         MultiAsset (canonicalMapUnion (canonicalMapUnion (+)) m1 m2)
+         m1 <> m2

Given that the above instance is trivial, we can even derive the Semigroup and Monoid instances automatically:

  newtype MultiAsset c = MultiAsset (MonoidMap (PolicyID c) (MonoidMap AssetName (Sum Integer))
+     deriving newtype (Semigroup, Monoid)

We can also simplify the insertMultiAsset function:

  insertMultiAsset ::
    (Integer -> Integer -> Integer) ->
    PolicyID c ->
    AssetName ->
    Integer ->
    MultiAsset c ->
    MultiAsset c
  insertMultiAsset combine pid aid new (MultiAsset m1) =
+   MultiAsset $
+   MonoidMap.adjust
+     (MonoidMap.adjust (\(M.Sum old) -> M.Sum (combine old new)) aid) pid m1
-  ...
-  ### 27 lines deleted ###
-  ...

Finally, since MonoidMap already provides Eq and Group instances that are defined in terms of the underlying monoidal value type, we can automatically derive Eq and Group instances for MultiAsset:

  newtype MultiAsset c = MultiAsset (MonoidMap (PolicyID c) (MonoidMap AssetName (Sum Integer))
-     deriving newtype (Semigroup, Monoid)
+     deriving newtype (Eq, Semigroup, Monoid, Group)

- instance Eq (MultiAsset c) where
-   MultiAsset x == MultiAsset y = pointWise (pointWise (==)) x y
-
- instance Group (MultiAsset c) where
-   invert (MultiAsset m) =
-     MultiAsset (canonicalMap (canonicalMap ((-1 :: Integer) *)) m)

Many other simplifications are also possible. (Left as an exercise for readers!)

Comparison with other generalised map types

The Haskell ecosystem has several different types for maps with monoidal properties, and several different types that model total functions from keys to values. Each type comes with its own set of advantages and limitations.

Here's a comparison between the MonoidMap type provided by this library and types provided by other libraries:

Type


Features Class Instances
Models
total
functions
Performs
automatic
minimisation
Eq Monoid
subclasses
Group Functor Applicative
monoidmap
MonoidMap
(this library)
:heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :x: :x:
monoid‑map
MonoidMap
:x: :heavy_check_mark: :heavy_check_mark: :x: :heavy_check_mark: :heavy_check_mark: :x:
monoidal‑containers
MonoidalMap
:x: :x: :heavy_check_mark: :x: :x: :heavy_check_mark: :x:
total‑map
TMap
:heavy_check_mark: :x: :x: :x: :x: :heavy_check_mark: :heavy_check_mark:
total‑maps
TotalSparseMap
:heavy_check_mark: :x: :heavy_check_mark: :x: :x: :heavy_check_mark: :heavy_check_mark:
defaultable‑map
DefaultableMap
:x: :x: :heavy_check_mark: :x: :x: :heavy_check_mark: :heavy_check_mark:
chatter
DefaultMap
:heavy_check_mark: :x: :heavy_check_mark: :x: :x: :heavy_check_mark: :x:
stack
MonoidMap
:x: :x: :heavy_check_mark: :x: :heavy_check_mark: :heavy_check_mark: :x: