safecopy-0.8.5: Binary serialization with version control.

CopyrightPublicDomain
Maintainerlemmih@gmail.com
Portabilitynon-portable (uses GHC extensions)
Safe HaskellTrustworthy
LanguageHaskell98

Data.SafeCopy

Contents

Description

SafeCopy extends the parsing and serialization capabilities of Data.Serialize to include nested version control. Nested version control means that you can change the definition and binary format of a type nested deep within other types without problems.

Consider this scenario. You want to store your contact list on disk and so write the following code:

type Name     = String
type Address  = String
data Contacts = Contacts [(Name, Address)]
instance SafeCopy Contacts where
     putCopy (Contacts list) = contain $ safePut list
     getCopy = contain $ Contacts <$> safeGet
 

At this point, everything is fine. You get the awesome speed of Data.Serialize together with Haskell's ease of use. However, things quickly take a U-turn for the worse when you realize that you want to keep phone numbers as well as names and addresses. Being the experienced coder that you are, you see that using a 3-tuple isn't very pretty and you'd rather use a record. At first you fear that this change in structure will invalidate all your old data. Those fears are quickly quelled, though, when you remember how nifty SafeCopy is. With renewed enthusiasm, you set out and write the following code:

type Name = String
type Address = String
type Phone = String

{- We rename our old Contacts structure -}
data Contacts_v0 = Contacts_v0 [(Name, Address)]
instance SafeCopy Contacts_v0 where
     putCopy (Contacts_v0 list) = contain $ safePut list
     getCopy = contain $ Contacts_v0 <$> safeGet

data Contact = Contact { name    :: Name
                        , address :: Address
                        , phone   :: Phone }
instance SafeCopy Contact where
    putCopy Contact{..} = contain $ do safePut name; safePut address; safePut phone
    getCopy = contain $ Contact <$> safeGet <*> safeGet <*> safeGet

data Contacts = Contacts [Contact]
instance SafeCopy Contacts where
     version = 2
     kind = extension
     putCopy (Contacts contacts) = contain $ safePut contacts
     getCopy = contain $ Contacts <$> safeGet

{- Here the magic happens: -}
instance Migrate Contacts where
     type MigrateFrom Contacts = Contacts_v0
     migrate (Contacts_v0 contacts) = Contacts [ Contact{ name    = name
                                                        , address = address
                                                        , phone   = "" }
                                               | (name, address) <- contacts ]
 

With this, you reflect on your code and you are happy. You feel confident in the safety of your data and you know you can remove Contacts_v0 once you no longer wish to support that legacy format.

Synopsis

Documentation

safeGet :: SafeCopy a => Get a Source

Parse a version tagged data type and then migrate it to the desired type. Any serialized value has been extended by the return type can be parsed.

safePut :: SafeCopy a => a -> Put Source

Serialize a data type by first writing out its version tag. This is much simpler than the corresponding safeGet since previous versions don't come into play.

class SafeCopy a where Source

The centerpiece of this library. Defines a version for a data type together with how it should be serialized/parsed.

Users should define instances of SafeCopy for their types even though getCopy and putCopy can't be used directly. To serialize/parse a data type using SafeCopy, see safeGet and safePut.

Minimal complete definition

Nothing

Methods

version :: Version a Source

The version of the type.

Only used as a key so it must be unique (this is checked at run-time) but doesn't have to be sequential or continuous.

The default version is '0'.

kind :: Kind a Source

The kind specifies how versions are dealt with. By default, values are tagged with their version id and don't have any previous versions. See extension and the much less used primitive.

getCopy :: Contained (Get a) Source

This method defines how a value should be parsed without also worrying about writing out the version tag. This function cannot be used directly. One should use safeGet, instead.

putCopy :: a -> Contained Put Source

This method defines how a value should be parsed without worrying about previous versions or migrations. This function cannot be used directly. One should use safeGet, instead.

objectProfile :: Profile a Source

Version profile.

errorTypeName :: Proxy a -> String Source

The name of the type. This is only used in error message strings. Feel free to leave undefined in your instances.

Instances

SafeCopy Bool 
SafeCopy Char 
SafeCopy Double 
SafeCopy Float 
SafeCopy Int 
SafeCopy Int8 
SafeCopy Int16 
SafeCopy Int32 
SafeCopy Int64 
SafeCopy Integer 
SafeCopy Ordering 
SafeCopy Word8 
SafeCopy Word16 
SafeCopy Word32 
SafeCopy Word64 
SafeCopy () 
SafeCopy ByteString 
SafeCopy ByteString 
SafeCopy IntSet 
SafeCopy Month 
SafeCopy Day 
SafeCopy ClockTime 
SafeCopy CalendarTime 
SafeCopy TimeDiff 
SafeCopy Text 
SafeCopy Text 
SafeCopy AbsoluteTime 
SafeCopy LocalTime 
SafeCopy ZonedTime 
SafeCopy TimeOfDay 
SafeCopy TimeZone 
SafeCopy UTCTime 
SafeCopy NominalDiffTime 
SafeCopy Day 
SafeCopy UniversalTime 
SafeCopy DiffTime 
SafeCopy a => SafeCopy [a] 
(Integral a, SafeCopy a) => SafeCopy (Ratio a) 
(HasResolution a, Fractional (Fixed a)) => SafeCopy (Fixed a) 
SafeCopy a => SafeCopy (Maybe a) 
SafeCopy a => SafeCopy (IntMap a) 
(SafeCopy a, Ord a) => SafeCopy (Set a) 
SafeCopy a => SafeCopy (Tree a) 
SafeCopy a => SafeCopy (Seq a) 
SafeCopy a => SafeCopy (Vector a) 
(SafeCopy a, Unbox a) => SafeCopy (Vector a) 
(SafeCopy a, Storable a) => SafeCopy (Vector a) 
(SafeCopy a, Prim a) => SafeCopy (Vector a) 
SafeCopy a => SafeCopy (Prim a) 
(SafeCopy a, SafeCopy b) => SafeCopy (Either a b) 
(SafeCopy a, SafeCopy b) => SafeCopy (a, b) 
(IArray UArray e, Ix i, SafeCopy e, SafeCopy i) => SafeCopy (UArray i e) 
(Ix i, SafeCopy e, SafeCopy i) => SafeCopy (Array i e) 
(SafeCopy a, SafeCopy b, Ord a) => SafeCopy (Map a b) 
(SafeCopy a, SafeCopy b, SafeCopy c) => SafeCopy (a, b, c) 
(SafeCopy a, SafeCopy b, SafeCopy c, SafeCopy d) => SafeCopy (a, b, c, d) 
(SafeCopy a, SafeCopy b, SafeCopy c, SafeCopy d, SafeCopy e) => SafeCopy (a, b, c, d, e) 
(SafeCopy a, SafeCopy b, SafeCopy c, SafeCopy d, SafeCopy e, SafeCopy f) => SafeCopy (a, b, c, d, e, f) 
(SafeCopy a, SafeCopy b, SafeCopy c, SafeCopy d, SafeCopy e, SafeCopy f, SafeCopy g) => SafeCopy (a, b, c, d, e, f, g) 

newtype Prim a Source

Wrapper for data that was saved without a version tag.

Constructors

Prim 

Fields

getPrimitive :: a
 

Instances

class SafeCopy (MigrateFrom a) => Migrate a where Source

The central mechanism for dealing with version control.

This type class specifies what data migrations can happen and how they happen.

Associated Types

type MigrateFrom a Source

This is the type we're extending. Each type capable of migration can only extend one other type.

Methods

migrate :: MigrateFrom a -> a Source

This method specifies how to migrate from the older type to the newer one. It will never be necessary to use this function manually as it all taken care of internally in the library.

newtype Reverse a Source

This is a wrapper type used migrating backwards in the chain of compatible types.

Constructors

Reverse 

Fields

unReverse :: a
 

data Kind a Source

The kind of a data type determines how it is tagged (if at all).

Primitives kinds (see primitive) are not tagged with a version id and hence cannot be extended later.

Extensions (see extension) tells the system that there exists a previous version of the data type which should be migrated if needed.

There is also a default kind which is neither primitive nor is an extension of a previous type.

extension :: (SafeCopy a, Migrate a) => Kind a Source

The extension kind lets the system know that there is at least one previous version of this type. A given data type can only extend a single other data type. However, it is perfectly fine to build chains of extensions. The migrations between each step is handled automatically.

extended_extension :: (SafeCopy a, Migrate a, Migrate (Reverse a)) => Kind a Source

The extended_base kind lets the system know that there is at least one future version of this type.

extended_base :: Migrate (Reverse a) => Kind a Source

The extended_base kind lets the system know that there is at least one future version of this type.

base :: Kind a Source

The default kind. Does not extend any type.

data Contained a Source

To ensure that no-one reads or writes values without handling versions correct, it is necessary to restrict access to getCopy and putCopy. This is where Contained enters the picture. It allows you to put values in to a container but not to take them out again.

contain :: a -> Contained a Source

Place a value in an unbreakable container.

data Version a Source

A simple numeric version id.

Instances

Template haskell functions

deriveSafeCopy :: Version a -> Name -> Name -> Q [Dec] Source

Derive an instance of SafeCopy.

When serializing, we put a Word8 describing the constructor (if the data type has more than one constructor). For each type used in the constructor, we call getSafePut (which immediately serializes the version of the type). Then, for each field in the constructor, we use one of the put functions obtained in the last step.

For example, given the data type and the declaration below

data T0 b = T0 b Int
deriveSafeCopy 1 'base ''T0
   

we generate

instance (SafeCopy a, SafeCopy b) =>
         SafeCopy (T0 b) where
    putCopy (T0 arg1 arg2) = contain $ do put_b   <- getSafePut
                                          put_Int <- getSafePut
                                          put_b   arg1
                                          put_Int arg2
                                          return ()
    getCopy = contain $ do get_b   <- getSafeGet
                           get_Int <- getSafeGet
                           return T0 <*> get_b <*> get_Int
    version = 1
    kind = base
   

And, should we create another data type as a newer version of T0, such as

data T a b = C a a | D b Int
deriveSafeCopy 2 'extension ''T

instance SafeCopy b => Migrate (T a b) where
  type MigrateFrom (T a b) = T0 b
  migrate (T0 b i) = D b i
   

we generate

instance (SafeCopy a, SafeCopy b) =>
         SafeCopy (T a b) where
    putCopy (C arg1 arg2) = contain $ do putWord8 0
                                         put_a <- getSafePut
                                         put_a arg1
                                         put_a arg2
                                         return ()
    putCopy (D arg1 arg2) = contain $ do putWord8 1
                                         put_b   <- getSafePut
                                         put_Int <- getSafePut
                                         put_b   arg1
                                         put_Int arg2
                                         return ()
    getCopy = contain $ do tag <- getWord8
                           case tag of
                             0 -> do get_a <- getSafeGet
                                     return C <*> get_a <*> get_a
                             1 -> do get_b   <- getSafeGet
                                     get_Int <- getSafeGet
                                     return D <*> get_b <*> get_Int
                             _ -> fail $ "Could not identify tag \"" ++
                                         show tag ++ "\" for type Main.T " ++
                                         "that has only 2 constructors.  " ++
                                         "Maybe your data is corrupted?"
    version = 2
    kind = extension
   

Note that by using getSafePut, we saved 4 bytes in the case of the C constructor. For D and T0, we didn't save anything. The instance derived by this function always use at most the same space as those generated by deriveSafeCopySimple, but never more (as we don't call 'getSafePut'/'getSafeGet' for types that aren't needed).

Note that you may use deriveSafeCopySimple with one version of your data type and deriveSafeCopy in another version without any problems.

deriveSafeCopySimple :: Version a -> Name -> Name -> Q [Dec] Source

Derive an instance of SafeCopy. The instance derived by this function is simpler than the one derived by deriveSafeCopy in that we always use safePut and safeGet (instead of getSafePut and getSafeGet).

When serializing, we put a Word8 describing the constructor (if the data type has more than one constructor) and, for each field of the constructor, we use safePut.

For example, given the data type and the declaration below

data T a b = C a a | D b Int
deriveSafeCopySimple 1 'base ''T
   

we generate

instance (SafeCopy a, SafeCopy b) =>
         SafeCopy (T a b) where
    putCopy (C arg1 arg2) = contain $ do putWord8 0
                                         safePut arg1
                                         safePut arg2
                                         return ()
    putCopy (D arg1 arg2) = contain $ do putWord8 1
                                         safePut arg1
                                         safePut arg2
                                         return ()
    getCopy = contain $ do tag <- getWord8
                           case tag of
                             0 -> do return C <*> safeGet <*> safeGet
                             1 -> do return D <*> safeGet <*> safeGet
                             _ -> fail $ "Could not identify tag \"" ++
                                         show tag ++ "\" for type Main.T " ++
                                         "that has only 2 constructors.  " ++
                                         "Maybe your data is corrupted?"
    version = 1
    kind = base
   

Using this simpler instance means that you may spend more bytes when serializing data. On the other hand, it is more straightforward and may match any other format you used in the past.

Note that you may use deriveSafeCopy with one version of your data type and deriveSafeCopySimple in another version without any problems.

deriveSafeCopyHappstackData :: Version a -> Name -> Name -> Q [Dec] Source

Derive an instance of SafeCopy. The instance derived by this function should be compatible with the instance derived by the module Happstack.Data.SerializeTH of the happstack-data package. The instances use only safePut and safeGet (as do the instances created by deriveSafeCopySimple), but we also always write a Word8 tag, even if the data type isn't a sum type.

For example, given the data type and the declaration below

data T0 b = T0 b Int
deriveSafeCopy 1 'base ''T0
   

we generate

instance (SafeCopy a, SafeCopy b) =>
         SafeCopy (T0 b) where
    putCopy (T0 arg1 arg2) = contain $ do putWord8 0
                                          safePut arg1
                                          safePut arg2
                                          return ()
    getCopy = contain $ do tag <- getWord8
                           case tag of
                             0 -> do return T0 <*> safeGet <*> safeGet
                             _ -> fail $ "Could not identify tag \"" ++
                                         show tag ++ "\" for type Main.T0 " ++
                                         "that has only 1 constructors.  " ++
                                         "Maybe your data is corrupted?"
    version = 1
    kind = base
   

This instance always consumes at least the same space as deriveSafeCopy or deriveSafeCopySimple, but may use more because of the useless tag. So we recomend using it only if you really need to read a previous version in this format, and not for newer versions.

Note that you may use deriveSafeCopy with one version of your data type and deriveSafeCopyHappstackData in another version without any problems.

Rarely used functions

getSafeGet :: forall a. SafeCopy a => Get (Get a) Source

Parse a version tag and return the corresponding migrated parser. This is useful when you can prove that multiple values have the same version. See getSafePut.

getSafePut :: forall a. SafeCopy a => PutM (a -> Put) Source

Serialize the version tag and return the associated putter. This is useful when serializing multiple values with the same version. See getSafeGet.

primitive :: Kind a Source

Primitive kinds aren't version tagged. This kind is used for small or built-in types that won't change such as Int or Bool.