generic-persistence: Database persistence using generics

This is a package candidate release! Here you can preview how this package release will appear once published to the main package index (which can be accomplished via the 'maintain' link below). Please note that once a package has been published to the main package index it cannot be undone! Please consult the package uploading documentation for more information.

[maintain] [Publish]

Please see the README on GitHub at https://github.com/thma/generic-persistence#readme


[Skip to Readme]

Properties

Versions 0.2.0.0, 0.2.0.1, 0.3.0.0, 0.3.0.1, 0.4.0.0, 0.5.0, 0.6.0, 0.6.0, 0.7.0.0, 0.7.0.1
Change log None available
Dependencies base (>=4.7 && <5), convertible (>=1.1.1.1 && <1.2), generic-deriving (>=1.14.3 && <1.15), HDBC (>=2.4.0.4 && <2.5), raw-strings-qq (>=1.1 && <1.2), resource-pool (>=0.4.0.0 && <0.5), template-haskell (>=2.18.0.0 && <2.19) [details]
License BSD-3-Clause
Copyright 2023 Thomas Mahler
Author Thomas Mahler
Maintainer thma@apache.org
Category Database
Home page https://github.com/thma/generic-persistence#readme
Bug tracker https://github.com/thma/generic-persistence/issues
Source repo head: git clone https://github.com/thma/generic-persistence
Uploaded by thma at 2023-10-17T12:53:31Z

Modules

[Index] [Quick Jump]

Downloads

Maintainer's Corner

Package maintainers

For package maintainers and hackage trustees


Readme for generic-persistence-0.6.0

[back to package description]

GenericPersistence - A Haskell Persistence Layer using Generics

Actions Status codecov Available on Hackage

GP Logo

Table of Contents

Introduction

GenericPersistence is a small Haskell persistence layer for relational databases. The approach relies on GHC.Generics. The actual database access is provided by the HDBC library.

The functional goal of the persistence layer is to provide hassle-free RDBMS persistence for Haskell data types in Record notation (for simplicity I call these Entities).

It therefore provides means for inserting, updating, deleting and querying such entities into/from relational databases.

The main design goal is to minimize the boilerplate code required:

In an ideal world we would be able to take any POHO (Plain old Haskell Object) and persist it to any RDBMS without any additional effort.

Status

As of now there is full support for SQLite and PostgreSQL. Support for other databases will be implemented on demand.

new features in v0.6

new features in v0.5

Feature requests, feedback and pull requests are welcome!

Available on Hackage

https://hackage.haskell.org/package/generic-persistence

Add the following to your package.yaml file:

dependencies:
- generic-persistence

I would also recommend to add the setting language: GHC2021 to your package.yaml file:

language: GHC2021

This drastically reduces the amount of LANGUAGE extensions that need to be added to your source files.

Short demo

Here now follows a short demo that shows how the library looks and feels from the user's point of view.

{-# LANGUAGE DeriveAnyClass #-} -- allows automatic derivation from Entity type class

module Main (main) where

import           Database.GP          
import           Database.HDBC.Sqlite3 (connectSqlite3)
import           GHC.Generics

-- | An Entity data type with several fields, using record syntax.
data Person = Person
  { personID :: Int,
    name     :: String,
    age      :: Int,
    address  :: String
  }
  deriving (Generic, Entity, Show) -- deriving Entity allows us to use the GenericPersistence API


main :: IO ()
main = do
  -- connect to a database in auto commit mode
  conn <- connect AutoCommit <$> connectSqlite3 "sqlite.db"

  -- initialize Person table
  setupTableFor @Person SQLite conn

  alice <- insert conn Person {name = "Alice", age = 25, address = "Elmstreet 1"}
  print alice

  -- update a Person
  update conn alice {address = "Main Street 200"}

  -- select a Person by id
  -- The result type must be provided by the call site,
  -- as `selectById` has a polymorphic return type `IO (Maybe a)`.
  alice' <- selectById @Person conn (personID alice)
  print alice'

  -- select all Persons from a database. again, the result type must be provided.
  allPersons <- select @Person conn allEntries
  print allPersons

  -- select all Persons from a database, where age is smaller 30.
  allPersonsUnder30 <- select @Person conn (field "age" <. (30 :: Int))
  print allPersonsUnder30

  -- delete a Person from a database
  delete conn alice

  -- select all Persons from a database. Now it should be empty.
  allPersons' <- select @Person conn allEntries
  print allPersons'

  -- close connection
  disconnect conn

Deal with runtime exceptions or use total functions? Your choice!

GenericPersistence provides two different APIs for accessing the database:

Exceptions in the default API

The default API is the easiest to use, but you will have to do exception handling to catch runtime errors. To use it you'll have to import the Database.GP module:

import Database.GP 

These are the exceptions that can be thrown:

data PersistenceException =
    EntityNotFound String
  | DuplicateInsert String
  | DatabaseError String
  | NoUniqueKey String
  deriving (Show, Eq, Exception)

The EntityNotFound exception is thrown when you try to select an entity by its primary key, but no entity with the given primary key exists in the database.

The DuplicateInsert exception is thrown when you try to insert an entity into the database, but an entity with the same primary key already exists in the database.

The DatabaseError exception is thrown when the database backend returns an error.

The NoUniqueKey exception is thrown when you try to select an entity by its primary key, but multiple rows are returned by the database. This can happen if there is no primary key constraint defined on the underlying database column.

A real world example can be found in the Servant GP - UserServer module.

Total functions in the safe API

The safe API is a bit more verbose, but it does not throw exceptions. To use it you'll have to import the Database.GP.GenericPersistenceSafe module:

import Database.GP.GenericPersistenceSafe

This module provides the same function as Database.GP, but all functions return Either PersistenceException a instead of IO a or IO (Maybe a).

eitherExRes <- selectById conn "1" :: IO (Either PersistenceException Person)
case eitherExRes of
  Left (EntityNotFound _) -> print "Entity not found"
  Right person            -> print person

This may look a bit verbose, but in actual code this may work out better, as Either allows pattern matching and chaining of computations with the do notation.

A real world example can be found in the Servant GP - UserServerSafe module. The UserServerSafe module is a copy of the UserServer module, but it uses the safe API instead of the default API. As you can see, the code of UserServerSafe is actually a bit more compact than the code of UserServer. (In the default API, we have to deal with the special case of selectById returning Nothing.)

How it works

In order to store Haskell data types in a relational database, we need to define a mapping between Haskell types and database tables. This mapping is defined by the Entity type class. This type class comes with default implementations for all methods which define the standard behaviour. (The default implementations internally use GHC.Generics.)

This default mapping will work for many cases, but it can be customized by overriding the default implementations.

The Entity type class

The Entity type class specifies the following methods:

class (Generic a, HasConstructor (Rep a), HasSelectors (Rep a)) => Entity a where
  -- | Converts a database row to a value of type 'a'.
  fromRow :: Conn -> [SqlValue] -> IO a

  -- | Converts a value of type 'a' to a database row.
  toRow :: Conn -> a -> IO [SqlValue]

  -- | Returns the name of the primary key field for a type 'a'.
  idField :: String

  -- | Returns a list of tuples that map field names to column names for a type 'a'.
  fieldsToColumns :: [(String, String)]

  -- | Returns the name of the table for a type 'a'.
  tableName :: String

  -- | Returns True if the primary key field for a type 'a' is autoincremented by the database.
  autoIncrement :: Bool

Default Behaviour

idField, fieldsToColumns and tableName are used to define the mapping between Haskell types and database tables.

fromRow and toRow are used to convert between Haskell types and database rows.

The default implementations of fromRow and toRow expects that type a has a single constructor and a selector for each field. All fields are expected to have a 1:1 mapping to a column in the database table. Thus each field must have a type that can be converted to and from a SqlValue.

For example

toRow conn (Person {personID = 1234, name = "Alice", age = 27, address = "Elmstreet 1"}) 

will return

[SqlInt64 1234,SqlString "Alice",SqlInt64 27,SqlString "Elmstreet 1"]

And fromRow does the inverse:

fromRow conn [SqlInt64 1234,SqlString "Alice",SqlInt64 27,SqlString "Elmstreet 1"] :: IO Person

returns

Person {personID = 1234, name = "Alice", age = 27, address = "Elmstreet 1"}

The conversion functions toRow and fromRow both carry an additional Conn argument. This argument is not used by the default implementations, but it can be used to provide database access during the conversion process. We will cover this later.

Customizing the default behaviour

The default implementations of idField, fieldsToColumns, tableName, fromRow and toRow can be customized by overriding the default implementations. Overiding idField, fieldsToColumns and tableName will be required when your database tables do not follow the default naming conventions.

For example, if we have a database table BOOK_TBL with the following columns:

CREATE TABLE BOOK_TBL 
  ( bookId INTEGER PRIMARY KEY, 
    bookTitle TEXT, 
    bookAuthor TEXT, 
    bookYear INTEGER
  );

and we want to map this table to a Haskell data type Book:

data Book = Book
  { book_id :: Int,
    title   :: String,
    author  :: String,
    year    :: Int
  }
  deriving (Generic, Show)

Then we can customize the default implementations of idField, fieldsToColumns and tableName to achieve the desired mapping:

instance Entity Book where
  -- this is the primary key field of the Book data type (not following the default naming convention)
  idField = "book_id"

  -- this defines the mapping between the field names of the Book data type and the column names of the database table
  fieldsToColumns = [("book_id", "bookId"), ("title", "bookTitle"), ("author", "bookAuthor"), ("year", "bookYear")]

  -- this is the name of the database table
  tableName = "BOOK_TBL"

Overriding fromRow and toRow will be required when your database tables do not follow the default mapping conventions. We will see some examples in later sections.

Handling enumeration fields

Say we have a data type Book with an enumeration field of type BookCategory:

data Book = Book
  { bookID :: Int,
    title   :: String,
    author  :: String,
    year    :: Int,
    category :: BookCategory
  }
  deriving (Generic, Entity, Show)

data BookCategory = Fiction | Travel | Arts | Science | History | Biography | Other
  deriving (Generic, Show, Enum)

In this case everything works out of the box, because GenericPersistence provides Convertible instances for all Enum types. Convertible instances are used to convert between Haskell types and database types.

If you do not want to use Enum types for your enumeration fields, you have to implement Convertible instances manually:

data BookCategory = Fiction | Travel | Arts | Science | History | Biography | Other
  deriving (Generic, Show, Read)

instance Convertible BookCategory SqlValue where
  safeConvert = Right . toSql . show
  
instance Convertible SqlValue BookCategory where
  safeConvert = Right . read . fromSql  

Handling embedded Objects

Say we have a data type Article with a field of type Author:

data Article = Article
  { articleID :: Int,
    title     :: String,
    author    :: Author,
    year      :: Int
  }
  deriving (Generic, Show, Eq)

data Author = Author
  { authorID :: Int,
    name     :: String,
    address  :: String
  }
  deriving (Generic, Show, Eq)  

If we don't want to store the Author as a separate table, we can use the following approach to embed the Author into the Article table:

instance Entity Article where
  -- in the fields to column mapping we specify that all fields of the 
  -- Author type are also mapped to columns of the Article table:
  fieldsToColumns :: [(String, String)]
  fieldsToColumns = [("articleID", "articleID"),
                       ("title", "title"), 
                       ("authorID", "authorID"), 
                       ("authorName", "authorName"), 
                       ("authorAddress", "authorAddress"),
                       ("year", "year")
                    ]

  -- in fromRow we have to manually construct the Author object from the 
  -- respective columns of the Article table and insert it 
  -- into the Article object:
  fromRow _conn row = return $ Article (col 0) (col 1) author (col 5)
    where
      col i = fromSql (row !! i)
      author = Author (col 2) (col 3) (col 4)

  -- in toRow we have to manually extract the fields of the Author object
  -- and insert them into the respective columns of the Article table:
  toRow _conn a = return [toSql (articleID a), toSql (title a), toSql authID, toSql authorName, toSql authorAddress, toSql (year a)]
    where 
      authID = authorID (author a)
      authorName = name (author a)
      authorAddress = address (author a)

Handling 1:1 references

If we have the same data types as in the previous example, but we want to store the Author in a separate table, we can use the following approach:

data Article = Article
  { articleID :: Int,
    title     :: String,
    author    :: Author,
    year      :: Int
  }
  deriving (Generic, Show, Eq)

data Author = Author
  { authorID :: Int,
    name     :: String,
    address  :: String
  }
  deriving (Generic, Entity, Show, Eq)


instance Entity Article where
  fieldsToColumns :: [(String, String)]                      -- ommitting the author field,
  fieldsToColumns =                                          -- as this can not be mapped to a single column
    [ ("articleID", "articleID"),                            -- instead we invent a new column authorID         
      ("title", "title"),
      ("authorID", "authorID"),
      ("year", "year")
    ]

  fromRow :: Conn -> [SqlValue] -> IO Article
  fromRow conn row = do    
    authorById <- fromJust <$> selectById conn (row !! 2)  -- load author by foreign key
    return $ rawArticle {author = authorById}                -- add author to article
    where
      rawArticle = Article (col 0) (col 1)                   -- create article from row, 
                           (Author (col 2) "" "") (col 3)    -- using a dummy author
        where
          col i = fromSql (row !! i)

  toRow :: Conn -> Article -> IO [SqlValue]
  toRow conn a = do
    persist conn (author a)                                  -- persist author first
    return [toSql (articleID a), toSql (title a),            -- return row for article table where 
            toSql $ authorID (author a), toSql (year a)]     -- authorID is foreign key to author table 

Persisting the Authoras a side effect in toRow may sound like an interesting idea... This step is optional. But then the user has to make sure that the Author is persisted before the Article is persisted.

Handling 1:n references

Now let's change the previous example by having a list of Articles in the Author type:

data Author = Author
  { authorID :: Int,
    name     :: String,
    address  :: String,
    articles :: [Article]
  }
  deriving (Generic, Show, Eq)

data Article = Article
  { articleID :: Int,
    title     :: String,
    authorId  :: Int,
    year      :: Int
  }
  deriving (Generic, Entity, Show, Eq)

So now we have a 1:n relationship between Author and Article.

We can handle this situation by using the following instance declaration for Author:

instance Entity Author where
  fieldsToColumns :: [(String, String)]                   -- ommitting the articles field, 
  fieldsToColumns =                                       -- as this can not be mapped to a single column
    [ ("authorID", "authorID"),
      ("name", "name"),
      ("address", "address")
    ]

  fromRow :: Conn -> [SqlValue] -> IO Author
  fromRow conn row = do
    let authID = head row                                  -- authorID is the first column
    articlesBy <- select conn (field "authorId" =. authID) -- retrieve all articles by this author
    return rawAuthor {articles = articlesBy}               -- add the articles to the author
    where
      rawAuthor = Author (col 0) (col 1) (col 2) []        -- create the author from row (w/o articles)
      col i = fromSql (row !! i)                           -- helper function to convert SqlValue to Haskell type

  toRow :: Conn -> Author -> IO [SqlValue]
  toRow conn a = do
    mapM_ (persist conn) (articles a)                     -- persist all articles of this author (update or insert)
    return [toSql (authorID a),                           -- return the author as a list of SqlValues
            toSql (name a), toSql (address a)]

Persisting all articles of an author as a side effect during the conversion of the author to a row may seem special... You can ommit this step. But then you have to persist the articles manually before persisting the author.

Performing queries with the Query DSL

The library provides a simple DSL for performing SELECTqueries. The select function

select :: forall a. (Entity a) => Conn -> WhereClauseExpr -> IO [a]

This function retrieves all entities of type a that match some query criteria. The function takes an HDBC connection (wrapped in a Conn) and a WhereClauseExpr as parameters. The function returns a (possibly empty) list of all matching entities.

The WhereClauseExpr is constructed using a small set of functions and infix operators.

There are a set of infix operators (=.), (>.), (<.), (>=.), (<=.), (<>.), like, between, in', contains that define field comparisons:

field "name" =. "John"

field "age" >=. 18

field "age" `between` (18, 30)

field "name" `like` "J%"

field "name" `in'` ["John", "Jane"]

Then we have three function isNull, allEntries and byId that also define simple WHERE clauses:

isNull (field "name") -- matches all entries where the name field is NULL

byId 42               -- matches the entry where the primary key column has the value 42

allEntries            -- matches all entries of the table

It is also possible to apply SQL functions to fields:

lower = sqlFun "LOWER" -- define a function that applies the SQL function LOWER to a field

lower(field "name") =. "all lowercase"

These field-wise comparisons can be combined using the logical operators &&., ||. and not':

(field "name" `like` "J%") &&. (field "age" >=. 18)

(field "name" =. "John") ||. (field "name" =. "Jane")

not' (field "name" =. "John")

The select function will then use the WhereClauseExpr constructed from these operators and functions to generate a SQL query that retrieves all matching entities:


ageField :: Field
ageField = field "age"

thirtySomethings <- select @Person conn (ageField `between` (30, 39))

It is also possible to add ORDER BY and LIMIT clauses to the query:

sortedPersons <- select @Person conn (allEntries `orderBy` [(ageField,ASC), (nameField,DESC)])

limitedPersons <- select @Person conn (allEntries `limit` 25)

pageOfPersons <- select @Person conn (allEntries `limitOffset` (100,10))

You will find more examples in the test suite.

Integrating user defined queries

As we have seen in the previous section, the library provides two functions select and selectById to query the database for entities.

If you want to use more complex queries, you can integrate HDBC SQL queries by using the entitiesFromRows function as in the following example:

main :: IO ()
main = do
  -- connect to a database
  conn <- connect SQLite <$> connectSqlite3 ":memory:" 

  -- initialize Person table
  setupTableFor @Person conn

  let alice = Person 1 "Alice" 25 "123 Main St"
      bob = Person 2 "Bob" 30 "456 Elm St"
      charlie = Person 3 "Charlie" 35 "789 Pine St"
      dave = Person 4 "Dave" 40 "1011 Oak St"
      eve = Person 5 "Eve" 45 "1213 Maple St"
      frank = Person 6 "Frank" 50 "1415 Walnut St"
      people = [alice, bob, charlie, dave, eve, frank]
      
  -- insert all persons into the database
  insertMany conn people

  -- perform a custom query with HDBC
  stmt = "SELECT * FROM Person WHERE age >= ? ORDER BY age ASC"
  resultRows <- quickQuery conn stmt [toSql (40 :: Int)]

  -- convert the resulting rows into a list of Person objects
  fourtplussers <- entitiesFromRows @Person conn resultRows
  print fourtplussers

Of course this approach is not type safe. It is up to the user to make sure that the query returns the correct columns.

The Conn Connection Type

The Conn type is a wrapper around an IConnection obtained from an HDBC backend driver like HDBC-sqlite3 or hdbc-postgresql. It is used to pass the connection to the database to Generic-Persistence. All functions of the library that require a database connection take a Conn as an argument.

HDBC provides a very similar type called ConnectionWrapper. The main reason for such a wrapper type is to simplify the type signatures of the library functions.

In addition, the Conn type provides additional database related information that is not available in the ConnectionWrapper type. For example, the Conn type contains the name of the database driver that is used. This information can be used to generate the correct SQL statements for different database backends. Conn also carries a flag that indicates whether implicit commits should be used by the library. This flag is set to True by default. If you want to use explicit commits, you can set the flag to False by modifying the Conn value:

c <- connect SQLite <$> connectSqlite3 ":memory:"
let conn = c {implicitCommit = False}

Connection Pooling

The library provides a simple connection pool for managing database connections. This is a must in multi-threaded environments where multiple threads may need to access the database at the same time. A typical use case is a REST service that uses a database to store its data.

The connection Pool is implemented based on the resource-pool library. generic-persistence exposes a ConnectionPool type and two function createConnPool and withResource to create and use a connection pool.

The following example shows how to create a connection pool and how to use it to perform a database query:


sqlLitePool :: FilePath -> IO ConnectionPool
sqlLitePool sqlLiteFile = createConnPool SQLite sqlLiteFile connectSqlite3 10 100

main :: IO ()
main = do
  connPool <- sqlLitePool ":memory:" 
  let alice = Person 123456 "Alice" 25 "123 Main St"
  withResource connPool $ \conn -> do
    setupTableFor @Person conn
    insert conn alice
    allPersons <- select conn allEntries :: IO [Person]
    print allPersons

You'll find a more complete example in the servant-gp repo. There I have set up a sample REST service based on Servant that uses Generic-Persistence and a connection pool to manage the database connections.