Safe Haskell | None |
---|---|
Language | Haskell2010 |
Data types and functions to discover sequences of DDL commands to go from one database state to another. Used for migration generation.
For our purposes, a database state is fully specified by the set of predicates that apply to that database.
Migration generation is approached as a graph search problem over the
infinite graph of databases G. The nodes of G are database states, which
(as said above) are simply sets of predicates (see DatabaseState
for the
realization of this concept in code). For two vertices S1 and S2 in G,
there is an edge between the two if and only if there is a DDL command that
can take a database at S1 to S2.
We generate migrations by exploring this graph, starting at the source state and ending at the destination state. By default we use an optimizing solver that weights each edge by the complexity of the particular command, and we attempt to find the shortest path using Dijkstra's algorithm, although a user may override this behavior and provide a custom edge selection mechanism (or even defer this choice to the user).
In order to conduct the breadth-first search, we must know which edges lead
out of whichever vertex we're currently visiting. The solving algorithm thus
takes a set of ActionProvider
s, which are means of discovering edges that
are incident to the current database state.
Conceptually, an ActionProvider
is a function of type ActionProviderFn
,
which takes the current database state and produces a list of edges in the
form of PotentialAction
objects. For optimization purposes,
ActionProvider
s also take in the desired destination state, which it can
use to select only edges that make sense. This does not affect the result,
just the amount of time it may take to get there.
Note that because the graph of database states is infinite, a breadth-first search may easily end up continuing to explore when there is no chance of reaching our goal. This would result in non-termination and is highly undesirable. In order to prevent this, we limit ourselves to only exploring edges that take us closer to the destination state. Here, we measure distance between two states as the number of elements in the symmetric difference of two database states. Thus, every action we take must either remove a predicate that doesn't exist in the destination state, or add a predicate that does. If a potential action only adds predicates that do not exist in the final state or removes predicates that do not exist in the first, then we never explore that edge.
A note on speed
There are some issues with this approach. Namely, if there is no solution, we
can end up exploring the entire action space, which may be quite a lot. While
beam-migrate
can solve all databases that can be made up of predicates in
this module, other beam backends may not make such strict guarantees
(although in practice, all do). Nevertheless, if you're hacking on this
module and notice what seems like an infinite loop, you may have accidentally
removed code that exposed the edge that leads to a solution to the migration.
- data DatabaseStateSource
- data DatabaseState cmd = DatabaseState {}
- data PotentialAction cmd = PotentialAction {
- actionPreConditions :: !(HashSet SomeDatabasePredicate)
- actionPostConditions :: !(HashSet SomeDatabasePredicate)
- actionCommands :: !(Seq cmd)
- actionEnglish :: !Text
- actionScore :: !Int
- newtype ActionProvider cmd = ActionProvider {}
- type ActionProviderFn cmd = (forall preCondition. Typeable preCondition => [preCondition]) -> (forall postCondition. Typeable postCondition => [postCondition]) -> [PotentialAction cmd]
- ensuringNot_ :: Alternative m => [a] -> m ()
- justOne_ :: [a] -> [a]
- createTableActionProvider :: forall cmd. (Sql92SaneDdlCommandSyntax cmd, Sql92SerializableDataTypeSyntax (Sql92DdlCommandDataTypeSyntax cmd)) => ActionProvider cmd
- dropTableActionProvider :: forall cmd. (Sql92SaneDdlCommandSyntax cmd, Sql92SerializableDataTypeSyntax (Sql92DdlCommandDataTypeSyntax cmd)) => ActionProvider cmd
- addColumnProvider :: forall cmd. (Sql92SaneDdlCommandSyntax cmd, Sql92SerializableDataTypeSyntax (Sql92DdlCommandDataTypeSyntax cmd)) => ActionProvider cmd
- addColumnNullProvider :: forall cmd. Sql92SaneDdlCommandSyntax cmd => ActionProvider cmd
- dropColumnNullProvider :: forall cmd. Sql92SaneDdlCommandSyntax cmd => ActionProvider cmd
- defaultActionProvider :: (Sql92SaneDdlCommandSyntax cmd, Sql92SerializableDataTypeSyntax (Sql92DdlCommandDataTypeSyntax cmd)) => ActionProvider cmd
- data Solver cmd where
- ProvideSolution :: [cmd] -> Solver cmd
- SearchFailed :: [DatabaseState cmd] -> Solver cmd
- ChooseActions :: {..} -> Solver cmd
- data FinalSolution cmd
- = Solved [cmd]
- | Candidates [DatabaseState cmd]
- finalSolution :: Solver cmd -> FinalSolution cmd
- heuristicSolver :: ActionProvider cmd -> [SomeDatabasePredicate] -> [SomeDatabasePredicate] -> Solver cmd
Database state
data DatabaseStateSource Source #
Used to indicate whether a particular predicate is from the initial database state, or due to a sequence of actions we've committed too. Used to prevent runaway action generation based off of derived states.
DatabaseStateSourceOriginal | Predicate is from the original set given by the user |
DatabaseStateSourceDerived | Predicate is from an action we've committed to in this action chain |
data DatabaseState cmd Source #
Represents the state of a database as a migration is being generated
DatabaseState | |
|
Show cmd => Show (DatabaseState cmd) Source # | |
NFData (DatabaseState cmd) Source # | |
Action generation
data PotentialAction cmd Source #
Represents an edge (or a path) in the database graph.
Given a particular starting point, the destination database is the database
where each predicate in actionPreConditions
has been removed and each
predicate in actionPostConditions
has been added.
PotentialAction | |
|
Monoid (PotentialAction cmd) Source # |
|
newtype ActionProvider cmd Source #
Edge discovery mechanism. A newtype wrapper over ActionProviderFn
.
An ActionProviderFn
takes two arguments. The first is the set of predicates
that exist in the current database.
The function should a set of edges from the database specified in the first argument to possible destination databases. For optimization purposes, the second argument is the set of predicates that ought to exist in the destination database. This can be used to eliminate edges that will not lead to a solution.
This second argument is just an optimization and doesn't change the final result, although it can significantly impact the time it takes to get there.
Both the current database set and the destination database set are given as polymorphic lists of predicates. When you instantiate the type, the current database predicate set is queried for predicates of that type.
For example, dropTableActionProvider
provides a DROP TABLE
action edge
whenever it encounters a table that exists. In order to do this, it attempts
to find all TableExistsPredicate
that do not exist in the destination
database. Its ActionProviderFn
may be implemented like such:
dropTableActionProvider preConditions postConditions = do TableExistsPredicate srcTblNm <- preConditions ensuringNot_ $ $ do TableExistsPredicate destTblNm <- postConditions guard (srcTblNm == destTblNm)
ensuringNot_
is a function that causes the action provider to return no
results if there are any elements in the provided list. In this case, it's
used to stop DROP TABLE
action generation for tables which must be present
in the final database.
Monoid (ActionProvider cmd) Source # | |
type ActionProviderFn cmd = (forall preCondition. Typeable preCondition => [preCondition]) -> (forall postCondition. Typeable postCondition => [postCondition]) -> [PotentialAction cmd] Source #
See ActionProvider
ensuringNot_ :: Alternative m => [a] -> m () Source #
Proceeds only if no predicate matches the given pattern. See the
implementation of dropTableActionProvider
for an example of usage.
justOne_ :: [a] -> [a] Source #
Used to ensure that only one predicate matches the given pattern. See the
implementation of createTableActionProvider
for an example of usage.
createTableActionProvider :: forall cmd. (Sql92SaneDdlCommandSyntax cmd, Sql92SerializableDataTypeSyntax (Sql92DdlCommandDataTypeSyntax cmd)) => ActionProvider cmd Source #
Action provider for SQL92 CREATE TABLE
actions.
dropTableActionProvider :: forall cmd. (Sql92SaneDdlCommandSyntax cmd, Sql92SerializableDataTypeSyntax (Sql92DdlCommandDataTypeSyntax cmd)) => ActionProvider cmd Source #
Action provider for SQL92 DROP TABLE
actions
addColumnProvider :: forall cmd. (Sql92SaneDdlCommandSyntax cmd, Sql92SerializableDataTypeSyntax (Sql92DdlCommandDataTypeSyntax cmd)) => ActionProvider cmd Source #
Action provider for SQL92 ALTER TABLE ... ADD COLUMN ...
actions
addColumnNullProvider :: forall cmd. Sql92SaneDdlCommandSyntax cmd => ActionProvider cmd Source #
Action provider for SQL92 ALTER TABLE ... ALTER COLUMN ... SET NULL
dropColumnNullProvider :: forall cmd. Sql92SaneDdlCommandSyntax cmd => ActionProvider cmd Source #
Action provider for SQL92 ALTER TABLE ... ALTER COLUMN ... SET NOT NULL
defaultActionProvider :: (Sql92SaneDdlCommandSyntax cmd, Sql92SerializableDataTypeSyntax (Sql92DdlCommandDataTypeSyntax cmd)) => ActionProvider cmd Source #
Default action providers for any SQL92 compliant syntax.
In particular, this provides edges consisting of the following statements:
- CREATE TABLE
- DROP TABLE
- ALTER TABLE ... ADD COLUMN ...
- ALTER TABLE ... DROP COLUMN ...
- ALTER TABLE ... ALTER COLUMN ... SET [NOT] NULL
Solver
data Solver cmd where Source #
Represents current state of a database graph search.
If ProvideSolution
, the destination database has been reached, and the
given list of commands provides the path from the source database to the
destination.
If SearchFailed
, the search has failed. The provided DatabaseState
s
represent the closest we could make it to the destination database. By
default, only the best 10 are kept around (to avoid unbounded memory growth).
If ChooseActions
, we are still searching. The caller is provided with the
current state as well as a list of actions, provided as an opaque type f
.
The getPotentialActionChoice
function can be used to get the
PotentialAction
corresponding to any given f
. The caller is free to cull
the set of potential actions according however they'd like (for example, by
prompting the user). The selected actions to explore should be passed to the
continueSearch
function.
Use of the f
existential type may seem obtuse, but it prevents the caller
from injecting arbitrary actions. Instead the caller is limited to choosing
only valid actions as provided by the suppled ActionProvider
.
ProvideSolution :: [cmd] -> Solver cmd | |
SearchFailed :: [DatabaseState cmd] -> Solver cmd | |
ChooseActions :: {..} -> Solver cmd | |
|
data FinalSolution cmd Source #
Represents the final results of a search
Solved [cmd] | The search found a path from the source to the destination database, and has provided a set of commands that would work |
Candidates [DatabaseState cmd] | The search failed, but provided a set of |
Show cmd => Show (FinalSolution cmd) Source # | |
finalSolution :: Solver cmd -> FinalSolution cmd Source #
An exhaustive solving strategy that simply continues the search, while exploring every possible action. If there is a solution, this will find it.
:: ActionProvider cmd | Edge discovery function |
-> [SomeDatabasePredicate] | Source database state |
-> [SomeDatabasePredicate] | Destination database state |
-> Solver cmd |
Conduct a breadth-first search of the database graph to find a path from
the source database to the destination database, using the given
ActionProvider
to discovere "edges" (i.e., DDL commands) between the
databases.
See the documentation on Solver
for more information on how to consume the
result.