New approach to dialacts, more rule driven so all the options for a
dialect can appear here.

It would be nice to make the dialects completely rule driven, but I
think this is much less maintainable than making them mainly rule
driven (primarily via the catalogs), but with some special cases
scattered in the source code.

The dialect contains:
the name of the dialect
some parsing options about what syntax is supported
some options about typechecking support

flags to control the details of the dialect (for instance, some
dialects have additional options to specify which kinds of string
literal escapes are valid, so this is like subdialect

stuff about types:

canonical names of types with multiple names (these are only the
built in types, user/catalog driven type aliases are not covered here)
the built in text types
the built in datetime types

default catalog for this dialect


then supplied with hssqlppp are:
base dialect with minimal stuff in it
ansi2011 dialect
recent-ish postgresql dialect
recent-ish sql server dialect
recent-ish oracle dialect

the idea is that if you have one of these dialects, you can start here
then add your own catalog entries and use it as is. If your dialect is
not here, and is it similar enough to an existing dialect, you can
take that dialect and
    a) modify some of the dialect options
    b) modify the default catalog
    and you will get something useful

if the dialect is too different, then you will have to edit the
hssqlppp source.

> {-# LANGUAGE DeriveDataTypeable #-}
> module Database.HsSqlPpp.Internals.Dialect
>     (Dialect(..)
>     ,SyntaxFlavour(..)
>     ,canonicalizeTypeName
>     ,ansiTypeNameToDialect) where

> import Database.HsSqlPpp.Internals.Catalog.CatalogTypes
> import Data.Data
> import Data.Text (Text)
> import qualified Data.Text as T
> import Data.List (find)
> import Data.Char (toLower)

> data Dialect = Dialect
>      {diName :: String

represent the syntax variations with a crude enum. Later, can make
this more rule driven.

>      ,diSyntaxFlavour :: SyntaxFlavour

map from alternative names to the canonical name of built in
types. This is used e.g. because in ansi the canonical name of boolean
type is 'boolean', and in postgresql the canonical name of this type
is 'bool'.
These should all be in lower case

>      ,diCanonicalTypeNames :: [(Text,[Text])]

the names of the built in text types. This is used to help type check
built in functions like substring?

>      ,diTextTypes :: [Text] -- names of the text types (canonical names must be used)

used to typecheck things like extract

todo: create a single function which takes the ansi name of a type and
returns the dialect specific name (as a maybe) - then don't have to
have a huge number of functions here.

Also, these functions should be -TypeName not -Type.

>      ,diDatetimeTypes :: [Text]
>      ,diNumberTypes :: [Text] -- names of the number types (canonical names must be used)
>      -- this is a map from the canonical ansi name (in hssqlppp)
>      -- to the canonical name in the dialect
>      -- if there is no entry, then it means that type isn't
>      -- supported in this dialect

todo: make a list of exactly what type names are needed in the type
checker and why. This should only be used for the type checker
internally, and not anywhere else. Should do the same for the
other fields above

>      ,namesForAnsiTypes :: [(Text,Text)]

A small issue with having the default catalog like this is that we can
make a programming error where we have a function which takes the
dialect and a catalog, and we use this default catalog instead of the
supplied updated catalog.

>      ,diDefaultCatalog :: Catalog
>      } deriving (Eq,Show,Data,Typeable)

> data SyntaxFlavour = Ansi | Postgres | SqlServer | Oracle
>                      deriving (Eq,Show,Data,Typeable)

> ansiTypeNameToDialect :: Dialect -> Text -> Maybe Text
> ansiTypeNameToDialect d n = lookup n (namesForAnsiTypes d)

> canonicalizeTypeName :: Dialect -> Text -> Text
> canonicalizeTypeName d s =
>     let m = diCanonicalTypeNames d
>     in ct m s
>   where
>     hasType t p = let t' = T.map toLower t
>                   in t' `elem` snd p
>     ct m tn = maybe tn fst
>               $ find (hasType tn) m

TODO:
modify the catalog:
  when adding a type flags to say:
    if this type is a text or datetime or number type
    if it is built in
    if it is undroppable
    if it has a list of builtin aliases
    what the ansi equivalent type name is

this will get rid of most of the fields in the dialect and make it
much easier to keep everything consistent and maintainable

start adding flags for which bits of syntax to support instead of
using the syntax flavour thing
there will be flags like this for typechecking also

support functions to help create minimal dialects
  e.g. take ansi, and safely remove a bunch of types and functions

write the default catalogs and dialects all in one file
  (they are split into two at the moment)

export the ansidialect every module that exposes something with a
dialect (function or data type) - parse, pretty, lex, typecheck, etc.

then you only need to import the dialects module to get the other
dialects

consider how the catalog in the dialect can help with parsing
operators

minimal dialects:
mainly missing types then following through on implications:
no text types
only one text type covering char,varchar,nclob, etc.
no numeric or decimal
on decimal for all integers and precise decimals
no date or time types


Types:
typeextra: needs fixing
how to represent e.g. 'varchar' without precision

invent some sort of concrete syntax (which is parseable) to
represent implicit casts (cast implicit x as varchar)?