This file contains the functions for resolving types and
function/operator resolution (which is seriously crazy). See the pg
manual chapter 10:
http://www.postgresql.org/docs/8.4/interactive/typeconv.html
sql server todo: want to match sql server implicit cast rules better
when doing tsql type checking. Follows a completely different approach, possible info here:
http://msdn.microsoft.com/en-us/library/ms187928.aspx
http://msdn.microsoft.com/en-us/library/ms190309.aspx
linked from here:
http://blogs.msdn.com/b/craigfr/archive/2010/01/20/more-on-implicit-conversions.aspx
TODO: rewrite
Wrappers to zap:
tcAppLike
matchApp
matchAppExtra
resolveResultSetType
resolveResultSetTypeExtra
findcallmatch
resolveresultsettype
checkassignments
We should have one function for each of the resolvers (3 just like
postgres) all the nullability, precision, dialect and special hacks
should all be in one place.
> {-# LANGUAGE OverloadedStrings, TupleSections, MultiWayIf,FlexibleInstances,ScopedTypeVariables #-}
> module Database.HsSqlPpp.Internals.TypeChecking.TypeConversion.TypeConversion
> (matchApp
> ,matchAppExtra
> ,tcAppLike
> ,resolveResultSetType
> ,resolveResultSetTypeExtra
> ,checkAssignmentValid
> ,MatchAppLiteralList
> ) where
>
> import Data.Maybe
> import Data.List
> import Data.Either
> import Data.Char
>
> import Database.HsSqlPpp.Internals.TypesInternal
>
> import Database.HsSqlPpp.Internals.Catalog.CatalogTypes
> import Database.HsSqlPpp.Internals.Utils
> import Control.Monad
> import Control.Applicative
> import Control.Arrow
> import Database.HsSqlPpp.Internals.TypeChecking.TypeConversion.OldTypeConversion
> import Database.HsSqlPpp.Internals.Dialect
> import qualified Database.HsSqlPpp.Internals.TypeChecking.TypeConversion.SqlTypeConversion as TSQL
> import Data.Text ()
> import qualified Data.Text as T
> import Text.Printf
>
******************************************************************
matchApp: takes the function name and argument types, and returns the
matching operator/function
This needs a lot more tests
> type MatchAppLiteralList = [Maybe Int]
> matchApp :: Dialect
> -> Catalog
> -> [NameComponent]
> -> [Type]
> -> Either [TypeError] ([Type],Type)
> matchApp d cat nmcs =
> ambiguityResolver $ matchApp' d nmcs
> where
>
>
>
>
>
>
>
> matchApp' (Dialect {diSyntaxFlavour = SqlServer})
> [Nmc dd] [_
> ,ScalarType "date"
> ,ScalarType "date"]
> | map toLower dd == "datediff" =
>
>
>
> Right ([ScalarType "int4",ScalarType "date",ScalarType "date"], ScalarType "int4")
> matchApp' (Dialect {diSyntaxFlavour = SqlServer}) [Nmc dd] [_,ScalarType "date"]
> | map toLower dd == "datepart" =
> Right ([ScalarType "int4",ScalarType "date"], ScalarType "int4")
> matchApp' (Dialect {diSyntaxFlavour = SqlServer}) [Nmc dd] [_,ScalarType "timestamp"]
> | map toLower dd == "datepart" =
> Right ([ScalarType "int4",(ScalarType "timestamp")], ScalarType "int4")
> matchApp' (Dialect {diSyntaxFlavour = SqlServer}) [Nmc dd] [_,_,ScalarType "date"]
> | map toLower dd == "dateadd" =
> Right ([ScalarType "int4",ScalarType "int4",ScalarType "date"], ScalarType "date")
> matchApp' (Dialect {diSyntaxFlavour = SqlServer}) [Nmc dd] [_,_,ScalarType "timestamp"]
> | map toLower dd == "dateadd" =
> Right ([ScalarType "int4",ScalarType "int4",ScalarType "timestamp"], ScalarType "timestamp")
double hack: support oracle decode when in tsql mode:
> matchApp' (Dialect {diSyntaxFlavour = SqlServer}) [Nmc dd] as
> | map toLower dd == "decode" =
decode is just syntax for simple case statement:
demand at least 3 arguments
get the type of the first argument: this is the test target
> case as of
> (tt:as'@(_:_:_)) -> do
for each pair of arguments following: check the first
one can be compared to the test target
collect all the second types
if there is a single trailing argument this is the else
> let checkBranches [] acc = return $ reverse acc
> checkBranches [els] acc = return $ reverse (els:acc)
> checkBranches (w:t:xs) acc = do
> _ <- matchApp' d [Nmc "="] [tt,w]
> checkBranches xs (t:acc)
> sndTypes <- checkBranches as' []
check the seconds types + the else for type compatilibility
return this type
todo: add the implicit casting where needed
> (as,) <$> resolveResultSetType cat sndTypes
> _ -> Left [NoMatchingOperator (T.pack dd) as]
> matchApp' d' nmcs' pts = do
> (_,ps,r,_) <- case d' of
> (Dialect {diSyntaxFlavour = SqlServer}) -> TSQL.findCallMatch d' cat nm pts
> _ -> findCallMatch d' cat nm pts
> return (ps,r)
> where
> nm = case last nmcs' of
> Nmc n -> T.pack $ map toLower n
> QNmc n -> T.pack n
> AntiNameComponent _ ->
> error "tried to find function matching an antinamecomponent"
hack to support literal arguments to overloaded functions
currently, the problem arises only for date/datetime arguments
the solution reflects this
for long argument lists with several literals, there can be a lot of variants generated, but
this shouldn't slow down the execution because lazyness composes well with the functions used
> ambiguityResolver:: ([Type] -> Either [TypeError] ([Type],Type))
> -> [Type] -> Either [TypeError] ([Type],Type)
> ambiguityResolver f ts = let rs :: [Either [TypeError] ([Type], Type)]
> rs = map f $ expandList variants ts
>
>
> in case rights rs of
> x:_ -> Right x
> [] -> head rs
> where
> variants t = case t of
> UnknownType -> [t, ScalarType "timestamp"]
> _ -> [t]
>
>
> expandList:: (a -> [a]) -> [a] -> [[a]]
> expandList f' = foldr (liftM2 (:) . f') [[]]
uses matchApp for inferring basic types
> matchAppExtra :: Dialect
> -> Catalog
> -> [NameComponent]
> -> MatchAppLiteralList
> -> [TypeExtra]
> -> Either [TypeError] ([TypeExtra],TypeExtra)
> matchAppExtra dialect cat nmcs lits tes = do
> (ts',t') <- matchApp dialect cat nmcs $ map teType tes
> tes' <- joinArgsExtra appName tes $ zipWith addArgExtra tes ts'
> return (tes', addResultExtra appName tes' t' lits)
> where
> addArgExtra te t = te {teType = t}
> appName = case nmcs of
> [Nmc dd] -> map toLower dd
> _ -> ""
precision and nullability of the result
> addResultExtra:: String -> [TypeExtra] -> Type -> MatchAppLiteralList -> TypeExtra
> addResultExtra appName tes t lits = checkPrecisionRelevance . checkResultPrecisionClass tesr
> $ TypeExtra t jp js jn
> where
> jp = if
> | appName == "||" -> Just $ sum $ mapMaybe tePrecision tesr
> | appName == "substring" ->
>
>
>
> let totalLen = joinPrecision $ map tePrecision tesr
> in case (map teType tesr) of
> [ScalarType "nvarchar"] -> totalLen
> _ ->
> case lits of
> (Nothing:(Just startPos):(Just len):_) -> do
> totalLen' <- totalLen
> return $ if startPos + len > totalLen' then totalLen' - startPos else len
> _ -> totalLen
>
> | appName `elem` ["replace"]
> -> Nothing
> | otherwise
> -> joinPrecision $ map tePrecision tesr
> js = joinScale $ map teScale tesr
> jn = if
> | appName `elem`
> ( ["isnotnull","isdate","isnumeric"]
>
> ++ ["isnull" | length tes == 1]
>
> ++ ["count","count_big"])
> -> False
> | appName `elem`
> ( ["coalesce","greatest","least"]
>
>
>
> ++ ["isnull" | length tes == 2]
>
>
>
> ++ ["strpos","position"])
> -> all teNullable tesr
>
>
> | appName `elem` ["case","decode","nullif","replicate"]
> -> True
>
> | appName `elem` ["substr","substring"] -> any teNullable tes
> | otherwise -> joinNullability $ map teNullable tesr
>
> tesr = case appName of
> "decode" -> caseResultTypes tes
>
> _ | appName `elem` ["nullif","substr","substring","left","right","ltrim","rtrim","replicate","translate","like","notlike","rlike"]
> -> take 1 tes
>
> | appName `elem` ["charindex"]
> -> take 2 tes
>
> | appName `elem` ["datepart","datediff"]
> -> drop 1 tes
>
> | appName `elem` ["dateadd"]
> -> drop 2 tes
>
> | otherwise -> tes
>
> caseResultTypes tes' = caseResultTypes' (tail tes') []
> where
> caseResultTypes' [] acc = acc
> caseResultTypes' [els] acc = els:acc
> caseResultTypes' (_:t':xs) acc = caseResultTypes' xs (t':acc)
Bring relevant arguments of a function to common precision and nullability.
The meaning of "relevant" is complicated:
- the list of arguments is split into partitions;
- precision and nullability is partition-wise joined;
- the results are broadcast back to the arguments that constituted partitions.
The algorithm follows this outline.
Examples:
- for binary operators, like "=", both arguments constitute a single partition;
- for some functions, like "substr", each argument belongs to its own partition;
- arguments of "case" and "decode" form two partitions which, except for the analyzed value
and the 'else' argument, are intervened with one another.
Actually, partitions for precision and partitions for nullability can be different.
Example:
"||": both arguments must be brought to common nullability, but remain with same precision.
Additionaly:
Before splitting onto partitions, check each argument for:
- precision class of the original (before matchApp) argument;
- precision relevance.
After splitting onto partitions, check each precision partition:
- all arguments must have same precision class (return an error if they don't).
What the is this function doing? What is it for?
> joinArgsExtra:: String -> [TypeExtra] -> [TypeExtra] -> Either [TypeError] [TypeExtra]
> joinArgsExtra "!odbc-left" _t0 t1 = Right t1
> joinArgsExtra "!odbc-timestampdiff" _t0 t1 = Right t1
> joinArgsExtra "!odbc-timestampadd" _t0 t1 = Right t1
> joinArgsExtra appName tes0 tes1
> = liftM (uncurry $ zipWith3 combine tes) $ uncurry (liftM2 (,))
> $ (joinDim joinPrec partitionPrec &&& joinDim joinNull partitionNull) tes
> where
>
> tes = map checkPrecisionRelevance
> $ zipWith adjust tes0
> $ zipWith checkPrecisionClass tes0 tes1
> where
> adjust te0 te1 = te1{tePrecision = head $ adjustStringCastPrec (teType te1) [te0]}
>
> checkPartition ptes = if length (nub ppcs) > 1
> then Left [InternalError $ printf "implicit cast: arguments of '%s' that belong to same partition are of different precision classes: %s -> %s" appName (show ptes) (show ppcs)]
> else return ptes
> where
> ppcs = map (precisionClass . teType) ptes
>
> joinDim:: ([TypeExtra] -> [TypeExtra])
> -> ([TypeExtra] -> Either [TypeError]
> ([[TypeExtra]] -> [TypeExtra], [[TypeExtra]]))
> -> [TypeExtra] -> Either [TypeError] [TypeExtra]
> joinDim join' partitionArgs'
> = liftM (uncurry ($) . second (map join')) . partitionArgs'
>
> combine te tePrec teNull = te {
> tePrecision = tePrecision tePrec,
> teScale = teScale tePrec,
> teNullable = teNullable teNull
> }
>
> joinPrec tes' = map promote tes'
> where
> promote (TypeExtra t p s n) = TypeExtra t (p' `mplus` p) (s' `mplus` s) n
> p' = joinPrecision $ map tePrecision tes
> s' = joinScale $ map teScale tes
> joinNull tes' = map promote tes'
> where
> promote (TypeExtra t p s n) = TypeExtra t p s (n' || n)
> n' = joinNullability $ map teNullable tes
>
>
>
>
> partitionArgs:: [a] -> ([[a]] -> [a], [[a]])
> partitionArgs as = case () of
>
>
> _ | appName `elem`
> ( ["datepart","dateadd"]
> ++ ["substr","substring","left","right","ltrim","rtrim"]
> ++ ["replicate","like","notlike","rlike"]
> ++ ["strpos","position","replace"]
>
> ++ ["nullif"]
>
> ++ ["regexp_substr","regexp_count","regexp_instr"]
> ++ ["string_agg"]
> )
> -> (const as, [])
>
> | appName `elem` ["datediff"]
> -> (concat, pairToList $ splitAt 1 as)
> | appName `elem` ["decode"]
> -> let (ws,ts) = decomposeDecodeTail (tail as) ([],[])
> in (composeDecodePartitions, [head as :ws, ts])
>
> | otherwise -> (concat, [as])
> partitionPrec as = secondM (mapM checkPartition) $ case () of
>
> _ | appName `elem` ["||","concat","translate","charindex"]
> -> (const as, [])
>
> | appName `elem`
> ( ["coalesce","greatest","least"]
>
> ++ ["isnull" | length as == 2]
> )
> -> (concat, [as])
> | otherwise -> partitionArgs as
> partitionNull as = return $ case () of
> _ | appName `elem`
> ( ["coalesce","greatest","least"]
> ++ ["isnull" | length as == 2]
> )
> -> (const as, [])
> | appName `elem` ["charindex"]
> -> (concat, pairToList $ splitAt 2 as)
> | otherwise -> partitionArgs as
>
> pairToList (x,y) = [x,y]
> decomposeDecodeTail [] acc = (reverse***reverse) acc
> decomposeDecodeTail [els] acc = (reverse.(els:)***reverse) acc
> decomposeDecodeTail (w:t:xs) acc = decomposeDecodeTail xs $ ((w:)***(t:)) acc
> composeDecodePartitions [t:ts,ws] = t : concat (transpose [ts,ws])
>
> composeDecodePartitions xs = concat xs
This is a small library for checking whether inference of precision does make sense.
It is used both in inference of precision of arguments and result of a function.
It is theoretically possible that types belong to different precision classes,
but inference of precision still makes sense (consider, for instance,
conversion between string and decimal).
Such cases must be handled specially.
> data PrecisionClass = String | Number | FlexiblePrecisionClass
> deriving (Eq,Show)
>
> precisionClass:: Type -> Maybe PrecisionClass
> precisionClass (ScalarType tn)
> | tn `elem` ["text","varchar","nvarchar","char"] = Just String
> | tn `elem` ["int1","int2","int4","int8","float4","float8","numeric"] = Just Number
> | otherwise = Nothing
> precisionClass UnknownType = Just FlexiblePrecisionClass
> precisionClass _ = Nothing
Do original and new type have compatible precision classes?
Note: this function is not commutative.
> infix 4 .~>.
> (.~>.):: TypeExtra -> TypeExtra -> Bool
> t0 .~>. t = Just FlexiblePrecisionClass `elem` [pc0,pc] || pc0 == pc
> where
> [pc0,pc] = map (precisionClass . teType) [t0,t]
retreat to default when original and new type are incompatible
> checkPrecisionClass:: TypeExtra -> TypeExtra -> TypeExtra
> checkPrecisionClass t0 t = if t0 .~>. t then t else t{tePrecision = Nothing, teScale = Nothing}
check precision class of result against precision classes of arguments
> checkResultPrecisionClass:: [TypeExtra] -> TypeExtra -> TypeExtra
> checkResultPrecisionClass tes t
> = if and $ map (.~>. t) tes then t else t{tePrecision = Nothing, teScale = Nothing}
check whether precision/scale is relevant for a type (consider "round").
> checkPrecisionRelevance:: TypeExtra -> TypeExtra
> checkPrecisionRelevance te = if
> | Just String <- pc
> -> te{teScale = Nothing}
> | Just FlexiblePrecisionClass <- pc
> -> te
> | ScalarType tn <- t, tn == "numeric"
> -> te
> | otherwise
> -> te{tePrecision = Nothing, teScale = Nothing}
> where
> t = teType te
> pc = precisionClass t
The purpose of the tcapplike wrapper is to partially fix up the
keyword 'enums' in some of the ms style functions (like
datepart). This is a horrible hack and will be fixed.
> tcAppLike :: Dialect -> Catalog -> [NameComponent]
> -> [Maybe Int] -> [Maybe TypeExtra]
> -> Either [TypeError] ([TypeExtra],TypeExtra)
> tcAppLike d cat anm@[Nmc dd] lits [_,a0,a1]
> | map toLower dd == "datediff" = do
>
> tys <- mapM (maybe (Left []) Right) [a0,a1]
> typeInt <- ScalarType <$> maybe (Left []) Right
> (ansiTypeNameToDialect d "int")
> let
> (ats,rt) <- matchAppExtra d cat anm lits (mkTypeExtraNN typeInt : tys)
> return (ats,rt)
> tcAppLike d cat anm@[Nmc dd] lits [_,a0]
> | map toLower dd == "datepart" = do
> tys <- mapM (maybe (Left []) Right) [a0]
> typeInt <- ScalarType <$> maybe (Left []) Right
> (ansiTypeNameToDialect d "int")
> (ats,rt) <- matchAppExtra d cat anm lits (mkTypeExtraNN typeInt : tys)
> return (ats,rt)
> tcAppLike d cat anm@[Nmc dd] lits [_,a0,a1]
> | map toLower dd == "dateadd" = do
> tys <- mapM (maybe (Left []) Right) [a0,a1]
> typeInt <- ScalarType <$> maybe (Left []) Right
> (ansiTypeNameToDialect d "int")
> (ats,rt) <- matchAppExtra d cat anm lits (mkTypeExtraNN typeInt : tys)
> return (ats,rt)
>
>
> tcAppLike d cat anm@[Nmc dd] _ ts
> | map toLower dd == "!odbc-convert" = do
> (tys :: [TypeExtra]) <- mapM (maybe (Left []) Right) ts
>
> (ats,rt) <- matchAppExtra d cat anm [] tys
> return (ats,rt)
> tcAppLike d cat anm@[Nmc dd] _ [_,a0,a1]
> | map toLower dd `elem` ["!odbc-timestampadd","!odbc-timestampdiff"] = do
> tys <- mapM (maybe (Left []) Right) [a0,a1]
>
> typeInt <- ScalarType <$> maybe (Left []) Right
> (ansiTypeNameToDialect d "int")
> (ats,rt) <- matchAppExtra d cat anm [] (mkTypeExtraNN typeInt : tys)
> return (ats,rt)
> tcAppLike d cat anm lits teArgs = do
>
>
> tys <- mapM (maybe (Left []) Right) teArgs
> (ats,rt) <- matchAppExtra d cat anm lits tys
> return (ats,rt)
******************************************************************
findCallMatch :: Catalog -> String -> [Type] -> Either [TypeError] OperatorPrototype
findCallMatch cat fnName' argsType =
code interspersed with text cut and pasted from postgresql manual
10.3. Functions
Function Type Resolution
Select the functions to be considered from the pg_proc system
catalog. If a non-schema-qualified function name was used, the
functions considered are those with the matching name and argument
count that are visible in the current search path (see Section
5.7.3). If a qualified function name was given, only functions in the
specified schema are considered.
[HsSqlPpp doesn't support schema stuff yet, so just get a list of all
the functions with a matching name]
>
If the search path finds multiple functions of identical argument
types, only the one appearing earliest in the path is
considered. Functions of different argument types are considered on an
equal footing regardless of search path position.
If a function is declared with a VARIADIC array parameter, and the
call does not use the VARIADIC keyword, then the function is treated
as if the array parameter were replaced by one or more occurrences of
its element type, as needed to match the call. After such expansion
the function might have effective argument types identical to some
non-variadic function. In that case the function appearing earlier in
the search path is used, or if the two functions are in the same
schema, the non-variadic one is preferred.
Functions that have default values for parameters are considered to
match any call that omits zero or more of the defaultable parameter
positions. If more than one such function matches a call, the one
appearing earliest in the search path is used. If there are two or
more such functions in the same schema with identical parameter types
in the non-defaulted positions (which is possible if they have
different sets of defaultable parameters), the system will not be able
to determine which to prefer, and so an "ambiguous function call"
error will result if no better match to the call can be found.
Check for a function accepting exactly the input argument types. If
one exists (there can be only one exact match in the set of functions
considered), use it. (Cases involving unknown will never find a match
at this step.)
If no exact match is found, see if the function call appears to be a
special type conversion request. This happens if the function call has
just one argument and the function name is the same as the (internal)
name of some data type. Furthermore, the function argument must be
either an unknown-type literal, or a type that is binary-coercible to
the named data type, or a type that could be converted to the named
data type by applying that type's I/O functions (that is, the
conversion is either to or from one of the standard string
types). When these conditions are met, the function call is treated as
a form of CAST specification. [1]
Look for the best match.
Discard candidate functions for which the input types do not match and
cannot be converted (using an implicit conversion) to match. unknown
literals are assumed to be convertible to anything for this
purpose. If only one candidate remains, use it; else continue to the
next step.
Run through all candidates and keep those with the most exact matches
on input types. (Domains are considered the same as their base type
for this purpose.) Keep all candidates if none have exact matches. If
only one candidate remains, use it; else continue to the next step.
Run through all candidates and keep those that accept preferred types
(of the input data type's type category) at the most positions where
type conversion will be required. Keep all candidates if none accept
preferred types. If only one candidate remains, use it; else continue
to the next step.
If any input arguments are unknown, check the type categories accepted
at those argument positions by the remaining candidates. At each
position, select the string category if any candidate accepts that
category. (This bias towards string is appropriate since an
unknown-type literal looks like a string.) Otherwise, if all the
remaining candidates accept the same type category, select that
category; otherwise fail because the correct choice cannot be deduced
without more clues. Now discard candidates that do not accept the
selected type category. Furthermore, if any candidate accepts a
preferred type in that category, discard candidates that accept
non-preferred types for that argument.
If only one candidate remains, use it. If no candidate or more than
one candidate remains, then fail.
exact match
binop1unknownmatch
polymorphic matches
reachable
mostexactmatches
filteredforpreferred
unknownmatchesbycat
TODO: do a log monad, which can record the tests and then return this
process along with the resolved function
todo:
>
todo:
assignmentCheck