Copyright | (c) Galois Inc 2014-2021 |
---|---|
License | BSD3 |
Maintainer | Rob Dockins <rdockins@galois.com> |
Stability | provisional |
Safe Haskell | Safe-Inferred |
Language | Haskell2010 |
This module translates an LLVM Module
into a collection of Crucible
control-flow graphs, one per function. The tricky parts of this translation
are 1) mapping LLVM types onto Crucible types in a sensible way and 2)
translating the phi-instructions of LLVM's SSA form.
To handle the translation of phi-functions, we first perform a
pre-pass over all the LLVM basic blocks looking for phi-functions
and build a datastructure that tells us what assignments to make
when jumping from block l to block l'. We then emit instructions
to make these assignments in a separate Crucible basic block (see
definePhiBlock
). Thus, the translated CFG will generally have
more basic blocks than the original LLVM.
Points of note:
- Immediate (i.e., not in memory) structs and packed structs are translated the same.
- Undefined values generate special Crucible expressions (e.g., BVUndef) to represent arbitrary bitpatterns.
- The floating-point domain is interpreted by IsSymInterface as either the IEEE754 floating-point domain, the real domain, or a domain with bitvector values and uninterpreted operations.
Some notes on undefined/poison values: (outcome of discussions between JHx and RWD)
- Continue to add Crucible expressions for undefined values as required (e.g. for floating-point values). Crucible itself is currently treating undefined values as fresh symbolic inputs; it should instead invent a new category of "arbitrary" values that get passed down into the solvers in a way that is dependent on the task at hand. For example, in verification tasks, it is appropriate to treat the arbitrary values the same as symbolic inputs. However, for preimage-finding tasks, the arbitrary values must be treated as universally-quantified, unlike the symbolic inputs which are existentially-quantified.
- For poison values, our implementation strategy is to assert side conditions onto values that may create poison. As opposed to assertions (which must be satisfied because you passed through a control-flow point) side conditions are intended to mean that a condition must hold when a value is used (i.e., side conditions follow data-flow). So if a poison value is created but not examined (e.g., because we later do a test to determine if the value is safe), that should be allowed.
A (probably) partial list of things we intend to support, but do not yet:
- Various vector instructions. This includes a variety of instructions that LLVM allows to take vector arguments, but are currently only defined on scalar (nonvector) arguments. (Progress has been made on this, but may not yet be complete).
A list of things that are unsupported and may never be:
- Computed jumps and blockaddress expressions
- Multithreading primitives
- Alternate calling conventions
Synopsis
- data ModuleTranslation arch
- getTranslatedCFG :: ModuleTranslation arch -> Symbol -> IO (Maybe (Declare, AnyCFG LLVM, [LLVMTranslationWarning]))
- getTranslatedFnHandle :: ModuleTranslation arch -> Symbol -> IO (Maybe (Declare, SomeHandle))
- transContext :: Getter (ModuleTranslation arch) (LLVMContext arch)
- globalInitMap :: Getter (ModuleTranslation arch) GlobalInitializerMap
- modTransDefs :: Getter (ModuleTranslation arch) [(Declare, SomeHandle)]
- modTransModule :: Getter (ModuleTranslation arch) Module
- modTransHalloc :: Getter (ModuleTranslation arch) HandleAllocator
- data LLVMContext arch = LLVMContext {
- llvmArch :: ArchRepr arch
- llvmPtrWidth :: forall a. (16 <= ArchWidth arch => NatRepr (ArchWidth arch) -> a) -> a
- llvmMemVar :: GlobalVar Mem
- _llvmTypeCtx :: TypeContext
- llvmGlobalAliases :: Map Symbol (Set GlobalAlias)
- llvmFunctionAliases :: Map Symbol (Set GlobalAlias)
- llvmTypeCtx :: Simple Lens (LLVMContext arch) TypeContext
- translateModule :: (?transOpts :: TranslationOptions) => HandleAllocator -> GlobalVar Mem -> Module -> IO (Some ModuleTranslation)
- data LLVMTranslationWarning = LLVMTranslationWarning Symbol Position Text
- data LLVMConst where
- ZeroConst :: !MemType -> LLVMConst
- IntConst :: 1 <= w => !(NatRepr w) -> !(BV w) -> LLVMConst
- FloatConst :: !Float -> LLVMConst
- DoubleConst :: !Double -> LLVMConst
- LongDoubleConst :: !FP80Value -> LLVMConst
- StringConst :: !ByteString -> LLVMConst
- ArrayConst :: !MemType -> [LLVMConst] -> LLVMConst
- VectorConst :: !MemType -> [LLVMConst] -> LLVMConst
- StructConst :: !StructInfo -> [LLVMConst] -> LLVMConst
- SymbolConst :: !Symbol -> !Integer -> LLVMConst
- UndefConst :: !MemType -> LLVMConst
- boolConst :: Bool -> LLVMConst
- intConst :: MonadError String m => Natural -> Integer -> m LLVMConst
- transConstant :: (?lc :: TypeContext, MonadError String m, HasPtrWidth wptr) => Typed Value -> m LLVMConst
- transConstantWithType :: (?lc :: TypeContext, MonadError String m, HasPtrWidth wptr) => Typed Value -> m (MemType, LLVMConst)
- transConstant' :: (?lc :: TypeContext, MonadError String m, HasPtrWidth wptr) => MemType -> Value -> m LLVMConst
- transConstantExpr :: forall m wptr. (?lc :: TypeContext, MonadError String m, HasPtrWidth wptr) => ConstExpr -> m LLVMConst
- data GEP (n :: Nat) (expr :: Type) where
- GEP_scalar_base :: expr -> GEP 1 expr
- GEP_vector_base :: NatRepr n -> expr -> GEP n expr
- GEP_scatter :: NatRepr n -> GEP 1 expr -> GEP n expr
- GEP_field :: FieldInfo -> GEP n expr -> GEP n expr
- GEP_index_each :: MemType -> GEP n expr -> expr -> GEP n expr
- GEP_index_vector :: MemType -> GEP n expr -> expr -> GEP n expr
- data GEPResult expr where
- translateGEP :: forall wptr m. (?lc :: TypeContext, MonadError String m, HasPtrWidth wptr) => Bool -> Type -> Typed Value -> [Typed Value] -> m (GEPResult (Typed Value))
- showInstr :: Instr -> String
- testBreakpointFunction :: String -> Bool
- data TranslationOptions = TranslationOptions {
- debugIntrinsics :: !Bool
- laxArith :: !Bool
- optLoopMerge :: !Bool
- defaultTranslationOptions :: TranslationOptions
- debugIntrinsicsTranslationOptions :: TranslationOptions
- type VarArgs = VectorType AnyType
- varArgsRepr :: TypeRepr VarArgs
- llvmTypesAsCtx :: forall a wptr. HasPtrWidth wptr => [MemType] -> (forall ctx. CtxRepr ctx -> a) -> a
- llvmTypeAsRepr :: forall a wptr. HasPtrWidth wptr => MemType -> (forall tp. TypeRepr tp -> a) -> a
- llvmRetTypeAsRepr :: forall a wptr. HasPtrWidth wptr => Maybe MemType -> (forall tp. TypeRepr tp -> a) -> a
- llvmDeclToFunHandleRepr :: HasPtrWidth wptr => FunDecl -> (forall args ret. CtxRepr args -> TypeRepr ret -> a) -> a
- llvmDeclToFunHandleRepr' :: (?lc :: TypeContext, HasPtrWidth wptr, MonadFail m) => Declare -> (forall args ret. CtxRepr args -> TypeRepr ret -> m a) -> m a
- type LLVMPointerType w = IntrinsicType "LLVM_pointer" (EmptyCtx ::> BVType w)
- pattern PtrWidth :: HasPtrWidth w => w ~ w' => NatRepr w'
- pattern LLVMPointerRepr :: () => (1 <= w, ty ~ LLVMPointerType w) => NatRepr w -> TypeRepr ty
- declareFromDefine :: Define -> Declare
- allModuleDeclares :: Module -> [Declare]
- liftMemType :: (?lc :: TypeContext, MonadError String m) => Type -> m MemType
- liftRetType :: (?lc :: TypeContext, MonadError String m) => Type -> m RetType
- liftDeclare :: (?lc :: TypeContext, MonadError String m) => Declare -> m FunDecl
Documentation
data ModuleTranslation arch Source #
The result of translating an LLVM module into Crucible CFGs.
Instances
TestEquality ModuleTranslation Source # | |
Defined in Lang.Crucible.LLVM.Translation testEquality :: forall (a :: k) (b :: k). ModuleTranslation a -> ModuleTranslation b -> Maybe (a :~: b) # |
getTranslatedCFG :: ModuleTranslation arch -> Symbol -> IO (Maybe (Declare, AnyCFG LLVM, [LLVMTranslationWarning])) Source #
Given a ModuleTranslation
and a function symbol corresponding to a function
defined in the module, attempt to look up the symbol name and retrieve the corresponding
Crucible CFG. This will load and translate the CFG if this is the first time
the given symbol is requested.
Will return Nothing
if the symbol does not refer to a function defined in this
module.
getTranslatedFnHandle :: ModuleTranslation arch -> Symbol -> IO (Maybe (Declare, SomeHandle)) Source #
Look up the signature and function handle for a function defined in this module translation. This does not trigger translation of the named function into Crucible if it has not already been requested.
Will return Nothing
if the symbol does not refer to a function defined in this
module.
transContext :: Getter (ModuleTranslation arch) (LLVMContext arch) Source #
modTransDefs :: Getter (ModuleTranslation arch) [(Declare, SomeHandle)] Source #
modTransModule :: Getter (ModuleTranslation arch) Module Source #
modTransHalloc :: Getter (ModuleTranslation arch) HandleAllocator Source #
data LLVMContext arch Source #
Information about the LLVM module.
LLVMContext | |
|
llvmTypeCtx :: Simple Lens (LLVMContext arch) TypeContext Source #
:: (?transOpts :: TranslationOptions) | |
=> HandleAllocator | Generator for nonces. |
-> GlobalVar Mem | Memory model to associate with this context |
-> Module | Module to translate |
-> IO (Some ModuleTranslation) |
Translate a module into Crucible control-flow graphs. Return the translated module and a list of warning messages generated during translation. Note: We may want to add a map from symbols to existing function handles if we want to support dynamic loading.
Representation of LLVM constant values
Translation-time LLVM constant values.
ZeroConst :: !MemType -> LLVMConst | A constant value consisting of all zero bits. |
IntConst :: 1 <= w => !(NatRepr w) -> !(BV w) -> LLVMConst | A constant integer value, with bit-width |
FloatConst :: !Float -> LLVMConst | A constant floating point value. |
DoubleConst :: !Double -> LLVMConst | A constant double value. |
LongDoubleConst :: !FP80Value -> LLVMConst | A constant long double value (X86_FP80) |
StringConst :: !ByteString -> LLVMConst | A constant sequence of bytes |
ArrayConst :: !MemType -> [LLVMConst] -> LLVMConst | A constant array value. |
VectorConst :: !MemType -> [LLVMConst] -> LLVMConst | A constant vector value. |
StructConst :: !StructInfo -> [LLVMConst] -> LLVMConst | A constant structure value. |
SymbolConst :: !Symbol -> !Integer -> LLVMConst | A pointer value, consisting of a concrete offset from a global symbol. |
UndefConst :: !MemType -> LLVMConst | The |
Instances
Show LLVMConst Source # | This also can't be derived, but is completely uninteresting. |
Eq LLVMConst Source # | The interesting cases here are:
* |
:: MonadError String m | |
=> Natural | width of the integer constant, |
-> Integer | value of the integer constant, |
-> m LLVMConst |
Create an LLVM constant of a given width. The resulting integer
constant value will be the unsigned integer value n mod 2^w
.
Translations from LLVM syntax to constant values
transConstant :: (?lc :: TypeContext, MonadError String m, HasPtrWidth wptr) => Typed Value -> m LLVMConst Source #
transConstantWithType :: (?lc :: TypeContext, MonadError String m, HasPtrWidth wptr) => Typed Value -> m (MemType, LLVMConst) Source #
Compute the constant value of an expression. Fail if the given value does not represent a constant.
transConstant' :: (?lc :: TypeContext, MonadError String m, HasPtrWidth wptr) => MemType -> Value -> m LLVMConst Source #
Compute the constant value of an expression. Fail if the given value does not represent a constant.
transConstantExpr :: forall m wptr. (?lc :: TypeContext, MonadError String m, HasPtrWidth wptr) => ConstExpr -> m LLVMConst Source #
Compute the value of a constant expression. Fails if the expression does not actually represent a constant value.
Intermediate representation for GEP
data GEP (n :: Nat) (expr :: Type) where Source #
Intermediate representation of a GEP.
A GEP n expr
is a representation of a GEP with
n
parallel vector lanes with expressions represented
by expr
values.
GEP_scalar_base :: expr -> GEP 1 expr | Start a GEP with a single base pointer |
GEP_vector_base :: NatRepr n -> expr -> GEP n expr | Start a GEP with a vector of |
GEP_scatter :: NatRepr n -> GEP 1 expr -> GEP n expr | Copy a scalar base vector pointwise into a
vector of length |
GEP_field :: FieldInfo -> GEP n expr -> GEP n expr | Add the offset corresponding to the given field pointwise to each pointer |
GEP_index_each :: MemType -> GEP n expr -> expr -> GEP n expr | Add an offset corresponding to the given array index (multiplied by the given type size) pointwise to the pointers in each lane. |
GEP_index_vector :: MemType -> GEP n expr -> expr -> GEP n expr | Given a vector of offsets (whose length must match the number of lanes), multiply each one by the type size, and add the offsets to the corresponding pointers. |
Instances
Foldable (GEP n) Source # | |
Defined in Lang.Crucible.LLVM.Translation.Constant fold :: Monoid m => GEP n m -> m # foldMap :: Monoid m => (a -> m) -> GEP n a -> m # foldMap' :: Monoid m => (a -> m) -> GEP n a -> m # foldr :: (a -> b -> b) -> b -> GEP n a -> b # foldr' :: (a -> b -> b) -> b -> GEP n a -> b # foldl :: (b -> a -> b) -> b -> GEP n a -> b # foldl' :: (b -> a -> b) -> b -> GEP n a -> b # foldr1 :: (a -> a -> a) -> GEP n a -> a # foldl1 :: (a -> a -> a) -> GEP n a -> a # elem :: Eq a => a -> GEP n a -> Bool # maximum :: Ord a => GEP n a -> a # minimum :: Ord a => GEP n a -> a # | |
Traversable (GEP n) Source # | |
Functor (GEP n) Source # | |
data GEPResult expr where Source #
The result of a GEP instruction translation. It records the number of parallel vector lanes in the resulting instruction, the resulting memory type of the instruction, and the sequence of sub-operations required to compute the GEP instruction.
Instances
Foldable GEPResult Source # | |
Defined in Lang.Crucible.LLVM.Translation.Constant fold :: Monoid m => GEPResult m -> m # foldMap :: Monoid m => (a -> m) -> GEPResult a -> m # foldMap' :: Monoid m => (a -> m) -> GEPResult a -> m # foldr :: (a -> b -> b) -> b -> GEPResult a -> b # foldr' :: (a -> b -> b) -> b -> GEPResult a -> b # foldl :: (b -> a -> b) -> b -> GEPResult a -> b # foldl' :: (b -> a -> b) -> b -> GEPResult a -> b # foldr1 :: (a -> a -> a) -> GEPResult a -> a # foldl1 :: (a -> a -> a) -> GEPResult a -> a # toList :: GEPResult a -> [a] # length :: GEPResult a -> Int # elem :: Eq a => a -> GEPResult a -> Bool # maximum :: Ord a => GEPResult a -> a # minimum :: Ord a => GEPResult a -> a # | |
Traversable GEPResult Source # | |
Defined in Lang.Crucible.LLVM.Translation.Constant | |
Functor GEPResult Source # | |
:: forall wptr m. (?lc :: TypeContext, MonadError String m, HasPtrWidth wptr) | |
=> Bool | inbounds flag |
-> Type | base type for calculations |
-> Typed Value | base pointer expression |
-> [Typed Value] | index arguments |
-> m (GEPResult (Typed Value)) |
Given the data for an LLVM getelementpointer instruction,
preprocess the instruction into a GEPResult
, checking
types, computing vectorization lanes, etc.
As a concrete example, consider a call to
with the following
instruction:translateGEP
inbounds baseTy basePtr elts
getelementptr [12 x i8], ptr %aptr, i64 0, i32 1
Here:
inbounds
isFalse
, as the keyword of the same name is missing from the instruction. (Currently,crucible-llvm
ignores this information.)baseTy
is[12 x i8]
. This is the type used as the basis for subsequent calculations.basePtr
isptr %aptr
. This pointer is used as the base address to start calculations from. Note that the type ofbasePtr
is notbaseTy
, but rather a pointer type.- The
elts
are[i64 0, i32 1]
. These are the indices that indicate which of the elements of the aggregate object are indexed.
Utility functions
testBreakpointFunction :: String -> Bool Source #
data TranslationOptions Source #
This datatype encodes a variety of tweakable settings that apply during LLVM translation.
TranslationOptions | |
|
defaultTranslationOptions :: TranslationOptions Source #
The default translation options:
- Do not translate
llvm.dbg
intrinsic statements - Do not produce proof obligations for arithmetic-related assertions.
- Do not insert merge blocks in loops with early exits.
debugIntrinsicsTranslationOptions :: TranslationOptions Source #
Like defaultTranslationOptions
, but llvm.dbg
intrinsic statements are
translated.
type VarArgs = VectorType AnyType Source #
llvmTypesAsCtx :: forall a wptr. HasPtrWidth wptr => [MemType] -> (forall ctx. CtxRepr ctx -> a) -> a Source #
Translate a list of LLVM expressions into a crucible type context, which is passed into the given continuation.
llvmTypeAsRepr :: forall a wptr. HasPtrWidth wptr => MemType -> (forall tp. TypeRepr tp -> a) -> a Source #
Translate an LLVM type into a crucible type, which is passed into the given continuation
llvmRetTypeAsRepr :: forall a wptr. HasPtrWidth wptr => Maybe MemType -> (forall tp. TypeRepr tp -> a) -> a Source #
Translate an LLVM return type into a crucible type, which is passed into the given continuation
llvmDeclToFunHandleRepr :: HasPtrWidth wptr => FunDecl -> (forall args ret. CtxRepr args -> TypeRepr ret -> a) -> a Source #
Compute the function Crucible function signature that corresponds to the given LLVM function declaration.
llvmDeclToFunHandleRepr' :: (?lc :: TypeContext, HasPtrWidth wptr, MonadFail m) => Declare -> (forall args ret. CtxRepr args -> TypeRepr ret -> m a) -> m a Source #
type LLVMPointerType w = IntrinsicType "LLVM_pointer" (EmptyCtx ::> BVType w) Source #
Crucible type of pointers/bitvector values of width w
.
pattern LLVMPointerRepr :: () => (1 <= w, ty ~ LLVMPointerType w) => NatRepr w -> TypeRepr ty Source #
This pattern synonym makes it easy to build and destruct runtime
representatives of
.LLVMPointerType
w
declareFromDefine :: Define -> Declare Source #
allModuleDeclares :: Module -> [Declare] Source #
Return all declarations derived from both external symbols and internal definitions.
liftMemType :: (?lc :: TypeContext, MonadError String m) => Type -> m MemType Source #
liftRetType :: (?lc :: TypeContext, MonadError String m) => Type -> m RetType Source #
liftDeclare :: (?lc :: TypeContext, MonadError String m) => Declare -> m FunDecl Source #