Quality-aware alignments
Generally, quality data are ignored for alignment/pattern searching like Smith-Waterman, Needleman-Wunsch, or BLAST(p|n|x). I believe that accounting for quality will at the very least affect things like BLAST statistics, and e.g. is crucial for good EST annotation using Blastx.
This module performs sequences alignments, takes quality values into account.
See also http://bioinformatics.oxfordjournals.org/cgi/content/abstract/btn052v1.
- local_score :: QualMx t Double -> (Double, Double) -> Sequence t -> Sequence t -> Double
- local_align :: QualMx t Double -> (Double, Double) -> Sequence t -> Sequence t -> (Double, EditList)
- global_score :: QualMx t Double -> (Double, Double) -> Sequence t -> Sequence t -> Double
- global_align :: QualMx t Double -> (Double, Double) -> Sequence t -> Sequence t -> (Double, EditList)
- overlap_score :: QualMx t Double -> (Double, Double) -> Sequence t -> Sequence t -> Double
- overlap_align :: QualMx t Double -> (Double, Double) -> Sequence t -> Sequence t -> (Double, EditList)
- qualMx :: Qual -> Qual -> (Chr, Chr) -> Double
- test :: IO ()
Smith-Waterman
Locally optimal alignment with affine gaps, i.e. best infix match.
local_score :: QualMx t Double -> (Double, Double) -> Sequence t -> Sequence t -> DoubleSource
Calculate local edit distance (Smith-Waterman alignment score)
local_align :: QualMx t Double -> (Double, Double) -> Sequence t -> Sequence t -> (Double, EditList)Source
Calculate local alignment (Smith-Waterman) (can we replace uncurry max' with fst - a local alignment must always end on a subst, no?)
Needleman-Wunsch
Globally optimal alignment with affine gaps, the whole sequences are matched.
global_score :: QualMx t Double -> (Double, Double) -> Sequence t -> Sequence t -> DoubleSource
Calculate global edit distance (Needleman-Wunsch alignment score)
global_align :: QualMx t Double -> (Double, Double) -> Sequence t -> Sequence t -> (Double, EditList)Source
Calculate global alignment (Needleman-Wunsch)
Overlapping alignment.
The suffix of one sequence matches a prefix of another.
overlap_score :: QualMx t Double -> (Double, Double) -> Sequence t -> Sequence t -> DoubleSource
Calucalte best overlap score, where gaps at the edges are free The starting point is like for local score (0 cost for initial indels), the result is the maximum anywhere in the last column or bottom row of the matrix.
overlap_align :: QualMx t Double -> (Double, Double) -> Sequence t -> Sequence t -> (Double, EditList)Source
Calucalte best overlap score, where gaps at the edges are free The starting point is like for local score (0 cost for initial indels), the result is the maximum anywhere in the last column or bottom row of the matrix.