|
|
|
|
|
Description |
Data structures for manipulating (biological) sequences.
Generally supports both nucleotide and protein sequences, some functions,
like revcompl, only makes sense for nucleotides.
|
|
Synopsis |
|
|
|
|
Data structure
|
|
A sequence is a header, sequence data itself, and optional quality data.
All items are lazy bytestrings. The Offset type can be used for indexing.
|
|
|
A sequence consists of a header, the sequence data itself, and optional quality data.
| Constructors | | Instances | |
|
|
|
An offset, index, or length of a SeqData
|
|
|
The basic data type used in Sequences
|
|
Quality data is normally associated with nucleotide sequences
|
|
|
Basic type for quality data. Range 0..255. Typical Phred output is in
the range 6..50, with 20 as the line in the sand separating good from bad.
|
|
|
Quality data is a Qual vector, currently implemented as a ByteString.
|
|
Accessor functions
|
|
|
Read the character at the specified position in the sequence.
|
|
|
Return sequence length.
|
|
|
Return sequence label (first word of header)
|
|
|
Return full header.
|
|
|
Return the sequence data.
|
|
|
|
|
Check whether the sequence has associated quality data.
|
|
|
Return the quality data, or error if none exist. Use hasqual if in doubt.
|
|
Converting to and from [Char]
|
|
|
Convert a String to SeqData
|
|
|
Convert a SeqData to a String
|
|
Nucleotide functionality
|
|
Nucleotide sequences contain the alphabet [A,C,G,T].
IUPAC specifies an extended nucleotide alphabet with wildcards, but
it is not supported at this point.
|
|
|
Complement a single character. I.e. identify the nucleotide it
can hybridize with. Note that for multiple nucleotides, you usually
want the reverse complement (see revcompl for that).
|
|
|
Calculate the reverse complement.
This is only relevant for the nucleotide alphabet,
and it leaves other characters unmodified.
|
|
Protein functionality
|
|
Proteins are chains of amino acids, represented by the IUPAC alphabet.
|
|
|
Constructors | Ala | | Arg | | Asn | | Asp | | Cys | | Gln | | Glu | | Gly | | His | | Ile | | Leu | | Lys | | Met | | Phe | | Pro | | Ser | | Thr | | Tyr | | Trp | | Val | | STP | | Asx | | Glx | | Xle | | Xaa | |
| Instances | |
|
|
|
Translate a nucleotide sequence into the corresponding protein
sequence. This works rather blindly, with no attempt to identify ORFs
or otherwise QA the result.
|
|
|
Convert a sequence in IUPAC format to a list of amino acids.
|
|
|
Convert a list of amino acids to a sequence in IUPAC format.
|
|
Produced by Haddock version 2.4.2 |