Copyright | (c) HyraxBio 2018 |
---|---|
License | BSD3 |
Maintainer | andre@hyraxbio.co.za, andre@andrevdm.com |
Safe Haskell | Safe |
Language | Haskell2010 |
Functionality for generating AB1 files from an input FASTA. These AB1s are supported by both PHRED and recall, if you are using other software you may need to add additional required sections.
Weighted reads
The input FASTA files have "weighted" reads. The name for each read is an value between 0 and 1 which specifies the height of the peak relative to a full peak.
Single read
The most simple example is a single FASTA with a single read with a weight of 1
> 1 ACTG
The chromatogram for this AB1 shows perfect traces for the input ACTG
nucleotides with a full height peak.
Mixes & multiple reads
The source FASTA can have multiple reads, which results in a chromatogram with mixes
> 1 ACAG > 0.3 ACTG
There is an AT
mix at the third nucleotide. The first read has a weight of 1 and the second a weight of 0.3.
The chromatogram shows the mix and the T
with a lower peak (30% of the A
peak)
Summing weights
- The weigh of a read specifies the intensity of the peak from 0 to 1.
- Weights for each position are added to a maximum of 1 per nucleotide
- You can use `_` as a "blank" nucleotide, in which only the nucleotides from other reads will be considered
E.g.
> 1 ACAG > 0.3 _GT > 0.2 _G
Reverse reads
A weighted FASTA can represent a reverse read. To do this add a R
suffix to the weight.
The data you enter should be entered as if it was a forward read. This data will be complemented
and reversed before writing to the ABIF
E.g.
> 1R ACAG
See README.md for additional details and examples
Synopsis
- generateAb1s :: FilePath -> FilePath -> IO ()
- generateAb1 :: (Text, [(Double, Text)]) -> ByteString
- readWeightedFasta :: ByteString -> Either Text [(Double, Text)]
- iupac :: [[Char]] -> [Char]
- unIupac :: Char -> [Char]
- complementNucleotides :: Text -> Text
Documentation
generateAb1s :: FilePath -> FilePath -> IO () Source #
Generate a set of AB1s. One for every FASTA found in the source directory
generateAb1 :: (Text, [(Double, Text)]) -> ByteString Source #
Create the ByteString
data for an AB1 given the data from a weighted FASTA (see readWeightedFasta
)
readWeightedFasta :: ByteString -> Either Text [(Double, Text)] Source #
Read a weighted FASTA file. See the module documentation for details on the format of the weighted FASTA
Reads with a weight followed by an R
are reverse reads, and the AB1 generated will contain the complemeted
sequence.
e.g. weighted FASTA
> 1 ACAG > 0.3 _GT > 0.2 _G
The result data has the type
[(Double
,Text
)] ^ ^ | | | +---- read | +---- weight
unIupac :: Char -> [Char] Source #
Convert a IUPAC ambiguity code to the set of nucleotides it represents
complementNucleotides :: Text -> Text Source #
Return the complement of a nucelotide string