Copyright | © 2015–present Megaparsec contributors |
---|---|
License | FreeBSD |
Maintainer | Mark Karpov <markkarpov92@gmail.com> |
Stability | experimental |
Portability | portable |
Safe Haskell | Safe |
Language | Haskell2010 |
Megaparsec's input stream facilities.
You probably do not want to import this module directly because Text.Megaparsec re-exports it anyway.
Since: 6.0.0
Synopsis
- class (Ord (Token s), Ord (Tokens s)) => Stream s where
- type Token s :: Type
- type Tokens s :: Type
- tokenToChunk :: Proxy s -> Token s -> Tokens s
- tokensToChunk :: Proxy s -> [Token s] -> Tokens s
- chunkToTokens :: Proxy s -> Tokens s -> [Token s]
- chunkLength :: Proxy s -> Tokens s -> Int
- chunkEmpty :: Proxy s -> Tokens s -> Bool
- take1_ :: s -> Maybe (Token s, s)
- takeN_ :: Int -> s -> Maybe (Tokens s, s)
- takeWhile_ :: (Token s -> Bool) -> s -> (Tokens s, s)
- newtype ShareInput a = ShareInput {
- unShareInput :: a
- newtype NoShareInput a = NoShareInput {
- unNoShareInput :: a
- class Stream s => VisualStream s where
- showTokens :: Proxy s -> NonEmpty (Token s) -> String
- tokensLength :: Proxy s -> NonEmpty (Token s) -> Int
- class Stream s => TraversableStream s where
- reachOffset :: Int -> PosState s -> (Maybe String, PosState s)
- reachOffsetNoLine :: Int -> PosState s -> PosState s
Documentation
class (Ord (Token s), Ord (Tokens s)) => Stream s where Source #
Type class for inputs that can be consumed by the library.
Note that the Stream
instances for Text
and ByteString
(strict and
lazy) default to "input sharing" (see ShareInput
, NoShareInput
). We plan
to move away from input sharing in a future major release; if you want to
retain the current behaviour and are concerned with maximum performance you
should consider using the ShareInput
wrapper explicitly.
Note: before the version 9.0.0 the class included the methods from
VisualStream
and TraversableStream
.
tokenToChunk :: Proxy s -> Token s -> Tokens s Source #
Lift a single token to chunk of the stream. The default implementation is:
tokenToChunk pxy = tokensToChunk pxy . pure
However for some types of stream there may be a more efficient way to lift.
tokensToChunk :: Proxy s -> [Token s] -> Tokens s Source #
The first method that establishes isomorphism between list of tokens and chunk of the stream. Valid implementation should satisfy:
chunkToTokens pxy (tokensToChunk pxy ts) == ts
chunkToTokens :: Proxy s -> Tokens s -> [Token s] Source #
The second method that establishes isomorphism between list of tokens and chunk of the stream. Valid implementation should satisfy:
tokensToChunk pxy (chunkToTokens pxy chunk) == chunk
chunkLength :: Proxy s -> Tokens s -> Int Source #
Return length of a chunk of the stream.
chunkEmpty :: Proxy s -> Tokens s -> Bool Source #
Check if a chunk of the stream is empty. The default implementation
is in terms of the more general chunkLength
:
chunkEmpty pxy ts = chunkLength pxy ts <= 0
However for many streams there may be a more efficient implementation.
take1_ :: s -> Maybe (Token s, s) Source #
Extract a single token form the stream. Return Nothing
if the
stream is empty.
takeN_ :: Int -> s -> Maybe (Tokens s, s) Source #
should try to extract a chunk of length takeN_
n sn
, or if the
stream is too short, the rest of the stream. Valid implementation
should follow the rules:
- If the requested length
n
is 0 (or less),Nothing
should never be returned, instead
should be returned, whereJust
("", s)""
stands for the empty chunk, ands
is the original stream (second argument). - If the requested length is greater than 0 and the stream is
empty,
Nothing
should be returned indicating end of input. - In other cases, take chunk of length
n
(or shorter if the stream is not long enough) from the input stream and return the chunk along with the rest of the stream.
takeWhile_ :: (Token s -> Bool) -> s -> (Tokens s, s) Source #
Extract chunk of the stream taking tokens while the supplied
predicate returns True
. Return the chunk and the rest of the stream.
For many types of streams, the method allows for significant performance improvements, although it is not strictly necessary from conceptual point of view.
Instances
newtype ShareInput a Source #
This wrapper selects the input-sharing Stream
implementation for
Text
(Text
) and ByteString
(ByteString
). By input
sharing we mean that our parsers will use slices whenever possible to
avoid having to copy parts of the input. See also the documentation of
split
.
Note that using slices is in general faster than copying; on the other
hand it also has the potential for causing surprising memory leaks: if
any slice of the input survives in the output, holding on to the output
will force the entire input Text
/ByteString
to stay in memory!
Even when using lazy Text
/ByteString
we will hold on to whole
chunks at a time leading to to significantly worse memory residency in
some cases.
See NoShareInput
for a somewhat slower implementation that avoids this
memory leak scenario.
Since: 9.3.0
ShareInput | |
|
Instances
newtype NoShareInput a Source #
This wrapper selects the no-input-sharing Stream
implementation for
Text
(Text
) and ByteString
(ByteString
). This means
that our parsers will create independent copies rather than using slices
of the input. See also the documentation of copy
.
More importantly, any parser output will be independent of the input, and holding on to parts of the output will never prevent the input from being garbage collected.
For maximum performance you might consider using ShareInput
instead,
but beware of its pitfalls!
Since: 9.3.0
Instances
class Stream s => VisualStream s where Source #
Type class for inputs that can also be used for debugging.
Since: 9.0.0
showTokens :: Proxy s -> NonEmpty (Token s) -> String Source #
Pretty-print non-empty stream of tokens. This function is also used to print single tokens (represented as singleton lists).
Since: 7.0.0
tokensLength :: Proxy s -> NonEmpty (Token s) -> Int Source #
Return the number of characters that a non-empty stream of tokens spans. The default implementation is sufficient if every token spans exactly 1 character.
Since: 8.0.0
Instances
VisualStream ByteString Source # | |
Defined in Text.Megaparsec.Stream showTokens :: Proxy ByteString -> NonEmpty (Token ByteString) -> String Source # tokensLength :: Proxy ByteString -> NonEmpty (Token ByteString) -> Int Source # | |
VisualStream ByteString Source # | |
Defined in Text.Megaparsec.Stream showTokens :: Proxy ByteString -> NonEmpty (Token ByteString) -> String Source # tokensLength :: Proxy ByteString -> NonEmpty (Token ByteString) -> Int Source # | |
VisualStream Text Source # | |
VisualStream Text Source # | |
VisualStream String Source # | |
class Stream s => TraversableStream s where Source #
Type class for inputs that can also be used for error reporting.
Since: 9.0.0
:: Int | Offset to reach |
-> PosState s | Initial |
-> (Maybe String, PosState s) | See the description of the function |
Given an offset o
and initial PosState
, adjust the state in such
a way that it starts at the offset.
Return two values (in order):
Maybe
String
representing the line on which the given offseto
is located. It can be omitted (i.e.Nothing
); in that case error reporting functions will not show offending lines. If returned, the line should satisfy a number of conditions that are described below.- The updated
PosState
which can be in turn used to locate another offseto'
given thato' >= o
.
The String
representing the offending line in input stream should
satisfy the following:
- It should adequately represent location of token at the offset of
interest, that is, character at
sourceColumn
of the returnedSourcePos
should correspond to the token at the offseto
. - It should not include the newline at the end.
- It should not be empty, if the line happens to be empty, it
should be replaced with the string
"<empty line>"
. - Tab characters should be replaced by appropriate number of
spaces, which is determined by the
pstateTabWidth
field ofPosState
.
Note: type signature of the function was changed in the version 9.0.0.
Since: 7.0.0
:: Int | Offset to reach |
-> PosState s | Initial |
-> PosState s | Reached source position and updated state |
A version of reachOffset
that may be faster because it doesn't need
to fetch the line at which the given offset in located.
The default implementation is this:
reachOffsetNoLine o pst = snd (reachOffset o pst)
Note: type signature of the function was changed in the version 8.0.0.
Since: 7.0.0
Instances
TraversableStream ByteString Source # | |
Defined in Text.Megaparsec.Stream reachOffset :: Int -> PosState ByteString -> (Maybe String, PosState ByteString) Source # reachOffsetNoLine :: Int -> PosState ByteString -> PosState ByteString Source # | |
TraversableStream ByteString Source # | |
Defined in Text.Megaparsec.Stream reachOffset :: Int -> PosState ByteString -> (Maybe String, PosState ByteString) Source # reachOffsetNoLine :: Int -> PosState ByteString -> PosState ByteString Source # | |
TraversableStream Text Source # | |
TraversableStream Text Source # | |
TraversableStream String Source # | |