Portability | Haskell 2010 |
---|---|
Stability | stable |
Maintainer | Brent Yorgey <byorgey@gmail.com> |
Safe Haskell | Safe-Inferred |
Implementation module for Data.List.Split, a combinator library for splitting lists. See the Data.List.Split documentation for more description and examples.
- data Splitter a = Splitter {}
- defaultSplitter :: Splitter a
- newtype Delimiter a = Delimiter [a -> Bool]
- matchDelim :: Delimiter a -> [a] -> Maybe ([a], [a])
- data DelimPolicy
- data CondensePolicy
- data EndPolicy
- data Chunk a
- type SplitList a = [Chunk a]
- fromElem :: Chunk a -> [a]
- isDelim :: Chunk a -> Bool
- isText :: Chunk a -> Bool
- splitInternal :: Delimiter a -> [a] -> SplitList a
- postProcess :: Splitter a -> SplitList a -> SplitList a
- doDrop :: DelimPolicy -> SplitList a -> SplitList a
- doCondense :: CondensePolicy -> SplitList a -> SplitList a
- insertBlanks :: CondensePolicy -> SplitList a -> SplitList a
- insertBlanks' :: CondensePolicy -> SplitList a -> SplitList a
- doMerge :: DelimPolicy -> SplitList a -> SplitList a
- mergeLeft :: SplitList a -> SplitList a
- mergeRight :: SplitList a -> SplitList a
- dropInitial :: EndPolicy -> SplitList a -> SplitList a
- dropFinal :: EndPolicy -> SplitList a -> SplitList a
- split :: Splitter a -> [a] -> [[a]]
- oneOf :: Eq a => [a] -> Splitter a
- onSublist :: Eq a => [a] -> Splitter a
- whenElt :: (a -> Bool) -> Splitter a
- dropDelims :: Splitter a -> Splitter a
- keepDelimsL :: Splitter a -> Splitter a
- keepDelimsR :: Splitter a -> Splitter a
- condense :: Splitter a -> Splitter a
- dropInitBlank :: Splitter a -> Splitter a
- dropFinalBlank :: Splitter a -> Splitter a
- dropInnerBlanks :: Splitter a -> Splitter a
- dropBlanks :: Splitter a -> Splitter a
- startsWith :: Eq a => [a] -> Splitter a
- startsWithOneOf :: Eq a => [a] -> Splitter a
- endsWith :: Eq a => [a] -> Splitter a
- endsWithOneOf :: Eq a => [a] -> Splitter a
- splitOneOf :: Eq a => [a] -> [a] -> [[a]]
- splitOn :: Eq a => [a] -> [a] -> [[a]]
- splitWhen :: (a -> Bool) -> [a] -> [[a]]
- endBy :: Eq a => [a] -> [a] -> [[a]]
- endByOneOf :: Eq a => [a] -> [a] -> [[a]]
- wordsBy :: (a -> Bool) -> [a] -> [[a]]
- linesBy :: (a -> Bool) -> [a] -> [[a]]
- build :: ((a -> [a] -> [a]) -> [a] -> [a]) -> [a]
- chunksOf :: Int -> [e] -> [[e]]
- splitPlaces :: Integral a => [a] -> [e] -> [[e]]
- splitPlacesBlanks :: Integral a => [a] -> [e] -> [[e]]
- chop :: ([a] -> (b, [a])) -> [a] -> [b]
Types and utilities
A splitting strategy.
Splitter | |
|
defaultSplitter :: Splitter aSource
The default splitting strategy: keep delimiters in the output as separate chunks, don't condense multiple consecutive delimiters into one, keep initial and final blank chunks. Default delimiter is the constantly false predicate.
Note that defaultSplitter
should normally not be used; use
oneOf
, onSublist
, or whenElt
instead, which are the same as
the defaultSplitter
with just the delimiter overridden.
The defaultSplitter
strategy with any delimiter gives a
maximally information-preserving splitting strategy, in the sense
that (a) taking the concat
of the output yields the original
list, and (b) given only the output list, we can reconstruct a
Splitter
which would produce the same output list again given
the original input list. This default strategy can be overridden
to allow discarding various sorts of information.
A delimiter is a list of predicates on elements, matched by some contiguous subsequence of a list.
matchDelim :: Delimiter a -> [a] -> Maybe ([a], [a])Source
Try to match a delimiter at the start of a list, either failing or decomposing the list into the portion which matched the delimiter and the remainder.
data DelimPolicy Source
What to do with delimiters?
data CondensePolicy Source
What to do with multiple consecutive delimiters?
Condense | Condense into a single delimiter. |
DropBlankFields | Keep consecutive delimiters separate, but don't insert blank chunks in between them. |
KeepBlankFields | Insert blank chunks between consecutive delimiters. |
What to do with a blank chunk at either end of the list (i.e. when the list begins or ends with a delimiter).
Tag chunks as delimiters or text.
type SplitList a = [Chunk a]Source
Internal representation of a split list that tracks which pieces are delimiters and which aren't.
Implementation
splitInternal :: Delimiter a -> [a] -> SplitList aSource
Given a delimiter to use, split a list into an internal representation with chunks tagged as delimiters or text. This transformation is lossless; in particular,
concatMap
fromElem
(splitInternal
d l) == l.
postProcess :: Splitter a -> SplitList a -> SplitList aSource
Given a split list in the internal tagged representation, produce
a new internal tagged representation corresponding to the final
output, according to the strategy defined by the given
Splitter
.
doDrop :: DelimPolicy -> SplitList a -> SplitList aSource
Drop delimiters if the DelimPolicy
is Drop
.
doCondense :: CondensePolicy -> SplitList a -> SplitList aSource
Condense multiple consecutive delimiters into one if the
CondensePolicy
is Condense
.
insertBlanks :: CondensePolicy -> SplitList a -> SplitList aSource
Insert blank chunks between any remaining consecutive delimiters
(unless the condense policy is DropBlankFields
), and at the
beginning or end if the first or last element is a delimiter.
insertBlanks' :: CondensePolicy -> SplitList a -> SplitList aSource
Insert blank chunks between consecutive delimiters.
doMerge :: DelimPolicy -> SplitList a -> SplitList aSource
Merge delimiters into adjacent chunks according to the DelimPolicy
.
mergeLeft :: SplitList a -> SplitList aSource
Merge delimiters with adjacent chunks to the right (yes, that's not a typo: the delimiters should end up on the left of the chunks, so they are merged with chunks to their right).
mergeRight :: SplitList a -> SplitList aSource
Merge delimiters with adjacent chunks to the left.
dropInitial :: EndPolicy -> SplitList a -> SplitList aSource
Drop an initial blank chunk according to the given EndPolicy
.
dropFinal :: EndPolicy -> SplitList a -> SplitList aSource
Drop a final blank chunk according to the given EndPolicy
.
Combinators
split :: Splitter a -> [a] -> [[a]]Source
Split a list according to the given splitting strategy. This is
how to "run" a Splitter
that has been built using the other
combinators.
Basic strategies
All these basic strategies have the same parameters as the
defaultSplitter
except for the delimiters.
oneOf :: Eq a => [a] -> Splitter aSource
A splitting strategy that splits on any one of the given elements. For example:
split (oneOf "xyz") "aazbxyzcxd" == ["aa","z","b","x","","y","","z","c","x","d"]
onSublist :: Eq a => [a] -> Splitter aSource
A splitting strategy that splits on the given list, when it is encountered as an exact subsequence. For example:
split (onSublist "xyz") "aazbxyzcxd" == ["aazb","xyz","cxd"]
Note that splitting on the empty list is a special case, which splits just before every element of the list being split. For example:
split (onSublist "") "abc" == ["","","a","","b","","c"] split (dropDelims . dropBlanks $ onSublist "") "abc" == ["a","b","c"]
However, if you want to break a list into singleton elements like
this, you are better off using
, or better yet,
chunksOf
1
.
map
(:[])
whenElt :: (a -> Bool) -> Splitter aSource
A splitting strategy that splits on any elements that satisfy the given predicate. For example:
split (whenElt (<0)) [2,4,-3,6,-9,1] == [[2,4],[-3],[6],[-9],[1]]
Strategy transformers
dropDelims :: Splitter a -> Splitter aSource
Drop delimiters from the output (the default is to keep them). For example,
split (oneOf ":") "a:b:c" == ["a", ":", "b", ":", "c"] split (dropDelims $ oneOf ":") "a:b:c" == ["a", "b", "c"]
keepDelimsL :: Splitter a -> Splitter aSource
Keep delimiters in the output by prepending them to adjacent chunks. For example:
split (keepDelimsL $ oneOf "xyz") "aazbxyzcxd" == ["aa","zb","x","y","zc","xd"]
keepDelimsR :: Splitter a -> Splitter aSource
Keep delimiters in the output by appending them to adjacent chunks. For example:
split (keepDelimsR $ oneOf "xyz") "aazbxyzcxd" == ["aaz","bx","y","z","cx","d"]
condense :: Splitter a -> Splitter aSource
Condense multiple consecutive delimiters into one. For example:
split (condense $ oneOf "xyz") "aazbxyzcxd" == ["aa","z","b","xyz","c","x","d"] split (dropDelims $ oneOf "xyz") "aazbxyzcxd" == ["aa","b","","","c","d"] split (condense . dropDelims $ oneOf "xyz") "aazbxyzcxd" == ["aa","b","c","d"]
dropInitBlank :: Splitter a -> Splitter aSource
Don't generate a blank chunk if there is a delimiter at the beginning. For example:
split (oneOf ":") ":a:b" == ["",":","a",":","b"] split (dropInitBlank $ oneOf ":") ":a:b" == [":","a",":","b"]
dropFinalBlank :: Splitter a -> Splitter aSource
Don't generate a blank chunk if there is a delimiter at the end. For example:
split (oneOf ":") "a:b:" == ["a",":","b",":",""] split (dropFinalBlank $ oneOf ":") "a:b:" == ["a",":","b",":"]
dropInnerBlanks :: Splitter a -> Splitter aSource
Don't generate blank chunks between consecutive delimiters. For example:
split (oneOf ":") "::b:::a" == ["",":","",":","b",":","",":","",":","a"] split (dropInnerBlanks $ oneOf ":") "::b:::a" == ["", ":",":","b",":",":",":","a"]
Derived combinators
dropBlanks :: Splitter a -> Splitter aSource
Drop all blank chunks from the output, and condense consecutive
delimiters into one. Equivalent to
. For example:
dropInitBlank
. dropFinalBlank
. condense
split (oneOf ":") "::b:::a" == ["",":","",":","b",":","",":","",":","a"] split (dropBlanks $ oneOf ":") "::b:::a" == ["::","b",":::","a"]
startsWith :: Eq a => [a] -> Splitter aSource
Make a strategy that splits a list into chunks that all start
with the given subsequence (except possibly the first).
Equivalent to
.
For example:
dropInitBlank
. keepDelimsL
. onSublist
split (startsWith "app") "applyapplicativeapplaudapproachapple" == ["apply","applicative","applaud","approach","apple"]
startsWithOneOf :: Eq a => [a] -> Splitter aSource
Make a strategy that splits a list into chunks that all start
with one of the given elements (except possibly the first).
Equivalent to
. For
example:
dropInitBlank
. keepDelimsL
. oneOf
split (startsWithOneOf ['A'..'Z']) "ACamelCaseIdentifier" == ["A","Camel","Case","Identifier"]
endsWith :: Eq a => [a] -> Splitter aSource
Make a strategy that splits a list into chunks that all end with
the given subsequence, except possibly the last. Equivalent to
. For example:
dropFinalBlank
. keepDelimsR
. onSublist
split (endsWith "ly") "happilyslowlygnarlylily" == ["happily","slowly","gnarly","lily"]
endsWithOneOf :: Eq a => [a] -> Splitter aSource
Make a strategy that splits a list into chunks that all end with
one of the given elements, except possibly the last. Equivalent
to
. For example:
dropFinalBlank
. keepDelimsR
. oneOf
split (condense $ endsWithOneOf ".,?! ") "Hi, there! How are you?" == ["Hi, ","there! ","How ","are ","you?"]
Convenience functions
splitOneOf :: Eq a => [a] -> [a] -> [[a]]Source
Split on any of the given elements. Equivalent to
. For example:
split
. dropDelims
. oneOf
splitOneOf ";.," "foo,bar;baz.glurk" == ["foo","bar","baz","glurk"]
splitOn :: Eq a => [a] -> [a] -> [[a]]Source
Split on the given sublist. Equivalent to
. For example:
split
. dropDelims
. onSublist
splitOn ".." "a..b...c....d.." == ["a","b",".c","","d",""]
In some parsing combinator frameworks this is also known as
sepBy
.
Note that this is the right inverse of the intercalate
function
from Data.List, that is,
intercalate x . splitOn x === id
is the identity on
certain lists, but it is tricky to state the precise conditions
under which this holds. (For example, it is not enough to say
that splitOn
x . intercalate
xx
does not occur in any elements of the input list.
Working out why is left as an exercise for the reader.)
splitWhen :: (a -> Bool) -> [a] -> [[a]]Source
Split on elements satisfying the given predicate. Equivalent to
. For example:
split
. dropDelims
. whenElt
splitWhen (<0) [1,3,-4,5,7,-9,0,2] == [[1,3],[5,7],[0,2]]
endBy :: Eq a => [a] -> [a] -> [[a]]Source
Split into chunks terminated by the given subsequence.
Equivalent to
. For example:
split
. dropFinalBlank
. dropDelims
. onSublist
endBy ";" "foo;bar;baz;" == ["foo","bar","baz"]
Note also that the lines
function from Data.List is equivalent
to
.
endBy
"\n"
endByOneOf :: Eq a => [a] -> [a] -> [[a]]Source
Split into chunks terminated by one of the given elements.
Equivalent to
. For example:
split
. dropFinalBlank
. dropDelims
. oneOf
endByOneOf ";," "foo;bar,baz;" == ["foo","bar","baz"]
wordsBy :: (a -> Bool) -> [a] -> [[a]]Source
Split into "words", with word boundaries indicated by the given
predicate. Satisfies
; equivalent to words
=== wordsBy
isSpace
. For example:
split
. dropBlanks
. dropDelims
. whenElt
wordsBy (=='x') "dogxxxcatxbirdxx" == ["dog","cat","bird"]
linesBy :: (a -> Bool) -> [a] -> [[a]]Source
Split into "lines", with line boundaries indicated by the given
predicate. Satisfies
; equivalent to
lines
=== linesBy (=='\n')
. For example:
split
. dropFinalBlank
. dropDelims
. whenElt
linesBy (=='x') "dogxxxcatxbirdxx" == ["dog","","","cat","bird",""]
Other splitting methods
build :: ((a -> [a] -> [a]) -> [a] -> [a]) -> [a]Source
Standard build function, specialized to building lists.
Usually build is given the rank-2 type
build :: (forall b. (a -> b -> b) -> b -> b) -> [a]
but since we only use it when (b ~ [a])
, we give it the more
restricted type signature in order to avoid needing a
non-Haskell2010 extension.
Note that the 0.1.4.3 release of this package did away with a
custom build
implementation in favor of importing one from
GHC.Exts, which was (reportedly) faster for some applications.
However, in the interest of simplicity and complete Haskell2010
compliance as split
is being included in the Haskel Platform,
version 0.2.1.0 has gone back to defining build
manually. This
is in line with split
's design philosophy of having efficiency
as a non-goal.
chunksOf :: Int -> [e] -> [[e]]Source
splits a list into length-n pieces. The last
piece will be shorter if chunksOf
nn
does not evenly divide the length of
the list. If n <= 0
,
returns an infinite list
of empty lists. For example:
chunksOf
n l
Note that
is chunksOf
n [][]
, not [[]]
. This is
intentional, and is consistent with a recursive definition of
chunksOf
; it satisfies the property that
chunksOf n xs ++ chunksOf n ys == chunksOf n (xs ++ ys)
whenever n
evenly divides the length of xs
.
splitPlaces :: Integral a => [a] -> [e] -> [[e]]Source
Split a list into chunks of the given lengths. For example:
splitPlaces [2,3,4] [1..20] == [[1,2],[3,4,5],[6,7,8,9]] splitPlaces [4,9] [1..10] == [[1,2,3,4],[5,6,7,8,9,10]] splitPlaces [4,9,3] [1..10] == [[1,2,3,4],[5,6,7,8,9,10]]
If the input list is longer than the total of the given lengths, then the remaining elements are dropped. If the list is shorter than the total of the given lengths, then the result may contain fewer chunks than requested, and the last chunk may be shorter than requested.
splitPlacesBlanks :: Integral a => [a] -> [e] -> [[e]]Source
Split a list into chunks of the given lengths. Unlike
splitPlaces
, the output list will always be the same length as
the first input argument. If the input list is longer than the
total of the given lengths, then the remaining elements are
dropped. If the list is shorter than the total of the given
lengths, then the last several chunks will be shorter than
requested or empty. For example:
splitPlacesBlanks [2,3,4] [1..20] == [[1,2],[3,4,5],[6,7,8,9]] splitPlacesBlanks [4,9] [1..10] == [[1,2,3,4],[5,6,7,8,9,10]] splitPlacesBlanks [4,9,3] [1..10] == [[1,2,3,4],[5,6,7,8,9,10],[]]
Notice the empty list in the output of the third example, which
differs from the behavior of splitPlaces
.
chop :: ([a] -> (b, [a])) -> [a] -> [b]Source
A useful recursion pattern for processing a list to produce a new list, often used for "chopping" up the input list. Typically chop is called with some function that will consume an initial prefix of the list and produce a value and the rest of the list.
For example, many common Prelude functions can be implemented in
terms of chop
:
group :: (Eq a) => [a] -> [[a]] group = chop (\ xs@(x:_) -> span (==x) xs) words :: String -> [String] words = filter (not . null) . chop (span (not . isSpace) . dropWhile isSpace)