Safe Haskell | Safe-Inferred |
---|---|
Language | Haskell2010 |
PDF document page
Synopsis
- data Page
- pageParentNode :: Page -> IO PageNode
- pageContents :: Page -> IO [Ref]
- pageMediaBox :: Page -> IO (Rectangle Double)
- pageFontDicts :: Page -> IO [(Name, FontDict)]
- pageExtractText :: Page -> IO Text
- pageExtractGlyphs :: Page -> IO [Span]
- glyphsToText :: [Span] -> Text
Documentation
pageExtractText :: Page -> IO Text Source #
Extract text from the page
It tries to add spaces between chars if they don't present as actual characters in content stream.
glyphsToText :: [Span] -> Text Source #
Convert glyphs to text, trying to add spaces and newlines
It takes list of spans. Each span is a list of glyphs that are outputed in one shot. So we don't need to add space inside span, only between them.