fast-tagsoup: Fast parsing and extracting information from (possibly malformed) HTML/XML documents
Fast TagSoup parser. Speeds of 20-200MB/sec were observed.
Works only with strict bytestrings.
This library is intended to be used in conjunction with the original tagsoup
import Text.HTML.TagSoup hiding (parseTags, renderTags) import Text.HTML.TagSoup.Fast
Besides speed fast-tagsoup
correctly handles HTML <script>
and <style>
tags, converts tags to lower case and can decode non UTF-8 XML for you.
This parser is used in production in BazQux Reader feeds and comments crawler.
- fast-tagsoup-1.0.14.tar.gz [browse] (Cabal source package)
- Package description (as included in the package)
Maintainer's Corner
For package maintainers and hackage trustees
- No Candidates
Versions [RSS] | 1.0.0, 1.0.1, 1.0.2, 1.0.3, 1.0.4, 1.0.5, 1.0.6, 1.0.7, 1.0.8, 1.0.9, 1.0.10, 1.0.11, 1.0.12, 1.0.13, 1.0.14 |
Dependencies | base (>=4 && <5), bytestring, containers, tagsoup (>=0.13.10), text, text-icu [details] |
License | BSD-3-Clause |
Copyright | Vladimir Shabanov 2011-2017 |
Author | Vladimir Shabanov <> |
Maintainer | Vladimir Shabanov <> |
Category | XML |
Home page | |
Source repo | head: git clone |
Uploaded | by VladimirShabanov at 2017-07-04T17:36:00Z |
Distributions | NixOS:1.0.14 |
Reverse Dependencies | 3 direct, 0 indirect [details] |
Downloads | 11742 total (25 in the last 30 days) |
Rating | (no votes yet) [estimated by Bayesian average] |
Your Rating | |
Status | Docs available [build log] Last success reported on 2017-07-04 [all 1 reports] |