string-interpolate: Haskell string interpolation that just works

This is a package candidate release! Here you can preview how this package release will appear once published to the main package index (which can be accomplished via the 'maintain' link below). Please note that once a package has been published to the main package index it cannot be undone! Please consult the package uploading documentation for more information.

[maintain] [Publish]

Unicode-aware string interpolation that handles all textual types.

See the README at https://gitlab.com/williamyaoh/string-interpolate.git#string-interpolate for more info.


[Skip to Readme]

Properties

Versions 0.0.1.0, 0.0.1.0, 0.1.0.0, 0.1.0.1, 0.2.0.0, 0.2.0.1, 0.2.0.2, 0.2.0.3, 0.2.1.0, 0.3.0.0, 0.3.0.1, 0.3.0.2, 0.3.1.0, 0.3.1.1, 0.3.1.2, 0.3.2.0, 0.3.2.1, 0.3.3.0, 0.3.4.0
Change log CHANGELOG.md
Dependencies base (>=4 && <5), bytestring, haskell-src-meta, template-haskell, text, text-conversions, utf8-string [details]
License BSD-3-Clause
Copyright 2019 William Yao
Author William Yao
Maintainer williamyaoh@gmail.com
Category Data, Text
Uploaded by williamyaoh at 2019-03-10T20:23:17Z

Modules

[Index] [Quick Jump]

Downloads

Maintainer's Corner

Package maintainers

For package maintainers and hackage trustees


Readme for string-interpolate-0.0.1.0

[back to package description]

string-interpolate

Haskell having 5 different textual types in common use (String, strict and lazy Text, strict and lazy ByteString) means that doing any kind of string manipulation becomes a complicated game of type tetris with constant conversion back and forth. What if string handling was as simple and easy as it is in literally any other language?

Behold:

showWelcomeMessage :: Text -> Integer -> Text
showWelcomeMessage username visits =
  [i|Welcome to my website, #{username}! You are visitor #{visits}!|]

No more needing to mconcat, mappend, and (<>) to glue strings together. No more having to remember a gajillion different functions for converting between strict and lazy versions of Text, or having to worry about encoding between Text <=> ByteString. No more getting bitten by trying to work with Unicode ByteStrings. It just works!

string-interpolate provides a quasiquoter, i, that allows you to interpolate expressions directly into your string. It can produce anything that is an instance of IsString, and can interpolate anything which is an instance of Show.

Unicode handling

string-interpolate handles converting to/from Unicode when converting String/Text to ByteString and vice versa. Lots of libraries use ByteString to represent human-readable text, even though this is not safe. There are lots of useful libraries in the ecosystem that are unfortunately annoying to work with because of the need to generate ByteStrings containing application-specific info. Insisting on explicitly converting to/from UTF-8 in these cases and handling decoding failures adds lots of syntactic noise, when often you can reasonably assume that a given ByteString will, 95% of the time, contain Unicode text. So string-interpolate aims to provide reasonable defaults around conversion between ByteString and real textual types so that developers don't need to constantly be aware of text encodings.

When converting a String/Text to a ByteString, string-interpolate will automatically encode it as a sequence of UTF-8 bytes. When converting a ByteString to String/Text, string-interpolate will assume that the ByteString contains a UTF-8 string, and convert the characters accordingly. Any invalid characters in the ByteString will be converted to the Unicode replacement character � (U+FFFD).

Remember: string-interpolate is not designed for 100% correctness around text encodings, just for convenience in the most common case. If you absolutely need to be aware of text encodings and to handle decode failures, take a look at text-conversions.

Usage

First things first: add string-interpolate to your dependencies:

dependencies:
  - string-interpolate

and import the quasiquoter and enable -XQuasiQuotes:

{-# LANGUAGE QuasiQuotes #-}

import Data.String.Interpolate ( i )

Wrap anything you want to be interpolated with #{}:

λ> import Data.Time
λ> now <- getCurrentTime
λ> [i|The current time is #{now}.|] :: String
>>> "The current time is 2019-03-10 18:58:40.573892546 UTC."

string-interpolate must know what concrete type it's producing; it cannot be used to generate a IsString a => a. If you're using string-interpolate from GHCi, make sure to add type signatures to toplevel usages!

You can also interpolate arbitrary expressions:

λ> [i|Tomorrow's date is #{addDays 1 $ utctDay now}.|] :: String
>>> "Tomorrow's date is 2019-03-11."

Backslashes are handled exactly the same way they are in normal Haskell strings. If you need to put a literal #{ into your string, prefix the pound symbol with a backslash:

λ> [i|\#{ some inner text }#|] :: String
>>> "#{ some inner text }#"

Comparison to other interpolation libraries

Some other interpolation libraries available:

Of these, Text.Printf isn't exception-safe, and neat-interpolation can only produce Text values. interpolate and formatting solve the same problem of providing a general way of interpolating any value, into any kind of text.

Features

string-interpolate interpolate formatting
String/Text support
ByteString support
Can interpolate arbitrary Show instances
Unicode-aware

Performance

Overall: string-interpolate seems to be on-par with or faster than existing interpolation libraries for all text cases.

Testing on my local machine, creating small (<100 character) Strings shows string-interpolate as on-par with interpolate and significantly (> 20X) faster than formatting:

benchmarking Small Strings Bench/string-interpolate
time                 8.548 ns   (8.543 ns .. 8.554 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 8.516 ns   (8.504 ns .. 8.527 ns)
std dev              35.90 ps   (29.23 ps .. 43.63 ps)

benchmarking Small Strings Bench/interpolate
time                 9.047 ns   (9.042 ns .. 9.052 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 9.028 ns   (9.019 ns .. 9.036 ns)
std dev              27.06 ps   (21.88 ps .. 34.20 ps)

benchmarking Small Strings Bench/formatting
time                 222.3 ns   (221.7 ns .. 223.1 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 222.1 ns   (221.6 ns .. 222.8 ns)
std dev              1.902 ns   (1.530 ns .. 2.446 ns)

I suspect the poor performance of formatting is caused by using (++) to concatenate Strings instead of ShowS.

Doing the same test, but generating small Text and ByteStrings shows string-interpolate to be on-par with formatting, and significantly (> 20x) faster than interpolate.

For Text:

benchmarking Small Text Bench/string-interpolate
time                 173.7 ns   (173.1 ns .. 174.1 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 173.1 ns   (172.8 ns .. 173.5 ns)
std dev              1.230 ns   (1.042 ns .. 1.486 ns)

benchmarking Small Text Bench/interpolate
time                 4.398 μs   (4.371 μs .. 4.425 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 4.403 μs   (4.389 μs .. 4.421 μs)
std dev              54.07 ns   (41.42 ns .. 80.40 ns)

benchmarking Small Text Bench/formatting
time                 243.1 ns   (242.5 ns .. 243.8 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 242.8 ns   (242.4 ns .. 243.4 ns)
std dev              1.665 ns   (1.443 ns .. 1.982 ns)

For ByteString (formatting doesn't support ByteStrings):

benchmarking Small ByteString Bench/string-interpolate
time                 297.6 ns   (295.8 ns .. 299.8 ns)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 299.8 ns   (298.0 ns .. 302.3 ns)
std dev              7.176 ns   (5.417 ns .. 9.651 ns)
variance introduced by outliers: 33% (moderately inflated)

benchmarking Small ByteString Bench/interpolate
time                 4.389 μs   (4.352 μs .. 4.424 μs)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 4.374 μs   (4.350 μs .. 4.398 μs)
std dev              79.48 ns   (66.66 ns .. 96.14 ns)
variance introduced by outliers: 18% (moderately inflated)

Here interpolate is the poor performer, with the performance loss caused because it converts all values to String before using fromString to convert them to the target type.