Copyright | Copyright (C) 2004-2008 John Goerzen |
---|---|
License | Either LGPL or BSD3, as specified in the COPYRIGHT file. |
Maintainer | John Goerzen <jgoerzen@complete.org> |
Stability | provisional |
Portability | portable |
Safe Haskell | None |
Language | Haskell98 |
Configuration file parsing, generation, and manipulation
Copyright (c) 2004-2008 John Goerzen, jgoerzen@complete.org
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
This module contains extensive documentation. Please scroll down to the Introduction section to continue reading.
- type SectionSpec = String
- type OptionSpec = String
- data ConfigParser = ConfigParser {
- content :: CPData
- optionxform :: OptionSpec -> OptionSpec
- defaulthandler :: ConfigParser -> SectionSpec -> OptionSpec -> Either CPError String
- usedefault :: Bool
- accessfunc :: ConfigParser -> SectionSpec -> OptionSpec -> Either CPError String
- data CPErrorData
- type CPError = (CPErrorData, String)
- emptyCP :: ConfigParser
- simpleAccess :: MonadError CPError m => ConfigParser -> SectionSpec -> OptionSpec -> m String
- interpolatingAccess :: MonadError CPError m => Int -> ConfigParser -> SectionSpec -> OptionSpec -> m String
- readfile :: MonadError CPError m => ConfigParser -> FilePath -> IO (m ConfigParser)
- readhandle :: MonadError CPError m => ConfigParser -> Handle -> IO (m ConfigParser)
- readstring :: MonadError CPError m => ConfigParser -> String -> m ConfigParser
- class Get_C a where
- get :: MonadError CPError m => ConfigParser -> SectionSpec -> OptionSpec -> m a
- sections :: ConfigParser -> [SectionSpec]
- has_section :: ConfigParser -> SectionSpec -> Bool
- options :: MonadError CPError m => ConfigParser -> SectionSpec -> m [OptionSpec]
- has_option :: ConfigParser -> SectionSpec -> OptionSpec -> Bool
- items :: MonadError CPError m => ConfigParser -> SectionSpec -> m [(OptionSpec, String)]
- set :: MonadError CPError m => ConfigParser -> SectionSpec -> OptionSpec -> String -> m ConfigParser
- setshow :: (Show a, MonadError CPError m) => ConfigParser -> SectionSpec -> OptionSpec -> a -> m ConfigParser
- remove_option :: MonadError CPError m => ConfigParser -> SectionSpec -> OptionSpec -> m ConfigParser
- add_section :: MonadError CPError m => ConfigParser -> SectionSpec -> m ConfigParser
- remove_section :: MonadError CPError m => ConfigParser -> SectionSpec -> m ConfigParser
- merge :: ConfigParser -> ConfigParser -> ConfigParser
- to_string :: ConfigParser -> String
Introduction
Many programs need configuration files. These configuration files are typically used to configure certain runtime behaviors that need to be saved across sessions. Various different configuration file formats exist.
The ConfigParser module attempts to define a standard format that is easy for the user to edit, easy for the programmer to work with, yet remains powerful and flexible.
Features
For the programmer, this module provides:
- Simple calls to both read and write configuration files
- Call that can generate a string version of a file that is re-parsable by this module (useful for, for instance, sending the file down a network)
- Segmented configuration files that let you separate configuration into distinct sections, each with its own namespace. This can be used to configure multiple modules in one file, to configure multiple instances of a single object, etc.
- On-the-fly parsing of integer, boolean, float, multi-line string values, and anything else Haskell's read can deal with
- It is possible to make a configuration file parsable by this module, the Unix shell, and/or Unix make, though some feautres are, of course, not compatible with these other tools.
- Syntax checking with error reporting including line numbers
- Implemented in pure Haskell. No dependencies on modules outside the standard library distributed with Haskell compilers or interpreters. All calls except those that read directly from a handle are pure calls and can be used outside the IO monad.
- Comprehensive documentation
- Extensible API
- Complete compatibility with Python's ConfigParser module, or my ConfigParser module for OCaml, part of my MissingLib package.
For the user, this module provides:
- Easily human-editable configuration files with a clear, concise, and consistent format
- Configuration file format consistent with other familiar formats (/etc/passwd is a valid ConfigParser file)
- No need to understand semantics of markup languages like XML
History
This module is based on Python's ConfigParser module at http://www.python.org/doc/current/lib/module-ConfigParser.html. I had earlier developed an OCaml implementation as part of my MissingLib library at gopher://gopher.quux.org/devel/missinglib.
While the API of these three modules is similar, and the aim is to preserve all useful features of the original Python module, there are some differences in the implementation details. This module is a complete, clean re-implementation in Haskell, not a Haskell translation of a Python program. As such, the feature set is slightly different.
Configuration File Format
The basic configuration file format resembles that of an old-style Windows .INI file. Here are two samples:
debug = yes inputfile = /etc/passwd names = Peter, Paul, Mary, George, Abrahaham, John, Bill, Gerald, Richard, Franklin, Woodrow color = red
This defines a file without any explicit section, so all items will
occur within the default section DEFAULT
. The debug
option can be read
as a boolean or a string. The remaining items can be read as a string
only. The names
entry spans two lines -- any line starting with
whitespace, and containing something other than whitespace or
comments, is taken as a continuation of the previous line.
Here's another example:
# Default options [DEFAULT] hostname: localhost # Options for the first file [file1] location: /usr/local user: Fred uid: 1000 optionaltext: Hello, this entire string is included [file2] location: /opt user: Fred uid: 1001
This file defines three sections. The DEFAULT
section specifies an
entry hostname
. If you attempt to read the hostname option in any
section, and that section doesn't define hostname
, you will get the
value from DEFAULT
instead. This is a nice time-saver. You can also
note that you can use colons instead of the = character to separate
option names from option entries.
White Space
Whitespace (spaces, tabs, etc) is automatically stripped from the beginning and end of all strings. Thus, users can insert whitespace before/after the colon or equal sign if they like, and it will be automatically stripped.
Blank lines or lines consisting solely of whitespace are ignored.
A line giving an option or a section name may not begin with white space. This requirement is necessary so there is no ambiguity between such lines and continuation lines for multi-line options.
Comments
Comments are introduced with the pound sign #
or the semicolon ;
. They
cause the parser to ignore everything from that character to the end
of the line.
Comments may not occur within the definitions of options; that is, you
may not place a comment in the middle of a line such as user: Fred
.
That is because the parser considers the comment characters part
of the string; otherwise, you'd be unable to use those characters in
your strings. You can, however, "comment out" options by putting the
comment character at the start of the line.
Case Sensitivity
By default, section names are case-sensitive but option names are
not. The latter can be adjusted by adjusting optionxform
.
Interpolation
Interpolation is an optional feature, disabled by default. If you replace
the default accessfunc
(simpleAccess
) with interpolatingAccess
,
then you get interpolation support with get
and the other get
-based functions.
As an example, consider the following file:
arch = i386 project = test filename = test_%(arch)s.c dir = /usr/src/%(filename)s percent = 5%%
With interpolation, you would get these results:
get cp "DEFAULT" "filename" -> "test_i386.c" get cp "DEFAULT" "dir" -> "/usr/src/test_i386.c" get cp "DEFAULT" "percent" -> "5%"
For more details on interpolation, please see the documentation for the
interpolatingAccess
function.
Usage Examples
The basic theory of working with ConfigParser is this:
- Parse or build a
ConfigParser
object - Work with it in one of several ways
- To make changes, you discard the original object and use a new one. Changes can be "chained" through one of several monads.
The default ConfigParser
object that you always start with is emptyCP
.
From here, you load data into it (merging data into the empty object),
set up structures yourself, or adjust options.
Let's take a look at some basic use cases.
Non-Monadic Usage
You'll notice that many functions in this module return a
MonadError
over some
type. Although its definition is not this simple, you can consider this to be
the same as returning CPError
Either CPError a
.
That is, these functions will return Left error
if there's a problem
or Right result
if things are fine. The documentation for individual
functions describes the specific circumstances in which an error may occur in
more detail.
Some people find it annoying to have to deal with errors manually.
You can transform errors into exceptions in your code by using
forceEither
. Here's an example of this style of programming:
import Data.Either.Utils do val <- readfile emptyCP "/etc/foo.cfg" let cp = forceEither val putStrLn "Your setting is:" putStrLn $ forceEither $ get cp "sect1" "opt1"
In short, you can just put forceEither $
in front of every call that returns
something that is a MonadError.
This is still a pure functional call, so it can be used outside
of the IO monads. The exception, however, can only be caught in the IO
monad.
If you don't want to bother with forceEither
, you can use the error monad. It's simple and better... read on.
Error Monad Usage
The return type is actually defined in terms of the Error monad, which is itself based on the Either data type.
Here's a neat example of chaining together calls to build up a ConfigParser
object:
do let cp = emptyCP cp <- add_section cp "sect1" cp <- set cp "sect1" "opt1" "foo" cp <- set cp "sect1" "opt2" "bar" options cp "sect1"
The return value of this little snippet is Right ["opt1", "opt2"]
.
(Note to beginners: unlike the IO monad, you can escape from the Error
monad.)
Although it's not obvious, there actually was error checking there. If
any of those calls would have generated an error, processing would have
stopped immediately and a Left
value would have been returned. Consider
this example:
do let cp = emptyCP cp <- add_section cp "sect1" cp <- set cp "sect1" "opt1" "foo" cp <- set cp "sect2" "opt2" "bar" options cp "sect1"
The return value from this is Left (
. The
second call to NoSection
"sect2", "set")set
failed, so the final call was skipped, and the result
of the entire computation was considered to be an error.
You can combine this with the non-monadic style to get a final, pure value out of it:
forceEither $ do let cp = emptyCP cp <- add_section cp "sect1" cp <- set cp "sect1" "opt1" "foo" cp <- set cp "sect1" "opt2" "bar" options cp "sect1"
This returns ["opt1", "opt2"]
. A quite normal value.
Combined Error/IO Monad Usage
You've seen a nice way to use this module in the Error monad and get an Either value out. But that's the Error monad, so IO is not permitted. Using Haskell's monad transformers, you can run it in the combined Error/IO monad. That is, you will get an IO result back. Here is a full standalone example of doing that:
import Data.ConfigFile import Control.Monad.Error main = do rv <- runErrorT $ do cp <- join $ liftIO $ readfile emptyCP "/etc/passwd" let x = cp liftIO $ putStrLn "In the test" nb <- get x "DEFAULT" "nobody" liftIO $ putStrLn nb foo <- get x "DEFAULT" "foo" liftIO $ putStrLn foo return "done" print rv
On my system, this prints:
In the test x:65534:65534:nobody:/nonexistent:/bin/sh Left (NoOption "foo","get")
That is, my /etc/passwd
file contains a nobody
user but not a foo
user.
Let's look at how that works.
First, main
always runs in the IO monad only, so we take the result from
the later calls and put it in rv
. Note that the combined block
is started with runErrorT $ do
instead of just do
.
To get something out of the call to readfile
, we use
join $ liftIO $ readfile
. This will bring the result out of the IO monad
into the combined monad and process it like usual. From here on,
everything looks normal, except for IO calls. They are all executed under
liftIO
so that the result value is properly brought into the combined
monad. This finally returns "done"
. Since we are in the Error monad, that means that the literal value is Right "done"
. Since we are also in the IO
monad, this is wrapped in IO. So the final return type after applying
runErrorT
is IO (Either CPError String)
.
In this case, there was an error, and processing stopped at that point just
like the example of the pure Error monad. We print out the return value,
so you see the error displayed as a Left
value.
It all works quite easily.
Types
The code used to say this:
type CPResult a = MonadError CPError m => m a simpleAccess :: ConfigParser -> SectionSpec -> OptionSpec -> CPResult String
But Hugs did not support that type declaration. Therefore, types are now given like this:
simpleAccess :: MonadError CPError m => ConfigParser -> SectionSpec -> OptionSpec -> m String
Although it looks more confusing than before, it still means the same.
The return value can still be treated as Either CPError String
if you so
desire.
type SectionSpec = String Source
Names of sections
type OptionSpec = String Source
Names of options
data ConfigParser Source
This is the main record that is used by ConfigFile
.
ConfigParser | |
|
data CPErrorData Source
Possible ConfigParser errors.
ParseError String | Parse error |
SectionAlreadyExists SectionSpec | Attempt to create an already-existing ection |
NoSection SectionSpec | The section does not exist |
NoOption OptionSpec | The option does not exist |
OtherProblem String | Miscellaneous error |
InterpolationError String | Raised by |
type CPError = (CPErrorData, String) Source
Indicates an error occurred. The String is an explanation of the location of the error.
Initialization
emptyCP :: ConfigParser Source
The default empty ConfigFile
object.
The content contains only an empty mandatory DEFAULT
section.
optionxform
is set to map toLower
.
usedefault
is set to True
.
accessfunc
is set to simpleAccess
.
Configuring the ConfigParser
You may notice that the ConfigParser
object has some configurable parameters,
such as usedefault
. In case you're not familiar with the Haskell syntax
for working with these, you can use syntax like this to set these options:
let cp2 = cp { usedefault = False }
This will create a new ConfigParser
that is the same as cp
except for
the usedefault
field, which is now always False. The new object will be
called cp2
in this example.
Access Functions
simpleAccess :: MonadError CPError m => ConfigParser -> SectionSpec -> OptionSpec -> m String Source
Default (non-interpolating) access function
interpolatingAccess :: MonadError CPError m => Int -> ConfigParser -> SectionSpec -> OptionSpec -> m String Source
Interpolating access function. Please see the Interpolation section above for a background on interpolation.
Although the format string looks similar to one used by Text.Printf, it is not the same. In particular, only the %(...)s format is supported. No width specifiers are supported and no conversions other than s are supported.
To use this function, you must specify a maximum recursion depth for interpolation. This is used to prevent a stack overflow in the event that the configuration file contains an endless interpolation loop. Values of 10 or so are usually more than enough, though you could probably go into the hundreds or thousands before you have actual problems.
A value less than one will cause an instant error every time you attempt a lookup.
This access method can cause get
and friends to return a new CPError
:
InterpolationError
. This error would be returned when:
- The configuration file makes a reference to an option that does not exist
- The maximum interpolation depth is exceeded
- There is a syntax error processing a %-directive in the configuration file
An interpolation lookup name specifies an option only. There is no provision
to specify a section. Interpolation variables are looked up in the current
section, and, if usedefault
is True, in DEFAULT
according to the normal
logic.
To use a literal percent sign, you must place %%
in the configuration
file when interpolation is used.
Here is how you might enable interpolation:
let cp2 = cp {accessfunc = interpolatingAccess 10}
The cp2
object will now support interpolation with a maximum depth of 10.
Reading
You can use these functions to read data from a file.
A common idiom for loading a new object from stratch is:
cp <-readfile
emptyCP
"/etc/foo.cfg"
Note the use of emptyCP
; this will essentially cause the file's data
to be merged with the empty ConfigParser
.
readfile :: MonadError CPError m => ConfigParser -> FilePath -> IO (m ConfigParser) Source
Loads data from the specified file. It is then combined with the
given ConfigParser
using the semantics documented under merge
with the
new data taking precedence over the old. However, unlike
merge
, all the options
as set in the old object are preserved since the on-disk representation
does not convey those options.
May return an error if there is a syntax error. May raise an exception if the file could not be accessed.
readhandle :: MonadError CPError m => ConfigParser -> Handle -> IO (m ConfigParser) Source
readstring :: MonadError CPError m => ConfigParser -> String -> m ConfigParser Source
Accessing Data
The class representing the data types that can be returned by "get".
get :: MonadError CPError m => ConfigParser -> SectionSpec -> OptionSpec -> m a Source
Retrieves a string from the configuration file.
When used in a context where a String is expected, returns that string verbatim.
When used in a context where a Bool is expected, parses the string to a Boolean value (see logic below).
When used in a context where anything that is an instance of Read is expected, calls read to parse the item.
An error will be returned of no such option could be found or if it could not be parsed as a boolean (when returning a Bool).
When parsing to a Bool, strings are case-insentively converted as follows:
The following will produce a True value:
- 1
- yes
- on
- enabled
- true
The following will produce a False value:
- 0
- no
- off
- disabled
- false
sections :: ConfigParser -> [SectionSpec] Source
Returns a list of sections in your configuration file. Never includes
the always-present section DEFAULT
.
has_section :: ConfigParser -> SectionSpec -> Bool Source
Indicates whether the given section exists.
No special DEFAULT
processing is done.
options :: MonadError CPError m => ConfigParser -> SectionSpec -> m [OptionSpec] Source
Returns a list of the names of all the options present in the given section.
Returns an error if the given section does not exist.
has_option :: ConfigParser -> SectionSpec -> OptionSpec -> Bool Source
Indicates whether the given option is present. Returns True
only if the given section is present AND the given option is present
in that section. No special DEFAULT
processing is done. No
exception could be raised or error returned.
items :: MonadError CPError m => ConfigParser -> SectionSpec -> m [(OptionSpec, String)] Source
Returns a list of (optionname, value)
pairs representing the content
of the given section. Returns an error the section is invalid.
Modifying Data
set :: MonadError CPError m => ConfigParser -> SectionSpec -> OptionSpec -> String -> m ConfigParser Source
Sets the option to a new value, replacing an existing one if it exists.
Returns an error if the section does not exist.
setshow :: (Show a, MonadError CPError m) => ConfigParser -> SectionSpec -> OptionSpec -> a -> m ConfigParser Source
Sets the option to a new value, replacing an existing one if it exists. It requires only a showable value as its parameter. This can be used with bool values, as well as numeric ones.
Returns an error if the section does not exist.
remove_option :: MonadError CPError m => ConfigParser -> SectionSpec -> OptionSpec -> m ConfigParser Source
Removes the specified option. Returns a NoSection
error if the
section does not exist and a NoOption
error if the option does not
exist. Otherwise, returns the new ConfigParser
object.
add_section :: MonadError CPError m => ConfigParser -> SectionSpec -> m ConfigParser Source
Adds the specified section name. Returns a
SectionAlreadyExists
error if the
section was already present. Otherwise, returns the new
ConfigParser
object.
remove_section :: MonadError CPError m => ConfigParser -> SectionSpec -> m ConfigParser Source
Removes the specified section. Returns a NoSection
error if
the section does not exist; otherwise, returns the new ConfigParser
object.
This call may not be used to remove the DEFAULT
section. Attempting to do
so will always cause a NoSection
error.
merge :: ConfigParser -> ConfigParser -> ConfigParser Source
Combines two ConfigParser
s into one.
Any duplicate options are resolved to contain the value specified in the second parser.
The ConfigParser
options in the resulting object will be set as they
are in the second one passed to this function.
Output Data
to_string :: ConfigParser -> String Source
Converts the ConfigParser
to a string representation that could be
later re-parsed by this module or modified by a human.
Note that this does not necessarily re-create a file that was originally loaded. Things may occur in a different order, comments will be removed, etc. The conversion makes an effort to make the result human-editable, but it does not make an effort to make the result identical to the original input.
The result is, however, guaranteed to parse the same as the original input.