ConfigFile-1.1.4: Configuration file reading & writing

CopyrightCopyright (C) 2004-2008 John Goerzen
LicenseEither LGPL or BSD3, as specified in the COPYRIGHT file.
MaintainerJohn Goerzen <jgoerzen@complete.org>
Stabilityprovisional
Portabilityportable
Safe HaskellNone
LanguageHaskell98

Data.ConfigFile

Contents

Description

Configuration file parsing, generation, and manipulation

Copyright (c) 2004-2008 John Goerzen, jgoerzen@complete.org

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

This module contains extensive documentation. Please scroll down to the Introduction section to continue reading.

Synopsis

Introduction

Many programs need configuration files. These configuration files are typically used to configure certain runtime behaviors that need to be saved across sessions. Various different configuration file formats exist.

The ConfigParser module attempts to define a standard format that is easy for the user to edit, easy for the programmer to work with, yet remains powerful and flexible.

Features

For the programmer, this module provides:

  • Simple calls to both read and write configuration files
  • Call that can generate a string version of a file that is re-parsable by this module (useful for, for instance, sending the file down a network)
  • Segmented configuration files that let you separate configuration into distinct sections, each with its own namespace. This can be used to configure multiple modules in one file, to configure multiple instances of a single object, etc.
  • On-the-fly parsing of integer, boolean, float, multi-line string values, and anything else Haskell's read can deal with
  • It is possible to make a configuration file parsable by this module, the Unix shell, and/or Unix make, though some feautres are, of course, not compatible with these other tools.
  • Syntax checking with error reporting including line numbers
  • Implemented in pure Haskell. No dependencies on modules outside the standard library distributed with Haskell compilers or interpreters. All calls except those that read directly from a handle are pure calls and can be used outside the IO monad.
  • Comprehensive documentation
  • Extensible API
  • Complete compatibility with Python's ConfigParser module, or my ConfigParser module for OCaml, part of my MissingLib package.

For the user, this module provides:

  • Easily human-editable configuration files with a clear, concise, and consistent format
  • Configuration file format consistent with other familiar formats (/etc/passwd is a valid ConfigParser file)
  • No need to understand semantics of markup languages like XML

History

This module is based on Python's ConfigParser module at http://www.python.org/doc/current/lib/module-ConfigParser.html. I had earlier developed an OCaml implementation as part of my MissingLib library at gopher://gopher.quux.org/devel/missinglib.

While the API of these three modules is similar, and the aim is to preserve all useful features of the original Python module, there are some differences in the implementation details. This module is a complete, clean re-implementation in Haskell, not a Haskell translation of a Python program. As such, the feature set is slightly different.

Configuration File Format

The basic configuration file format resembles that of an old-style Windows .INI file. Here are two samples:

debug = yes
inputfile = /etc/passwd
names = Peter, Paul, Mary, George, Abrahaham, John, Bill, Gerald, Richard,
        Franklin, Woodrow
color = red

This defines a file without any explicit section, so all items will occur within the default section DEFAULT. The debug option can be read as a boolean or a string. The remaining items can be read as a string only. The names entry spans two lines -- any line starting with whitespace, and containing something other than whitespace or comments, is taken as a continuation of the previous line.

Here's another example:

# Default options
[DEFAULT]
hostname: localhost
# Options for the first file
[file1]
location: /usr/local
user: Fred
uid: 1000
optionaltext: Hello, this  entire string is included
[file2]
location: /opt
user: Fred
uid: 1001

This file defines three sections. The DEFAULT section specifies an entry hostname. If you attempt to read the hostname option in any section, and that section doesn't define hostname, you will get the value from DEFAULT instead. This is a nice time-saver. You can also note that you can use colons instead of the = character to separate option names from option entries.

White Space

Whitespace (spaces, tabs, etc) is automatically stripped from the beginning and end of all strings. Thus, users can insert whitespace before/after the colon or equal sign if they like, and it will be automatically stripped.

Blank lines or lines consisting solely of whitespace are ignored.

A line giving an option or a section name may not begin with white space. This requirement is necessary so there is no ambiguity between such lines and continuation lines for multi-line options.

Comments

Comments are introduced with the pound sign # or the semicolon ;. They cause the parser to ignore everything from that character to the end of the line.

Comments may not occur within the definitions of options; that is, you may not place a comment in the middle of a line such as user: Fred. That is because the parser considers the comment characters part of the string; otherwise, you'd be unable to use those characters in your strings. You can, however, "comment out" options by putting the comment character at the start of the line.

Case Sensitivity

By default, section names are case-sensitive but option names are not. The latter can be adjusted by adjusting optionxform.

Interpolation

Interpolation is an optional feature, disabled by default. If you replace the default accessfunc (simpleAccess) with interpolatingAccess, then you get interpolation support with get and the other get-based functions.

As an example, consider the following file:

arch = i386
project = test
filename = test_%(arch)s.c
dir = /usr/src/%(filename)s
percent = 5%%

With interpolation, you would get these results:

get cp "DEFAULT" "filename" -> "test_i386.c"
get cp "DEFAULT" "dir" -> "/usr/src/test_i386.c"
get cp "DEFAULT" "percent" -> "5%"

For more details on interpolation, please see the documentation for the interpolatingAccess function.

Usage Examples

The basic theory of working with ConfigParser is this:

  1. Parse or build a ConfigParser object
  2. Work with it in one of several ways
  3. To make changes, you discard the original object and use a new one. Changes can be "chained" through one of several monads.

The default ConfigParser object that you always start with is emptyCP. From here, you load data into it (merging data into the empty object), set up structures yourself, or adjust options.

Let's take a look at some basic use cases.

Non-Monadic Usage

You'll notice that many functions in this module return a MonadError CPError over some type. Although its definition is not this simple, you can consider this to be the same as returning Either CPError a.

That is, these functions will return Left error if there's a problem or Right result if things are fine. The documentation for individual functions describes the specific circumstances in which an error may occur in more detail.

Some people find it annoying to have to deal with errors manually. You can transform errors into exceptions in your code by using forceEither. Here's an example of this style of programming:

import Data.Either.Utils
do
   val <- readfile emptyCP "/etc/foo.cfg"
   let cp = forceEither val
   putStrLn "Your setting is:"
   putStrLn $ forceEither $ get cp "sect1" "opt1"

In short, you can just put forceEither $ in front of every call that returns something that is a MonadError. This is still a pure functional call, so it can be used outside of the IO monads. The exception, however, can only be caught in the IO monad.

If you don't want to bother with forceEither, you can use the error monad. It's simple and better... read on.

Error Monad Usage

The return type is actually defined in terms of the Error monad, which is itself based on the Either data type.

Here's a neat example of chaining together calls to build up a ConfigParser object:

do let cp = emptyCP
   cp <- add_section cp "sect1"
   cp <- set cp "sect1" "opt1" "foo"
   cp <- set cp "sect1" "opt2" "bar"
   options cp "sect1"

The return value of this little snippet is Right ["opt1", "opt2"]. (Note to beginners: unlike the IO monad, you can escape from the Error monad.)

Although it's not obvious, there actually was error checking there. If any of those calls would have generated an error, processing would have stopped immediately and a Left value would have been returned. Consider this example:

do let cp = emptyCP
   cp <- add_section cp "sect1"
   cp <- set cp "sect1" "opt1" "foo"
   cp <- set cp "sect2" "opt2" "bar"
   options cp "sect1"

The return value from this is Left (NoSection "sect2", "set"). The second call to set failed, so the final call was skipped, and the result of the entire computation was considered to be an error.

You can combine this with the non-monadic style to get a final, pure value out of it:

forceEither $ do let cp = emptyCP
                 cp <- add_section cp "sect1"
                 cp <- set cp "sect1" "opt1" "foo"
                 cp <- set cp "sect1" "opt2" "bar"
                 options cp "sect1"

This returns ["opt1", "opt2"]. A quite normal value.

Combined Error/IO Monad Usage

You've seen a nice way to use this module in the Error monad and get an Either value out. But that's the Error monad, so IO is not permitted. Using Haskell's monad transformers, you can run it in the combined Error/IO monad. That is, you will get an IO result back. Here is a full standalone example of doing that:

import Data.ConfigFile
import Control.Monad.Error

main = do
          rv <- runErrorT $
              do
              cp <- join $ liftIO $ readfile emptyCP "/etc/passwd"
              let x = cp
              liftIO $ putStrLn "In the test"
              nb <- get x "DEFAULT" "nobody"
              liftIO $ putStrLn nb
              foo <- get x "DEFAULT" "foo"
              liftIO $ putStrLn foo
              return "done"
          print rv

On my system, this prints:

In the test
x:65534:65534:nobody:/nonexistent:/bin/sh
Left (NoOption "foo","get")

That is, my /etc/passwd file contains a nobody user but not a foo user.

Let's look at how that works.

First, main always runs in the IO monad only, so we take the result from the later calls and put it in rv. Note that the combined block is started with runErrorT $ do instead of just do.

To get something out of the call to readfile, we use join $ liftIO $ readfile. This will bring the result out of the IO monad into the combined monad and process it like usual. From here on, everything looks normal, except for IO calls. They are all executed under liftIO so that the result value is properly brought into the combined monad. This finally returns "done". Since we are in the Error monad, that means that the literal value is Right "done". Since we are also in the IO monad, this is wrapped in IO. So the final return type after applying runErrorT is IO (Either CPError String).

In this case, there was an error, and processing stopped at that point just like the example of the pure Error monad. We print out the return value, so you see the error displayed as a Left value.

It all works quite easily.

Types

The code used to say this:

type CPResult a = MonadError CPError m => m a
simpleAccess :: ConfigParser -> SectionSpec -> OptionSpec -> CPResult String

But Hugs did not support that type declaration. Therefore, types are now given like this:

simpleAccess :: MonadError CPError m =>
                ConfigParser -> SectionSpec -> OptionSpec -> m String

Although it looks more confusing than before, it still means the same. The return value can still be treated as Either CPError String if you so desire.

type SectionSpec = String Source

Names of sections

type OptionSpec = String Source

Names of options

data ConfigParser Source

This is the main record that is used by ConfigFile.

Constructors

ConfigParser 

Fields

content :: CPData

The data itself

optionxform :: OptionSpec -> OptionSpec

How to transform an option into a standard representation

defaulthandler :: ConfigParser -> SectionSpec -> OptionSpec -> Either CPError String

Function to look up an option, considering a default value if usedefault is True; or ignoring a default value otherwise. The option specification is assumed to be already transformed.

usedefault :: Bool

Whether or not to seek out a default action when no match is found.

accessfunc :: ConfigParser -> SectionSpec -> OptionSpec -> Either CPError String

Function that is used to perform lookups, do optional interpolation, etc. It is assumed that accessfunc will internally call defaulthandler to do the underlying lookup. The option value is not assumed to be transformed.

data CPErrorData Source

Possible ConfigParser errors.

Constructors

ParseError String

Parse error

SectionAlreadyExists SectionSpec

Attempt to create an already-existing ection

NoSection SectionSpec

The section does not exist

NoOption OptionSpec

The option does not exist

OtherProblem String

Miscellaneous error

InterpolationError String

Raised by interpolatingAccess if a request was made for a non-existant option

type CPError = (CPErrorData, String) Source

Indicates an error occurred. The String is an explanation of the location of the error.

Initialization

emptyCP :: ConfigParser Source

The default empty ConfigFile object.

The content contains only an empty mandatory DEFAULT section.

optionxform is set to map toLower.

usedefault is set to True.

accessfunc is set to simpleAccess.

Configuring the ConfigParser

You may notice that the ConfigParser object has some configurable parameters, such as usedefault. In case you're not familiar with the Haskell syntax for working with these, you can use syntax like this to set these options:

let cp2 = cp { usedefault = False }

This will create a new ConfigParser that is the same as cp except for the usedefault field, which is now always False. The new object will be called cp2 in this example.

Access Functions

simpleAccess :: MonadError CPError m => ConfigParser -> SectionSpec -> OptionSpec -> m String Source

Default (non-interpolating) access function

interpolatingAccess :: MonadError CPError m => Int -> ConfigParser -> SectionSpec -> OptionSpec -> m String Source

Interpolating access function. Please see the Interpolation section above for a background on interpolation.

Although the format string looks similar to one used by Text.Printf, it is not the same. In particular, only the %(...)s format is supported. No width specifiers are supported and no conversions other than s are supported.

To use this function, you must specify a maximum recursion depth for interpolation. This is used to prevent a stack overflow in the event that the configuration file contains an endless interpolation loop. Values of 10 or so are usually more than enough, though you could probably go into the hundreds or thousands before you have actual problems.

A value less than one will cause an instant error every time you attempt a lookup.

This access method can cause get and friends to return a new CPError: InterpolationError. This error would be returned when:

  • The configuration file makes a reference to an option that does not exist
  • The maximum interpolation depth is exceeded
  • There is a syntax error processing a %-directive in the configuration file

An interpolation lookup name specifies an option only. There is no provision to specify a section. Interpolation variables are looked up in the current section, and, if usedefault is True, in DEFAULT according to the normal logic.

To use a literal percent sign, you must place %% in the configuration file when interpolation is used.

Here is how you might enable interpolation:

let cp2 = cp {accessfunc = interpolatingAccess 10}

The cp2 object will now support interpolation with a maximum depth of 10.

Reading

You can use these functions to read data from a file.

A common idiom for loading a new object from stratch is:

cp <- readfile emptyCP "/etc/foo.cfg"

Note the use of emptyCP; this will essentially cause the file's data to be merged with the empty ConfigParser.

readfile :: MonadError CPError m => ConfigParser -> FilePath -> IO (m ConfigParser) Source

Loads data from the specified file. It is then combined with the given ConfigParser using the semantics documented under merge with the new data taking precedence over the old. However, unlike merge, all the options as set in the old object are preserved since the on-disk representation does not convey those options.

May return an error if there is a syntax error. May raise an exception if the file could not be accessed.

readhandle :: MonadError CPError m => ConfigParser -> Handle -> IO (m ConfigParser) Source

Like readfile, but uses an already-open handle. You should use readfile instead of this if possible, since it will be able to generate better error messages.

Errors would be returned on a syntax error.

readstring :: MonadError CPError m => ConfigParser -> String -> m ConfigParser Source

Like readfile, but uses a string. You should use readfile instead of this if you are processing a file, since it can generate better error messages.

Errors would be returned on a syntax error.

Accessing Data

class Get_C a where Source

The class representing the data types that can be returned by "get".

Methods

get :: MonadError CPError m => ConfigParser -> SectionSpec -> OptionSpec -> m a Source

Retrieves a string from the configuration file.

When used in a context where a String is expected, returns that string verbatim.

When used in a context where a Bool is expected, parses the string to a Boolean value (see logic below).

When used in a context where anything that is an instance of Read is expected, calls read to parse the item.

An error will be returned of no such option could be found or if it could not be parsed as a boolean (when returning a Bool).

When parsing to a Bool, strings are case-insentively converted as follows:

The following will produce a True value:

  • 1
  • yes
  • on
  • enabled
  • true

The following will produce a False value:

  • 0
  • no
  • off
  • disabled
  • false

sections :: ConfigParser -> [SectionSpec] Source

Returns a list of sections in your configuration file. Never includes the always-present section DEFAULT.

has_section :: ConfigParser -> SectionSpec -> Bool Source

Indicates whether the given section exists.

No special DEFAULT processing is done.

options :: MonadError CPError m => ConfigParser -> SectionSpec -> m [OptionSpec] Source

Returns a list of the names of all the options present in the given section.

Returns an error if the given section does not exist.

has_option :: ConfigParser -> SectionSpec -> OptionSpec -> Bool Source

Indicates whether the given option is present. Returns True only if the given section is present AND the given option is present in that section. No special DEFAULT processing is done. No exception could be raised or error returned.

items :: MonadError CPError m => ConfigParser -> SectionSpec -> m [(OptionSpec, String)] Source

Returns a list of (optionname, value) pairs representing the content of the given section. Returns an error the section is invalid.

Modifying Data

set :: MonadError CPError m => ConfigParser -> SectionSpec -> OptionSpec -> String -> m ConfigParser Source

Sets the option to a new value, replacing an existing one if it exists.

Returns an error if the section does not exist.

setshow :: (Show a, MonadError CPError m) => ConfigParser -> SectionSpec -> OptionSpec -> a -> m ConfigParser Source

Sets the option to a new value, replacing an existing one if it exists. It requires only a showable value as its parameter. This can be used with bool values, as well as numeric ones.

Returns an error if the section does not exist.

remove_option :: MonadError CPError m => ConfigParser -> SectionSpec -> OptionSpec -> m ConfigParser Source

Removes the specified option. Returns a NoSection error if the section does not exist and a NoOption error if the option does not exist. Otherwise, returns the new ConfigParser object.

add_section :: MonadError CPError m => ConfigParser -> SectionSpec -> m ConfigParser Source

Adds the specified section name. Returns a SectionAlreadyExists error if the section was already present. Otherwise, returns the new ConfigParser object.

remove_section :: MonadError CPError m => ConfigParser -> SectionSpec -> m ConfigParser Source

Removes the specified section. Returns a NoSection error if the section does not exist; otherwise, returns the new ConfigParser object.

This call may not be used to remove the DEFAULT section. Attempting to do so will always cause a NoSection error.

merge :: ConfigParser -> ConfigParser -> ConfigParser Source

Combines two ConfigParsers into one.

Any duplicate options are resolved to contain the value specified in the second parser.

The ConfigParser options in the resulting object will be set as they are in the second one passed to this function.

Output Data

to_string :: ConfigParser -> String Source

Converts the ConfigParser to a string representation that could be later re-parsed by this module or modified by a human.

Note that this does not necessarily re-create a file that was originally loaded. Things may occur in a different order, comments will be removed, etc. The conversion makes an effort to make the result human-editable, but it does not make an effort to make the result identical to the original input.

The result is, however, guaranteed to parse the same as the original input.