BigAdmin System Administration Portal
Feature Article
Print-friendly VersionPrint-friendly Version

Centralized Host Configuration With Cfengine, Part I

Amy Rich, May, 2005

Centralized Host Configuration With Cfengine, Part II

Abstract: Part 1 of a series, this article discusses Cfengine, a distributed convergent configuration management system.

Contents:


Introduction

Many sites use automated installs such as jumpstart to ensure machine conformity, but after installation, machines tend to deviate from the standard towards disorder. To rectify this, some sites attempt to bring machine configurations back towards the standard one at a time by hand or using a few home-grown scripts. This method is often error prone and time consuming. Some other sites enforce conformity by reinstalling machines on a regular basis. This method works well if extra hardware and/or downtime is available for the reinstallation procedure, but removes any machine-specific customizations. A third method uses software to force the conformity of select configuration files but to let other parts of the system evolve naturally. This third option balances the need for configuration conformity and availability against local customization.

One software package designed to handle select configuration conformity is Cfengine, by Mark Burgess of Oslo University. Cfengine is a distributed convergent configuration management system with a centralized server and a client process on each machine. The purpose of Cfengine is to allow the administrator to create a single centralized system configuration which defines how each host on the network should be configured. An interpreter runs on each client host and parses the centralized configuration, checking the local machine's configuration files against it. If the machine's configuration has diverged from the defined standard, the necessary changes are performed to bring it back into conformity.

As mentioned previously, Cfengine concentrates on describing the state in which a machine should be, not how to make it conform to that state. Switching to this different mindset can sometimes require a bit of work for someone used to procedural scripting, but many find this approach less error prone in the end. Because it focuses on making a system match the defined state, Cfengine is idempotent. The result of running a Cfengine process on a machine once is the same as running it two or 10 times. Cfengine will only modify a system that does not match the defined state. At the same time, it's clear that some changes might require more than one pass to take effect, so Cfengine automatically executes twice for each run. For example, Cfengine might create an input file during the first pass which is then used to change the machine in the second pass.


The Cfengine Suite of Tools

The Cfengine distribution comprises several programs, cfagent, cfexecd, cfservd, cfrun, cfenvd, cfenvgraph, cfkey, and cfshow. Installation and execution of some programs on every host is optional, but they're all suggested to help track Cfengine performance. Of these programs, the two most used are the server daemon, cfservd, and the client-side program, cfagent.

The cfagent program runs on each machine and interprets the Cfengine policy file(s) and the configuration of the host. If the defined policy and host configuration differ and convergence is required in the policy file, cfagent implements the necessary changes. It also performs host integrity and security checks. Cfagent accepts a number of command-line options, covered in the cfagent(8) man page.

The cfservd program is an optional daemon running on the server which controls remote execution of Cfengine on client machines and file service. When cfagent runs on a client host, it can contact cfservd, verify that the system clocks on both machines are reasonably well synced, and pull outdated files down. Security between hosts is handled via IP address verification and RSA authentication, much like SSH. The RSA keys are generated on a per-host basis with the cfkey tool.

Cfrun is a tool for running remote agents. It does so by contacting the cfservd daemon on a remote host.

Cfexecd handles scheduling and reporting. Cfenvd is the optional daemon that monitors and reports on Cfengine anomalies. Cfenvgraph takes information gathered from cfenvd and plots it in graph form.

Cfshow provides the user with a command-line interface which queries the databases used to store Cfengine operational state information on each machine.


The Cfengine Configuration/Programming Language
As mentioned previously, the majority of the Cfengine complexity resides in the cfagent program. Most Cfengine configuration and customization takes place in the main configuration file cfagent.conf, and other files called from within it. Each configuration file can also be thought of as a self-contained program, so the terms "configuration file" and "program" can often be used interchangeably when discussing Cfengine. Each Cfengine program must include a set of declarations that specify a set of tasks and a list of actions that dictate the sections of the program to process, when to process them, and how many times. Each program statement takes the following generic form:

# comment

action-type:

  name = ( list of values )

  class::
    actions to take
.....

As a general rule, white space is not significant within the program, but consistently using white space to improve readability is strongly encouraged. Adding white space before and after parentheses is also suggested to avoid confusing the parser, and the white space is required in many evaluation tests. Comments begin with a hash symbol (#), and any text on the line after the comment symbol is ignored by the parser.

The action-type defines the sections of the program and is always a reserved word ending in a single colon. The action-type reserved words consist of: groups, control, homeservers, binservers, mailserver, mountables, import, broadcast, resolve, defaultroute, directories, miscmounts, files, ignore, tidy, required, links, disable, shellcommands, editfiles, and processes.

The name = ( list of values ) directive assigns the value of list to the variable names enclosed in the parentheses, for example: 'name = ( item1:item2:item3 )'. Statements ending in double colons are class names and determine what actions Cfengine will take on a given machine. Actions underneath a class name are only carried out if the machine running the Cfengine program is in the specified class. Each action-type can contain multiple variable assignations and classes, and each class can contain multiple directives.

Each Cfengine program must contain a control section or be included from another Cfengine program that contains one. The control section contains variable assignments, including the required actionsequence setting. The actionsequence variable lists which other sections of the program will be parsed. An example of a simple Cfengine program that just reports on its execution would look like:

control:

  actionsequence = ( shellcommands )

shellcommands:

  "/bin/echo 'The cfagent.conf file has been parsed.'"

To restrict which sections are applied on different types of machines, Cfengine uses classes. Expanding on the above example, the following program would print out different information based on whether or not the machine executing it matched the class sun4 or aix:

control:
  actionsequence = ( shellcommands )

shellcommands:
  aix::
    "/bin/echo 'The cfagent.conf file has been parsed on a Sun.'"
  sun::
    "/bin/echo 'The cfagent.conf file has been parsed on an AIX machine.'"

Class matching lies at the heart of the Cfengine configuration system and requires closer examination.

Classes

Cfengine uses a high-level language to specify policy and then applies rules to each machine which match a given class. Classes can be seen as the method by which Cfengine programs communicate. The state of a given class, either on or off, determines the behavior of the program. Classes can be defined from the command line of the tool running the Cfengine script, locally in the script itself, during the execution of a script, or from plug-in modules.

Since Cfengine runs independently on each machine, each knows its hostname, operating system version, architecture, and so on. Each host defines various classes for itself based on the aforementioned information. Cfengine has both hard classes and evaluated classes. Hard classes are these host-defined qualities along with other predefined data such as:

  • A time unit such as a day of the week or month, hour of the day, minute of the hour, a year, and so on
  • A user-defined group of hosts
  • A user-defined string

Also, a subset of hard classes are reserved for various operating systems. This subset consists of the following classes: ultrix, sun4, sun3, hpux, hpux10, aix, solaris, osf, irix4, irix, irix64, sco, freebsd, netbsd, openbsd, bsd4_3, newsos, solarisx86, aos, nextstep, bsdos linux, debian, cray, unix_sv, GnU, NT.

Evaluated classes are defined by performing internal test functions on input data. These internal functions are documented in the Cfengine Reference Manual as follows:

IsNewerThan(f1,f2)
True if file 2 is modified more recently than file 1 (mtime).
AccessedBefore(f1,f2)
True if file 1 had its last access earlier than file 2 (atime).
ChangedBefore(f1,f2)
True if file 1's attributes were changed in any way before file 2's (ctime).
FileExists(file)
True if the named file object exists.
IPRange(address-range)
True if the current host lies within the specified range.
HostRange(basename,start-stop)
True if the current relative domain name begins with basename and ends with an integer between start and stop.
IsDefined(variable-id)
True of the named variable is defined. NB: IsDefined(var), not IsDefined(${var})
IsDir(f)
True if the named file object is a directory.
IsLink(f)
True if the named file object is a symbolic link.
IsPlain(f)
True if the named file object is a plain file.
PrepModule(module,arg1 arg2...)
True if the named module exists and can be executed. The module is assumed to follow the standard programming interface for modules (see "Writing plugin modules" in the Cfengine Tutorial). Unlike actionsequence modules, these modules are evaluated immediately on parsing. Note that the module should be specified relative to the authorized module directory.
Regcmp(regexp,string or list separated string)
True if the string matched the regular expression.
ReturnsZero(command)
True if the named shell command returns with exit code zero (okay).
Strcmp(s1,s2)
True if the strings match exactly.

Classes can be logically combined to obtain finer-grained control. A logical AND is represented by a dot (.) or an ampersand (&), and a logical OR is represented by the pipe character (|). The exclamation character (!) represents the logical NOT. Like most programming languages, AND statements take precedence over OR statements, so using parentheses is important when grouping multiple types of statements. The order of operations for Cfengine classes is: (), !, . or &, and, lastly, |. For example, a statement which matches the classes red or blue but not the class green would be constructed as:

  (red|blue).!green

Variables

Variable substitution is supported in Cfengine as seen above in the example of a generic program. Cfengine supports the use of environment variables set by the shell, special internal variables, and general user-defined string substitutions. Except for environmental variables, all variables are set in the control section of a Cfengine program, and can also be set on a class-dependent basis. In the following code snippet, the variable dfprog is globally set to /usr/local/bin/df while the variable lsprog is set differently depending on whether the class is sun4 or aix:

control:

  dfprog = ( /usr/local/bin/df )

  sun4:: lsprog = ( /usr/xpg4/bin/ls )
  aix:: lsprog = ( /usr/bin/ls )

As with classes, Cfengine reserves certain case-insensitive internal variables useful for creating generalized configuration files. These internal variables are documented in the Cfengine Tutorial as follows:

AllClasses
A long string in the form CFALLCLASSES=class1:class2.... This variable is a summary of all the defined classes at any given time. It is always kept up to date so that scripts can make use of Cfengine's class data.
arch
The current detailed architecture string -- an amalgamation of the information from uname. Non-definable.
binserver
The default server for binary data. See the tutorial section on NFS resources. Non-definable.
class
The currently defined system hard-class (for example, sun4, hpux). Non-definable.
date
The current date string. Note that if you use this in a shell command it might be interpreted as a list variable, since it contains the default separator :.
domain
The currently defined domain.
faculty
The faculty or site as defined in control (see site).
fqhost
The fully qualified (DNS/BIND) hostname of the system, which includes the domain name as well.
host
The hostname of the machine running the program.
ipaddress
The numerical form of the Internet address of the host currently running Cfengine.
MaxCfengines
The maximum number of Cfengine processes which should be allowed to coexist concurrently on the system. This can prevent excessive load due to unintentional spamming in situations where several cfagent processes are started independently. The default value is unlimited.
ostype
A short for of $(arch).
OutputPrefix
This quoted string can be used to change the default cfengine: prefix on output lines to something else. You might wish to shorten the string, or have a different prefix for different hosts. The value in this variable is appended with the name of the host. The default is equivalent to, OutputPrefix = ( "cfengine:$(host):" )
RepChar
The character value of the string used by the file repository in constructing unique file names from path names. This is the character which replaces / (see the Reference Manual).
site
This variable is identical to $(faculty) and may be used interchangeably.
split
The character on which list variables are split (see the Reference Manual).
sysadm
The name or mail address of the system administrator.
timezone
The current timezone as defined in control.
UnderscoreClasses
If this is set to on, Cfengine uses hard-classes which begin with an underscore, so as to avoid name collisions. See also "Runtime Options" in the Reference Manual.
year
The current year.

The following variables are also reserved and used to produce special characters in strings.

cr
Expands to the carriage-return character.
dblquote
Expands to a double quote ".
dollar
Expands to $.
lf
Expands to a line-feed character (UNIX end of line).
n
Expands to a newline character.
quote
Expands to a single quote '.
spc
Expands simply to a single space. This can be used to place spaces in file names, and so on.
tab
Expands to a single tab character.

Installing Cfengine

Installing Cfengine is fairly straightforward as it only requires three other popular third-party packages. Two of those, Perl and SSL, come with recent versions of the base installation of the Solaris Operating System. OpenSSL and/or CPAN's perl distribution can be used instead of the Sun-supplied packages if desired. Cfengine also requires version 3.2 or later of Berkeley DB. Once all of these things are installed and functional, download the Cfengine source code, unpack it, configure it, and install it. The following configured input assumes that both OpenSSL and Berkeley DB are installed using a GNU style layout. If either library was installed in a place other than /usr/local/lib or headers somewhere other than /usr/local/include, then change the options below to reflect the correct installation directories:

wget ftp://ftp.iu.hio.no/pub/cfengine/cfengine-2.1.13.tar.gz

tar zxf cfengine-2.1.13.tar.gz
cd cfengine-2.1.13
../configure --prefix=/usr/local --with-berkeleydb=/usr/local \
  --with-openssl=/usr/local
make
make install

This installs the Cfengine programs into /usr/local/sbin, and a number of example configuration files into /usr/local/share/cfengine. Since it was not configured otherwise, the Cfengine work directory is set to /var/cfengine. If /usr/local/sbin is not a local file system on each machine, copy the binaries over into the work directory:

mkdir -p /var/cfengine/bin
cd /usr/local/sbin

cp cfagent cfenvgraph cfrun cfdoc cfexecd cfservd cfenvd cfkey cfshow vicf \
 /var/cfengine/bin

Cfengine also needs directories to store input and output files. When Cfengine is invoked by the scheduler, it reads only from /var/cfengine/inputs or the location specified by the environment variable CFINPUTS. The directory /var/cfengine/outputs stores the run reports which are emailed to the administrator or which can be copied to a central location. Create these two directories as well as the bin directory:

mkdir -p /var/cfengine/inputs /var/cfengine/outputs

Next Up

At this point, all the Cfengine components are installed, but no configuration files exist, so there's nothing for Cfengine to do. The next article will cover Cfengine public key encryption, running the necessary daemons and programs, and writing some useful Cfengine scripts. I'll also discuss using Cfengine to help automate machine installs via jumpstart and provide a pointer to information on how to automatically pull Cfengine scripts and files from a source code repository such as CVS (Concurrent Versions System).


Resources

Unless otherwise licensed, code in all technical manuals herein (including articles, FAQs, samples) is provided under this License.


BigAdmin