A Script Template and Useful Techniques for ksh ScriptsBernd Schemmer, November 2007 Abstract: This article discusses a template for Korn shell (ksh) scripts and explains the techniques used in the template. Note: For changes to the script since Nov. 2007, please see:
http://wikis.sun.com/display/BigAdmin/viewpage?pageId=69896634. Contents
IntroductionThis article discusses a script template for ksh scripts. I use this script template for nearly all the scripts I write for doing day-to-day work. I'm pretty sure that every system administrator who is responsible for more than a few machines running the Solaris Operating System has her own bag of scripts for maintaining the machines. Nevertheless, the script template and the programming techniques discussed in this article might be useful for them also. The script template is released under the Common Development and Distribution License, Version 1.0; a link to download the script is at the end of this article. Why Use a Script Template?Simple answer: Because you don't want to invent the wheel every other day. Writing a script template with the most frequently used functions one time and reusing it for your scripts makes life easier. Doing this as shell script instead of, for example, perl, ensures that you can just copy the script to a new machine and have it work. Note: This script template was written primarily for the Solaris OS but initial support for other operating systems, such as AIX and Linux, is included. The template was written and tested with ksh88, which is the default ksh version in the Solaris OS, but it should also run with ksh93. Techniques Used in the TemplateThis section describes the techniques used in the script template. Everyone knows that documentation is essential, but no one wants to write it. Because of this, I included
the documentation inside the script template as comments. To distinguish the documentation parts from the normal
comments, the lines with documentation are prefixed with a double hash
echo " -----------------------------------------------" >&2
echo " ${__SCRIPTNAME} ${__SCRIPT_TEMPLATE_VERSION} ">&2
echo " Documentation" >&2
echo " -----------------------------------------------" >&2
grep "^##" $0 | cut -c3- 1>&2 ; die 0 ;;
Therefore, the following command creates the documentation for the script: ../scriptt.sh -H 2>./scriptt.txt To distinguish between predefined global variables and script specific variables and ensure that
I can replace the general routines in existing scripts with updated routines from the template if necessary,
the names of all global variables are prefixed with a
## __MUST_BE_ROOT (def.: false)
## set to ${__TRUE} for scripts that must be executed by
## root only
##
__MUST_BE_ROOT=${__FALSE}
## __REQUIRED_USERID (def.: none)
## required userid to run this script (other than root);
## use blanks to separate multiple userids
## e.g. "oracle dba sysdba"
## "" = no special userid required
##
__REQUIRED_USERID=""
Because it's not always clear in scripts whether To avoid problems with duplicate variable names, all local variables of a function are defined as
local variables using To avoid difficult-to-find errors, all variables are used in the format All parts of the script template that must be filled with "real" code are marked with three question marks (???). Logging is essential for a script, especially if the script is called non-interactively as in cronjobs. One problem here is that you might want to change the log file name using a parameter for the script, but the script already prints messages before the parameter is processed. The solution I used for this problem is to create a temporary log file directly after the script starts. Then, after the parameter is processed, the temporary log file is either copied to the "real" log file or deleted, depending on the value of the log file parameter. And to make life easier, I defined some additional routines to log different kind of messages:
To support multiple levels of info messages, the script implements a
## --------------------------------------
## LogInfo
##
## print a message to STDOUT and write it also to the log file
## only if in verbose mode
##
## usage: LogInfo [loglevel] message
##
## returns: -
##
## Notes: Output goes to STDERR, default loglevel is 0
##
function LogInfo {
typeset __FUNCTION="LogInfo"; ${__FUNCTION_INIT} ; ${__DEBUG_CODE}
typeset THISLEVEL=0
if [ ${__VERBOSE_MODE} -eq ${__TRUE} ] ; then
if [ $# -gt 1 ] ; then
isNumber $1
if [ $? -eq ${__TRUE} ] ; then
THISLEVEL=$1
shift
fi
fi
[ ${__VERBOSE_LEVEL} -gt ${THISLEVEL} ] && LogMsg "INFO: $*" >&2
fi
}
For example: LogInfo "This message will only be printed if the parameter -v is entered one or more times" LogInfo 1 "This message will only be printed if the parameter -v is entered two or more times" LogInfo 2 "This message will only be printed if the parameter -v is entered three or more times" This feature is used in the function
## __RT_VERBOSE_LEVEL - level of -v for runtime messages
##
## e.g. 1 = -v -v is necessary to print info messages of
## the runtime system
## 2 = -v -v -v is necessary to print info messages
## of the runtime system
##
##
__RT_VERBOSE_LEVEL=1
# ....
function LogRuntimeInfo {
typeset __FUNCTION="LogRuntimeInfo"; ${__FUNCTION_INIT}
${__DEBUG_CODE}
LogInfo "${__RT_VERBOSE_LEVEL}" "$*"
}
All info messages of the "runtime" system (that is, the predefined code in the template) are printed via
In addition, there are three functions to simplify the logging of binaries called by the script:
Another function to log the output of one or more commands is # start the redirection of STDOUT and STDERR # StartStop_LogAll_to_logfile start # now STDOUT and STDERR of the script and all commands executed by # the script go to the logfile # To explicitly write to STDOUT after calling this function with # the parameter "start" use echo "This goes to STDOUT" >&3 # To explicitly write to STDERR after calling this function with # the parameter "start" use echo "This goes to STDERR" >&4 # To stop the redirections use # StartStop_LogAll_to_logfile stop Before calling the main code, the script template initializes the predefined variables and executes some common code. Some scripts run only on specific hardware or software versions or they must be run by root. Therefore, I implemented the following checks:
The checks are done only if the appropriate variables are defined:
## __MUST_BE_ROOT (def.: false)
## set to ${__TRUE} for scripts that must be executed by root only
##
__MUST_BE_ROOT=${__FALSE}
## __REQUIRED_USERID (def.: none)
## required userid to run this script (other than root);
## use blanks to separate multiple userids
## e.g. "oracle dba sysdba"
## "" = no special userid required
##
__REQUIRED_USERID=""
## __REQUIRED_ZONES - required zones (either global, non-global, or
## local or the names of the valid zones)
## (def.: none)
## "" = no special zone required
##
__REQUIRED_ZONES=""
## __ONLY_ONCE (def.: false)
## set to ${__TRUE} for scripts that cannot run more than one
## instance at the same time
##
__ONLY_ONCE=${__FALSE}
## __ REQUIRED_OS - required OS (uname -s) for the script (def.: none)
## use blanks to separate the OS names if the script runs under
## multiple OS, e.g. "SunOS"
##
__REQUIRED_OS=""
## __REQUIRED_OS_VERSION (def.: none)
## minimum OS version necessary, e.g. 5.10
## "" = no special version necessary
##
__REQUIRED_OS_VERSION=""
## __REQUIRED_MACHINE_PLATFORM (def.: none)
## required machine platform (uname -i),
## e.g. "i86pc"; use blanks to separate the
## machine types if more than one entry, e.g. "Sun Fire 3800 i86pc"
## "" = no special machine type necessary
##
__REQUIRED_MACHINE_PLATFORM=""
## __REQUIRED_MACHINE_CLASS (def.: none)
## required machine class (uname -m),
## e.g. "i86pc"; use blanks to separate the
## machine types if more than one entry, e.g. "sun4u i86pc"
## "" = no special machine class necessary
##
__REQUIRED_MACHINE_CLASS=""
## __REQUIRED_MACHINE_ARC (def.: none)
## required machine architecture (uname -p),
## e.g. "i386" ; use blanks to separate the
## machine types if more than one entry, e.g. "sparc i386"
## "" = no special machine architecture necessary
##
__REQUIRED_MACHINE_ARC=""
# .....
if [ ${__MUST_BE_ROOT} -eq ${__TRUE} ] ; then
UserIsRoot || die 249 "You must be root to execute this script"
fi
if [ "${__REQUIRED_USERID}"x != ""x ] ; then
pos " ${__USERID} " " ${__REQUIRED_USERID} " &&
die 242 "This script can only be executed by one of the users:
${__REQUIRED_USERID}"
fi
if [ ${__ONLY_ONCE} -eq ${__TRUE} ] ; then
CreateLockFile
if [ $? -ne 0 ] ; then
cat <<EOF
ERROR:
Either another instance of this script is already running
or the last execution of this script crashes.
In the first case, wait until the other instance ends;
in the second case, delete the lock file.
${__LOCKFILE}
manually and restart the script.
EOF
#...
Restricting a Script to Running Only Once The code to ensure that not more than one instance of the script is running at the same time is a little bit tricky.
It is important here to use a check that is known to be atomic. I use
# --------------------------------------
## CreateLockFile
#
# Create the lock file (which is really a symbolic link)
# if possible
#
# usage: CreateLockFile
#
# returns: 0 - lock created
# 1 - lock already exist or error creating the lock
#
# Note: Use a symbolic link because this is always an atomic
# operation
#
function CreateLockFile {
typeset __FUNCTION="CreateLockFile"; ${__FUNCTION_INIT}
${__DEBUG_CODE}
typeset LN_RC=""
ln -s $0 "${__LOCKFILE}" 2>/dev/null
LN_RC=$?
if [ ${LN_RC} = 0 ] ; then
__LOCKFILE_CREATED=${__TRUE}
return 0
else
return 1
fi
}
Using RBAC in the Solaris 10 OS To enable RBAC control for the scripts, I added the following code at the start of the script:
# ------------------------------------------------------------------
## __USE_RBAC - set this variable to ${__TRUE} to execute this script
## with RBAC
## default is ${__FALSE}
##
__USE_RBAC=${__USE_RBAC:=${__FALSE}}
...
# ------------------------------------------------------------------
#
# Set the variable ${__USE_RBAC} to ${__TRUE} to activate RBAC support
#
# Allow the use of RBAC to control who can access this script. Useful
# for administrators without root permissions
#
if [ "${__USE_RBAC}" = "${__TRUE}" ] ; then
if [ "$_" != "/usr/bin/pfexec" -a -x /usr/bin/pfexec ]; then
/usr/bin/pfexec $0 $*
exit $?
else
echo "${0%%*/} ERROR: /usr/bin/pfexec not found or
not executable!" >&2
exit 238
fi
fi
Now to enable RBAC, simply set the (environment) variable Note: I copied this code from the All my scripts use config files to be as flexible as possible. But I don't want to maintain duplicate code, one for the initialization of the variables in the script and one for the processing of the configuration file. Because of this, the config file is implemented via the source-in functionality of ksh. Details: The configuration variables are defined in the variable ## __CONFIG_PARAMETER ## The variable __CONFIG_PARAMETER contains the configuration ## variables ## # The defaults for these variables are defined here. You # can use a config file to overwrite the defaults. # # Use the parameter -C to create a default configuration file # # Note: The config file is read and interpreted via ". configfile" -> # You can add also some code here! # __CONFIG_PARAMETER=' # extension for backup files DEFAULT_BACKUP_EXTENSION=".$$.backup" # ??? example variables for the configuration file; # change to your need # master server with the directories to synchronize # The rsync daemon must run either on this host or on localhost # If the rsync daemon runs on localhost, the master server must # export the directories to synchronize using NFS. In this case # the directories on the master server must be the same as on # the rsync client # # overwritten by the parameter -m DEFAULT_MASTER_SERVER="linst2.rze.de.db.com" # server with the rsync daemon. This is either the master server # or localhost # # overwritten by the parameter -s DEFAULT_RSYNC_SERVER="localhost" # only change the following variables if you know what you're doing # ## sample debug code: ## __DEBUG_CODE=" eval echo Entering the subroutine $__FUNCTION ... " ## Note: Use an include script for more complicated debug code, e.g. ## __DEBUG_CODE=" eval . /var/tmp/mydebugcode" ## # no further internal variables defined yet ' # end of config parameters Because there is no evaluation of variable names in strings in single quotes (' '), you don't have to think about escaping the special characters here. And you can also add code to execute in a config file if necessary. Now, to initialize the variables in the script, the following simple line is necessary:
eval "${__CONFIG_PARAMETER}"
And the code to read and execute a configuration file is:
.. "${THIS_CONFIG_FILE}"
To write a config file with default values, the code is:
cat <<EOT >"${THIS_CONFIG_FILE}"
# config file for $0
${__CONFIG_PARAMETER}
EOT
THISRC=$?
This functionality is used if the script is called with the parameter That's it -- very simple and straightforward. For more in-depth information, please study the functions One thing a lot of scripts are missing is the housekeeping at script end, for example, deleting temporary files, directories, and so on. For this purpose, the script template discussed here defines four variables: ## __LIST_OF_TMP_MOUNTS - list of mounts that should be unmounted ## at program end ## __LIST_OF_TMP_MOUNTS="" ## __LIST_OF_TMP_DIRS - list of directories that should be removed ## at program end ## __LIST_OF_TMP_DIRS="" ## __LIST_OF_TMP_FILES - list of files that should be removed ## at program end ## __LIST_OF_TMP_FILES="" ## __EXITROUTINES - list of routines that should be executed before ## the script ends ## Note: These routines are called *before* temp files, temp ## directories, and temp mounts are removed ## __EXITROUTINES="" ## __FINISHROUTINES - list of routines that should be executed ## before the script ends ## Note: These routines are called *after* temp files, temp ## directories, and temp mounts are removed ## __FINISHROUTINES="" These variables are evaluated and processed by the routine Therefore, to use this feature, it's necessary to exit the script always using the function
# alias to install the trap handler
#
alias __settrap="
LINENO=\${LINENO }
trap 'GENERAL_SIGNAL_HANDLER 1 \${LINENO} \${__FUNCTION}' 1
trap 'GENERAL_SIGNAL_HANDLER 2 \${LINENO} \${__FUNCTION}' 2
trap 'GENERAL_SIGNAL_HANDLER 3 \${LINENO} \${__FUNCTION}' 3
trap 'GENERAL_SIGNAL_HANDLER 15 \${LINENO} \${__FUNCTION}' 15
"
# install trap handler
__settrap
trap 'GENERAL_SIGNAL_HANDLER exit ${LINENO} ${__FUNCTION}' exit
# ...
## ---------------------------------------
## die
##
## print a message and end the program
##
## usage: die returncode {message}
##
## returns: -
##
## Notes:
##
## This routine
## - calls cleanup
## - prints an error message if any (if returncode is not zero)
## or the message if any (if returncode is zero)
## - prints all warning messages again if
## ${__PRINT_LIST_OF_WARNING_MSGS} is ${__TRUE}
## - prints all error messages again if
## ${__PRINT_LIST_OF_ERROR_MSGS} is ${__TRUE}
## - prints a program end message and the program return code
## - and ends the program
##
## If the variable ${__FORCE} is ${__TRUE} and returncode is NOT
## zero, die() will only print the error message and return
##
function die {
typeset __FUNCTION="die"; ${__FUNCTION_INIT} ; ${__DEBUG_CODE}
# ...
# the function cleanup handles the unmounting, and the removing of
# files and directories
cleanup
# ...
}
## ---------------------------------------
## GENERAL_SIGNAL_HANDLER
##
##
## general trap handler
##
## usage: called automatically (parameter $1 is the signal number)
##
## returns: -
##
function GENERAL_SIGNAL_HANDLER {
typeset __RC=$?
__TRAP_SIGNAL=$1
typeset __LINENO=$2
typeset INTERRUPTED_FUNCTION=$3
typeset __FUNCTION="GENERAL_SIGNAL_HANDLER"; ${__DEBUG_CODE}
if [ "${__EXIT_VIA_DIE}"x != "${__TRUE}"x -a ${__TRAP_SIGNAL} !=
"exit" ] ; then
LogRuntimeInfo "Trap \"${__TRAP_SIGNAL}\" caught"
[ "${__INCLUDE_SCRIPT_RUNNING}"x != ""x ] && LogMsg "Trap occurred
inside of the include script \"${__INCLUDE_SCRIPT_RUNNING}\" "
LogRuntimeInfo "Signal ${__TRAP_SIGNAL} received: Line: ${__LINENO}
in function: ${INTERRUPTED_FUNCTION}"
fi
case ${__TRAP_SIGNAL} in
1 )
LogWarning "HUP signal received"
InvertSwitch __VERBOSE_MODE
LogMsg "Switching verbose mode to $( ConvertToYesNo
${__VERBOSE_MODE} )"
;;
2 )
if [ ${__USER_BREAK_ALLOWED} -eq ${__TRUE} ] ; then
die 252 "Script aborted by the user via signal BREAK (CTRL-C)"
else
LogRuntimeInfo "Break signal (CTRL-C) received and ignored
(Break is disabled)"
fi
;;
3 )
die 251 "QUIT signal received"
;;
15 )
die 253 "Script aborted by the external signal TERM"
;;
"ERR" )
LogMsg "A command ended with an error; the RC is ${__RC}"
;;
"exit" | 0 )
if [ "${__EXIT_VIA_DIE}"x != "${__TRUE}"x ] ; then
LogError "exit signal received; the RC is ${__RC}"
[ "${__INCLUDE_SCRIPT_RUNNING}"x != ""x ] && LogMsg "exit occurred
inside of the include script \"${__INCLUDE_SCRIPT_RUNNING}\" "
LogWarning "You should use the function \"die\" to end the
program"
fi
return
;;
* ) die 254 "Unknown signal caught: ${__TRAP_SIGNAL}"
;;
esac
}
Example usage:
# add mounts that should be automatically be unmounted at script
# end to this variable
#
__LIST_OF_TMP_MOUNTS="${__LIST_OF_TMP_MOUNTS} /tmp/my_mountpoint.$$ "
# add directories that should be automatically removed at script
# end to this variable
#
__LIST_OF_TMP_DIRS="${__LIST_OF_TMP_DIRS} /tmp/mydir.$$ "
# add files that should be automatically removed at script end
# to this variable
__LIST_OF_TMP_FILES="${__LIST_OF_TMP_FILES} /tmp/mydir.$$/myfile.$$
/tmp/myfile.out.$$ "
# add functions that should be called automatically at program
# end to this variable
#
__EXITROUTINES="${__EXITROUTINES} my_cleanup_function"
Additional features of the trap handlers: As you can see, the trap handler also implements the handling of CTRL-C via the variable
Another useful feature implemented by the trap handler is the possibility to enable or disable the verbose mode for a running script. This comes in very handy for long-running scripts for which you might start in verbose mode and turn the verbose mode off later (or vice versa). To use this feature, use If verbose mode is enabled, this signal will disable it. And if verbose mode is disabled, the signal will enable it. The following line is necessary to handle traps which occur in sourced-in scripts:
[ "${__INCLUDE_SCRIPT_RUNNING}"x != ""x ] && LogMsg "Trap
occurred inside of the include
script \"${__INCLUDE_SCRIPT_RUNNING}\" "
To use this feature, use The function
## ---------------------------------------
## includeScript
##
## include a script via . [scriptname]
##
## usage: includeScript [scriptname]
##
## returns: -
##
## notes:
##
function includeScript {
typeset __FUNCTION="includeScript"; ${__FUNCTION_INIT}
${__DEBUG_CODE}
if [ $# -ne 0 ] ; then
# install trap handler
trap "GENERAL_SIGNAL_HANDLER ERR $LINENO" ERR
trap "GENERAL_SIGNAL_HANDLER 1 $LINENO" 1
trap "GENERAL_SIGNAL_HANDLER 2 $LINENO" 2
trap "GENERAL_SIGNAL_HANDLER 3 $LINENO" 3
trap "GENERAL_SIGNAL_HANDLER 15 $LINENO" 15
trap "GENERAL_SIGNAL_HANDLER exit $LINENO" EXIT
LogRuntimeInfo "Including the script \"$*\" ..."
# set the variable for the TRAP handlers
[ ! -f "$1" ] && die 247 "Include script \"$1\" not found"
__INCLUDE_SCRIPT_RUNNING="$1"
# include the script
. $*
# reset the variable for the TRAP handlers
__INCLUDE_SCRIPT_RUNNING=""
fi
}
Some parameters are the same for all scripts and it makes sense to define them in the script template. The predefined parameters in this script template are:
The parameters The parameter In the Solaris 10 OS, the script also supports parameters with long names.
The script template initializes the (IMHO) most frequently used variables: ## __SCRIPTNAME - name of the script without the path ## ## __SCRIPTDIR - path of the script (as entered by the user!) ## ## __REAL_SCRIPTDIR - path of the script (real path, maybe a link) ## ## __CONFIG_FILE - name of the config file ## (use ReadConfigFile to read the config file; ## use WriteConfigFile to write it) ## ## __HOSTNAME - hostname ## ## __NODENAME - nodename ## ## __OS - Operating system (e.g. SunOS) ## ## __OS_VERSION - Operating system version (e.g. 5.8) ## ## __ZONENAME - name of the current zone if running in Solaris 10 OS ## or newer ## ## __OS_RELEASE - Operating system release (e.g. Generic_112233-08) ## ## __MACHINE_CLASS - Machine class (e.g. sun4u) ## ## __MACHINE_PLATFORM - machine platform (e.g. SUNW,Ultra-4) ## ## __MACHINE_SUBTYPE - machine type (e.g. Sun Fire 3800) ## ## __MACHINE_ARC - machine architecture (e.g. SPARC) ## ## __START_DIR - working directory when starting the script ## ## __LOGON_USERID - ID of the user opening the session ## ## __USERID - ID of the user executing this script (e.g. xtrnaw7) ## ## __RUNLEVEL - current runlevel ## The script template also supports some environment variables.
Another weak point of a lot of existing scripts is the error and return code handling. To simplify the error handling, the script
template contains a lot of error checking and uses well-defined return codes for each. Using the function mkdir /var/tmp/mydir || die 10 "Error creating the directory /var/tmp/mydir" The defined return codes are: ## Predefined return codes: ## ## 1 - Show usage and exit ## 2 - Invalid parameter found ## ## 210 - 237 Reserved for the runtime system ## 238 - Unsupported operating system ## 239 - Script runs in a not-supported zone ## 240 - Internal error ## 241 - A command ended with an error (set -e is necessary to ## activate this trap) ## 242 - The current user is not allowed to execute this script ## 243 - Invalid machine architecture ## 244 - Invalid processor type ## 245 - Invalid machine platform ## 246 - Error writing the config file ## 247 - Include script not found ## 248 - Unsupported OS version ## 249 - Script not executed by root ## 250 - Script is already running ## ## 251 - QUIT signal received ## 252 - User break ## 253 - TERM signal received ## 254 - Unknown external signal received The script template also implements some debugging facilities for scripts, but I won't talk about this
feature here in more depth. Just take a look in the source code if you're interested or add some code to the
main routine and call the script with the parameter ../scriptt.sh -D Functions Defined in the Script TemplateComing from a more string-oriented language like REXX, one thing that is really missing in ksh are string manipulating functions. Therefore, I wrote some functions for doing this. The functions use internal ksh functions (like pattern matching, typeset, and so on) as much as possible to avoid the use of external binaries like sed, or awk. All functions returning a string either return the value in a variable or print it to STDOUT; this is done with the following code:
if [ "$4"x != ""x ] ; then
eval $4=\"${resultstr}\"
else
echo "${resultstr}"
fi
The string handling functions defined are:
Also missing in standard ksh scripts are functions for converting data. The functions defined for this purpose in the script template are:
Other often used functions are those to process UIDs and user names:
Functions to Implement a FIFO Stack There are also functions to implement a simple LIFO stack. This is a very handy feature to temporarily save the contents of variables. The functions for stack handling are:
Example usage of these functions:
# for debugging
push_and_set __VERBOSE_MODE ${__TRUE}
push_and_set __VERBOSE_LEVEL ${__RT_VERBOSE_LEVEL}
LogInfo 0 "Setting variable $P2= \"$( eval "echo \"$$P1\"")\" "
pop __VERBOSE_MODE
pop __VERBOSE_LEVEL
There are other functions defined in the script template that may or may not be useful. Use
Source CodeHere's the source code of the script template. Please save without the ".txt" suffix.
Comments (latest comments first)Discuss and comment on this resource in the BigAdmin Wiki
Unless otherwise licensed, code in all technical manuals herein (including articles, FAQs, samples) is provided under this License. |
BigAdmin SubscriptionsBigAdmin Areas
BigAdmin Sun Center
BigAdmin Topics | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||