|
|
|
|
 Collateral |
|
|
Optimizing SAS Systems Performance on Solaris
This paper is designed to help you to maximize the performance of the SAS
System. The following sections suggest ways to increase the efficiency of
your SAS job in terms of the three critical resources: I/O, memory, and CPU
time. While you may not be able to take advantage of every technique for
every job, you can choose the ones that are best suited for your particular
job.
NOTE: For a comprehensive list of efficient programming tips, see the SAS
Programming Tips: A Guide to Efficient SAS Processing and the SAS
Companions for UNIX Environments.
Techniques for Optimizing I/O
I/O is one of the most important factors for optimizing performance. Most
SAS jobs consist of repeated cycles of reading a particular set of data in
order to perform various data analysis and data manipulation tasks. To improve
the performance of a SAS job, you need to reduce the number of times the
SAS System accesses disk or tape devices.
You can do this in two ways:
-
Modify your SAS programs to reduce the number of times you have to process
the data. There are ways to reduce the number of times you process data,
including using WHERE processing, using indexes, using both engines and data
sets efficiently, and accessing data through views.
-
Reduce the number of data accesses by processing more data each time the
device is accessed. See details on the BUFNO=, the BUFSIZE=, the CATCACHE=,
and the COMPRESS= options in the SAS Systems Options portion of this
paper.
Another way to improve efficiency is to use the DROP, KEEP, and LENGTH statements
to reduce the size of any given observation and to use the OBS= and FIRSTOBS=
options to reduce the number of observations processed.
When you create a temporary data set and include only the needed variables
and observation, you can reduce the number of I/Os required to process the
data. See Chapter 4 "Starting with SAS Data Sets," in SAS Language and
Procedures: Usage, Version 6, First Edition for more information on DROP,
KEEP, LENGTH, OBS and FIRSTOBS.
One final way to improve efficiency is to balance the I/O processing by keeping
the SAS WORK directory and system swap files on separate disk drives.
Techniques for Optimizing Memory Usage
If memory is your critical resource, several techniques can reduce the dependence
on increased memory. However, most of them will also increase I/O processing
or CPU usage.
By increasing the value of the MEMSIZE system option, you can decrease the
processing time because the amount of time spent on paging is reduced.
You can make tradeoffs between memory and other resources. To make the most
of the I/O subsystem, you need to use more and larger buffers. These buffers
must share space with the other memory demands of your SAS session.
Techniques for Optimizing CPU Performance
Executing a single stream of code takes approximately the same amount of
CPU each time that code is executed. Optimizing CPU performance in these
instances is usually a tradeoff, and the cost is using more memory (see MEMSIZE
System Option).
Because the CPU performs all the processing needed to perform an I/O operation,
an option or technique that reduces the number of I/O operations often has
a positive effect on CPU usage.
-
Storing Compiled Code for Computation-Intensive DATA Steps - Another technique
that can improve CPU performance is to store DATA step code that is executed
repeatedly as compiled code rather than as SAS source code. This is especially
true for large DATA step jobs that are not I/O-intensive. For more information
on the Stored Program Facility, see Appendix 3 in SAS Language:
Reference.
-
Reducing Search Time for SAS Executable Files - Your default configuration
file specifies a certain order for the directories containing SAS executable
files. You can rearrange the directory specifications in the PATH option
so that the most commonly accessed directories are listed first. Place the
least commonly accessed directories last.
-
Specifying Variable Lengths - When the SAS System processes the data vector,
it typically moves the data in one large operation rather than individual
variables. When data are properly aligned (in 8-byte boundaries), data movement
can occur in as little as 2 clock cycles (a single load, followed by a single
store). Unaligned data are moved by more complex means, at worst, a single
byte at a time. This would be a least eight times slower for an 8-byte variable.
Many high performance RISC processors pay a very large penalty for movement
of unaligned data. When possible, follow these suggestions for keeping data
aligned: 1) leave numeric data at full width (8-bytes) (Note that the SAS
System must widen short numeric data for any arithmetic operation. On the
other hand, short numeric data can save both memory and I/O); and 2) keep
character data in multiples of 8 bytes in length. This obviously wastes memory,
but it does keep data aligned.
These suggestions are especially important when processing a data set by
selecting only specific variables and where clause processing. It is important
that the variables selected are properly aligned.
SAS System Options:
-
BUFNO
The BUFNO= system option specifies the number of buffers to be allocated
for processing a SAS data set. The number of buffers is not a permanent attribute
of thedata set, and it is valid only for the current SAS session or job.
The BUFNO= option applies to SAS data sets opened for input, output, or update.
NOTE: Using the BUFNO= system option can speed up execution time by
limiting the number of input/output operations required for a particular
SAS data set. The improvement in execution time, however, comes at the expense
of increased memory consumption.
Default: 1
-
BUFSIZE
The BUFSIZE= option specifies the size of input/output buffers for SAS data
sets. The size of the input/output buffers is permanently associated with
the SAS data set. If the number of bytes is greater than 0 when a SAS data
set is created, that number is used as the default value for the BUFSIZE=
data set option. If the BUFSIZE= data set option is not used and the number
of bytes for the BUFSIZE= system option is 0, the SAS System chooses a host
system default value that is optimal for the SAS data set.
NOTE: Using the BUFSIZE= system option can speed up execution time by limiting
the number of input/output operations required for a particular SAS data
set. The improvement in execution time, however, comes at the expense of
increased memory consumption.
Default: 0
-
CATCACHE
The CATCACHE=n system option specifies the number of SAS catalogs to keep
open. If n is greater than 0, the SAS System places up to that number of
open-file descriptors in cache memory instead of closing the catalogs. If
n is 0, no open-file descriptors are kept in cache memory. You can
use the CATCACHE= system option to tune an application by avoiding the of
the CATCACHE option can potentially improve performance by keeping the catalogs
needed for a SAS application in memory during the entire SAS Session.
NOTE: The increased performance in catalog I/O can also use considerable
memory resources; use this technique only if memory issues are not a
concern.
Default: 0
-
COMPRESS
COMPRESS= YES|NO system option specifies whether observations in a newly
created SAS output data set are compressed (variable-length records) or
uncompressed (fixed-length records). The record type is a permanent attribute
of the SAS data set. Compressing a data set reduces the size of the data
set by reducing repeated consecutive characters to two- or three-byte
representations. To uncompress observations, you must use a DATA step to
copy the data set and specify COMPRESS=NO for the new data set. The advantages
gained by using the COMPRESS= data set option include 1) reduced storage
requirements for the data set; and 2) fewer input and output operations necessary
to read from or write to the data set during processing.
NOTE: Using the COMPRESS= system option prevents access to a SAS data set
by observation number. Also, using this option increases the CPU time for
reading a data set because of the overhead of compressing and uncompressing
the records.
-
IMPLMAC
The IMPLMAC system option controls whether macros defined as statement-style
macros can be invoked with statement-style macro calls or if the call must
be a name-style macro call.
NOTE: When you use the IMPLMAC system option, processing time is increased
because the SAS System checks every SAS statement to determine whether the
beginning word is a macro call. When you use the IMPLMAC system option in
conjunction with the MAUTOSOURCE system option, the MRECALL system, or both,
processing time can be increased further.
-
MEMSIZE
MEMSIZE= n | nK | nM | nG | MAX. The MEMSIZE option specifies a limit on
the total amount of memory the SAS System uses at any one time. The operating
system may use additional amounts of memory. The value of 0 the SAS System
to use all available memory, up to the system limit. Too low a value will
result in out-of-memory conditions. When you increase the value of SORTSIZE,
you will need to increase the value of MEMSIZE. This option can take the
following values:
-
n - specifies the amount of memory in bytes.
-
nK - specifies the amount of memory in kilobytes.
-
nM - specifies the amount of memory in megabytes.
-
nG - specifies the amount of memory in gigabytes.
NOTE: The MEMSIZE option must be set during the invocation of the SAS System
by modifying the CONFIG.SAS file or passing it as parameter to the SAS
command.
-
MSYMTABMAX
MSYMTABMAX= n | nK | nM | nG | MAX. The MSYMTABMAX= system option specifies
the maximum amount of memory available to the macro variable symbol table(s).
Once this value is reached, additional macro variables are written out to
disk. The value you specify with the MSYMTABMAX= system option can range
from 0 to the largest non-negative integer representable on your host. The
vLIn - specifies the amount of memory in bytes.
-
nK - specifies the amount of memory in kilobytes.
-
nM - specifies the amount of memory in megabytes.
-
nG - specifies the amount of memory in gigabytes.
-
MAX - specifies the maximum amount of memory available.
Default: 8K
-
MVARSIZE
MVARSIZE= n | nK | nM | nG | MAX. The MVARSIZE= system option specifies the
maximum size for in-memory macro variables. If the size is larger than this
value, variables are written out to disk. The value you specify with the
MVARSIZE= system option can range from 0 to the largest non- negative integer
representable on your host. The value can be expressed as follows:
-
n - specifies the amount of memory in bytes.
-
nK - specifies the amount of memory in kilobytes.
-
nM - specifies the amount of memory in megabytes.
-
nG - specifies the amount of memory in gigabytes.
-
MAX - specifies the maximum amount of memory available.
Default: 512K
-
RESIDENT
RESIDENT=num specifies whether an SCL entry is saved in resident memory the
first time that it is executed instead of being re-read from the catalog
on subsequent calls. Ranges for <num> are:
-
< 0 to save in memory only SCL entries containing a METHOD statement
with the /RESIDENT option.
-
= 0 to save no SCL entries in memory.
-
> 0 to save <num> entries in memory. By default, the number
of SCL entries saved in memory is 64.
When an SCL entry executes, SCL searches resident memory for the entry. If
the search is successful, the entry moves to the top of the search list.
An SCL entry that is called frequently remains at or near the top of the
list and so is found more quickly. When an SCL entry is not found on the
search list, the last entry on the search list (the least- recently used)
is removed, and the new entry is inserted at the top of the list.
-
SORTSIZE
The SORTSIZE= system option specifies the maximum amount of memory available
to the SORT procedure (or the sort utility specified with the SORTPGM= system
option). The 'memory-specification' can be one of the following:
-
MAX - specifies that all available memory can be used.
-
n - specifies the amount in bytes.
-
nK - specifies the amount in kilobytes.
-
nM - specifies the amount of memory in megabytes.
Specifying the SORTSIZE= option in the PROC SORT statement temporarily overrides
the setting for the SORTSIZE= system option. The value of the SORTSIZE= system
option is the default. When you increase the value of SORTSIZE, please make
sure you increase the value of MEMSIZE as well.
NOTE: Using this option can help improve sort performance by restricting
the virtual memory paging controlled by the host operating system. It the
SORT procedure needs more memory than you specify, it uses a temporary utility
file. As a general rule, the value you use for SORTSOZE= should be set to
less than the physical memory available to your process.
Default: 16M
SAS Procedures that use Extra
Resources:
-
CONTENTS with FMTLEN Option
The FMTLEN option prints the default length of formats and informats if they
do not have a specified length. If you omit the FMTLEN option, the CONTENTS
statement still prints the informat or format, but it does not include the
length unless the informat or format has a specified length.
NOTE: When you use the FMTLEN option, the SAS system uses additional CPU
time, I/O time, and memory to load the format and determine its length.
Default length of format: length of the longest formatted value
Default length of informat: longest informatted value
-
FREQ with TABLES Statement - EXACT Option
TABLES requests / EXACT ; The EXACT option requests Fisher's exact test for
tables that are larger than 2X2. The computational algorithm is the network
algorithm given by Mehta and Patel (1983).
NOTE: This option is not turned on when the ALL option is specified.
WARNING: This test is very intensive in the use of memory and
cpu time. It is NOT recommended when n / ((r-1)(c-1)) > 5
or when MIN(r, c) > 5. (n is the sample size. r is the number of rows.
c is the number of columns.)
-
LOGISTIC
For each BY group, define:
K = number of response levels
C = 1 + number of explanatory variables
m1 = K + C
m2 = 1 + m1
m3 = 16m1(m1 + 5)
m4 = 8C(C +4) + 4m2(m2 +3)
The minimum working space needed to process the BY group is m3 bytes.
For models with more than two response levels, a test of the parallel lines
assumption requires an additional workspace of m4 bytes. However,
if this additional memory is no available, the procedure skips the test and
finishes the other computataions. If sufficient space is available, the relevant
variables and observations from the input data set are also kept in memory;
otherwise, the input data set is reread for each evaluation of the likelihood
function and its derivatives, with the resulting execution time of the procedure
substantially increased.
-
MDDB
The minimum working space and virtual memory needs for creating an MDDB are:
-
For every MDDB: 900 byte overhead
-
For every analysis variable: 676 byte overhead
-
For every class variable: 340 byte overhead +(maximum formatted length of
the variable* number of values)+ (unformatted length of variable * number
of values).
-
For each hierarchy: 296 byte overhead (always at least one - NWAY)
-
For each hierarchy: (number of dimensions * 4 + number of analysis vars *
number of stats * 8) * number of crossings in hierarchy.
-
MEANS with CLASS Statement
CLASS variable-list; The CLASS statement assigns the variables used to form
subgroups. The CLASS statement has basically the same effect on the statistics
computed as that of the BY statement. The differences are in the format of
the printed output and in the sorting requirements of the BY statement.
NOTE: Theoretically, the maximum number of combinations of CLASS levels is
200 million. Realistically, it becomes a machine-dependent estimate, limited
solely by the amount of computer memory available. The maximum number of
CLASS variables is 30.
-
MULTTEST
PROC MULTTEST keeps all of the data in memory to expedite resampling. A large
portion of the memory requirement is thus 8*NOBS*NVAR bytes, where NOBS is
the number of observations in the data set, and NVAR is the number of variables
analyzed, including CLASS, FREQ, and STRATA variables. If you specify
PERMUTATION=number (for exact permatation distributions), then PROC
MULTTEST requires additional memory. This requirement is approximately
4*NTEST*NSTRATA*CMAX*number*(number+1) bytes, where NTEST is
the number of contrasts, NSTRATA is the number of STRATA levels, and CMAX
is the maximum contrast coefficient. The execution time is linear in the
number of resamples.
-
NPAR1WAY
Although the computational algorithm is fast, the computational time can
still be prohibitive, depending on the number of groups, the number of distinct
response variables, the total sample size, and the speed and memory available
on your computer. You can terminate exact computations and exit to the NPAR1WAY
procedure at any time by pressing the system interrupt key (refer to the
SAS Companion for your system) and choosing to stop computations.
-
PHREG
The PHREG procedure performs regression analysis of survival data based on
the Cox proportional hazards model. This procedure is very compute intensive
and will perform faster if given more memory. A simple algorithm to determine
the minimum working space (in bytes) needed to process the BY group is
max{12n, 24pp+ 160p} where n is the number of
observations in a BY group and p is the number of explanatory variables,
and pp is the square of p. If sufficient space is available, the input
data set is also kept in memory. Otherwise, the input data is reread from
the utility file for each evaluation of the likelihood function and its
derivatives, with the resulting execution time substantially increased.
-
REG
The REG procedure is efficient for ordinary regression; however, requests
for optional features can greatly increase the amount of time required. The
major computational expense in the regression analysis is the collection
of the cross- products matrix. For p variables and n
observations, the time required is proportional to np2. For
each model run, REG needs time roughly proportional to
k3, where k is the number of regressors in the model.
Add an additional nk2 for one of the R, CLM, or CLI options
and another nk2 for the INFLUENCE option. Most of the memory
REG needs to solve large problems is used for crossproducts matrices. PROC
REG requires 4p2 bytes for the main crossproducts matrix
plus 4k2 bytes for the largest model. If several output
data sets are requested, memory is also needed for buffers.
-
SORT
The SORT procedure sorts observations - arranging them in order by values
of one or more variables and is one of the most common operations performed
with the SAS System. You can now specify the SORTSIZE= option when you invoke
this procedure. Specifying the SORTSIZE= option in the PROC SORT statement
temporarily overrides the setting of the SORTSIZE= system option. The value
of the SORTSIZE= systems option is the default. The SORTSIZE= system option
is discussed earlier in this paper.
NOTE: When you invoke the SORT procedure, the computer system uses either
asorting module provided by SAS Institute, a sorting utility provided with
the operatingsystem, or a sorting utility provided by an independent
vendor.
WARNING: To sort a SAS data set, you need enough disk space
to hold the original file and at least two more files the same size as the
original one. This will depend on the number of BY variables.
-
SUMMARY with CLASS Statement
CLASS variable-list; The CLASS statement assigns the variables used to form
subgroups. The CLASS variable may be either numeric or character, but normally
each variable has a small number of discrete values or unique levels. The
CLASS statement has an effect on the statistics computed similar to that
of the BY statement.
NOTE: Theoretically, the maximum number of combinations of CLASS levels is
200 million. Realistically, it becomes a machine-dependent estimate, limited
solely by the amount of computer memory available. The maximum number of
CLASS variables is 30.
Performance Considerations of DATA Step Views
Using DATA step views can improve the efficiency of programming and
applications development. However, the requirements placed on machine resources
can increase or decrease depending on the methods of data processing that
you replace by using DATA step views. The impact on machine resources is
determined by the access pattern of the consuming task (DATA step or PROC
step). The consuming task can request the retrieval of data in two ways:
a single pass or multiple passes.
When one pass is requested, no data set is created. Compared to traditional
methods of processing, the one-pass access pattern increases performance
by decreasing the number of input/output operations and elapsed time.
When multiple passes are requested, the view must build a spill file that
contains all generated observations so that subsequent passes can read the
same data read by previous passes. Whenever the consuming task needs to access
only the data across BY groups, the SAS System optimizes multiple passes
by reusing space within the spill file whenever the BY groups change. With
this optimization, the amount of disk space required is the cumulative size
of the largest BY group generated rather than the cumulative size of all
observations generated by the view.
Both the single-pass access pattern and the multiple-pass access pattern
incur a certain overhead in CPU time and memory requirements. As a general
rule, CPU time increases by approximately 10%. This increase is due to an
internal host supervisor requirement and will be addressed in a future release
of the SAS System.
Concerning memory utilization, when a DATA step references a DATA step view,
the overhead incurred is associated with additional storage required to execute
the DATA step view. When a PROC step references a DATA step view, the additional
memory incurred is associated with the DATA step that executes the view.
Solaris Optimization Considerations
-
Work space - SASWORK
Work space areas are temporary holding areas which are reclaimed after job
execution has completed. Thus, I/O should be optimized for this partition.
If possible, avoid using a file system built on RAID-5 for this
WORK area since RAID-5 causes several I/Os for each write operation.
Another consideration is that you might want to use a TMPFS file system.
TMPFS is a memory resident file system. To do this, you would
add an entry similar to this in /etc/vfstab:
swap - /WORKFS tmpfs - no
-
After making the directory, /WORKFS, you can mount this directory as "mount
/WORKFS".
The potential downside to using a memory based workspace is that if total
system resources for Solaris, SAS and this work area exceed available system
memory, the effects of paging may have a detrimental effect over not having
used TMPFS.
-
Memory
Memory affects performance greatly. Optimize for a large memory
configuration. This critical resource is often the most underconfigured
resource. System memory requirements can be calculated by using the
FULLSTIMER option to SAS in conjunction with modifying the MEMSIZE and SORTSIZE
parameters.
-
Tracing system calls
If you need to trace system calls, use truss(8). This command will
product a tremendous amount of output but will allow to trace program execution
on a system call level.
-
Performance monitoring tools
Aside from bundled command line tools such as vmstat, iostat, ps and sar,
and GUI based tools such as
Solstice
SYMON, there are several public domain monitoring tools:
-
proctool
(Note: versions are specific to a particular Solaris release)
-
Additional Information:
Copyright (c) 2000 SAS Institute Inc. Cary,
NC, USA. All rights reserved.
Date last modified:
Fri Jan 28, 2000 18:41 UT
|
|