This document aims to identify the basics of Solaris Containers in the Solaris 10 Operating System so that Sun's developers, system administrators, ISVs, partners and customers will know how to address common tasks related to this technology. Additionally, it addresses how Solaris Containers should be used and what the limitations of this technology are. The goal of presenting this information is to provide the end users with the information they need to make intelligent decisions
about Solaris Containers when either developing or deploying applications or
managing a Solaris environment.
Solaris Containers
The Solaris 10 OS offers Solaris Containers, a set of technologies that allows system administrators to create separate operating environments on the same system to isolate and protect applications from each other while allowing system utilization to
increase. Solaris 10 Containers comprise two technologies: Solaris Zones partitioning technology and Resource Management. Solaris Zones provide virtualized
operating environments that have their own hostname, IP address(es),
users, file systems, and so on, while giving applications isolation and
protection from each other. Resource Management controls how
system resources are spread across specified workloads. The
combination of both Solaris Zones and Resource Management in Solaris 10
Containers can provide the following benefits:
Reduced management costs through server consolidation and a reduced number of operating system instances
Increased resource utilization with dynamic resource reallocation between containers
Increased service availability by minimizing fault propagation and security violations between applications
Increased flexibility because software-based containers can be dynamically reconfigured
Increased accuracy and flexibility of accounting, which is based on workloads rather than systems or processes
The following sections of this document describe the Solaris Zones feature and Resource
Management in detail. Each section focuses on how these technologies are used by both system administrators and application developers, so that people who want to use Containers have a better understanding of when and how they should be used.
Solaris Zones
Solaris Zones partitioning technology provides virtualized
and secure operating environments for running applications. Every
Solaris system contains a global zone, which is comparable to a normal
Solaris OS instance. Non-global zones can be created on the
system from the global zone by the global zone administrator to create
virtualized operating environments on the system that are isolated and secure from each
other.
The next section focuses on the common tasks that Sun's developers, system administrators, ISVs, partners, and customers will run into when using Solaris Zones, so that a good understanding of Zones partitioning technology can be achieved.
This section covers tasks and activities for both global and non-global zones.
The tasks are broken down into the following two categories:
Each of the areas is examined individually to highlight functional and behavioral differences that occur in both global and non-global zones, so that the end user comes away with a good understanding of how Solaris Zones technology affects the use of the Solaris OS.
Provisioning and Managing the Solaris OS Within Solaris Zones
This section highlights the most common activities that occur while
provisioning and managing the Solaris OS. It then details how these activities can be affected by the introduction of Solaris Zones in the Solaris 10 OS.
This section covers tasks and activities for both global and non-global zones.
Task List for Both Global and Non-Global Zones
The most common activities that occur while provisioning and managing a
Solaris environment from a system administration and application
development perspective are:
Task Analysis for Both Global and Non-Global Zones
Solaris Installation, Updates, and
Upgrades
Installation of Solaris global and non-global zones
A global zone contains a fully functional installation of the Solaris OS that
is bootable by the system hardware. An installation of the Solaris OS
becomes the global zone when it is booted by the system hardware. There is only one global zone running on a system. The global zone is installed by default when
the Solaris 10 OS is installed on a system.
Non-global zones are installed and configured by the global
system administrator using the zonecfg(2) and zoneadm(1M) commands. Over 8000 non-global zones can be created on a system, although not
all systems have the necessary resources to support that many non-global zones. See the Solaris 10 Zones installation documentation for further information and recommendations.
Two types of zones can be installed -- either whole root or
sparse root. See the file
system section of this document for more information.
During non-global zone installation, full replication
of the current package and patch database occurs. See the patch and package management section of this document for more information.
Solaris upgrades to new Solaris 10 updates occur via standard upgrade or Live Upgrade.
On a system with only a global zone, upgrades occur as
expected.
Support for upgrading the installed software on any
non-global zone from one Solaris release to a later release is not
provided with the Solaris 10 GA (3/05) release. At this point, Standard upgrade and
Live Upgrade do not know about non-global zones. The end result of an
upgrade of a Solaris instance that has non-global zones installed would
be a partially upgraded system. The global zone would be properly
upgraded, but the non-global zones would only be "partially upgraded." Inherited
file systems would have the upgraded files, but non-inherited file systems
would remain in their original state. To prevent a Solaris instance with non-global
zones installed from being damaged by an upgrade attempt, code has been
added to both standard upgrade and Live Upgrade to detect the presence
of non-global zones and to refuse to upgrade if non-global zones are
installed in Solaris 10 GA (3/05). As a result, once a system
administrator configures and installs the first non-global zone, the system can never be upgraded to a later release of Solaris until all non-global zones are
effectively destroyed. Support for upgrades of systems with
non-global Zones is expected in Solaris 10 Update 1.
Patch and Package Management
General Information
The global system administrator can administer the software
on every zone on the system.
The root file system for a non-global zone can be
administered from the global zone by using the Solaris packaging and patch tools.
Additionally, the Solaris packaging and patch tools are supported within the non-global zone for administering co-packaged (bundled), standalone (unbundled), or third-party products.
The packaging information visible from within a non-global
zone is consistent with the files that have been installed in that zone using
the Solaris packaging and patch tools. The visibility also includes packages that
have been imported from the global zone using read-only loopback mounts
(inherit-pkg-dir statements during non-global zone creation).
With a whole root non-global zone, all of the packages and
patches referenced in the global zone registry are replicated to the
non-global zone. See the file
system section of this document for more information on the whole
root model.
The package commands can add, remove, and interrogate
packages in both global and non-global zones. The
patch commands can add and remove patches in both global and non-global
zones.
The behavior of packaging in a zone environment varies
according to the factors listed below. (For more information, see the article Bringing
Your Application Into the Zone).
Use of the -G option in pkgadd(1M), which adds
a package to the global zone only
Setting the package parameters SUNW_PKG_ALLZONES, SUNW_PKG_HOLLOW, or SUNW_PKG_THISZONE in the pkginfo(1) file. (See pkginfo(4) for details.)
SUNW_PKG_ALLZONES
This parameter defines the zone scope of the package on
the system.
When it is set to true, the package will be installed
into the global zone and then distributed to each of the non-global
zones on the system.
When it is set to false, the package will only be
installed into the specific zone in which the pkgadd command is
executed unless the command was executed with the -Z option.
SUNW_PKG_HOLLOW
This parameter defines the visibility of a
package -- if that package is required to be installed on
all zones and be identical in all zones.
When this parameter is set to true, the global zone
needs to replicate the package database into both current and future
local zones.
When this parameter is set to false, it will force the
package to be replicated into the local zone regardless of its type.
SUNW_PKG_THISZONE
This parameter defines whether a package must be
installed in the current zone only.
When this parameter is true and executed in a
non-global zone, the package will only install in that non-global
zone.
Type of zone, global or non-global, in which pkgadd(1M)
is invoked.
Package and patch commands have been modified to accommodate
zones and each zone maintains its own patch and package database.
A package that needs to be interactive is added to the current
zone only. If the current zone is the global zone, the package is
treated as though it is being added by using the pkgadd command
with the -G option. For the rules of how packages
are added in zones, see the pkgadd(1M) section below.
pkgadd(1M)
pkgadd(1M) in the global
zone:
Packages can be added as follows:
To the global zone only, unless the package is SUNW_PKG_ALLZONES=true or the package contents affect any area of the global zone that is shared with any non-global zone.
To the global zone and to all non-global zones without
regard to the area affected by the package.
To all non-global zones only, if the package is already installed in the global zone.
To the current zone only, if SUNW_PKG_THISZONE=true.
Packages cannot be added:
If the destination is any subset of the non-global zones.
To all non-global zones, unless the package is already
installed in the global zone.
To add a package to the global zone only, execute the pkgadd(1M) utility with the -G option only as the global administrator in the global zone.
To add a package to the global zone and to all non-global
zones, execute the pkgadd(1M) utility in the global zone as
the global administrator. Run it without the -G
or -Z options.
At this time, the option to add a package that is already
installed on the global zone to all of the non-global zones (-Z) is not yet implemented. To accomplish this, use pkgrm(1M) to remove the package from the global zone, and then add the package without using the -G
option.
pkgadd(1M) in a non-global zone:
To add a package in a specified non-global zone, execute
the pkgadd(1M) utility, without options, as the zone
administrator. The following conditions apply:
The pkgadd(1M) utility can only add packages
in the non-global zone in which the utility is used.
The package cannot affect any area of the zone that is
shared from the global zone.
The package must be set SUNW_PKG_ALLZONES=false.
Neither the -G option nor the -Z
option can be used. If either of these options is used, pkgadd(1M) outputs an error message and the attempted operation fails.
pkgrm(1M)
When the pkgrm(1M) utility is used in the global
zone, the following actions apply:
pkgrm(1M) can remove a package from the global
zone and from all non-global zones without regard to the area affected by
the package, from all non-global zones only, or from the global
zone only when the package is only installed in the global zone.
pkgrm(1M) cannot remove a package from the
global zone if the package is also installed in a non-global zone, or remove a
package from any subset of the non-global zones.
If the -G option is used, pkgrm(1M)
removes the specified package from the global zone only.
If the -Z option is used, pkgrm(1M)
removes the specified package from all non-global zones only. The package is
marked as installed in the global zone only. The package is not installed when any
non-global zone is installed. Note: This option is not yet implemented.
If neither the -G option nor the -Z
option is used, pkgrm(1M) removes the specified package from all
zones, including the global zone. This is the default action.
When the pkgrm(1M) utility is used in the
non-global zone, the following limitations apply:
pkgrm(1M) can only remove packages from the
non-global zone.
Neither the -G nor the -Z
options can be used in a non-global zone. If either of these options is used, pkgrm(1M) outputs an error message and the attempted operation fails.
The package cannot affect any area of the zone that is
shared from the global zone.
The package must be set SUNW_PKG_ALLZONES=false.
pkginfo(1M)
When the pkginfo(1M) utility is used in the
global zone, the following actions apply:
pkginfo(1M) can query the software package
database in the global zone only, in a specified non-global zone only, or in all
non-global zones only.
pkginfo(1M) is not able to query the software
package database in the global zone and in all non-global zones, or in the
global zone and in a subset of all non-global zones.
If the -zzonename option
is used (lowercase z), pkginfo(1M) queries the software
package database in the specified non-global zone only. Note that the -z option is not yet implemented for pkginfo(1M).
If the -Z option is used (uppercase Z), pkginfo(1M) queries the software package database in all non-global zones. Note that the -Z option is not yet implemented for pkginfo(1M).
If neither the -z option nor the -Z
option is used, pkginfo(1M) queries the software package database
in the global zone only.
When the pkginfo(1M) utility is used in the
non-global zone, the following actions apply:
pkginfo(1M) can query only the software package
database in the non-global zone.
Neither the -z option nor the -Z
option can be used in a non-global zone. If either option is used, pkginfo(1M) outputs an error message and the attempted operation fails.
patchadd(1M)
When patchadd(1M) is used in the global zone, the
following conditions apply:
The patchadd(1M) utility is able to add the
patch(es) to the global zone and to all non-global zones only. This is the
default action.
The patchadd(1M) utility cannot add the
patch(es) to the global zone only or to a subset of the non-global zones.
When used in a non-global zone by the zone administrator, patchadd(1M) can only be used to add patches to that zone. A patch can be added to a
non-global zone in the following cases:
The patch does not affect any area of the zone that is
shared from the global zone.
All packages in the patch are set SUNW_PKG_ALLZONES=false.
As the number of zones that a patch is applied to increases,
the amount of time required to patch the system increases as well.
patchrm(1M)
The global administrator can use the patchrm(1M)
utility in the global zone to remove patches. The patchrm(1M)
utility cannot remove patches from the global zone only or from a subset of the
non-global zones.
The zone administrator can use the patchrm(1M)
utility in a non-global zone to remove patches from that non-global zone only.
Patches cannot affect areas that are shared.
Note: -z and -Z options are scheduled to be implemented sometime after Solaris 10 Update 1.
The two ways to configure a non-global zone's root file system
are through the whole root model or the sparse root model.
The whole root model installs all of the required and any selected optional Solaris packages into the private file systems of the zone. The advantages of this model include the ability for zone administrators to customize their zones
file system layout (for example, creating a /usr/local directory) and add
arbitrary unbundled or third-party packages. The disadvantages of this model include the loss of sharing of text segments from executables and shared libraries by the virtual memory system, and a much heavier disk footprint, approximately an additional 2 Gbyte, for each non-global zone that is configured this way.
The sparse root model optimizes the sharing of objects by only installing
a subset of the root packages (those with the pkginfo(4) parameter
SUNW_PKGTYPE set to root) and using read-only loopback
file systems to gain access to other files. In this model, only certain root packages are installed in the non-global zone. This will include a subset of the required root packages that are normally installed in the global zone as well as any additional root packages that the global administrator might have selected. Access to other
files will be via read-only loopback file systems. This is similar to the way a diskless client is configured, where /usr and other file systems are mounted over the network with NFS. By default with this
model, the directories /lib, /platform, /sbin, and /usr will be mounted in this manner. The advantages of this model are greater performance due to the efficient sharing of executables and shared
libraries, and a much smaller disk footprint for the zone itself. The sparse root model only requires approximately 100 Mbyte of file system space for the zone itself.
Configuring the non-global zone for either the whole root model
or the sparse root model can be accomplished as follows:
Upon creation of a zone, the default configuration uses
inheritance of the /usr, /lib, /platform, and /sbin directories through a loopback file system to create a sparse root zone. If other directories, like /opt, need to be inherited from the global zone, they can be added using add inherit-pkg-dir dir=/opt.
To create a non-global zone with the whole root model, the
administrator must configure the zone so that it does not use the
default configuration, which is the sparse root model. To do
this, use the -b option with the zonecfg(2) command and the create sub-command, which creates a blank configuration.
If you want to create a whole root zone, but shared file
systems have been added using inherit-pkg-dir, you must
remove these default resources using zonecfg(2) before the zone is
installed. This can be done using remove inherit-pkg-dir
dir=<file system directory>.
Each non-global zone has its own section of the file system
hierarchy, rooted at a directory known as the zone root. Processes in the zone can access only files in the part of the hierarchy that is located under the zone root.
To eliminate file system contention, it is possible to create
a separate file system using Solaris Volume Manager and soft
partitioning, which is provided as part of the Solaris 10 release. It
will need to be mounted to the root path of the zone with permissions of 700.
To mount file systems in running non-global zones, do one of the following:
Import raw and block devices using zonecfg.
Mount the file system from the global zone into the local zone.
The following table highlights the various file system types that are available in a non-global zone as well as mounting options.
Table 1: File System Types in Non-Global Zones, With Mounting Options
File System
Mount via zonecfg
Mount From Global to Non-Global
Mount From Within the Zone
AutoFS
No
No
Yes
CacheFS
N/A
N/A
N/A
FDFS
Yes
Yes
Yes
HSFS
Yes
Yes
Yes
LOFS
Yes
Yes (safe but some performance issues)
Yes
MNTFS
No
No
Yes
NFS
No
No
Yes (V2, V3, and V4)
PCFS
Yes
Yes
Yes
PROCFS
No
No
Yes
TMPFS
Yes
Yes
Yes
UDFS
Yes
Yes
Yes
UFS
Yes
Yes (safe, no performance overhead)
Yes
XMEMFS
Yes
Yes
Yes
VERITAS Volume Manager (VxVM) and VERITAS File System (VxFS) in Zones
As of July 2005, VxVM 4.1 is only supported in a global zone.
VxFS 4.1 is supported in non-global zones, but there are known limitations, which are currently being researched.
Sections of a file system can be mounted into one or more zones
using the read-only option of the LOFS file system. This allows the
same file system data to be shared in multiple zones, while preserving
the security guarantees supplied by zones.
NFS and autofs mounts established within a zone are local to
that zone; they cannot be accessed from other zones, including the
global zone. The mounts are removed when the zone is halted or rebooted.
df(1M)
By default, df(1M) only displays mounts located within the current zone.
When run from the global zone,df(1M) with the -Z option displays mounts in all zones.
By default, commands such as dd(1M), fmthard(1M), format(1M), fdisk(1M) (x86/x64), mkfs(1M), and newfs(1M) are not enabled in a non-global zone.
In the default configuration, the following file systems are mounted in a non-global zone:
/, the zone root file system, is mounted on <zonepath>/root in the global zone.
/sbin, /usr, /lib, and /platform file systems are read-only loopback mounts from the global zone to enable text page sharing to reduce memory requirements. This also reduces the required disk footprint of the zone.
/dev for the zone is mounted on <zonepath>/dev in the global zone.
/proc, /dev/fd, /system/contract, /etc/svc/volatile, /etc/mnttab, /var/run and /tmp
Additional file systems can be mounted in a zone if required (refer to the table above).
The ability to unmount a file system depends on who does the initial mount rather than the type of file system.
When a file system is specified via zonecfg(1M), then basically the global zone controls this mount, and the non-global zone root user cannot unmount the file system.
If the file system is mounted from within the zone via the zone's /etc/vfstab file, for example, then the non-global root user can unmount the file system.
/etc/system tunes system-wide parameters and can only be set
in the global zone. With the Solaris 10 OS, Sun is moving away from
using /etc/system tunables. Some of the parameters that had been set in /etc/system in previous Solaris
releases can be set using rctls, and this would be zone specific. A list of parameters that can be tuned using rctls can be found under Configuring Resource Controls and Attributes on docs.sun.com.
ndd(1M)
ndd(1M) tunes system-wide parameters and can only be set in the global zone.
/dev/ip is not available in a non-global zone.
There is no sys_net_config privilege in a non-global zone, so ndd(1M) parameters cannot be set there.
Moving forward, Sun is moving away from using ndd(1M)as a tool for configuring the system's networking settings.
Network Configuration and Management
Each non-global zone has its own logical network and loopback
interface. Bindings between upper layer streams and logical interfaces
are restricted such that a stream may only establish bindings to
logical interfaces in the same zone. Likewise, packets from a logical
interface can only be passed to upper layer streams in the same zone as
the logical interface. Bindings to the loopback address are kept within
a zone with one exception: when a stream in one zone attempts to access
the IP address of an interface in another zone.
Two zones can access each other only through IP connections (that is, telnet(1) and rlogin(1)).
While applications within a zone can bind to privileged network
ports, they have no control over the network configuration, including
IP addresses and the routing table. The following should be
managed from the global zone:
IPQoS
IPsec
IPMP
IPFilter (also, no inter-zone filters)
Routing
NCA
The ifconfig utility has been modified in order to configure and view interfaces based on zone granularity as well, but interfaces can only be plumbed or unplumbed from the global zone.
In the event that multiple subnet routing is needed, the default route for each subnet must be defined within the /etc/defaultrouter file belonging to the global zone. Defining an /etc/defaultrouter file in non-global zones has no effect.
IKE is not yet supported in non-global zones.
Network Services
NFS
NFS servers work in global zones but they
do not yet work within non-global zones because NFS servers need direct
access to the kernel. This is being worked on, although there is no ETA at this
point.
NFS clients (V2, V3, and V4) can be used within non-global
zones.
A non-global zone cannot mount an NFS file system from its
own global zone.
DHCP is not yet available for non-global zones.
Snooping is not yet available for non-global zones.
System and State Management
System and state management of a global zone is similar to
what it was in previous Solaris releases.
System state management of a non-global zone is described
below:
A non-global zone can be in one of these states:
configured
incomplete
installed
ready
running
shutting_down
down
During the bring-up process for a normal non-global zone,
a zone will go through these states: configured -> installed -> ready -> running.
Booting a zone is done with the zoneadm(1M)boot
command. Booting a zone places the zone in the running state. A zone can be booted from
the ready state or from the installed state. A zone in the installed state that is
booted transparently transitions through the ready state to the running state.
Halting a zone is accomplished through the zoneadm(1M)halt command, which removes both the application
environment and the virtual platform for a zone. The zone is then brought back to the
installed state. All processes are killed, devices are unconfigured, network interfaces
are unplumbed, file systems are unmounted, and the kernel data structures are
destroyed. The halt command does not run any shutdown
scripts within the zone.
Rebooting a zone is done through the zoneadm(1M)reboot command. When this command is issued, the zone is halted and then booted again. The zone ID will change when the zone is rebooted.
If you set the autoboot resource property in a
zone's configuration to true, that zone is automatically booted when the global zone is booted. The default setting is false.
Uninstalling a zone is done with the zoneadm(1M)uninstall command. It uninstalls all of the files under the zone's root
file system. Before proceeding, the command prompts you to confirm the action, unless the -F (force) option is also used. Use the uninstall command with caution, because the action is irreversible.
System state management of a non-global zone can be done
without interfering with the state of the global zone or other
non-global zones.
User Management
Each zone, whether global or non-global, has its own namespace
for users.
Hence, users in different zones with the same UID are in fact
distinct users, even though they share the same numerical ID. The virtualized user ID namespace also implies that passwords are unique to the zone.
Each zone has its own name service. They can be
completely different, such as 'files' for the local zone and NIS for
the global zone, for example, and/or both can use NIS but with
different NIS domains.
Users in a non-global zone are unable to monitor other zones,
such as viewing network traffic or the activity of processes.
Memory Management and Configuration
Shared memory cannot be used between containers as that would
violate security restrictions.
The entire swap partition is treated as a single global
resource to processes running in both the local and global zones. With the Solaris 10
GA release, you can't limit the amount of swap used by a local zone on a per-zone basis. You can globally limit the size of the swap-based
file systems (for example, /tmp) by using the "size" mount option in the local zone's /etc/vfstab file, for example, size=200m. This allows you to decrease the effect of many files and/or large files created in /tmp.
Note: A future enhancement is being planned for resource pools to implement a resource control called a swap set. Swap sets would allow swap to be limited within a pool bound to a zone on a per-zone basis.
Process Management
Processes
Global zones behave like a "non-zoned" system.
Root-owned global zone processes have the same powers,
across all zones.
Non-root-owned processes can view information about non-global
processes, but cannot signal them.
Processes in one non-global zone are not visible to any other
non-global zone.
The root user in a non-global zone is only omnipotent and
omniscient within its own non-global zone. It has no power or visibility into other
zones.
Process Tree
A global zone sees all processes in all zones in one process
tree.
A non-global zone sees only its own process sub-tree.
Process Isolation
Processes in one local zone cannot detect or interact with
processes in other local zones except through the network or a shared
file system.
Only processes in the same zone will be visible through
system call interfaces that take process IDs, such as the kill(1) and priocntl(1) commands. Attempts to access processes that exist in other zones (including the global zone) fail with the same error code that would be
issued if the specified process did not exist.
All processes live in one resource pool.
Device Management
In a non-global zone, the set of devices is restricted to
prevent a process in one zone from interfering with processes running in other zones. By default, only certain pseudo-devices that are considered safe for
use in a zone are available. Additional devices can be made available
within specific zones by using the zonecfg(2) utility. Here is the list of pseudo-devices available to a non-global zone:
/dev/null, /dev/zero, /dev/poll, /dev/random, /dev/tcp, and so on
Physical devices are only available if configured by a system
administrator. The administrator must ensure that the security of
the system is not compromised.
Placing a physical device into more than one zone can create
a covert channel between zones.
Global zone applications that use such a device risk the
possibility of compromised data or data corruption by a non-global zone.
If possible, mount the device within the non-global zone's
root hierarchy so it cannot be compromised by unprivileged users within
the global zone.
Most operations concerning kernel, device, and platform
management will not work inside a non-global zone because modifying platform hardware
configurations violates the zone security model. For example, the following operations
will not work:
Using facilities that affect the state of the physical
platform or expose system data (that is, eeprom(1M), prtconf(1M), prtdiag(1M), dtrace(7D), kmem(7D),
ksyms(7D), kmdb(7D), trapstat(1M),
lockstat(7D), and so on). These are only available in the
global zone or do not work as expected in non-global zone due to restrictions on devices.
Creating new device nodes (mknod(2))
Accessing NIC device nodes that support the DLPI programming
interface
Further information on devices use in non-global zones can be
found under Device Use in Non-Global Zones on docs.sun.com.
Privileges Available
Processes in a non-global zone are running with a reduced set
of privileges in their limit set. The privileges that were taken
away from zones were deemed unsafe when it came to providing a secure
and isolated application environment for a process in a zone.
Unprivileged processes, whether in a global zone or non-global
zone, share the same basic privilege set. The privileges file_link_any,
proc_info, proc_session, proc_fork,
and proc_exec make up the "basic" privilege set.
Privileged processes in a global zone have all privileges
available to them.
Privileged processes in a non-global zone have a subset of
privileges available to privileged processes in a global zone.
The functionality that these missing privileges provide
(with the exception of the DTrace privileges, which are new to the
Solaris 10 OS) is only available to the superuser in prior releases of
Solaris. The following privileges are available only in a global
zone, and not available in a non-global zone:
dtrace_*
net_rawaccess
proc_clock_highres
proc_lock_memory
proc_priocntl
proc_zone
sys_config
sys_devices
sys_ipc_config
sys_linkdir
sys_net_config
sys_res_config
sys_suser_compat
sys_time
To display the list of privileges available within a zone, use the ppriv(1) utility.
Backup and Recovery
Basic backup and recovery utilities, like tar(1),
ufsdump(1M), and fssnap(1M), allow for backup and recovery of
user information within both global and non-global zones as long as these utilities are
executed from the global zone.
For the most part, tar(1) functions properly.
There is an edge case where it is
unable to work in a non-global zone completely as it does in the global zone. When running in a non-global zone, tar(1)
is able to create archives that preserve the sticky bit on individual
files, but is not able to write files with the sticky bit set back to
the file system. The tar(1) command fails silently in
this case, because the chmod(2) system call does not
report a failure when this occurs.
Also, tar(1) cannot recreate the /dev entries
in a non-global zone.
Miscellaneous Information
Device access limitations and changes to library and system
call usage due to privilege restrictions in non-global zones affect
Solaris commands within non-global zones. The following commands,
some of which are described above,
behave differently in
non-global zones:
add_drv(1M) / rem_drv(1M)
arp(1M)
autopush(1M)
cfgadm(1M)
cpustat(1M)
devfsadm(1M)
devlinks(1M)
dispadmin(1M)
disks(1M)
drvconfig(1M)
dtrace(7D)
intrstat(1M)
ipf
modload(1M) / modunload(1M)
plockstat(1M)
pooladm(1M)
poolcfg(1M)
poolbind(1M)
ports(1M0
prtconf(1M)
prtdiag(1M)
psrset(1M)
route(1M)
share(1M)
snoop(1M)
tapes(1M)
trapstat(1M)
date(1)
nca(1)
How to export a CD-ROM into a non-global zone:
Check if Volume Manager is running:
ps -ef | grep volmgt
If it is not running, start it with:
/etc/init.d/volmgt start
Insert CD.
Force Volume Manager to check for media:
volcheck
Test if CD is automounted:
ls /cdrom
cdrom cdrom1 software_cd
Make /cdrom accessible in non-global zone:
zonecfg -z myzone
zonecfg:myzone> add fs
zonecfg:myzone:fs> set dir=/mnt
zonecfg:myzone:fs> set special=/cdrom
zonecfg:myzone:fs> set type=lofs
zonecfg:myzone:fs> add options [ro,nodevices]
zonecfg:myzone:fs> end
zonecfg:myzone> commit
zonecfg:unizone> exit
Restart non-global zone from the global zone:
Check if zone is running:
zoneadm list -CV
ID NAME STATUS PATH
0 global running /
- myzone running /export/home/myzone
If yes, stop zone:
zlogin myzone init 0
Check that zone is stopped:
zoneadm list -cv
ID NAME STATUS PATH
0 global running /
- myzone installed /export/home/myzone
Start zone:
zoneadm -z myzone boot
Check if zone is booted.
Log in to non-global zone:
zlogin myzone
Check if the mounted CD can be seen from the non-global zone:
ls /mnt
cdrom cdrom1 software_cd
Start installation from the mounted CD (that is, /mnt/cdrom/...).
Developing, Provisioning,
and Managing Third-Party Applications for Solaris Zones in the Solaris 10 OS
This section highlights the most common activities that occur while
developing, provisioning, and managing third-party applications for the
Solaris OS. It then details how these activities can be affected by the introduction of
Zones in the Solaris 10 OS.
This section covers tasks and activities for both global and non-global zones.
Task List for Both Global and Non-Global Zones
The most generic actions that occur while developing,
provisioning, and managing third-party applications within a Solaris
environment are as follows:
In a global zone in the Solaris 10 OS, third-party software installation
will behave as expected.
In a non-global zone in the Solaris 10 OS, third-party software
installation may fail due to read-only file systems or CD-ROM access.
Any software installation that places components in /usr
(or any of the other read-only loopback file systems) will fail in a
zone following the sparse root model.
Any software installation that requires CD-ROM access in a
non-global zone will fail unless the CD-ROM is available in the
non-global zone (not the default configuration).
Solaris Resource Management provides for fine-grained measurement and
dynamic control of a workload on a system. A workload is an
aggregation of all processes of an application or group of
applications. The number of workloads that can be combined onto a
Solaris system is determined by the resource requirements of the
workloads as well as the available system resources. If resource management features are not used, the Solaris OS gives all activity on the system equal access to
resources. Solaris resource management features enable you to treat and
control workloads individually. This facilitates understanding
and controlling the resource requirements of workloads so that the
consolidation of multiple workloads onto a single
system can occur.
To explain Resource Management, this section focuses on the most common
cases that system administrators will run into when using the
technology. Resource Management is used in the following three
areas:
Each of these topics is explained and examined individually to
illustrate how Solaris Resource Management behaves in the Solaris 10 OS. Details about how resource management functions in zones are also given.
Classifying Workloads
A method to classify workloads is needed so the system knows
which processes are associated with which workload. New objects,
called Project and Task, were created to
facilitate the labeling and classification of workloads. These objects are explained below.
Project
A project provides a way to identify related work.
Information about a project is stored in a project database,
which can be stored in a local file (like the /etc/project file), in a NIS project map, or in an LDAP directory.
A project entry consists of the name of the project, the project
ID, a description of the project, a list of users who are allowed in
the project, a list of groups who are allowed in the project, and any
attributes that belong to the project, which can include resource
controls.
Updates to project entries do not affect active projects; only
newly added tasks are affected.
A user or group can belong to one or more projects, although each
user is assigned to a default project.
The project identifier can be shared across multiple machines to
better assess resource consumption.
Projects behave the same in all types of zones (both global and
non-global). Each zone maintains an independent copy of the local
project file as well as users, groups, and projects.
Task
A task is a group of processes that represent a workload
component.
Each successful login into a project creates a new task that
contains the login process.
Each task is automatically assigned a task ID.
Each process is a member of one task, and each task is associated
with one project.
Some of the common commands that are used to classify workloads are
detailed below:
projadd(1M)
This command adds a new project entry on the local system only
(like the /etc/project file). It cannot add projects from the network naming service.
projmod(1M)
This command modifies information for a project on a local
system. It cannot modify projects from the network naming service.
It can be used to edit fields in a project entry including
project attributes, which can contain resource controls.
projdel(1M)
This command deletes a project from the local system only. It
cannot delete projects from the network naming service.
newtask(1)
This command creates a new task in a specified project.
Running processes can be associated with a new task as well.
These commands have been modified to show project and/or task information:
ps(1) (-o option displays project and task information)
id(1M) (-p option adds the current project ID to user ID and group ID listing)
pgrep(1) / pkill(1) (-J option allows these commands to be executed on a list of project IDs)
prstat(1M) (-J option displays information on processes and projects, -T option displays information on tasks)
It is important to note that projects and tasks are not affected by the
introduction of zones in the Solaris 10 OS. Each zone, whether global or
non-global, maintains its own projects and tasks to classify workloads
running in the isolated environment.
Monitoring and Measuring Workloads
Once workloads can be identified and separated using projects
and tasks, it is possible to monitor and measure resource consumption
of a given workload using the extended accounting system. This
facility records system and network usage on a task or process basis so
that a workload's resource consumption can be better understood. When the accounting system is enabled, specified statistics are gathered for tasks and processes and placed in files. These files can then be viewed and analyzed using the libexacct API, the Perl interface to this library, or third-party tools that support this
API.
The extended accounting facility works in a zones
environment as follows:
When it is enabled from the global zone,
statistics are gathered on a system-wide level, which includes
all of the non-global zones on the system. The global
administrator can then analyze resource consumption for the entire
system or on a per-zone basis.
Accounting records are written to the global zone's accounting
files as well as the non-global zone's accounting files.
Different account settings and files exist on a per-zone basis for process-based and task-based accounting.
Some of the common commands that are used with the extended accounting
system include:
acctadm(1M)
This is the command that is used to start and stop accounting, to
display the status of accounting, to view available accounting
resources, to modify attributes of the accounting facility, and to
select attributes to track for tasks and processes.
wracct(1M)
This command writes extended accounting records for active
processes and tasks.
Controlling Workloads
There are three mechanisms to control workloads within a Solaris
environment:
Constraints
This technique sets bounds on the consumption of specific
resources, like CPU or memory, for a given workload using the resource
control facility. Setting bounds prevents workloads from
consuming too many resources. This mechanism does present risks,
however, because if the bounds are set incorrectly, the application may
not be able to function.
Scheduling
This procedure makes resource allocation decisions at set
intervals. If a workload is not fully utilizing its resources,
those resources are made available to other workloads. In a
situation where a workload is over-committed, this mechanism provides
controlled allocations.
Partitioning
This mechanism binds a workload to a defined subset of the
system's resources. It guarantees that a known amount of resources will
always be available to the workload, but it can hamper system-wide
utilization if these resources are under-utilized.
The most common examples of these mechanisms include resource controls,
resource pools, memory capping, and IPQoS. Each of these are
detailed below with special emphasis on how they function within a
Solaris Zones environment.
Resource Controls
Resource controls can be applied to a project, a task, or a
process. In a zone environment, they can also be set at a
zone-wide level.
They are configured in the attribute field of
project entries in the project database, unless specified at the
zone-wide level. When specified at the zone level, resource
controls are set during zone configuration using zonecfg(2).
Standard resource controls are available for CPUs, memory,
ports, message queues, LWPs, tasks, CPU time, semaphores, and so on. A
full list of supported resource controls can be found under Configuring Resource Controls and Attributes on docs.sun.com. The set of controls available at a zone-wide level, rather than projects running within a zone, include only
cpu-shares and max-lwps at this point.
The zone-wide zone.cpu-shares is the fair share scheduler (FSS). It is used to control the allocation of available CPU resources among workloads, based on their importance. The importance is
expressed by the number of shares of CPU resources that are
assigned to each workload. Shares are defined in terms of ratios, with
the global zone getting 1 share by default. FSS CPU shares for a zone
are hierarchical. The shares for a given non-global
zone are set by the global administrator through the zone-wide resource
control zone.cpu-shares. The project.cpu-shares resource control can then be defined for each project within that zone to further subdivide the shares set through the zone-wide control.
Either global or local actions can be
taken when a resource cap is reached. Global actions include
logging at a variety of levels, while local actions can be one of the
following:
none, which takes no action but does
indicate that the bound was exceeded
deny, which does not fulfill requests that go above the threshold
signal, which sends a specified signal to the process when the threshold is exceeded
Not all local actions can be applied to every resource control, however.
Resource controls interact with zones in the Solaris 10 OS in the following
manner:
Any of the resource controls listed under Configuring Resource Controls and Attributes can be set as attributes of projects running in the non-global zone. These projects are then controlled by these resource
limitations.
Setting the controls for a non-global zone affects only
that zone. Projects that span multiple zones can have different controls in each zone.
Controls are subject to the additional requirements regarding
pools and the zone-wide resource controls.
Here are some of the common commands that are used to control workloads
via resource controls:
rctladm(1M)
This command allows for runtime interrogations and
modifications of the
resource control facility with global scope.
In a non-global zone, this command cannot be used to modify
settings.
prctl(1M)
This command allows for runtime interrogations and
modifications of the resource control facility with local scope.
Changes made with this command are good only until system
reboot. The projmod(1M)command should
be used to make changes that persist across reboot.
Dynamic Resource Pools
Resource pools provide a way to separate workloads into pools
of CPUs so that workload requirements for CPU resources do not compete.
When a pool's resources are not fully utilized, they are
temporarily allocated to other pools as needed so resources are not
wasted.
Here are the rules for resource pools within zones:
When a non-global zone is allocated resources from a CPU
pool, the resources can be subdivided further and given to specific workloads
within the non-global zone by the zone administrator. Resource
assignment occurs through the implementation of projects.
A zone may be assigned to only one resource pool although
several zones may be assigned to the same resource pool.
Processes in the global zone, however, can be bound by a sufficiently privileged process to any pool.
These options are available within zonecfg to configure the pools:
zone.cpu-shares, which specifies the number of FSS CPU shares available to the entire zone from the resource pool
pool, which indicates the name of the resource pool the zone is bound to when booted
Some of the commands and daemons to manage pools are given below.
poold(1M)
This is the daemon that partitions resource pools.
This runs only in the global zone, where there can be more than
one pool for it to operate on.
poolstat(1M)
This command reports active pool statistics.
When run in a non-global zone, this command displays statistics about
the pool associated with that zone only.
pooladm(1M)
This is the command used to administer resource pools.
When run without arguments in a non-global zone, this command
displays only information about the pool associated with the zone.
Memory Capping
Constraints can be set within projects to limit the amount of
memory consumed by processes belonging to the project.
The memory caps are defined as attributes of a project.
rcapd(1M) is the daemon that controls memory capping, and rcapadm(1M) is the command
used to enable or disable memory capping.
When processes in a project reach the specified memory cap, pages
will be paged out to reach the memory threshold. Special effort
should be used when setting the memory cap because:
If it is set too high, the system's resources can be consumed
before the cap is reached.
If it is set too low, the system may experience excessive
paging.
Memory capping within a zones environment works as follows:
Both global and non-global zones have their own rcapd(1M) daemons.
Each zone must be configured separately.
IPQoS
This feature allows for consistent levels of services to
network users. It provides for controlling network traffic, prioritizing
it, as well as monitoring it.
At the zone level, it manages network traffic in and out of the
zone. An upper limit can be set, and if it is exceeded, packets
are dropped.
This feature should be implemented with care as it does have CPU
overhead. Additionally, it should be determined that the feature works
with other network components, like firewalls and routers, before implementation.
Summary
This paper has examined both Solaris Zones and Resource Management
to build a better understanding of Solaris Containers technology and how
it should be used. Solaris Zones partitioning technology provides
virtualized and secure operating environments for running applications, and Resource
Management features provide the Solaris OS with functionality that helps
control and manage resources. Through these two feature sets,
Solaris Containers help users achieve higher levels of consolidation and better
system utilization.
Jennifer Rodoni Glore, who has been with Sun for five years, is an engineer in the Market Development Engineering organization focused on system integrator adoption of Sun products and solutions. Over the past year, Jennifer has been focused on the Solaris OS and the x86 platform.
Unless otherwise licensed, code in all technical manuals herein (including
articles, FAQs, samples) is provided under this License.