BigAdmin System Administration Portal
Feature Article
Print-friendly VersionPrint-friendly Version

Understanding the Basics About Solaris Containers in the Solaris 10 OS

Jennifer Rodoni Glore, August 2005


Overview

This document aims to identify the basics of Solaris Containers in the Solaris 10 Operating System so that Sun's developers, system administrators, ISVs, partners and customers will know how to address common tasks related to this technology. Additionally, it addresses how Solaris Containers should be used and what the limitations of this technology are. The goal of presenting this information is to provide the end users with the information they need to make intelligent decisions about Solaris Containers when either developing or deploying applications or managing a Solaris environment.

Solaris Containers

The Solaris 10 OS offers Solaris Containers, a set of technologies that allows system administrators to create separate operating environments on the same system to isolate and protect applications from each other while allowing system utilization to increase. Solaris 10 Containers comprise two technologies: Solaris Zones partitioning technology and Resource Management. Solaris Zones provide virtualized operating environments that have their own hostname, IP address(es), users, file systems, and so on, while giving applications isolation and protection from each other. Resource Management controls how system resources are spread across specified workloads. The combination of both Solaris Zones and Resource Management in Solaris 10 Containers can provide the following benefits:

  • Reduced management costs through server consolidation and a reduced number of operating system instances
  • Increased resource utilization with dynamic resource reallocation between containers
  • Increased service availability by minimizing fault propagation and security violations between applications
  • Increased flexibility because software-based containers can be dynamically reconfigured
  • Increased accuracy and flexibility of accounting, which is based on workloads rather than systems or processes

The following sections of this document describe the Solaris Zones feature and Resource Management in detail. Each section focuses on how these technologies are used by both system administrators and application developers, so that people who want to use Containers have a better understanding of when and how they should be used.

Solaris Zones

Solaris Zones partitioning technology provides virtualized and secure operating environments for running applications. Every Solaris system contains a global zone, which is comparable to a normal Solaris OS instance. Non-global zones can be created on the system from the global zone by the global zone administrator to create virtualized operating environments on the system that are isolated and secure from each other.

The next section focuses on the common tasks that Sun's developers, system administrators, ISVs, partners, and customers will run into when using Solaris Zones, so that a good understanding of Zones partitioning technology can be achieved.

This section covers tasks and activities for both global and non-global zones.

The tasks are broken down into the following two categories:

Each of the areas is examined individually to highlight functional and behavioral differences that occur in both global and non-global zones, so that the end user comes away with a good understanding of how Solaris Zones technology affects the use of the Solaris OS.


Provisioning and Managing the Solaris OS Within Solaris Zones

This section highlights the most common activities that occur while provisioning and managing the Solaris OS. It then details how these activities can be affected by the introduction of Solaris Zones in the Solaris 10 OS.

This section covers tasks and activities for both global and non-global zones.

Task List for Both Global and Non-Global Zones

The most common activities that occur while provisioning and managing a Solaris environment from a system administration and application development perspective are:

Task Analysis for Both Global and Non-Global Zones

Solaris Installation, Updates, and Upgrades

  • Installation of Solaris global and non-global zones
    • A global zone contains a fully functional installation of the Solaris OS that is bootable by the system hardware. An installation of the Solaris OS becomes the global zone when it is booted by the system hardware. There is only one global zone running on a system. The global zone is installed by default when the Solaris 10 OS is installed on a system.
    • Non-global zones are installed and configured by the global system administrator using the zonecfg(2) and zoneadm(1M) commands. Over 8000 non-global zones can be created on a system, although not all systems have the necessary resources to support that many non-global zones. See the Solaris 10 Zones installation documentation for further information and recommendations.
      • Two types of zones can be installed -- either whole root or sparse root. See the file system section of this document for more information.
      • During non-global zone installation, full replication of the current package and patch database occurs. See the patch and package management section of this document for more information.
  • Solaris upgrades to new Solaris 10 updates occur via standard upgrade or Live Upgrade.
    • On a system with only a global zone, upgrades occur as expected.
    • Support for upgrading the installed software on any non-global zone from one Solaris release to a later release is not provided with the Solaris 10 GA (3/05) release. At this point, Standard upgrade and Live Upgrade do not know about non-global zones. The end result of an upgrade of a Solaris instance that has non-global zones installed would be a partially upgraded system. The global zone would be properly upgraded, but the non-global zones would only be "partially upgraded." Inherited file systems would have the upgraded files, but non-inherited file systems would remain in their original state. To prevent a Solaris instance with non-global zones installed from being damaged by an upgrade attempt, code has been added to both standard upgrade and Live Upgrade to detect the presence of non-global zones and to refuse to upgrade if non-global zones are installed in Solaris 10 GA (3/05). As a result, once a system administrator configures and installs the first non-global zone, the system can never be upgraded to a later release of Solaris until all non-global zones are effectively destroyed. Support for upgrades of systems with non-global Zones is expected in Solaris 10 Update 1.

Patch and Package Management

  • General Information
    • The global system administrator can administer the software on every zone on the system.
    • The root file system for a non-global zone can be administered from the global zone by using the Solaris packaging and patch tools. Additionally, the Solaris packaging and patch tools are supported within the non-global zone for administering co-packaged (bundled), standalone (unbundled), or third-party products.
    • The packaging information visible from within a non-global zone is consistent with the files that have been installed in that zone using the Solaris packaging and patch tools. The visibility also includes packages that have been imported from the global zone using read-only loopback mounts (inherit-pkg-dir statements during non-global zone creation).
      • With a whole root non-global zone, all of the packages and patches referenced in the global zone registry are replicated to the non-global zone. See the file system section of this document for more information on the whole root model.
    • The package commands can add, remove, and interrogate packages in both global and non-global zones. The patch commands can add and remove patches in both global and non-global zones.
    • The behavior of packaging in a zone environment varies according to the factors listed below. (For more information, see the article Bringing Your Application Into the Zone).
      • Use of the -G option in pkgadd(1M), which adds a package to the global zone only
      • Setting the  package parameters SUNW_PKG_ALLZONES, SUNW_PKG_HOLLOW, or SUNW_PKG_THISZONE in the pkginfo(1) file. (See pkginfo(4) for details.)
        • SUNW_PKG_ALLZONES
          • This parameter defines the zone scope of the package on the system.
          • When it is set to true, the package will be installed into the global zone and then distributed to each of the non-global zones on the system.
          • When it is set to false, the package will only be installed into the specific zone in which the pkgadd command is executed unless the command was executed with the -Z option.
        • SUNW_PKG_HOLLOW
          • This parameter defines the visibility of a package -- if that package is required to be installed on all zones and be identical in all zones.
          • When this parameter is set to true, the global zone needs to replicate the package database into both current and future local zones.
          • When this parameter is set to false, it will force the package to be replicated into the local zone regardless of its type.
        • SUNW_PKG_THISZONE
          • This parameter defines whether a package must be installed in the current zone only.
          • When this parameter is true and executed in a non-global zone, the package will only install in that non-global zone.
        • The behavior of packaging in a zone environment based on these parameters is described in detail in the article Bringing Your Application Into the Zone.
      • Type of zone, global or non-global, in which pkgadd(1M) is invoked.
    • Package and patch commands have been modified to accommodate zones and each zone maintains its own patch and package database.
    • A package that needs to be interactive is added to the current zone only. If the current zone is the global zone, the package is treated as though it is being added by using the pkgadd command with the -G option. For the rules of how packages are added in zones, see the pkgadd(1M) section below.
  • pkgadd(1M)
    • pkgadd(1M) in the global zone:
      • Packages can be added as follows:
        • To the global zone only, unless the package is SUNW_PKG_ALLZONES=true or the package contents affect any area of the global zone that is shared with any non-global zone.
        • To the global zone and to all non-global zones without regard to the area affected by the package.
        • To all non-global zones only, if the package is already installed in the global zone.
        • To the current zone only, if SUNW_PKG_THISZONE=true.
      • Packages cannot be added:
        • If the destination is any subset of the non-global zones.
        • To all non-global zones, unless the package is already installed in the global zone.
      • To add a package to the global zone only, execute the pkgadd(1M) utility with the -G option only as the global administrator in the global zone.
      • To add a package to the global zone and to all non-global zones, execute the pkgadd(1M) utility in the global zone as the global administrator. Run it without the -G or -Z options.
      • At this time, the option to add a package that is already installed on the global zone to all of the non-global zones (-Z) is not yet implemented. To accomplish this, use pkgrm(1M) to remove the package from the global zone, and then add the package without using the -G option.
    • pkgadd(1M)  in a non-global zone:
      • To add a package in a specified non-global zone, execute the pkgadd(1M) utility, without options, as the zone administrator. The following conditions apply:
        • The pkgadd(1M) utility can only add packages in the non-global zone in which the utility is used.
        • The package cannot affect any area of the zone that is shared from the global zone.
        • The package must be set SUNW_PKG_ALLZONES=false.
        • Neither the -G option nor the -Z option can be used. If either of these options is used, pkgadd(1M) outputs an error message and the attempted operation fails.
  • pkgrm(1M)
    • When the pkgrm(1M) utility is used in the global zone, the following actions apply:
      • pkgrm(1M) can remove a package from the global zone and from all non-global zones without regard to the area affected by the package, from all non-global zones only, or from the global zone only when the package is only installed in the global zone.
      • pkgrm(1M) cannot remove a package from the global zone if the package is also installed in a non-global zone, or remove a package from any subset of the non-global zones.
      • If the -G option is used, pkgrm(1M) removes the specified package from the global zone only.
      • If the -Z option is used, pkgrm(1M) removes the specified package from all non-global zones only. The package is marked as installed in the global zone only. The package is not installed when any non-global zone is installed. Note: This option is not yet implemented.
      • If neither the -G option nor the -Z option is used, pkgrm(1M) removes the specified package from all zones, including the global zone. This is the default action.
    • When the pkgrm(1M) utility is used in the non-global zone, the following limitations apply:
      • pkgrm(1M) can only remove packages from the non-global zone.
      • Neither the -G nor the -Z options can be used in a non-global zone. If either of these options is used, pkgrm(1M) outputs an error message and the attempted operation fails.
      • The package cannot affect any area of the zone that is shared from the global zone.
      • The package must be set SUNW_PKG_ALLZONES=false.
  • pkginfo(1M)
    • When the pkginfo(1M) utility is used in the global zone, the following actions apply:
      • pkginfo(1M) can query the software package database in the global zone only, in a specified non-global zone only, or in all non-global zones only.
      • pkginfo(1M) is not able to query the software package database in the global zone and in all non-global zones, or in the global zone and in a subset of all non-global zones.
      • If the -z zonename option is used (lowercase z), pkginfo(1M) queries the software package database in the specified non-global zone only. Note that the -z option is not yet implemented for pkginfo(1M).
      • If the -Z option is used (uppercase Z), pkginfo(1M) queries the software package database in all non-global zones. Note that the -Z option is not yet implemented for pkginfo(1M).
      • If neither the -z option nor the -Z option is used, pkginfo(1M) queries the software package database in the global zone only.
    • When the pkginfo(1M) utility is used in the non-global zone, the following actions apply:
      • pkginfo(1M) can query only the software package database in the non-global zone.
      • Neither the -z option nor the -Z option can be used in a non-global zone. If either option is used, pkginfo(1M) outputs an error message and the attempted operation fails.
  • patchadd(1M)
    • When patchadd(1M) is used in the global zone, the following conditions apply:
      • The patchadd(1M) utility is able to add the patch(es) to the global zone and to all non-global zones only. This is the default action.
      • The patchadd(1M) utility cannot add the patch(es) to the global zone only or to a subset of the non-global zones.
    • When used in a non-global zone by the zone administrator, patchadd(1M) can only be used to add patches to that zone. A patch can be added to a non-global zone in the following cases:
      • The patch does not affect any area of the zone that is shared from the global zone.
      • All packages in the patch are set SUNW_PKG_ALLZONES=false.
    • As the number of zones that a patch is applied to increases, the amount of time required to patch the system increases as well.
  • patchrm(1M)
    • The global administrator can use the patchrm(1M) utility in the global zone to remove patches. The patchrm(1M) utility cannot remove patches from the global zone only or from a subset of the non-global zones.
    • The zone administrator can use the patchrm(1M) utility in a non-global zone to remove patches from that non-global zone only. Patches cannot affect areas that are shared.
  • Note: -z and -Z options are scheduled to be implemented sometime after Solaris 10 Update 1.

File System Configuration and Management

  • The two ways to configure a non-global zone's root file system are through the whole root model or the sparse root model.
    • The whole root model installs all of the required and any selected optional Solaris packages into the private file systems of the zone. The advantages of this model include the ability for zone administrators to customize their zones file system layout (for example, creating a /usr/local directory) and add arbitrary unbundled or third-party packages. The disadvantages of this model include the loss of sharing of text segments from executables and shared libraries by the virtual memory system, and a much heavier disk footprint, approximately an additional 2 Gbyte, for each non-global zone that is configured this way.
    • The sparse root model optimizes the sharing of objects by only installing a subset of the root packages (those with the pkginfo(4) parameter SUNW_PKGTYPE set to root) and using read-only loopback file systems to gain access to other files. In this model, only certain root packages are installed in the non-global zone. This will include a subset of the required root packages that are normally installed in the global zone as well as any additional root packages that the global administrator might have selected. Access to other files will be via read-only loopback file systems. This is similar to the way a diskless client is configured, where /usr and other file systems are mounted over the network with NFS. By default with this model, the directories /lib, /platform, /sbin, and /usr will be mounted in this manner. The advantages of this model are greater performance due to the efficient sharing of executables and shared libraries, and a much smaller disk footprint for the zone itself. The sparse root model only requires approximately 100 Mbyte of file system space for the zone itself.
  • Configuring the non-global zone for either the whole root model or the sparse root model can be accomplished as follows:
    • Upon creation of a zone, the default configuration uses inheritance of the /usr, /lib, /platform, and /sbin directories through a loopback file system to create a sparse root zone. If other directories, like /opt, need to be inherited from the global zone, they can be added using add inherit-pkg-dir dir=/opt.
    • To create a non-global zone with the whole root model, the administrator must configure the zone so that it does not use the default configuration, which is the sparse root model. To do this, use the -b option with the zonecfg(2) command and the create sub-command, which creates a blank configuration.
    • If you want to create a whole root zone, but shared file systems have been added using inherit-pkg-dir, you must remove these default resources using zonecfg(2) before the zone is installed. This can be done using remove inherit-pkg-dir dir=<file system directory>.
  • Each non-global zone has its own section of the file system hierarchy, rooted at a directory known as the zone root. Processes in the zone can access only files in the part of the hierarchy that is located under the zone root.
    • To eliminate file system contention, it is possible to create a separate file system using Solaris Volume Manager and soft partitioning, which is provided as part of the Solaris 10 release. It will need to be mounted to the root path of the zone with permissions of 700.
  • To mount file systems in running non-global zones, do one of the following:
    • Import raw and block devices using zonecfg.
    • Mount the file system from the global zone into the local zone.
  • The following table highlights the various file system types that are available in a non-global zone as well as mounting options.
Table 1: File System Types in Non-Global Zones, With Mounting Options
File System Mount via zonecfg Mount From Global to Non-Global Mount From Within the Zone
AutoFS No No Yes
CacheFS N/A N/A N/A
FDFS Yes Yes Yes
HSFS Yes Yes Yes
LOFS Yes Yes (safe but some performance issues) Yes
MNTFS No No Yes
NFS No No Yes (V2, V3, and V4)
PCFS Yes Yes Yes
PROCFS No No Yes
TMPFS Yes Yes Yes
UDFS Yes Yes Yes
UFS Yes Yes (safe, no performance overhead) Yes
XMEMFS Yes Yes Yes
 

  • VERITAS Volume Manager (VxVM) and VERITAS File System (VxFS) in Zones
    • As of July 2005, VxVM 4.1 is only supported in a global zone.
    • VxFS 4.1 is supported in non-global zones, but there are known limitations, which are currently being researched.
  • Sections of a file system can be mounted into one or more zones using the read-only option of the LOFS file system. This allows the same file system data to be shared in multiple zones, while preserving the security guarantees supplied by zones.
  • NFS and autofs mounts established within a zone are local to that zone; they cannot be accessed from other zones, including the global zone. The mounts are removed when the zone is halted or rebooted.
  • df(1M)
    • By default, df(1M) only displays mounts located within the current zone.
    • When run from the global zone, df(1M) with the -Z option displays mounts in all zones.
  • By default, commands such as dd(1M), fmthard(1M), format(1M), fdisk(1M) (x86/x64), mkfs(1M), and newfs(1M) are not enabled in a non-global zone.
  • In the default configuration, the following file systems are mounted in a non-global zone:
    • /, the zone root file system, is mounted on <zonepath>/root in the global zone.
    • /sbin, /usr, /lib, and /platform file systems are read-only loopback mounts from the global zone to enable text page sharing to reduce memory requirements. This also reduces the required disk footprint of the zone.
    • /dev for the zone is mounted on <zonepath>/dev in the global zone.
    • /proc, /dev/fd, /system/contract, /etc/svc/volatile, /etc/mnttab, /var/run and /tmp
    • Additional file systems can be mounted in a zone if required (refer to the table above).
  • The ability to unmount a file system depends on who does the initial mount rather than the type of file system.
    • When a file system is specified via zonecfg(1M), then basically the global zone controls this mount, and the non-global zone root user cannot unmount the file system.
    • If the file system is mounted from within the zone via the zone's /etc/vfstab file, for example, then the non-global root user can unmount the file system.

Parameter Configuration and Management

  • /etc/system
    • /etc/system tunes system-wide parameters and can only be set in the global zone. With the Solaris 10 OS, Sun is moving away from using /etc/system tunables. Some of the parameters that had been set in /etc/system in previous Solaris releases can be set using rctls, and this would be zone specific. A list of parameters that can be tuned using rctls can be found under Configuring Resource Controls and Attributes on docs.sun.com.
  • ndd(1M)
    • ndd(1M) tunes system-wide parameters and can only be set in the global zone.
    • /dev/ip is not available in a non-global zone.
    • There is no sys_net_config privilege in a non-global zone, so ndd(1M) parameters cannot be set there.
    • Moving forward, Sun is moving away from using ndd(1M)as a tool for configuring the system's networking settings.

Network Configuration and Management

  • Each non-global zone has its own logical network and loopback interface. Bindings between upper layer streams and logical interfaces are restricted such that a stream may only establish bindings to logical interfaces in the same zone. Likewise, packets from a logical interface can only be passed to upper layer streams in the same zone as the logical interface. Bindings to the loopback address are kept within a zone with one exception: when a stream in one zone attempts to access the IP address of an interface in another zone.
    • Two zones can access each other only through IP connections (that is, telnet(1) and rlogin(1)).
  • While applications within a zone can bind to privileged network ports, they have no control over the network configuration, including IP addresses and the routing table. The following should be managed from the global zone:
    • IPQoS
    • IPsec
    • IPMP
    • IPFilter (also, no inter-zone filters)
    • Routing
    • NCA
  • The ifconfig utility has been modified in order to configure and view interfaces based on zone granularity as well, but interfaces can only be plumbed or unplumbed from the global zone.
  • In the event that multiple subnet routing is needed, the default route for each subnet must be defined within the /etc/defaultrouter file belonging to the global zone. Defining an /etc/defaultrouter file in non-global zones has no effect.
  • IKE is not yet supported in non-global zones.
  • Network Services
    • NFS
      • NFS servers work in global zones but they do not yet work within non-global zones because NFS servers need direct access to the kernel. This is being worked on, although there is no ETA at this point.
      • NFS clients (V2, V3, and V4) can be used within non-global zones.
      • A non-global zone cannot mount an NFS file system from its own global zone.
    • DHCP is not yet available for non-global zones.
    • Snooping is not yet available for non-global zones.

System and State Management

  • System and state management of a global zone is similar to what it was in previous Solaris releases.
  • System state management of a non-global zone is described below:
    • A non-global zone can be in one of these states:
      • configured
      • incomplete
      • installed
      • ready
      • running
      • shutting_down
      • down
    • During the bring-up process for a normal non-global zone, a zone will go through these states: configured -> installed -> ready -> running.
    • Booting a zone is done with the zoneadm(1M) boot command. Booting a zone places the zone in the running state. A zone can be booted from the ready state or from the installed state. A zone in the installed state that is booted transparently transitions through the ready state to the running state.
    • Halting a zone is accomplished through the zoneadm(1M) halt command, which removes both the application environment and the virtual platform for a zone. The zone is then brought back to the installed state. All processes are killed, devices are unconfigured, network interfaces are unplumbed, file systems are unmounted, and the kernel data structures are destroyed. The halt command does not run any shutdown scripts within the zone.
    • Rebooting a zone is done through the zoneadm(1M) reboot command. When this command is issued, the zone is halted and then booted again. The zone ID will change when the zone is rebooted.
    • If you set the autoboot resource property in a zone's configuration to true, that zone is automatically booted when the global zone is booted. The default setting is false.
    • Uninstalling a zone is done with the zoneadm(1M) uninstall command. It uninstalls all of the files under the zone's root file system. Before proceeding, the command prompts you to confirm the action, unless the -F (force) option is also used. Use the uninstall command with caution, because the action is irreversible.
    • System state management of a non-global zone can be done without interfering with the state of the global zone or other non-global zones.

User Management

  • Each zone, whether global or non-global, has its own namespace for users.
    • Hence, users in different zones with the same UID are in fact distinct users, even though they share the same numerical ID. The virtualized user ID namespace also implies that passwords are unique to the zone.
  • Each zone has its own name service. They can be completely different, such as 'files' for the local zone and NIS for the global zone, for example, and/or both can use NIS but with different NIS domains.
  • Users in a non-global zone are unable to monitor other zones, such as viewing network traffic or the activity of processes.

Memory Management and Configuration

  • Shared memory cannot be used between containers as that would violate security restrictions.
  • The entire swap partition is treated as a single global resource to processes running in both the local and global zones. With the Solaris 10 GA release, you can't limit the amount of swap used by a local zone on a per-zone basis. You can globally limit the size of the swap-based file systems (for example, /tmp) by using the "size" mount option in the local zone's /etc/vfstab file, for example, size=200m. This allows you to decrease the effect of many files and/or large files created in /tmp.
  • Note: A future enhancement is being planned for resource pools to implement a resource control called a swap set. Swap sets would allow swap to be limited within a pool bound to a zone on a per-zone basis.

Process Management

  • Processes
    • Global zones behave like a "non-zoned" system.
      • Root-owned global zone processes have the same powers, across all zones.
      • Non-root-owned processes can view information about non-global processes, but cannot signal them.
    • Processes in one non-global zone are not visible to any other non-global zone.
      • The root user in a non-global zone is only omnipotent and omniscient within its own non-global zone. It has no power or visibility into other zones.
  • Process Tree
    • A global zone sees all processes in all zones in one process tree.
    • A non-global zone sees only its own process sub-tree.
  • Process Isolation
    • Processes in one local zone cannot detect or interact with processes in other local zones except through the network or a shared file system.
    • Only processes in the same zone will be visible through system call interfaces that take process IDs, such as the kill(1) and priocntl(1) commands. Attempts to access processes that exist in other zones (including the global zone) fail with the same error code that would be issued if the specified process did not exist.
  • All processes live in one resource pool.

Device Management

  • In a non-global zone, the set of devices is restricted to prevent a process in one zone from interfering with processes running in other zones. By default, only certain pseudo-devices that are considered safe for use in a zone are available. Additional devices can be made available within specific zones by using the zonecfg(2) utility. Here is the list of pseudo-devices available to a non-global zone:
    • /dev/null, /dev/zero, /dev/poll, /dev/random, /dev/tcp, and so on
  • Physical devices are only available if configured by a system administrator. The administrator must ensure that the security of the system is not compromised.
    • Placing a physical device into more than one zone can create a covert channel between zones.
    • Global zone applications that use such a device risk the possibility of compromised data or data corruption by a non-global zone.
    • If possible, mount the device within the non-global zone's root hierarchy so it cannot be compromised by unprivileged users within the global zone.
  • Most operations concerning kernel, device, and platform management will not work inside a non-global zone because modifying platform hardware configurations violates the zone security model. For example, the following operations will not work:
    • Adding and removing drivers
    • Explicitly loading and unloading kernel modules
    • Initiating dynamic reconfiguration (DR) operations
    • Using facilities that affect the state of the physical platform or expose system data (that is, eeprom(1M), prtconf(1M), prtdiag(1M), dtrace(7D), kmem(7D), ksyms(7D), kmdb(7D), trapstat(1M), lockstat(7D), and so on). These are only available in the global zone or do not work as expected in non-global zone due to restrictions on devices.
    • Creating new device nodes (mknod(2))
    • Accessing NIC device nodes that support the DLPI programming interface
  • Further information on devices use in non-global zones can be found under Device Use in Non-Global Zones on docs.sun.com.

Privileges Available

  • Processes in a non-global zone are running with a reduced set of privileges in their limit set. The privileges that were taken away from zones were deemed unsafe when it came to providing a secure and isolated application environment for a process in a zone.
  • Unprivileged processes, whether in a global zone or non-global zone, share the same basic privilege set. The privileges file_link_any, proc_info, proc_session, proc_fork, and proc_exec make up the "basic" privilege set.
  • Privileged processes in a global zone have all privileges available to them.
  • Privileged processes in a non-global zone have a subset of privileges available to privileged processes in a global zone.  The functionality that these missing privileges provide (with the exception of the DTrace privileges, which are new to the Solaris 10 OS) is only available to the superuser in prior releases of Solaris. The following privileges are available only in a global zone, and not available in a non-global zone:
    • dtrace_*
    • net_rawaccess
    • proc_clock_highres
    • proc_lock_memory
    • proc_priocntl
    • proc_zone
    • sys_config
    • sys_devices
    • sys_ipc_config
    • sys_linkdir
    • sys_net_config
    • sys_res_config
    • sys_suser_compat
    • sys_time
  • To display the list of privileges available within a zone, use the ppriv(1) utility.

Backup and Recovery

  • Basic backup and recovery utilities, like tar(1), ufsdump(1M), and fssnap(1M), allow for backup and recovery of user information within both global and non-global zones as long as these utilities are executed from the global zone.
  • For the most part, tar(1) functions properly.
    • There is an edge case where it is unable to work in a non-global zone completely as it does in the global zone. When running in a non-global zone, tar(1) is able to create archives that preserve the sticky bit on individual files, but is not able to write files with the sticky bit set back to the file system. The tar(1) command fails silently in this case, because the chmod(2) system call does not report a failure when this occurs.
    • Also, tar(1) cannot recreate the /dev entries in a non-global zone.

Miscellaneous Information

  • Device access limitations and changes to library and system call usage due to privilege restrictions in non-global zones affect Solaris commands within non-global zones. The following commands, some of which are described above, behave differently in non-global zones:
    • add_drv(1M) / rem_drv(1M)
    • arp(1M)
    • autopush(1M)
    • cfgadm(1M)
    • cpustat(1M)
    • devfsadm(1M)
    • devlinks(1M)
    • dispadmin(1M)
    • disks(1M)
    • drvconfig(1M)
    • dtrace(7D)
    • intrstat(1M)
    • ipf
    • modload(1M) / modunload(1M)
    • plockstat(1M)
    • pooladm(1M)
    • poolcfg(1M)
    • poolbind(1M)
    • ports(1M0
    • prtconf(1M)
    • prtdiag(1M)
    • psrset(1M)
    • route(1M)
    • share(1M)
    • snoop(1M)
    • tapes(1M)
    • trapstat(1M)
    • date(1)
    • nca(1)
  • How to export a CD-ROM into a non-global zone:
    1. Check if Volume Manager is running:
    2. ps -ef | grep volmgt
      
      If it is not running, start it with:
      /etc/init.d/volmgt start
      
    3. Insert CD.
    4. Force Volume Manager to check for media:
    5. volcheck
      
    6. Test if CD is automounted:
    7. ls /cdrom
      cdrom   cdrom1   software_cd
      
    8. Make /cdrom accessible in non-global zone:
    9. zonecfg -z myzone
      zonecfg:myzone> add fs
      zonecfg:myzone:fs> set dir=/mnt
      zonecfg:myzone:fs> set special=/cdrom
      zonecfg:myzone:fs> set type=lofs
      zonecfg:myzone:fs> add options [ro,nodevices]
      zonecfg:myzone:fs> end
      zonecfg:myzone> commit
      zonecfg:unizone> exit
      
    10. Restart non-global zone from the global zone:
      1. Check if zone is running:
      2. zoneadm list -CV
        
        ID NAME           STATUS        PATH
            0 global      running       /
            - myzone      running       /export/home/myzone
        
      3. If yes, stop zone:
      4. zlogin myzone init 0
        
      5. Check that zone is stopped:
      6. zoneadm list -cv
        
        ID NAME           STATUS         PATH
            0 global      running        /
            - myzone      installed      /export/home/myzone
        
      7. Start zone:
      8. zoneadm -z myzone boot
        
      9. Check if zone is booted.
    11. Log in to non-global zone:
    12. zlogin myzone
      
    13. Check if the mounted CD can be seen from the non-global zone:
    14. ls /mnt
      cdrom cdrom1 software_cd
      
    15. Start installation from the mounted CD (that is, /mnt/cdrom/...).

Developing, Provisioning, and Managing Third-Party Applications for Solaris Zones in the Solaris 10 OS

This section highlights the most common activities that occur while developing, provisioning, and managing third-party applications for the Solaris OS. It then details how these activities can be affected by the introduction of Zones in the Solaris 10 OS.

This section covers tasks and activities for both global and non-global zones.

Task List for Both Global and Non-Global Zones

The most generic actions that occur while developing, provisioning, and managing third-party applications within a Solaris environment are as follows:

Use Case Analysis for Both Global and Non-Global Zones

Using System Calls

  • The following system calls behave differently in a non-global zone due to privilege restrictions:
    • adjtime
    • chmod [S_ISVTX]                  
    • ioctl [I_POP, STREAMS]
    • link / unlink [directory]     
    • memctl [MC_LOCK, et al.]
    • mknod        
    • msgctl [IPC_SET && msg_qbytes]
    • ntp_adjtime
    • p_online
    • priocntl                       
    • priocntlset
    • pset_*                          
    • shmctl [SHM_{UN}LOCK]
    • socket [SOCK_RAW]           
    • stime
    • swapctl[SC_ADD, SC_REMOVE]                              
    • uadmin [HALT, REBOOT, ...]
  • The article Bringing Your Application into the Zone describes each of these in detail.

Using Libraries

  • The following is a list of libraries that behave differently in a non-global zone due to privilege restrictions:
    • libdevinfo(3LIB)
    • libcfgadm(3LIB)
    • libpool(3LIB)
    • libkvm(3LIB)
    • libtnfctl(3LIB)
    • libsysevent(3LIB)
  • The following are library calls that function differently in a non-global zone due to privilege restrictions:
    • clock_settime
    • cpc_bind_cpu
    • mlock / munlock / mlockall / munlockall / plock
    • timer_create
    • t_open [/dev/rawip]
    • settimeofday
  • The article Bringing Your Application into the Zone provides further information on these libraries and calls.

Installing Third-Party Software

  • In a global zone in the Solaris 10 OS, third-party software installation will behave as expected.
  • In a non-global zone in the Solaris 10 OS, third-party software installation may fail due to read-only file systems or CD-ROM access.
    • Any software installation that places components in /usr (or any of the other read-only loopback file systems) will fail in a zone following the sparse root model.
    • Any software installation that requires CD-ROM access in a non-global zone will fail unless the CD-ROM is available in the non-global zone (not the default configuration).
  • The article Bringing Your Application into the Zone provides further information on software installation within Solaris Zones.

Executing Third-Party Software


Resource Management

Solaris Resource Management provides for fine-grained measurement and dynamic control of a workload on a system. A workload is an aggregation of all processes of an application or group of applications. The number of workloads that can be combined onto a Solaris system is determined by the resource requirements of the workloads as well as the available system resources. If resource management features are not used, the Solaris OS gives all activity on the system equal access to resources. Solaris resource management features enable you to treat and control workloads individually. This facilitates understanding and controlling the resource requirements of workloads so that the consolidation of multiple workloads onto a single system can occur.

To explain Resource Management, this section focuses on the most common cases that system administrators will run into when using the technology. Resource Management is used in the following three areas:

Each of these topics is explained and examined individually to illustrate how Solaris Resource Management behaves in the Solaris 10 OS. Details about how resource management functions in zones are also given.

Classifying Workloads

A method to classify workloads is needed so the system knows which processes are associated with which workload. New objects, called Project and Task, were created to facilitate the labeling and classification of workloads. These objects are explained below.

Project

  • A project provides a way to identify related work.
  • Information about a project is stored in a project database, which can be stored in a local file (like the /etc/project file), in a NIS project map, or in an LDAP directory.
  • A project entry consists of the name of the project, the project ID, a description of the project, a list of users who are allowed in the project, a list of groups who are allowed in the project, and any attributes that belong to the project, which can include resource controls.
  • Updates to project entries do not affect active projects; only newly added tasks are affected.
  • A user or group can belong to one or more projects, although each user is assigned to a default project.
  • The project identifier can be shared across multiple machines to better assess resource consumption.
  • Projects behave the same in all types of zones (both global and non-global). Each zone maintains an independent copy of the local project file as well as users, groups, and projects.

Task

  • A task is a group of processes that represent a workload component.
  • Each successful login into a project creates a new task that contains the login process.
  • Each task is automatically assigned a task ID.
  • Each process is a member of one task, and each task is associated with one project.

Some of the common commands that are used to classify workloads are detailed below:

  • projadd(1M)
    • This command adds a new project entry on the local system only (like the /etc/project file). It cannot add projects from the network naming service.
  • projmod(1M)
    • This command modifies information for a project on a local system. It cannot modify projects from the network naming service.
    • It can be used to edit fields in a project entry including project attributes, which can contain resource controls.
  • projdel(1M)
    • This command deletes a project from the local system only. It cannot delete projects from the network naming service.
  • newtask(1)
    • This command creates a new task in a specified project.
    • Running processes can be associated with a new task as well.

These commands have been modified to show project and/or task information:

  • ps(1) (-o option displays project and task information)
  • id(1M) (-p option adds the current project ID to user ID and group ID listing)
  • pgrep(1) / pkill(1) (-J option allows these commands to be executed on a list of project IDs)
  • prstat(1M) (-J option displays information on processes and projects, -T option displays information on tasks)

It is important to note that projects and tasks are not affected by the introduction of zones in the Solaris 10 OS. Each zone, whether global or non-global, maintains its own projects and tasks to classify workloads running in the isolated environment.

Monitoring and Measuring Workloads

Once workloads can be identified and separated using projects and tasks, it is possible to monitor and measure resource consumption of a given workload using the extended accounting system. This facility records system and network usage on a task or process basis so that a workload's resource consumption can be better understood. When the accounting system is enabled, specified statistics are gathered for tasks and processes and placed in files. These files can then be viewed and analyzed using the libexacct API, the Perl interface to this library, or third-party tools that support this API.

The extended accounting facility works in a zones environment as follows:

  • When it is enabled from the global zone, statistics are gathered on a system-wide level, which includes all of the non-global zones on the system. The global administrator can then analyze resource consumption for the entire system or on a per-zone basis.
  • Accounting records are written to the global zone's accounting files as well as the non-global zone's accounting files.
  • Different account settings and files exist on a per-zone basis for process-based and task-based accounting.

Some of the common commands that are used with the extended accounting system include:

  • acctadm(1M)
    • This is the command that is used to start and stop accounting, to display the status of accounting, to view available accounting resources, to modify attributes of the accounting facility, and to select attributes to track for tasks and processes.
  • wracct(1M)
    • This command writes extended accounting records for active processes and tasks.

Controlling Workloads

There are three mechanisms to control workloads within a Solaris environment:

  • Constraints
    • This technique sets bounds on the consumption of specific resources, like CPU or memory, for a given workload using the resource control facility. Setting bounds prevents workloads from consuming too many resources. This mechanism does present risks, however, because if the bounds are set incorrectly, the application may not be able to function.
  • Scheduling
    • This procedure makes resource allocation decisions at set intervals. If a workload is not fully utilizing its resources, those resources are made available to other workloads. In a situation where a workload is over-committed, this mechanism provides controlled allocations.
  • Partitioning
    • This mechanism binds a workload to a defined subset of the system's resources. It guarantees that a known amount of resources will always be available to the workload, but it can hamper system-wide utilization if these resources are under-utilized.

The most common examples of these mechanisms include resource controls, resource pools, memory capping, and IPQoS. Each of these are detailed below with special emphasis on how they function within a Solaris Zones environment.

Resource Controls

  • Resource controls can be applied to a project, a task, or a process. In a zone environment, they can also be set at a zone-wide level.
  • They are configured in the attribute field of project entries in the project database, unless specified at the zone-wide level. When specified at the zone level, resource controls are set during zone configuration using zonecfg(2).
  • Standard resource controls are available for CPUs, memory, ports, message queues, LWPs, tasks, CPU time, semaphores, and so on. A full list of supported resource controls can be found under Configuring Resource Controls and Attributes on docs.sun.com. The set of controls available at a zone-wide level, rather than projects running within a zone, include only cpu-shares and max-lwps at this point.
    • The zone-wide zone.cpu-shares is the fair share scheduler (FSS). It is used to control the allocation of available CPU resources among workloads, based on their importance. The importance is expressed by the number of shares of CPU resources that are assigned to each workload. Shares are defined in terms of ratios, with the global zone getting 1 share by default. FSS CPU shares for a zone are hierarchical. The shares for a given non-global zone are set by the global administrator through the zone-wide resource control zone.cpu-shares. The project.cpu-shares resource control can then be defined for each project within that zone to further subdivide the shares set through the zone-wide control.
  • Either global or local actions can be taken when a resource cap is reached. Global actions include logging at a variety of levels, while local actions can be one of the following:
    • none, which takes no action but does indicate that the bound was exceeded
    • deny, which does not fulfill requests that go above the threshold
    • signal, which sends a specified signal to the process when the threshold is exceeded
  • Not all local actions can be applied to every resource control, however.
  • Resource controls interact with zones in the Solaris 10 OS in the following manner:
    • Any of the resource controls listed under Configuring Resource Controls and Attributes can be set as attributes of projects running in the non-global zone. These projects are then controlled by these resource limitations.
    • Setting the controls for a non-global zone affects only that zone. Projects that span multiple zones can have different controls in each zone.
    • Controls are subject to the additional requirements regarding pools and the zone-wide resource controls.
  • Here are some of the common commands that are used to control workloads via resource controls:
    • rctladm(1M)
      • This command allows for runtime interrogations and modifications of the resource control facility with global scope.
      • In a non-global zone, this command cannot be used to modify settings.
    • prctl(1M)
      • This command allows for runtime interrogations and modifications of the resource control facility with local scope.
      • Changes made with this command are good only until system reboot. The projmod(1M)command should be used to make changes that persist across reboot.

Dynamic Resource Pools

  • Resource pools provide a way to separate workloads into pools of CPUs so that workload requirements for CPU resources do not compete.
  • When a pool's resources are not fully utilized, they are temporarily allocated to other pools as needed so resources are not wasted.
  • Here are the rules for resource pools within zones:
    • When a non-global zone is allocated resources from a CPU pool, the resources can be subdivided further and given to specific workloads within the non-global zone by the zone administrator. Resource assignment occurs through the implementation of projects.
    • A zone may be assigned to only one resource pool although several zones may be assigned to the same resource pool.  Processes in the global zone, however, can be bound by a sufficiently privileged process to any pool.
    • These options are available within zonecfg to configure the pools:
      • zone.cpu-shares, which specifies the number of FSS CPU shares available to the entire zone from the resource pool
      • pool, which indicates the name of the resource pool the zone is bound to when booted
  • Some of the commands and daemons to manage pools are given below.
    • poold(1M)
      • This is the daemon that partitions resource pools.
      • This runs only in the global zone, where there can be more than one pool for it to operate on.
    • poolstat(1M)
      • This command reports active pool statistics.
      • When run in a non-global zone, this command displays statistics about the pool associated with that zone only.
    • pooladm(1M)
      • This is the command used to administer resource pools.
      • When run without arguments in a non-global zone, this command displays only information about the pool associated with the zone.

Memory Capping

  • Constraints can be set within projects to limit the amount of memory consumed by processes belonging to the project.
  • The memory caps are defined as attributes of a project.
  • rcapd(1M) is the daemon that controls memory capping, and rcapadm(1M) is the command used to enable or disable memory capping.
  • When processes in a project reach the specified memory cap, pages will be paged out to reach the memory threshold. Special effort should be used when setting the memory cap because:
    • If it is set too high, the system's resources can be consumed before the cap is reached.
    • If it is set too low, the system may experience excessive paging.
  • Memory capping within a zones environment works as follows:
    • Both global and non-global zones have their own rcapd(1M) daemons.
    • Each zone must be configured separately.

IPQoS

  • This feature allows for consistent levels of services to network users. It provides for controlling network traffic, prioritizing it, as well as monitoring it.
  • At the zone level, it manages network traffic in and out of the zone. An upper limit can be set, and if it is exceeded, packets are dropped.
  • This feature should be implemented with care as it does have CPU overhead. Additionally, it should be determined that the feature works with other network components, like firewalls and routers, before implementation.

Summary

This paper has examined both Solaris Zones and Resource Management to build a better understanding of Solaris Containers technology and how it should be used. Solaris Zones partitioning technology provides virtualized and secure operating environments for running applications, and Resource Management features provide the Solaris OS with functionality that helps control and manage resources. Through these two feature sets, Solaris Containers help users achieve higher levels of consolidation and better system utilization.


Resources

About the Author

Jennifer Rodoni Glore, who has been with Sun for five years, is an engineer in the Market Development Engineering organization focused on system integrator adoption of Sun products and solutions. Over the past year, Jennifer has been focused on the Solaris OS and the x86 platform.


Unless otherwise licensed, code in all technical manuals herein (including articles, FAQs, samples) is provided under this License.


Rate and Review
Tell us what you think of the content of this page.
Excellent   Good   Fair   Poor  
Comments:
Your email address (no reply is possible without an address):
Sun Privacy Policy

Note: We are not able to respond to all submitted comments.
BigAdmin
  
 
 
 
Would you recommend this Sun site to a friend or colleague?
Contact About Sun News & Events Employment Site Map Privacy Terms of Use Trademarks Copyright Sun Microsystems, Inc.