Contents
Many organizations are deploying data warehouses because their corporate data ultimately has value that will help them maintain competitiveness. Data warehouses enable decision support systems that facilitate extracting business intelligence from enormous amounts of data. They can allow businesses to cope with narrowing market windows, reduced product cycles, demands for greater profitability, and in differentiating customer service from the competition. Significant benefits accrue to corporate Information Technology organizations as well because they can consolidate data from diverse operational systems, off-load analysis from overburdened mainframes, and can promote the use of powerful, commercially available analysis tools.
In 1997, the Palo Alto Management Group showed that the average size of data warehouses maintained by Fortune 500 companies is expected expand by a factor of 36 between 1997 and 2001 -- for example a 100GB data warehouse would swell to an astonishing 3.6 terabytes in only four years! Because of the increasing amounts of data stored in data warehouses, it has long been thought that massively parallel processing (MPP) systems would be necessary to achieve reasonable performance in accessing and analyzing these multiple terabyte databases. Although MPP architectures have promised an unlimited growth path, the realities of MPP performance and scalability have not shown this to be the case -- most MPP systems have been deployed with a few dozen processors, hardly configurations that the term `massively parallel processing' implies.
The weakness of massively parallel processors is their extreme sensitivity to the underlying data partitioning:
- When database records needed in a query are evenly distributed across all of the processor nodes in an MPP system, all nodes can participate in parallel data analysis and excellent scalability is achieved.
- When the data distribution is skewed across nodes, or when the query accesses only a portion of the data, some nodes can efficiently participate in query operations and some cannot, leading to poor scalability.
The responsibility for data partitioning falls squarely on the shoulders of database administrators -- and since the cost-effectiveness of an entire MPP system depends on partitioning, these choices can put careers on the line. There is only one way to physically partition each database object (tables, indices, etc.) Because decision support systems by their very nature are intended to respond to previously unforeseen queries, there will never be a partitioning choice that is optimal for all queries.
In order to make the partitioning issue seem less critical for MPP systems, vendors have now found ways to reduce inter-node communication and improve their benchmark results through the use of highly-specialized techniques. If the use of these short-cuts are representative of real-world use of data warehouses -- including true ad hoc queries -- they will truly benefit the end user. In reality, data warehouses are created for the purpose of answering questions that have not previously been formulated -- and the use of specialized indices that are based on advance knowledge of queries are not likely to provide a benefit.
Key in comparing architectures is the fact that SMP systems are more efficient and do not require specialized techniques to deliver performance that is much more predictable than MPP systems. While others have been designing larger and more complex massively parallel processors, Sun Microsystems has been persistently solving the engineering challenges of symmetric multiprocessing. These architectures allow all processors to share memory and I/O devices equally, dramatically reducing the impact of the database partitioning issue that haunts MPP systems. With SMP architectures from Sun, data warehouses can be deployed, populated, and expanded without having to optimize for every imaginable type of query -- resulting in performance that is more predictable and has less risk.
With the Sun Enterprise 6500, and the Sun Enterprise 10000 (Starfire) servers, the largest data warehouse installations can be handled with the same binary compatibility as the smallest database problems. The Starfire server is architecturally superior to MPP -- its Gigaplane-XB system interconnect supports 12.8 GB/sec. throughput, more than two orders of magnitude faster than the processor interconnection mechanism of the IBM RS/6000 SP. The Enterprise 10000 server supports up to sixty-four 336 MHz UltraSPARC processors, 64 GB of main memory, and more than 20 terabytes of disk storage. And if even larger data warehouses are required, servers from Sun can be configured in clusters, delivering the benefits of SMP with even greater numbers of processors -- for example, one cluster discussed in this paper consists of four Sun Enterprise 6000 servers with a total of 96 powerful UltraSPARC processors.
The database server market is highly competitive, and thus sees a rapid release of performance data for new and different hardware platforms and database servers. In 1997 this paper was able to objectively show the inherently superior performance of SMP systems by comparing SMP and MPP servers configured with servers having the same number of processors, roughly-equivalent table scanning performance, and with the exact same versions of database software Today, published TPC-D results encompass a wide range of servers and database software, and a more detailed look at individual TPC-D results is necessary in order to highlight the new short-cuts taken by database server vendors. This paper surveys the issues in data warehouse performance between SMP, MPP, and cluster architectures, and takes a detailed look at the current TPC-D benchmark results to reveal how short-cuts can make performance appear better than will be seen in actual real-world use.
Organizations deploy data warehouses so that they can have the ability to extract meaningful, yet un-obvious, information from large amounts of data using techniques such as relational intra-query parallelization, on-line analytical processing (OLAP), data mining, and multidimensional databases. Powerful systems to perform these analyses require access to many times the amount of data that is stored in any one of a company's operational systems -- a proposition that has become more feasible given the plummeting cost of disk storage.
The most common way for organizations to deploy data warehouses is to periodically migrate data from on-line transaction processing (OLTP) databases into data warehouses. Given that the schemas used in data warehouses are usually different from the operational OLTP systems, the migration process itself can be as resource-intensive as the extraction and scrubbing of the data.
The amount of storage needed is staggering as well -- with a company's payroll, sales, shipping, and warehouse management databases containing tens of gigabytes of data, data warehouses that coalesce all of this information fall in the range of one to ten terabytes. Because the usefulness of this data cannot be determined a priori, all of a company's data is usually stored in a data warehouse -- not just a statistical sample that may or may not contain the information needed to make successful strategic decisions.
Data warehouses present a constant challenge of rapid application deployment -- unlike OLTP systems where the workload is predictable and can be managed with careful tuning, data warehouses are constantly changing as new applications are created, new user modalities evolve with their maturity, and access to new forms of data are made possible. Because of their constantly-changing nature, all data warehouses require custom configuration. There are three main areas to consider when deploying data warehouses:
- Query Complexity
The complexity of queries that decision support systems must support range from simple canned queries to data mining using artificial intelligence techniques. Canned queries may utilize pre-compiled, optimized, SQL that is used to answer simple, often repeated, questions such as: ``what is the margin on a particular product for the month?'' Ad hoc queries are written in SQL and are used to perform more complex analysis of data. Finally, queries that support data mining operations are complex, have usually never been considered before, and are typically quite difficult to optimize. These queries are often not SQL-based, and make use of other computationally intensive methods such as genetic programs, tree inducers, neural nets, and rule discovery algorithms.
- System Architecture
Decision support systems lend themselves well to parallel processing technology. The range of parallel computing architectures that can support large data warehouses vary primarily in the degree to which memory is hierarchical.
Symmetric multiprocessors provide uniform access to all memory using high-speed buses or crossbar switching technologies that support point-to-point interconnection between processors.
Clustered approaches use sets of SMP systems linked with slower-speed interconnection mechanisms.
Massively parallel processor systems utilize nodes with local memory accessed through a local high-speed bus, with intercommunication between nodes accomplished through slower speed message-based interconnects.
Most companies deploying data warehouses start small and grow their infrastructure as the amount of data and the demands on the decision support systems increase. Data warehouses are often initially deployed with 16 or fewer processors, and require a growth path that can support many times the initial processing capability. One of the reasons for the popularity of Sun's database servers is that, with binary compatibility across their entire product line, data warehouses can begin with very small desktop servers and grow to configurations that cluster multiple SMP systems from Sun.
As processors are added to an SMP, or nodes are added to an MPP, it is important for the system to scale. Ideally, a system will demonstrate a property called speed-up, in which a job that requires one unit of time to complete with one processor will require 1/N of the time to complete with N processors. For example, if a job that requires ten hours to complete with one processor requires only one hour to complete with ten processors, the system scales well.
Another desirable characteristic of scalability is called scale-up. A system with excellent scale-up offers the same level of performance as the size of the data warehouse increases through the addition of processors or nodes. For example, a batch job that takes ten hours to run when the database is one terabyte in size will take the same length of time at two terabytes, simply by doubling the number of processors. Note that on an MPP, the data must also be re-partitioned across the nodes in order to maintain scalability, usually a time-consuming and risky venture with terabyte-sized databases. This step is not required on an SMP.
Many database administrators look at scalability from the standpoint of whether the system has predictable behavior as the intensity of the workload increases. A system that scales well is one that holds no surprises as both the system and the workload grows.
A brief technical note about a detail of database architecture: threads and processes. All UNIX``-based database systems make use of UNIX processes, but some also take advantage of a capability called threading. Threads are loci of control within one process address space that allow multiple simultaneous sets of activity. Some databases such as those by Informix and Sybase are built using a small number of processes and multiple threads; other databases such as Oracle and DB2 are built using multiple single-threaded processes to achieve concurrency. Oracle is the database used as an example in this paper, and thus the concept of multiple processes will be used throughout.
Massively parallel processor systems use a large number of nodes each accessed using an interconnect mechanism that supports message-based data transfers typically on the order of 13 to 38 MB/sec. Figure 1 illustrates that each node is a self-sufficient processor complex consisting of CPU, memory, and disk subsystems. In the vernacular, an MPP is considered a ``shared nothing'' system because memory and I/O resources are independent, with each node even supporting its own copy of the operating system. MPP systems promise unlimited scalability, with a growth path that allows as many processors as necessary to be added to a system.
MPP architectures provide excellent performance in situations where a problem can be partitioned so that all nodes can run in parallel with little or no inter-node communication. In reality, the true ad hoc queries typical of data warehouses can only rarely be so well-partitioned, and thus limit the performance that MPP architectures actually deliver. When either data skew or high inter-node communication requirements prevail, the scalability of MPP architectures is severely limited. Reliability is a concern with MPP systems because the loss of a node does not merely reduce the processing power available to the whole system; it can make any database objects that are wholly or partially located on the failed node unavailable.
An interesting industry trend is for MPP vendors to augment single-processor ``thin'' nodes with multiprocessor ``fat'' nodes using many processors in an SMP configuration within each node. If the trend continues, each MPP node will have increasing numbers of processors, fewer nodes, and the architecture begins to resemble clusters of SMP systems, discussed below.
Symmetric multiprocessors fall on the opposite end of the spectrum from MPP systems. These systems consist of from a pair to as many as 64 processors that share memory and disk I/O resources equally under the control of one copy of the operating system, as shown in Figure 2. Because system resources are equally shared between processors, they can be managed more effectively. SMP systems make use of extremely high-speed interconnections to allow each processor to share memory on an equal basis. These interconnections are two orders of magnitude faster than those found in MPP systems, and range from the 2.6 GB/sec interconnect on Sun Enterprise 4000, 5000, and 6000 systems to the 12.8 GB/sec aggregate throughput of the Sun Enterprise 10000 server with the Gigaplane-XB architecture.
In addition to high bandwidth, low communication latency is also important if the system is to show good scalability. This is because common data warehouse database operations such as index lookups and joins involve communication of small data packets. Sun provides very low latency interconnects. At only 400 nsec. for local access, the latency of the Starfire server's Gigaplane-XB interconnect is 200 times less than the 80,000 nsec. latency of the IBM SP2 interconnect. Of all the systems discussed in this paper, the lowest latency is on the Sun Enterprise 6000, with a uniform latency of only 300 nsec. When the amount of data contained in each message is small, the importance of low latencies is paramount.
Sun has long held that SMP is the architecture yielding the best reliability and performance, and has worked through the difficult engineering problems to develop servers that can scale nearly linearly with the incremental addition of processors. Ironically, the special shared-nothing versions of merchant database products that run on MPP systems also run with excellent performance on SMP architectures, although the reverse is not true.
When data warehouses must be scaled beyond the number of processors available in current SMP configurations, or when the high availability (HA) characteristics of a multiple-system complex are desirable, clusters provide an excellent growth path. High-availability software can enable the other nodes in a cluster to take over the functions of a failed node, ensuring around-the-clock availability of enterprise-critical data warehouses. A cluster of four SMP servers is illustrated in Figure 3. The same database management software that exploits multiple processors in an SMP architecture and distinct processing nodes in an MPP architecture can create query execution plans that utilize all servers in the cluster.

With their inherently superior scalability, SMP systems provide the best building blocks for clustered systems. These systems are configured with multi-ported disk arrays so that the nodes which have direct disk access enjoy the same disk I/O rates as stand-alone SMP systems. Nodes not having direct access to disk data must use the high-speed cluster interconnect mechanism.
In data warehouses deployed with clusters, the database management system or load-balancing software is responsible for distributing the various threads of the DSS queries across the multiple nodes for maximum efficiency. As with MPP systems, the more effectively that the query can be partitioned across the nodes in the cluster -- and the less inter-node communication that is necessary -- the more scalable the solution will be. This leads to the conclusion that clusters should be configured with as few nodes as possible, with each SMP node scaled up as much as possible before additional nodes are added to the cluster.
Sun currently supports clustering of 30-processor Enterprise 6000 platforms, with the ability to cluster 64-processor Enterprise 10000 systems available in the future.
Beware that, the larger the number of nodes in a cluster, the more the cluster looks like an MPP system, and the database administrator may need to deal with the issues of large numbers of nodes much sooner than they would with clusters of more powerful SMP systems from Sun. These are the familiar issues of non-uniformity of data access, partitioning of database tables across nodes, and the performance limits imposed by the high bandwidth interconnects. A new direction that Sun is taking with clustered systems is to develop software that allows a single system image to be executed across a clustered architecture, increasing the ease of management far beyond that of today's clusters.
The range of architectural choices from MPP to SMP offers a complex decision space for organizations deploying data warehouses. Given that companies tend to make architectural choices early, and then invest up to hundreds of millions of dollars as they grow, the result of choosing an architecture that presents increasingly intractable partitioning problems and the likelihood of idle nodes can have consequences that measure in the millions of dollars. A system that scales well exploits all of its processors and makes best use of the investment in computing infrastructure.
Regardless of environment, the single largest factor influencing the scalability of a decision support system is how the data is partitioned across the disk subsystem. Systems that do not scale well may have entire batch queries waiting for a single node to complete an operation. On the other hand, for throughput environments with multiple concurrent jobs running, these underutilized nodes may be exploited to accomplish other work.
There are typically three approaches to partitioning database records:
- Range Partitioning
Range partitioning places specific ranges of table entries on different disks. For example, records having ``name'' as a key may have names beginning with A-B in one partition, C-D in the next, and so on. Likewise, a DSS managing monthly operations might partition each month onto a different set of disks. In cases where only a portion of the data is used in a query -- the C-D range, for example -- the database can avoid examining the other sets of data in what is known as partition elimination.This can dramatically reduce the time to complete a query.
The difficulty with range partitioning is that the quantity of data may vary significantly from one partition to another, and the frequency of data access may vary as well. For example, as the data accumulates, it may turn out that a larger number of customer names fall into the M-N range than the A-B range. Likewise, mail-order catalogs find their December sales to far outweigh the sales in any other month.
- Round-Robin Partitioning
Round-robin partitioning evenly distributes records across all disks that compose a logical space for the table, without regard to the data values being stored. This permits even workload distribution for subsequent table scans. Disk striping accomplishes the same result -- spreading read operations across multiple spindles -- but with the logical volume manager, not the DBMS, managing the striping. One difficulty with round-robin partitioning is that, if appropriate for the query, performance cannot be enhanced with partition elimination.
- Hash Partitioning
Hash partitioning is a third method of distributing DBMS data evenly across the set of disk spindles. A hash function is applied to one or more database keys, and the records are distributed across the disk subsystem accordingly. Again, a drawback of hash partitioning is that partition elimination may not be possible for those queries whose performance could be improved with this technique.
For symmetric multiprocessors, the main reason for data partitioning is to avoid ``hot spots'' on the disks, where records on one spindle may be frequently accessed, causing the disk to become a bottleneck. These problems can usually be avoided by combinations of database partitioning and the use of disk arrays with striping. Because all processors have equal access to memory and disks, the layout of data does not significantly affect processor utilization.
For massively parallel processors, improper data partitioning can degrade performance by an order of magnitude or more. Because all processors do not share memory and disk resources equally, the choice of on which node to place which data has a significant impact on query performance.
The choice of partition key is a critical, fixed decision, that has extreme consequences over the life of an MPP-based data warehouse. Each database object can be partitioned once and only once without re-distributing the data for that object. This decision determines long into the future whether MPP processors are evenly utilized, or whether many nodes sit idle while only a few are able to efficiently process database records. Unfortunately, because the very purpose of data warehouses is to answer ad hoc queries that have never been foreseen, a correct choice of partition key is one that is, by its very definition, impossible to make. This is a key reason why database administrators who wish to minimize risks tend to recommend SMP architectures where the choice of partition strategy and keys have significantly less impact.
Three fundamental operations composing the steps of query execution plans are table scans, joins, and index lookup operations. Since decision support system performance depends on how well each of these operations are executed, it's important to consider how their performance varies between SMP and MPP architectures.
On MPP systems where the database records happen to be uniformly partitioned across nodes, good performance on single-user batch jobs can be achieved because each node/memory/disk combination can be fully utilized in parallel table scans, and each node has an equivalent amount of work to complete the scan. When the data is not evenly distributed, or less than the full set of data is accessed, load skew can occur, causing some nodes to finish their scanning quickly and remain idle until the processor having the largest number of records to process is finished. Because the database is statically partitioned, and the cost of eliminating data skew by moving parts of tables across the interconnect is prohibitively high, table scans may or may not equally utilize all processors depending on the uniformity of the data layout. Thus the impact of database partitioning on an MPP can allow it to perform as well as an SMP, or significantly less well, depending on the nature of the query.
On an SMP system, all processors have equal access to the database tables, so consistent performance is achieved regardless of the database partitioning. The database query coordinator simply allocates a set of processes to the table scan based on the number of processors available and the current load on the DBMS. Table scans can be parallelized by dividing up the table's records between processes and having each processor examine an equal number of records -- avoiding the problems of load skew that can cripple MPP architectures.
Consider a database join operation in which the tables to be joined are equally distributed across the nodes in an MPP architecture. A join of this data may have very good or very poor performance depending on the relationship between the partition key and the join key:
- If the partition key is equal to the join key, a process on each of the MPP nodes can perform the join operation on its local data, most effectively utilizing the processor complex.
- If the partition key is not equal to the join key, each record on each node has the potential to join with matching records on all of the other nodes. When the MPP hosts N nodes, the join operation requires each of the N nodes to transmit each record to the remaining N-1 nodes, increasing communication overhead and reducing join performance. The problem gets even more complex when real-world data having an uneven distribution is analyzed. Unfortunately, with ad hoc queries predominating in decision support systems, the case of partition key not equal to the join key can be quite common.
To make matters worse, MPP partitioning decisions become more complicated when joins among multiple tables are required. For example, consider Figure 4, where the DBA must decide how to physically partition three tables: Supplier, PartSupp, and Part. It is likely that queries will involve joins between Supplier and PartSupp, as well as between PartSupp and Part. If the DBA decides to partition PartSupp across MPP nodes on the Supplier key, then joins to Supplier will proceed optimally and with minimum inter-node traffic. But then joins between Part and PartSupp could require high inter-node communication, as explained above. The situation is similar if instead the DBA partitions PartSupp on the Part key.

For an SMP, the records selected for a join operation are communicated through the shared memory area. Each process that the query coordinator allocates to the join operation has equal access to the database records, and when communication is required between processes it is accomplished at memory speeds that are two orders of magnitude faster than MPP interconnect speeds. Again, an SMP has consistently good performance independent of database partitioning decisions.
The query optimizer chooses index lookups when the number of records to retrieve is a small (significantly less than one percent) portion of the table size. During an index lookup, the table is accessed through the relevant index, thus avoiding a full table scan. In cases where the desired attributes can be found in the index itself, the query optimizer will access the index alone, perhaps through a parallel full-index scan, not needing to examine the base table at all. For example, assume that the index is partitioned evenly across all nodes of an MPP, using the same partition key as used for the data table. All nodes can be equally involved in satisfying the query to the extent that matching data rows are evenly distributed across all nodes. If a global index -- one not partitioned across nodes -- is used, then the workload distribution is likely to be uneven and scalability low.
On SMP architectures, performance is consistent regardless of the placement of the index. Index lookups are easily parallelized on SMPs because each processor can be assigned to access its portion of the index in a large shared memory area. All processors can be involved in every index lookup, and the higher interconnect bandwidth can cause SMPs to outperform MPPs even in the case where data is also evenly partitioned across the MPP architecture.
The lesson here is clear: for the basic building blocks of database queries -- table scans, joins, and index lookups -- the scalability of an MPP architecture depends on the partitioning of data across the processor complex. Any one choice in partition key may cause some queries to be as fast as an SMP, and it will cause other queries to fall far short of the performance available from an SMP. This puts database administrators into a bind that has no easy solution. The choice of partition key is critical, and even if the best choice is made, it will still be the wrong choice for some queries demanded of a decision support system.
An SMP gives consistent performance because all processors have equal access to disk resources, and when communication between processes in the DBMS are required, it is accomplished at memory speeds which are two orders of magnitude faster than MPP message-based interconnects. With the consequences of disk layout being so minor compared to database partitioning on an MPP, the choice of SMP architectures is one that brings consistency in performance, as well as scalability.
A skeptical database administrator, however, will not accept these qualitative arguments without real performance data to back them up, and fortunately the TPC-D benchmarks provide a means for all vendors to put forth their best performance measurements.
In April, 1995, the 42-member Transaction Processing Performance Council (TPC) approved the TPC-D benchmark. TPC-D is representative of a wide range of decision support system workloads that require complex, long-running queries against large and complex databases. The 17 queries and 2 updates in TPC-D are designed to give answers to real-world business questions which are far more complex than those seen in OLTP operations. Because these are wide-ranging, real-world queries, it is difficult to optimize the database in such a way that performance on all queries is evenly improved; however because of the time that vendors have been able to study the benchmark in detail, quite a few short-cuts have been found which can yield performance improvements for a few specific queries.
The queries generate intense activity on the system under test, and include a rich set of operators and selection constraints. The benchmark is executed against a database that complies to specific requirements as to how it is populated, and how it is scaled. The smallest database (scale factor of 1) holds business intelligence amounting to approximately one gigabyte of data.
The largest database servers in use today are measured at scale factor 1000, utilizing one terabyte of data. The actual storage that this data consumes is considerably greater once indices, temporary space, and mirrors are factored in. A scale factor of 1000 requires from three to five terabytes of disk space.
In order to make the benchmark more representative of real decision support systems, one of the requirements for running the benchmark is the execution of update processes that simulate the periodic transfer of data from the operations side of an enterprise (the OLTP systems) into the data warehouse. Some of the short-cuts that can be used to improve performance have an adverse effect on the update functions, making them an important component of the benchmark. (More information on TPC-D is available from TPC through the contact information provided in the References).
This paper evaluates the impact of data warehouse architectures on query performance by closely examining the TPC-D queries and comparing how SMP and clustered system performance compares with MPP performance. The results from different vendors are publicly available, and each vendor puts forth a significant amount of effort to ensure that the best possible results are presented. The systems discussed in this paper are all measured at scale factor 1000, and include the following:
- Symmetric Multiprocessor. The SMP system used in the comparison is the Sun Enterprise 10000 with 64 336 MHz UltraSPARC processors using 64 GB of main memory. The TPC-D benchmark was executed using Oracle Version 8.0.4.2. This system and configuration was available beginning June 1, 1998.
- Cluster. The cluster system is a four-way cluster of Sun Enterprise 6000 servers hosting a total of 96 336 MHz UltraSPARC processors and using a total of 64 GB of main memory. The benchmark was executed using Informix IDS AD/XP Version 8.21. This configuration was available beginning June 15, 1998.
- IBM Massively Parallel Processor. The MPP system from IBM used in this comparison is the IBM RS/6000 SP model 550 with 192 332 MHz 604e PowerPC processors, and a total of 144 GB of main memory. IBM used DB2 UDB Version 5.2 for their measurements of the TPC-D benchmark. This configuration is to be available beginning October 31, 1998.
- Teradata Massively Parallel Processor. The MPP system from Teradata is the NCR Teradata Worldmark Server with 128 200 MHz Intel Pentium Pro processors using a total of 32 GB of main memory. The benchmark was executed using Teradata DBS Version2R2.1. This system and configuration were available beginning June 1, 1998.
The Transaction Processing Performance Council requires a standard set of metrics to be disclosed with every set of systems compared, and for each set of results to be formally audited using an independent, TPC-certified auditor. The summary presented in Table 1 shows the total system cost, the power metric, the throughput metric, and a price/performance metric:
- Total System Cost
The total system cost reflects the price of the exact configuration for which the results are presented -- including five years of maintenance. The fact that total system cost must be disclosed prevents vendors from adding huge amounts of hardware to avoid performance bottlenecks. Vendors must carefully consider the cost-effectiveness of each investment in hardware.
| System |
Type |
DB Mgmt System |
Scale Factor |
Total System Cost |
TPC-D Power |
TPC-D Throughput |
Price Performance |
| Sun Enterprise 10000 54 processors; single node Single stream |
SMP |
Oracle V8.0.4.2 |
1000 |
$8,536,522 |
8,870.6 |
3,612.1 |
$1,508 |
| Sun Enterprise 6000, 96 processors; four nodes, Single stream |
Cluster |
Informix IDS AD/XP V8.21 |
1000 |
$11,766,932 |
12,931.9 |
5,850.3 |
$1,363 |
| IBM RS/6000 SP/550, 192 Processors, 48 nodes, 7 streams |
MPP |
DB2UDB V5.2 |
1000 |
$12,491,647 |
19,137.5 |
10,661.5 |
$875 |
| Teradata Worldmark Server, 128 processors, 32 nodes, 5 streams |
MPP |
Teradata DBS V2R2.1 |
1000 |
$14,495,886 |
12,149.2 |
3,912.3 |
$2,103 |
- Power Metric
The power metric is calculated from the geometric mean of all of the 17 queries and the two update processes. It is a unit-less measure where a larger number represents better performance.
- Throughput Metric
The throughput metric has units of queries per hour times the scale factor; the number of streams used for this test are provided in the table, and are executed according to the TPC specifications.
- Price/Performance
The TPC-D price/performance metric measures cost per query per hour times the scale factor.
A quick look at the TPC-D power metric -- the most often-quoted of the TPC-D measurements -- shows the surprising result that the MPP system from IBM has the highest result. On first glance, this performance is contrary to the prediction that symmetric multiprocessing is architecturally superior to massively parallel processing. On second glance, the fact that IBM used up to three times the number of processors as Sun obviously has a significant impact on the TPC-D power metric.
In 1997, it was possible to make absolute and direct comparisons between the Sun Enterprise 10000 and the IBM RS/6000 SP2 systems because the TPC-D results available at the time were made at the same scaling factor, using the same database software, and using servers that were roughly-equivalent in their raw table scanning power. In 1998, there are several factors at work that make it more difficult to compare the performance of database systems at face value:
- Number of processors. The number of processors in the systems tested at scale factor 1000 range from 64 (Sun Enterprise 10000) to 192 (IBM RS/6000 Model 550). Clearly a significant factor in the difference in power metrics is IBM's use of three times the number of processors used by Sun.
- Processor performance. The performance of the individual processors -- roughly equivalent in the 1998 scale factor 300 measurements -- now include a SPECint95 value of 8.67 (Teradata 5150), 14.4 (IBM RS/6000), and 14.9 (both servers from Sun). Clearly these differences must be considered when evaluating performance differences due to architectural factors.
- Database software. In 1997, it was easy to compare SMP and MPP architectures because the exact same database software was used in measurements of systems from both Sun and IBM. Today each server measured uses a different database management system, and some of the systems enable the creation of exquisitely-designed indices which can speed the execution of TPC-D queries -- or entirely eliminate the work involved in some joins. All of these short-cuts may or may not reflect real-life use.
- Amount of work performed. Because of the differences in database management system software, and the ability to configure indices which can reduce the amount of work actually done to perform a query, TPC-D results can vary significantly because some systems can be pre-configured to actually do less work per query than others, skewing results. Fortunately, the Transaction Processing Performance Council requires full disclosure of the data definition language statements used to perform the TPC-D queries, which allows in-depth analysis of exactly the work that is being done.
If the architectural differences between servers are to be fairly evaluated, it is important to eliminate the variation in results due to the above factors. Testimony to this variation are some of the large differences from one query to another as illustrated by the raw query data in Figure 5.
The publicly-available TPC-D measurements at scale factor 1000 represent systems with SMP, cluster, and MPP architectures. There is wide variation between the four servers being compared in terms of number of processors, processor speed, database software, and the amount of work required to execute the individual queries. In order to fairly evaluate the architectural differences between systems, these factors must be minimized, and the individual queries must be considered rather than averages that obscure important performance differences. This is especially relevant given that, once a decision is made on a data warehouse architecture, organizations often spend millions of dollars maintaining, scaling, and upgrading them as their workloads vary -- making the architectural differences even more important to consider.

The goal of benchmarking is to predict how a system will perform in real use by extrapolating from benchmark performance. In order for benchmarks to be useful predictors of real-world performance, they must execute workloads that are representative of the work to be done by the end user. For example, if a data warehouse is to be used predominantly for table scans, then the TPC-D queries which stress the use of table scans should be utilized in evaluating potential real-world performance. When a benchmark -- such as TPC-D -- has been in use for some time, vendors have time to implement short-cuts that speed the performance of the benchmarks in ways that are not necessarily representative of real-world use.
Because short-cuts are now available that affect many of the TPC-D queries, it is especially important to compare performance not only on a query-by-query basis, but also on the basis of the amount of work performed by each query. If a particular short-cut is representative of real-world use of the data warehouse, then it should be considered in the evaluation of the TPC-D results. If the short-cut does not represent how the data warehouse will be put to use, then it is important to factor this into any evaluation of results.
When the variation between query results is due to so many potential factors, caution is especially warranted in using averages such as the TPC-D power metric and throughput values. These averages obscure the vital details that can help end users evaluate the differences between systems. It is only through examining the individual queries that users can truly understand how the architectural differences of the four systems come into play.
This chapter discusses the short-cuts that can be used with the TPC-D benchmark to obscure the architectural differences between SMP, cluster, and MPP architectures. In order to evaluate the impact of these short-cuts, as many of the variables discussed in the previous chapter must be minimized so that the short-cuts can be separated from the results which truly reflect relative data warehousing performance.
The most interesting insights between systems are gleaned from results for which a level playing ground is established, and this is done by normalizing the number and speed of processors, and factoring in the amount of work done per query. Once these two variables are considered, the impact of the various techniques for improving benchmark performance can be observed.
The fact that four different platforms have provided published, audited TPC-D benchmark results provides a rare opportunity to cross-compare performance on three different hardware architectures, SMP, cluster, and MPP. The queries, data volume, and data schemas are identical -- yet the underlying hardware used to perform the queries varies tremendously -- up to a factor of three in the number of processors used.
In order to discern the impact of the system and the database architectures, the results must be normalized to account for the different number of processors and for the different processor speeds. Experience has shown that there is a strong correlation between SPECint95 and per-query, raw TPC-D performance. As a result, the TPC-D query times can be normalized using the number of processors and their SPECint95 performance.
SPECint95 performance for the Sun SMP, Sun cluster, IBM, and Teradata systems is illustrated in Figure 6. The SPECint95 value is multiplied by the number of processors in each platform from Table 1 to arrive at the raw power for each platform that is illustrated in Figure 7. The raw power can be used to normalize query times and the power metric, which is illustrated in Figure 8.


Although the IBM MPP configuration is powerful as shown by its raw processing power, it is relatively inefficient in its use of resources, as shown by the normalized TPC-D power metric. The Sun SMP system produces a relatively low power metric -- because only 64 processors were used -- however the normalized power metric shows it to be 34 percent more efficient than the IBM MPP system, which uses 192 processors.
The Sun SMP system is nearly equally efficient as the Sun cluster system, mainly due to the fact that both are based on platforms having very high-speed inter-processor communication mechanisms -- in fact, this is the key to the more predictable performance inherent in SMP systems.
The Teradata results are the best shown, however they are due to their reliance on highly-specialized join indices that allow joins to be computed at load time and therefore do not contribute to the amount of work done in processing the queries. This short-cutting is the topic of the next section, where the amount of work per query is analyzed.

All data warehouse queries perform some combination of scan and join activity, and the TPC-D queries can be characterized on the basis of how much of each of these activities each query undertakes. Sun defines the concept of query work as the sum of the scan volume S and join volume J in bytes, where 1 GB of work yields a unit-less scan volume of 1:
query work = ( S + J ) / 1 GB
The larger the amount of query work, the more likely it is that a large system will be needed to quickly process a query. Understanding the TPC-D queries in terms of the query work that they perform will help to determine which of the TPC-D query results are due to architectural differences and which are due to differences in query work. Some examples of query work are:
- If a query performs a simple table scan on a 1 TB table, the query work is 1024, or 1 TB / 1 GB.
- If the same table is partitioned on a query predicate that results in only one percent of the 1 TB table being scanned, the query work is 10.24.
- If 1 percent of the data in the table is read and then joined to 10 percent of the data from a full table scan of a second 100 GB table, then the query work is 130.5. Joins can be expensive and significantly add to query work.
Using indices can significantly reduce query work because only the bytes read from the index contribute to the scan volume. Likewise, pre-calculated joins can eliminate the join volume entirely, dramatically decreasing query work. Of course pre-calculated joins only benefit very limited cases which the database administrator spends the time and effort to identify -- and even then, the more indices that are added, the more likely it is that the optimizer will choose a sub-optimal query plan. And updates, deletions, and insertions take significantly more time as more specialized indices are added -- and must be updated as well.
There are two ways in which the query work can be reduced in data warehouse operations. First, smart optimizers can find ways to reduce query work by exploiting well thought-out data layouts and indices. Second, smart database administrators may rewrite queries in ways that reduce the amount of query work.
Query work provides a valuable way in which to compare TPC-D query performance between different systems. It allows some of the performance variation to be attributed to architectural factors, and some to be attributed to reduction of query work through the use of specialized techniques. When query work is reduced through pre-computed indices, for example, it is important to evaluate whether use of these indices is representative of real-world optimization, or whether it artificially reduces the work of what would otherwise be performed through an ad hoc query.
If the use of specialized indices is truly representative of real-world use, then they can be used to significantly reduce query work and improve query performance. Real data warehouses, however, are typically used to answer questions which have not been thought of beforehand, and therefore it is unlikely that the reduction of query work would be possible. In this case, using benchmarks that utilize specialized indices as predictors of real-world performance will be highly inaccurate.
The query work for the seventeen queries composing the TPC-D query set are illustrated in Figure 9. Each point represents one of the TPC-D queries where the abscissa illustrates join volume and the ordinate illustrates scan volume. The placement of the queries on this graph are approximate and relative, and are based on a detailed analysis of the TPC-D query execution plans, data schemas, and query selectivities (absent any short-cuts). Absolute values are not used because it is difficult to exactly pinpoint the exact scan and join volumes for each query.
The seventeen queries are grouped according to the relative amount of scan and join activity presented to a system. Queries close to the origin are the lightest of the TPC-D queries. Queries that fall along the ordinate do little or no join activity; those along the abscissa do very little table scanning.
Query 9 is the heaviest query on both dimensions, scanning 900 GB of data from six tables and joining together approximately 30 GB (at scale factor 1000). Query 5 is also a complex join; depending on the database administrator and optimizer techniques, it scans less than query 9 does, however the intermediate join results can be quite large. Query 1, a straight table scan, reads through approximately 600 GB of data, but does no joins, placing it near the ordinate. In contrast to query 9, query 13 does hardly any work, selecting a single clerk and therefore being highly selective. Each of the other queries map onto the graph according to their scan and join volume; there are many queries close to the origin and are therefore fairly light in query work.

Database performance does not scale well when the database optimizer produces a poor query execution plan, or when work must be transported to a separate node where local data resides. In comparison to SMP architectures, scalability is a potentially significant problem on cluster and MPP systems because each of them pay a latency penalty for remote access.
In order to produce correct TPC-D benchmark results, vendors which publish results must exactly follow the requirements of the TPC-D specification. The Transaction Processing Performance Council provides considerable leeway for vendor-selected optimizations that are not excludable by the TPC auditors, yet have dubious value in real-world data warehouses.
These short-cuts complicate the analysis of the relative scalability of SMP, cluster, and MPP architectures for decision support system workloads, and the relative scalability of shared nothing vs. shared disk architectures. The use of short-cuts obfuscates the results because they exploit the known nature of the TPC-D query set to reduce -- or even eliminate -- MPP inter-node traffic and potentially the volume of data scanned from disk. All published results make use of short-cuts to one extent or another, with varying impact on overall performance. As this section will show, some short-cuts have a substantial benefit on performance of some queries with a significant detriment on other queries.
When interpreting the results of these short-cuts, it is important to ask -- just as any smart database administrator would ask -- whether the short-cut is actually useful for real applications, or whether it is specifically designed to improve specific query performance. If the short-cut is deemed to be useful, its impact on other queries and update functions should be evaluated, since specialized indices often cause the cost of maintaining the data warehouse to skyrocket.
All short-cuts -- whether representative or not -- seek to reduce query work by moving the point representing the query closer to the origin of the query work graph. The five known short-cuts are summarized in Table 2, and are the topic of the following several sections.

Every data warehouse makes use of secondary as well as concatenated indices. A straightforward secondary index simply provides an ordered means of accessing the base table on an attribute different than its primary key. For example, most vendors index the TPC-D lineitem table on l_orderkey. Focused indices go beyond this familiar type of indexing because they involve careful design based on specific knowledge of likely queries. IBM, for example adds a focused index in DB2 that puts specific lineitem attributes into the index. This helps to avoid the need to access the lineitem base table, as well as reducing the volume of data that needs to be scanned. This reduces the volume of data that must be scanned and moves the query closer to the origin of the query work graph.
A good example of the use of this short-cut is query 9, which, without short-cuts, has a high scan and join volume. The attributes that IBM added -- l_discount, l_quantity, and l_extendedprice -- just happen to be exactly the ones to reduce the query work for query 9. If the analyst using the data warehouse were instead interested in including other attributes in the query -- l_tax or l_linestatus, for example -- the index would be significantly less useful. Figure 10 illustrates how a focused index significantly benefits query 9 performance on the IBM MPP architecture; when the focused index does not apply (query 7), the resulting performance is poor -- in fact, the IBM system is the worst of the lot in query 7.

A database administrator might be tempted to put all base table attributes into a focused index, however this increases overall storage requirements as well as the time needed to process table updates, insertions, and deletions. Real-world data warehouse shops can afford the use of focused indices only when they have previous, detailed knowledge of frequently-executed, known queries. If the data warehouse supports true ad hoc queries, it is unlikely that the database administrator would have this advance knowledge.
A join index computes key joins in advance, and is a feature currently specific to Teradata. By computing a join at database load time, the join index turns heavyweight joins into simple table lookups -- effectively moving a query over to the ordinate with other queries having zero join volume (Figure 11). The method is effective only when an ad hoc data warehouse query can be satisfied and when the pre-join is ideally ordered so that the query is perfectly parallelized across MPP nodes.
An example of the effectiveness of this short-cut is illustrated by Teradata's outstanding performance on query 4, as illustrated in Figure 12. When ad hoc queries can be constrained to benefit from such a short-cut, the benefits are great, however when the query cannot utilize the join index, performance suffers greatly from inter-node communication as shown by Teradata's poor performance on query 9.
Setting up the join index takes extra time at database load time, and it can benefit specific queries. If a data warehouse executes mostly well-known queries, pre-joining can be a useful technique. This places a burden on the database administrator to carefully analyze query predicates, table attributes, query frequencies, and to properly design a pre-join index. This is not straightforward, and testimony to this fact is Teradata's extremely variable performance between queries 4 and 9.
True ad hoc queries are not likely to benefit from join indices. As with focused indices, compute time and disk space is needed to create and store the index. Database updates, table insertions, and deletions take longer than if the join index did not exist since consistency must be maintained between the index and the underlying database tables.
Whereas focused indices and join indices are strongly dependent on the nature and use frequency of DSS queries, the various forms of partitioning tend to have a positive effect on a wider range of queries. The partitioning criteria is often the primary key of a table or a key that is indicative of an obvious business characteristic; these tend to be fairly evident to a database administrator, making it more likely that the data warehouse can deliver good ad hoc query performance.
Hash Partitioning
Both of the MPP systems -- using DB2 and Teradata -- use hash partitioning to distribute base table rows across nodes. In doing so, they exploit the fact that more than half of the TPC-D queries involve a join on orderkey between the lineitem and orders tables. This permits joins between l_orderkey and o_orderkey to be kept local to an MPP node, dramatically reducing the inter-node traffic. For ad hoc queries which join on other lineitem column values, join data might need to be transferred between nodes, limiting scalability to the extent that interconnect bandwidth is an issue.
Query 12 is a good example that illustrates the benefits of hash partitioning. Query 12 accesses approximately 775 GB of data, but less than 0.5 percent of the lineitem rows match the restrictive predicates. The intermediate join volume is relatively small (a few gigabytes), causing the use of join indices to be not highly significant for this query. IBM's use of hash partitioning avoids all inter-node traffic due to the fact that the join between the lineitem and orders table is made on the same key that partitions the tables. As Figure 13 illustrates, IBM's MPP performance is superior. The Teradata system, also an MPP, shows the worst performance, illustrating the degree to which query short-cuts can obscure the architectural component of performance.
Range Partitioning
Range partitioning physically partitions data onto nodes and disk spindles by a carefully-chosen table column value. The Oracle and Informix systems partition the lineitem table by l_shipdate, helping to move the query towards the abscissa of the query work graph by requiring less data to be scanned from disk. It is important to recall that a specific table can be physically partitioned in only one way. For that partitioning to be the most effective for a given query, the partitioning key must align with a predicate in the query. These assumptions may not prove to be true for ad hoc queries.
Query 10 illustrates a large scan followed by a join through index. Queries such as this are complex enough that they scan large quantities of data, but are selective enough that the database query optimizer can make use of ordinary indices for the subsequent join. The superior performance of Sun's SMP and cluster architectures is illustrated in Figure 14 -- and in fact Sun did not make use of any specialized indices to achieve this level of performance.

Composite Partitioning
The two-step composite partition is currently supported only by Informix. In creating the lineitem table, the Sun/Informix system hashed first on l_orderkey, and then range partitioned on l_shipdate. The first results in an even work distribution across Informix co-servers, which are in turn evenly distributed across multiple nodes in the Sun Enterprise 6000 cluster. The second partition then provides the benefits described in the previous section on range partitioning.
The normalized results for query 3 are illustrated in Figure 15, and show the superior performance of the Sun Enterprise 6000 cluster due to the use of composite partitioning. The IBM MPP system shows the worst performance. The first operation, a join of the lineitem and orders tables on the hash partition key, the best case for MPP systems. IBM's initial performance gain is lost during the subsequent join to the customer table, where inter-node traffic is necessary.

The cost of using specialized short-cuts is revealed in the TPC-D update functions, which were added by TPC to reveal the cost and overhead of maintaining any indices that vendors might add to improve query performance. This cost is considerable when join indices and focused indices are used -- in fact, Teradata's raw performance is six times slower than the Sun Enterprise 6000 performance, and even the normalized performance shown in Figure 16 illustrates the fact that systems that did not utilize specialized short-cuts benefited from excellent update efficiency. Join indices -- indeed, excessive use of indices in general -- result in poor data warehouse performance in any update or insert-intensive application.

In theory, massively parallel processors offer unlimited scalability because of the ability to add processing power as needed. The data partitioning problem, however, limits the ability of MPP architectures to profitably utilize their processors because it is impossible to have data evenly distributed for all decision support system queries -- this is why the short-cuts used to improve performance on the TPC-D benchmarks all aim to reduce inter-node traffic. The important principles to keep in mind when considering data warehousing performance on MPP architectures are:
- When the database records are evenly distributed across all of the MPP nodes, and no partition elimination is possible, all processors on all nodes can participate in parallel data analysis and good scalability is realized.
- When partition elimination is possible on MPP systems, fewer than the total number of nodes may be involved in executing the query, and scalability can be poor.
- When the data is skewed across the nodes in MPP systems, the nodes with the greatest number of rows will have more processing to do compared to those for which skew is not so high -- and again scalability can be poor.
- Ad Hoc queries may not align with previously made data partitioning choices, making the workload uneven across the MPP nodes. Scalability can be poor.
- Real-world data warehouses grow over time, and need to scale throughout the addition of more processing power. On MPP systems, this often results in the use of additional nodes, requiring reorganization of the database tables in order to maintain performance. Dynamic data reorganization, even if automatic, is expensive in terms of time and computing resources. SMP systems, in comparison, do not require data reorganization in order to exploit the addition of more processing power.
Each of these problems either does not exist, or exists to a significantly more limited degree with SMP systems. The tuning challenges on SMP is to avoid disk hot spots. The challenge on MPP systems is to avoid either disk or node hot spots.
The choice between symmetric multiprocessors and massively parallel processors for decision support systems is one of the most critical decisions to be faced by today's Information Technology organizations. Because most enterprises start small, and enlarge their data warehouses as their data and their processing needs grow, a large financial investment will ultimately be made in computing infrastructures. The most important factors determining whether the benefit of this financial outlay will be realized in decision support system performance is whether the database servers can provide reliable performance without the use of highly-specialized short-cuts, and whether the database can scale with rapidly-increasing use and data storage (Table 3):
- Massively parallel processors are highly sensitive to whether database administrators are able to partition the database uniformly across all MPP nodes. This is a risky proposition where the ultimate result is that some queries will perform very well, and some will perform poorly. Both the distribution of data and the communication costs for inter-node transfers are difficult to optimize, and have a tremendous influence on the cost-effectiveness of an enormous investment. Data warehouses based on MPP architectures work best when queries are highly predictable, have little skew, and where the update activity is minimal. MPP performance depends on highly-skilled database administrators who dynamically monitor queries, add and subtract indices as needed, and who are patient enough to monitor systems having a large number of nodes.
- Symmetric multiprocessors give each processor equal access to memory and I/O resources, giving consistently superior performance over MPP architectures. Given that no database administrator can foresee all DSS queries that a server is to process, symmetric multiprocessing represents a choice that can scale and grow with the demands of today's data warehouses, while minimizing risks to the IT organizations supporting them. Because symmetric multiprocessors share data, I/O, and processing resources, SMP performance is most efficient, and provides the most reliable performance for true ad hoc queries.
- Clustered architectures provide a growth path when a greater number of processors is needed than is available on SMP systems, and when the business needs require the high availability that comes through the use of multiple independent servers. For workloads that require fewer than 64 processors, a single SMP system will always provide superior performance. When more processors are needed, use a cluster based on a small number of SMP systems. It is always easier to manage a small number of powerful nodes in a cluster than to manage a large number of nodes in an MPP configuration.
As this paper has demonstrated, the use of TPC-D results for predicting real-world data warehouse performance has limited value because many vendors are using specialized techniques to exploit known characteristics of the query set. These techniques serve to improve TPC-D performance by artificially reducing the work performed for specific queries -- a tactic that is seldom useful in real world data warehouses. Although data partitioning is a requirement of all MPP and cluster systems, the use of focused indices and join indices have a dramatic effect on TPC-D performance but generally have little impact on true, ad hoc queries for which data warehouses are deployed. This is because data warehouses do not give database administrators to tune for every eventuality -- true data warehouses are constantly changing, with new applications coming on-line, new user modalities surfacing, and new forms of external data being incorporated. True ad hoc queries are a fact of life in data warehouses, and it is short-sighted to believe that queries can be optimized in ways that are common in transactional databases such as OLTP.
For organizations risking their future on whether their data warehouses will help them to maintain competitiveness, Sun is a choice that makes data warehouses far more easy for organizations to implement. And with consistent performance that is not dependent on specialized techniques, Sun offers more predictable performance in actual use.
All data warehouse implementations are different, and Sun understands that there are difficult points in every system's growth. The same experienced Sun database engineering organization that prepared this detailed analysis continues to improve system performance on a wide range of platforms and database management systems. When this alliance between Sun and the industry-leading database management system vendors is added to superior decision support system price and performance, the clear choice for data warehouses is with symmetric multiprocessor systems and clusters from Sun.
Sun Microsystems posts product information in the form of data sheets, specifications, and white papers on its Internet World Wide Web page at: http//www.sun.com.
Information on the Transaction Processing Performance Council is available on their Web page at http://www.tpc.org, via electronic mail at admin@tpc.ip.portal.com, or at: 777 North First Street, Suite 600, San Jose, CA 95112-6311 USA. Telephone (408) 295-8894.
Sun Database Engineering Group Third Edition August, 1998
1997, 1998 Sun Microsystems, Inc.--Printed in the United States of America. 2550 Garcia Avenue, Mountain View, California 94043-1100 U.S.A
All rights reserved. This product and related documentation is protected by copyright and distributed under licenses restricting its use, copying, distribution and decompilation. No part of this product or related documentation may be reproduced in any form by any means without prior written authorization of Sun and its licensors, if any.
Portions of this product may be derived from the UNIX® and Berkeley 4.3 BSD systems, licensed from UNIX Systems Laboratories, Inc. and the University of California, respectively. Third party font software in this product is protected by copyright and licensed from Sun's Font Suppliers.
RESTRICTED RIGHTS LEGEND: Use, duplication, or disclosure by the government is subject to restrictions as set forth in subparagraph (c)(1)(ii) of the Rights in Technical Data and Computer Software clause at DFARS 252.227-7013 and FAR 52.227-19.
The product described in this manual may be protected by one or more U.S. patents, foreign patents, or pending applications.
TRADEMARKS Sun, Sun Microsystems, the Sun logo, Sun Enterprise, Starfire, and Gigaplane-XB are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States and or other countries. UNIX is a registered trademark in the United States and other countries, exclusively licensed through X/Open Company, Ltd.
All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. in the United States and other countries. Products bearing SPARC trademarks are based upon an architecture developed by Sun Microsystems, Inc.
THIS PUBLICATION IS PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT.
THIS PUBLICATION COULD INCLUDE TECHNICAL INACCURACIES OR TYPOGRAPHICAL ERRORS. CHANGES ARE PERIODICALLY ADDED TO THE INFORMATION HEREIN; THESE CHANGES WILL BE INCORPORATED IN NEW EDITIONS OF THE PUBLICATION. SUN MICROSYSTEMS, INC. MAY MAKE IMPROVEMENTS AND/OR CHANGES IN THE PRODUCT(S) AND/OR THE PROGRAM(S) DESCRIBED IN THIS PUBLICATION AT ANY TIME.
|