CoolThreads - CMT Tuning and Resources

Active Tab UltraSPARC T2 / T2 Plus Servers
Begin Product Tab Sub Links At a Glance Active Sub Link Applications


MySQL

MySQL InnoDB Performance Tuning for the Solaris 10 OS
You can maximize the performance of MySQL on the Solaris platform through configuration and tuning of the database server, along with optimizing the Solaris OS for MySQL. This paper is intended to help you define tuning parameters and tune them in your environment. Learn more

IBM DB2

DB2 v9.1 runs well on UltraSPARC T1 and T2 based systems. No special system tuning is required.

DB2 v9.5 could hit 'memory allocation error'. The temporary workaround is to turn off the 256M/32M large pages on UltraSPARC T2-based systems by setting the following in /etc/system:

* Disable 32M/256M pages
#set disable_large_pages = 0x30
#set disable_ism_large_pages = 0x34
#set mmu_ism_pagesize = 4194304

Please ensure you are running firmware release 7.0.3.c or above

Building MySQL with Sun Studio 11 Software

Learn more

Download an Optimized MySQL Implementation

MySQL 5.0.30 for Solaris 10 running on UltraSPARC and x64 systems.

Download in Coolstack

Back to top


Oracle Databases

Oracle Configuration Notes

Generally speaking Oracle will scale well on Sun Fire CoolThreads servers. Standard database tuning needs to be applied for Oracle. Use statspack to determine if there are contended resources or I/O bottlenecks.

In /etc/system add

set consistent_coloring=2
or ensure that patch 118833-03 or later has been applied.

For write-intensive applications placing the database, especially the log, on external storage other than Sun Fire CoolThreads server is recommended. Use iostat (1M) to ensure there are no disk I/O bottlenecks in the system.

Long running single threaded Oracle batch jobs may run longer on Sun Fire CoolThreads server.

If your Oracle installation involves the following, please implement the instructions below:

  • installing Oracle 10g Release 2 on T1000/T2000
  • using Oracle 10g Release 2 JVM on T1000/T2000

Description:

Use of the Oracle JVM involves an uncommon software trap which was not implemented correctly for sun4v platforms (T1000/T2000) and will cause the calling process/thread to enter an infinite loop and appear to be hung. Spinning processes/threads respond normally to UNIX signals and can be killed.

If the Oracle JVM is selected for installation, the JVM will be called and installation will appear to hang at that point.

If Oracle is installed on another SPARC system and that installation is copied to a T1000/T2000, processes will spin when they first attempt to call the JVM.

Workaround:

  1. If your application makes no use of the Oracle JVM, when installing Oracle, de-select the option to have the Oracle JVM installed.
  2. If your application uses the Oracle JVM, download and install patch 118833-18. To install this patch and update your system follow the process documented here: http://www.sun.com/servers/coolthreads/tnb/updates.jsp

Oracle DSS Workloads

Testing Oracle Database10g R2 on Sun Fire T2000 Server with a DSS workload demonstrates the importance of ensuring the following parameters are met in order to deliver high levels of performance:

  • Application parallelism
  • I/O bandwidth
  • configuring for large pages and consistent_coloring
Tuning parameters are as follows:
  1. Implement "set consistent_coloring=2" in /etc/system (default setting in Solaris 10 6/06 and later). For recommended /etc/system parameters, see http://www.sun.com/servers/coolthreads/tnb/parameters.jsp
  2. Setting parallelism to "DOP=32" ensures good scaling of queries.
  3. DSS queries often require high I/O bandwidth. Set db_file_multiblock_read_count appropriately to enable 1MB reads and set maxphys=4194304 in /etc/system
  4. By default Oracle 10gR2 does not use 4M pages for anon. This can be explicitly set by the following init.ora parameter.
    _realfree_heap_pagesize_hint = 4194304
  5. I/O bandwidth is an important requirement for single query execution. When multiple, complex queries are executed in parallel, it is observed that the system becomes CPU bound and hence I/O bandwidth becomes less of an issue.

Back to top


PostgreSQL

PostgreSQL Tuning Tips for Sun Fire T2000 Systems Running Solaris

Latest binaries

Download the latest PostreSQL packages, compiled to achieve optimal performance on Solaris 10 (both SPARC and x86).

Tuning /etc/system

Recommend: Solaris 10 11/06 as the OS to use.

Apply the tuning parameters listed at:
http://www.sun.com/servers/coolthreads/tnb/parameters.jsp

For Data Warehousing type of workloads also use:

set maxphys = 1048576

Only set the following parameter if you have plenty of available physical memory and UFS is used instead of ZFS:

set segmap_percent=50

For ZFS please use the following patches (recommended for Solaris 10 11/06):

  • 118833-36 SunOS 5.10: kernel patch
  • 124204-03 SunOS 5.10: zfs patch
  • 122660-07 SunOS 5.10: zones jumbo patch
  • 120986-08 SunOS 5.10: mkfs and newfs patch
  • 123839-04 SunOS 5.10: Fault Manager patch

Reboot the system for the above changes to take effect.

Important notes about the /etc/system settings above:

  • Setting maxphys as shown can help performance if your workload is doing I/O to and from a UFS filesystem in greater than 64K size chunks. Also consider creating filesystems with the following command: "newfs -C 128 /dev/rawdevicename"
  • The default segmap_percent setting is 12% of physical memory. If your system has enough memory, increasing this value can improve filesystem performance. If your system does not have enough physical memory, increasing this value could cause paging, reducing performance.
  • The appropriate values for System V IPC tunables depend on your setting in postgresql.conf such as shared_buffer, max_connection, etc.
  • For Solaris 10, changes to System V IPC tunables are not usually necessary - default values are large enough for most typical environments. Should changes to the defaults be required, though, use the prctl command to set resource controls (rctls) for affected projects. For the sake of backward compatability, values may still be set in /etc/system, although those values take effect only if they are larger than the Solaris 10 defaults.

Tuning /etc/project

For bufferpools bigger than 25% of RAM and high concurrent users (Replace string "postgres" with the username of PostgreSQL database user) :

# projadd -U postgres user.postgres
# projmod -a -K "project.max-sem-ids=(priv,4k,deny)" user.postgres
# projmod -a -K "project.max-shm-memory=(priv,4G,deny)" user.postgres

I/O Tunning

Putting the log into a different filesystem with the forcedirectio option can improve performance for OLTP workloads. Here's a sample entry from /etc/vfstab showing a filesystem with forcedirectio enabled:

/dev/dsk/c1t1d0s0 /dev/rdsk/c1t1d0s0 /pglog ufs 1 yes forcedirectio

Forcedirectio can also be enabled dynamically (applies only until the next reboot):

mount -o remount,forcedirectio /pglog

After running initdb, move $PGDATA/pg_log to your log partition mount point.

The Sun Fire T2000's internal disks are shipped with write cache disabled. To maximize write performance, it is possible to enable the disk write cache using the format -e command, but at the risk of losing data in the event of a power failure. For the best disk I/O performance combined with full data protection, use an external storage array in a RAID 1 + 0 configuration.

PostgreSQL Tuning

Performance tests have been carried out on a Sun Fire T2000 system with 8 cores and 32 GB memory using an internal OLTP workload located on the internal drives. The test database was less than 2GB in size, though, and could fit into memory. Our main interest was CPU scaleability, and indeed PostgreSQL scales quite well on the T2000 when increasing the number of users.

The testing showed that a number of parameters in postgresql.conf can be tuned to improve OLTP performance. The following settings provided the best throughput:

wal_sync_method = fdatasync
wal_buffers = 256
checkpoint_segments = 96
effective_cache_size = 128000
shared_buffer = 430000
max_fsm_pages = 208000
max_fsm_relations = 10000
commit_delay = 5
max_connections = 1000

The above parameters were tuned for our OLTP test workload and the results may vary for your application.

The following links provide some guidelines for determining the optimal settings for your specific workload:

Server Configuration (PostgreSQL 8.1.3 Manual)
PostgreSQL 8.0 Performance Checklist (from Power PostgreSQL, by Josh Berkus and Joe Conway)

Back to top


Sybase ASE / Server Configuration

Easy Tuning for Sybase ASE on the Sun Fire T2000 Server

A recent exercise of running a lightweight OLTP workload using Sybase ASE on the SF T2000 has shown that a few simple up front tunings and setup can provide a potential for a very large gain in performance. Though not "out-of-the-box", these tunings do not require deep or extensive setup of the system CPUs.

Configure ASE to run 24 engines

The T1 processor (1.2 GHz, 8 core, 32 strand) in concert with Solaris 10 runs OLTP workloads best when all the strands are active and most importantly not blocking each other. The ASE 'engine', though it is a multi-thread process appears to run well where there is an engine per strand - or nearly an engine per strand. The recommendation here is that 24 engines be run, leaving 6 engines to the system.

In the Sybase configuration file the following will direct ASE to startup 24 engine processes (that are multi-thread).

     [Processors]
           max online engines = 24
           number of engines at startup = 24
           statement cache size = DEFAULT

Locking Scheme - try row level locking.

Depending on the workload the locking scheme for shared tables and other objects can have a large effect on performance. For example, unless specified, Sybase ASE will typically default to shared or exclusive table or page locking. In either case (shared or exclusive, table or page) the locking used needs to be examined closely on the T1 as a potential bottleneck. This is a large topic and is dependent on the SQL application logic, workload etc. but what is recommended here is that row-level locking at least be tried.

To determine what kind of locking is being used you can issue the following command during live execution:

    $ isql -Usa -P
     1> use igen
     2> go
     3> sp_lock

If you don't see "Ex_row" then you could be able to benefit by building or altering your tables to use row-level locking.

Sybase ASE allows the DBA to either create a table with row locking or to alter a tables locking scheme after it has been built and loaded. An example of directing ASE to use row level locking on the 'customer' table from the interactive sql utility:

    $ isql -Usa -P
     1> use igen
     2> go
     3> alter table customer lock datarows
     4> go
     Non-clustered index (index id = 2) is being rebuilt.
     1>

Note that the associated index for the table is also adjusted.

The other approach is to actually create the table with row level locking - the "lock datarow" directive was added to the DDL create table script.

     if exists ( select name from sysobjects where name = 'customer' )
         drop table customer
     go
     create table customer (
        custid           numeric (10) not null,
        name             char (30) not null,
        address          char (30) not null,
        zipcode          numeric (5) not null,
        industry_id      numeric (2) not null,
        total_sales      numeric (12,2) not null,
        comments          char (256) not null,
        products         char (679) not null
     ) lock datarows
     on SDATA01
     go

Back to top


Sybase IQ

Sybase IQ 12.6 Tuning for UltraSPARC T1 Platforms

Sybase IQ has a tuning parameter called -iqnumbercpus. It defaults to the number of CPU's on the server. For an 8-core T1000/T2000 platform (which supports 4 threads per core - totaling 32 threads), that means it defaults to 32.

Testing has shown that values this high can cause certain types of queries to perform sub-optimally, resulting in very poor execution times. The symptoms of this are a handful of threads using 100% of a hardware strand (user time, as observed via 'prstat -mL') for long periods of time.

This issue can be resolved by specifying a -iqnumbercpus parameter to something less than the default.

For example, a parallel, complex table insert that took over 6 hours with the default parameter dropped to 40 minutes by setting -iqnumbercpus to 8. Note that setting this to 8 does not limit Sybase IQ from using more than 8 CPU's (hardware strands), so you'll still benefit from the 32 strands. It's application dependent on what the optimal value for -iqnumbercpus is. Consider starting with this parameter set to the number of strands/2.

Back to top