Fast Track to Solaris 10 Adoption: Predictive Self-Healing
Compatibility Issues
Please click on a question below or download a pdf version.
- Will the agent dev tool for integration be another d-type of language, or scripting capable?
- How will or can PSH interact with Sun or Veritas Cluster services?
- Are trap messages the only SNMP "article" produced? Is there a history/log that can be polled at the MIB level?
- Will PSH data be available through kstat?
- How can third-party applications use PSH features? Is there any way of checking third-party software logs for errors?
- Why isn't Sun back porting this to the Solaris 9 and 8 operating systems? Not everyone will be moving to the Solaris 10 OS on [Sun's] timeline, but could still benefit from this technology in their existing Data Centers.
- How does PSH respond to Oracle in a clustered environment? Example: Oracle runs out of swap space.
- I see in the documentation that fmdump will tell you the part number that needs to be replaced. Is this true for third-party components, or strictly Sun supplied ones?
- Will API calls be available to tie into NMS tools such as OpenView?
- How is PSH integrating with monitoring tools/agents such as Sun Cluster, HP OpenView, and Tivoli? Is there an SNMP agent to send alerts to a control station?
- Please compare aggressive page retirement (Solaris 8 and 9 operating systems) and PSH (Solaris 10 OS) with regard to memory error management.
- Will the Solaris 10 OS also be ported on Fujitsu hardware?
- Will the Solaris 10 OS run on a spark 10?
- What about UltraSPARC II chips? I would like to leverage this technology in my existing midrange servers where I experience many faults. What about SBus?
 |
Q: Will the agent dev tool for integration be another d-type of language, or scripting capable?
A: We are working on a PSH module that will allow administrators to run a custom script when an automated diagnosis occurs.
Back to top
Q: How will or can PSH interact with Sun or Veritas Cluster services?
A: SunCluster and Veritas can register to receive detailed fault information from cluster nodes to make decisions about resource failover and management, and use SMF to restart node-based services. Additionally, the PSH architecture can be leveraged to create a consistent administrative model and experience for a single node or a cluster of nodes.
Back to top
Q: Are trap messages the only SNMP "article" produced? Is there a history/log that can be polled at the MIB level?
A: We're in the process of developing the PSH->SNMP connections, so this is a great topic for giving us more feedback to we can meet your requirements. If you have time, go to sun.com/bigadmin/content/selfheal and describe more about what you would like to see.
Back to top
Q: Will PSH data be available through kstat?
A: Kstats are somewhat orthogonal in that they are bean counters measuring numbers of certain kinds of events in the kernel. We update these in addition to sending telemetry for PSH when errors occur. And then PSH itself has its own bean counters for statistical purposes: fmstat(1M) lets you view these.
Back to top
Q: How can third-party applications use PSH features? Is there any way of checking third-party software logs for errors?
A: Yes. The major way application software vendors can plug into PSH is using the Service Management Facility (SMF), coming in the next Solaris Express release. By defining an SMF manifest, a service will be automatically restarted upon failure (be it software bug, administrator error, or hardware failure), and it will be given an individual service log (in addition to any logging features that the application developer provides).
Back to top
Q: Why isn't Sun back porting this to the Solaris 9 and 8 operating systems? Not everyone will be moving to the Solaris 10 OS on [Sun's] timeline, but could still benefit from this technology in their existing Data Centers.
A: The intention is to have Sun's customers take advantage not only of this key technology in the Solaris 10 OS, but also of other groundbreaking technologies in that release. These technologies demanded a radical change in the kernel; hence, it would take a big effort to backport them the Solaris 8 and 9 operating systems.
Back to top
Q: How does PSH respond to Oracle in a clustered environment? Example: Oracle runs out of swap space.
A: Oracle, like other application or middleware software, may register as an SMF service and/or receive detailed fault information. If a fault should occur on any of the resources it uses, Oracle may respond by releasing its hold on the affected resource and failing over to a duplicate or restart its services.
Back to top
Q: I see in the documentation that fmdump will tell you the part number that needs to be replaced. Is this true for third-party components, or strictly Sun supplied ones?
A: We will tell you the location path for the FRU and that will work regardless of whether, for example, the PCI card there is from Sun or another vendor. Some other information, like the part number and serial number, require that the part have that information encoded in it in a standard form (e.g. Sun FRUID) that we can read. You'll need to check with the vendor to see if they support that.
Back to top
Q: Will API calls be available to tie into NMS tools such as OpenView?
A: Our plan is to leverage customized agents to integrate those functions, please visit in the near future for up-to-date information.
Back to top
Q: How is PSH integrating with monitoring tools/agents such as Sun Cluster, HP OpenView, and Tivoli? Is there an SNMP agent to send alerts to a control station?
A: We're working with all our ISVs to take advantage of the new APIs and technology offered by PSH. We will be providing a module in the Solaris 10 OS timeframe to permit SNMP traps to be sent triggered when an automated diagnosis occurs.
Back to top
Q: Please compare aggressive page retirement (Solaris 8 and 9 operating systems) and PSH (Solaris 10 OS) with regard to memory error management.
A: The Solaris 10 OS includes all of the underlying technology we use in the VM system such as page retirement found in previous releases. So with PSH, you get aggressive page retirement, but we have more sophisticated diagnosis algorithms determining when to apply it, and we have all the other stuff, such as an improved administrative model.
Back to top
Q: Will the Solaris 10 OS also be ported on Fujitsu hardware?
A: Yes.
Back to top
Q: Will the Solaris 10 OS run on a spark 10?
A: It is not a tested configuration and won't be supported.
Back to top
Q: What about UltraSPARC II chips? I would like to leverage this technology in my existing midrange servers where I experience many faults. What about SBus?
A: We have implemented PSH capabilities for the US-II PCI subsystem. Future instrumentation of the US-II CPU/memory subsystem is planned but not available in the Solaris 10 OS. There are no plans for SBus instrumentation.
|