BigAdmin System Administration Portal
BigAdmin XPerts

Pages:   1 - 2 | » Next
Last Updated August 08, 2007

XPert: Solaris Fibre Channel Connectivity, Configuration and Tuning

Sumit Gupta Bio | XPerts Home

Page 1 (1-12 of 22 questions)
  1. What are the different components that make up the total FC solution?
  2. What is the difference between Solaris 8/9 and Solaris 10 releases with regards to Fibre Channel SAN support?
  3. What are the main advantages of the Leadville driver in the Solaris 10 OS? Does the new framework negate the requirement for lpfc.conf and sd.conf configuration files?
  4. Is there a good tool to determine how utilized your FC link(s) are?
  5. How can we ensure that the Emulex driver for LP10000DC-x running on the Solaris 8 OS is totally removed before upgrading from the Solaris 8 release to Solaris 10?
  6. What is your opinion about fixed settings for Fibre Channel topology and speed in emlxs.conf and qlc.conf?
  7. What is the relationship of MPxIO and ZFS - and then ZFS and SAN device configuration?
  8. On the Solaris 8, 9, and 10 OS, what tools are available to monitor HBA performance and statistics?
  9. Why does Sun use so many different management interfaces with its storage devices?
  10. Sun has recently implemented TPGS into STMS. Is there any formal documentation available that describes the details?
  11. Is there any documentation that describes all the sd/ssd parameters?
  12. With the Solaris 10 OS, (s)sd_retry_count does not seem to exist anymore. Is this correct, and if so, why?

Q: What are the different components that make up the total FC solution?

A: Users connect to Fibre Channel SANs through a PCI card commonly known as Host Bus Adapter. These are provided by two major Sun OEM vendors, Qlogic (www.qlogic.com) and Emulex (www.emulex.com). Each of these vendors offer a variety of different HBAs with different speeds and port counts. Check out their web sites for more information. The device driver for all of the Qlogic's HBAs is 'qlc' which resides under /kernel/drv/ or /kernel/drv/sparcv9 or /kernel/drv/amd64, depending upon the Solaris host. Similarly the device driver for all of the Emulex HBAs is 'emlxs' which also sits at the location specified above. These device drivers are primarily responsible for driving the HBA hardware but they themselves don't do much of the protocol work.

The majority of the Fibre Channel protocol is handled by a kernel software layer, known as the Leadville Stack or Sun StorEdge SAN Foundation Software. This is composed of two kernel modules, 'fp' and 'fctl'. fp resides in the /kernel/drv tree and fctl resides in the /kernel/misc tree. But Fibre Channel Protocol by itself cannot talk to storage devices. The protocol that helps communicating with storage devices is SCSI, which in this case runs on top of the Fibre Channel protocol (hence the term "protocol stack"). The kernel software which encapsulates SCSI on top of FC is 'fcp' found under the /kernel/drv tree. Now SCSI offers communicating to different types of storage devices like disks, tapes, medium changers and so on. Each of these different types of devices use a different set of commands and operations and hence they have different kernel modules, called target drivers, to help the rest of the system communicate with them. These modules are 'sd' (x86 platforms) and 'ssd' (SPARC platforms) for disks and 'st' for tapes, to name a few. Each of these modules have man pages. Type man <module name> for more information.

How target drivers communicate with 'fcp' (or 'iscsi' in the case of iSCSI SAN) is another story. It varies based on whether or not Solaris multipathing software, known as 'MPxIO' is enabled or not. In the absence of MPxIO, FCP registers every HBA port with 'scsa' (which stands for Sun Common SCSI Architecture), which resides under the /kernel/misc tree and 'scsa' then connects all the target drivers to the SCSI transport, 'fcp' in this case. If MPxIO is enabled, then another device driver, called 'scsi_vhci' (under /kernel/drv tree) comes into play. All the SCSI transports like 'fcp', 'iscsi' etc. (which are MPxIO aware) register all the HBA ports with 'scsi_vhci'. 'scsi_vhci' then tries to look into all the devices behind those ports. It determines if a single SCSI device (also known as SCSI LUN) is visible to the host through more than one port. It does that based on a globally unique identifier (GUID) presented by the device. It then only registers one port with 'scsa' which is sort of a virtual port. That port exports all the 'real' devices behind all these HBA ports. 'scsi_vhci' automatically load balances the storage traffic and maintains connections to a device, also known as paths, dynamically.

Various applications talk to the 'target drivers' directly by opening the /dev/rdsk/ (or /dev/rmt) nodes. File systems talk to target drivers by mounting nodes under /dev/dsk. Third-party software uses kernel facilities like layered driver interface (LDI) to communicate directly with target drivers and use the Fibre Channel stack.

More information on the structure of the stack can be found in the SDN article Updating Fibre Channel Drivers to Use Sun StorEdge SAN Foundation Software.

May 14, 2007 Back to top


Q: What is the difference between Solaris 8/9 and Solaris 10 releases with regards to Fibre Channel SAN support?

A: In the Solaris 8 and 9 releases, the support for Fibre Channel SAN is divided into two parts. One part is integrated into the Solaris OS and is installed along with the rest of the system. This part has bare minimal support for FC connectivity. It only works with direct attach storage. The 2nd part is the complete and current FC support which is distributed as SAN patches. These patches can be obtained from http://www.sun.com/storagetek/networking.jsp. Move down to the section named "SAN x.x release Software".

In the Solaris 10 release, the FC SAN support is fully integrated into the OS and gets installed as part of the standard Solaris installation. As updates for Solaris 10 are released, they contain newer features and bug fixes. In between updates customers can download the SAN patch (patch ID 119130 for SPARC and 119131 for x86) which is planned to be part of the next update.

The other difference is in the area of booting. Since in Solaris 8 and 9 the support for 2G and 4G HBAs is not integrated into Solaris, customers cannot use the standard solaris installation method to install and boot from a FC device connected through 2G and 4G HBAs. However it is still possible to boot from these devices using the "dump/restore method" described in the ref. section (Sun StorageTek PCI-X Enterprise 2 Gb FC Single Port Host Bus Adapter Installation Guide).

The other big difference between Solaris 8/9 and Solaris 10 support is that in Solaris 8/9 users will have to configure devices manually i.e. after a device is connected to FC SAN, the user will use the command cfgadm -c configure <dev> so that the rest of the system can see the device. This is no longer the case in the Solaris 10 OS. Everything gets configured automatically upon getting connected to SAN (give or take a few seconds for the configuration process itself). However if the old behavior is desired, i.e. not configuring devices automatically, then a configuration parameter can be set in /kernel/drv/fp.conf to make that happen. See Appendix A of Solaris Fibre Channel and Storage Multipathing Administration Guide (link in Useful Links section) for details.

May 14, 2007 Back to top


Q: What are the main advantages of the Leadville driver in the Solaris 10 OS? Does the new framework negate the requirement for lpfc.conf and sd.conf configuration files?

A: With monolithic drivers like lpfc, you have to use sd.conf to tell the OS which devices can be enumerated. Then after connecting the device to the SAN, you have to trigger the discovery process by running devfsadm (or a script wrapper around that). In certain situations you have to reboot for the new settings to take effect.

On the other hand, the Leadville framework in the Solaris 10 OS uses dynamic discovery. This means that whenever a device is connected to the SAN, it is detected by the framework and automatically enumerated (creation of nodes under /dev and /devices). Similarly when a device is disconnected from the SAN, the framework automatically offlines the device after a few seconds. This process does not depend upon the sd.conf file entries at all, which means you do not need to touch sd.conf or add any additional entries in it.

lpfc.conf is used for a different purpose. It contains hardware-specific settings for the HBA card. It also contains something commonly known as "persistent bindings" settings. Hardware-specific settings are also there in emlxs.conf, which is the conf file for Emulex's Leadville driver (or qlc.conf for QLogic). These are independent of the framework (Leadville or monolithic). The persistent bindings are a different story. In most cases there is no need for persistent bindings in the case of Leadville.

For more information, please see Chapter 8 of the Solaris Fibre Channel Storage Configuration and Multipathing Administration Guide (in the "Useful Links" section of this session).

Also, dynamic SAN configuration is not the only feature available with Leadville. There are many additional advantages, e.g. you also get Solaris Multipathing support (MPxIO) for free. For more information see the article Updating Fibre Channel Drivers to Use Sun StorEdge SAN Foundation Software in the "Useful Links" section.

Alan Anderson - May 22, 2007 Back to top


Q: Is there a good tool to determine how utilized your FC link(s) are? (We have a single Sun Fire v490 server connected to a single Sun StorEdge 3510 FC Array with one 2Gig adapter. We are trying to determine if we need to migrate/upgrade to a dual connection for added performance.)

A: If you have a FC switch connected between the HBA port and the storage, you can launch the switch management GUI and monitor the port activity from there. Most switches have very nice graph displays to show port activity and port utilization. You can also use the iostat tool on the host side to monitor disk-level activity. iostat does not give you a total port-level view, though. But it is possible to add up different disk activity numbers to figure out the port utilization. The command I mostly use with iostat is this:

iostat -xcn 5 |grep c5

Replace c5 with the controller number you have on your system or use any other grep pattern to help eliminate all the unwanted data.

Tim Z - May 25, 2007 Back to top


Q: How can we ensure that the Emulex driver for LP10000DC-x running on the Solaris 8 OS is totally removed before upgrading from the Solaris 8 release to Solaris 10? The goal is to make installing the latest driver for Emulex go smoothly.

A: Emulex has two drivers for the Solaris 8 OS. One is their old monolithic driver 'lpfc' and the second one is the new Leadville driver 'emlxs'. Neither of these drivers is part of the Solaris 8 distribution. Both of them are available as patches. So depending upon which patch you are using, doing a patchrm on the respective patch should remove the driver completely. To make sure that the driver is completely removed, do the following:

  • Use patchrm on the /kernel/drv tree for the driver binary and conf file. For example, if you have 'lpfc' on a Sun machine running the SPARC platform, then this will be the following three files: /kernel/drv/lpfc, /kernel/drv/sparcv9/lpfc, and /kernel/drv/lpfc.conf. After using patchrm, these files should not be there.
  • Do a rem_drv for the driver. e.g. if you had emlxs, issue the command rem_drv emlxs. This command might fail if the driver was already removed by patchrm.
  • After doing the above two steps, reboot the system, and after the reboot, do a devfsadm -C. This will clean up devfsadm cache.

If you were running 'lpfc' before (on the Solaris 8 release) and now you install Solaris 10, you will get the emlxs driver by default because unlike Solaris 8, Solaris 10 has the emlxs driver integrated, that is, the emlxs driver comes with the default Solaris 10 distribution. Moving forward, that is the driver customers should be using. However you can get 'lpfc' for the Solaris 10 OS from Emulex and it contains documentation on how to revert back to 'lpfc' on Solaris 10. Also the device naming scheme is different in the case of the 'emlxs' driver. So if you were using the 'lpfc' driver, you will have to migrate the device paths. For more information see Updating Fibre Channel Drivers to Use Sun StorEdge SAN Foundation Software in the Useful Links section of this session.

Here's another thing I have noticed (this is specific to SPARC platforms only). If you are running old FCode on the card, the device path for the HBA (shown by luxadm -e port command) shows 'lpfc' as one of the components of the path even if you don't have the lpfc driver installed. This is a little bit confusing as the 'lpfc' in this case is not the driver. It is just a name exported by the FCode. You can get the driver name by running the prtconf -D <device path> command. And you will notice that the driver name is 'emlxs' in this case. This is not really a problem except it is confusing when you use a management application like luxadm to view the paths. To fix this issue, make sure you have the latest FCode, and then use the set- sfs-boot command to switch the boot mode (see Boot Code User Manual for Emulex HBAs at http://www.emulex.com/support/hardware/bootcode.pdf for more information).

Ajay Parandkar - May 25, 2007 Back to top


Q: What is your opinion about fixed settings for Fibre Channel topology and speed in emlxs.conf and qlc.conf? In the past, we saw some problems with autonegotiation of these parameters and therefore we recommended using fixed settings.

A: There are two problems with fixed settings:

  1. If you were to ever switch a HBA from a fabric to loop (or vice versa), you had better remember that you have fixed settings in the .conf file. The same goes for moving between different switches with different link speeds.
  2. This might mask a real problem (see below for more on that).

Before you decide to go for fixed settings, devote some attention to the nature of the problems seen. If the problems were related to port speed, i.e., a 4G link did not work well as a 4G link but worked well when forced to be a 2G link, then I think the problem is with the GBIC/SFP or the cable or the HBA hardware. Otherwise the problem is related to topology. The only topology-related issue I am aware of is HBA port coming up in loop topology even when it is connected directly to the switch. Mostly that can be solved by just setting the switch port to G_port (or F_port) instead of GL_port.

mon - May 25, 2007 Back to top


Q: What is the relationship of MPxIO and ZFS - and then ZFS and SAN device configuration?

A: MPxIO and ZFS are two different technologies that are independent of each other. We'll look at MPxIO first.

When a Fibre Channel HBA inside a Solaris host is connected to SAN, the host sees the devices on the SAN through that HBA port. This view of devices through a HBA port is called a 'path'. If another HBA port in the same host is connected to the same SAN, the host will see the same set of devices again but this time through a different 'path'. Now you have two paths to the same device. In the absence of MPxIO (or any other multipathing software), the host does not know that it is seeing the same set of devices again. The applications running on the host also see multiple /dev/[r]dsk nodes for the same device. Unless these applications know which two paths belong to the same device, they will end up thinking that these are two different devices. This is potentially very dangerous, or at least not a very useful, configuration. Ideally you would want the host to be aware of the multiple 'paths' to the same device and then load balance the traffic, do failovers in case one path goes bad, etc. This is what is provided by MPxIO.

When MPxIO is used, applications only see one node in the /dev/[r]dsk/ tree. And this node is the same no matter how many connections you have to the device. Also, the MPxIO software provides for load balancing and automatic failover functionality.

ZFS, on the other hand, is a file system like UFS but of course much more advanced and sophisticated. You use the devices under the /dev/[r] dsk tree to create ZFS file system(s) or volume(s). Unlike MPxIO, ZFS does not care about the 'paths' to a device but it cares about the device itself. ZFS allows you to combine multiple devices to create larger volumes or file systems. For more information on ZFS, check out the OpenSolaris ZFS community page at http://www.opensolaris.org/os/community/zfs/.

david - May 27, 2007 Back to top


Q: On the Solaris 8, 9, and 10 OS, what tools are available to monitor HBA performance and statistics? I'm using the Sun-branded Emulex cards with the emlxs driver. I'm interested in knowing how many requests are queued at the card, the number of reads and writes, the I/O wait (how long is a request in the queue), etc.

A: So far there are no tools to provide that information at the HBA level. The main tool on the host side to get performance data is iostat, which operates at the target driver (e.g. ssd) level.

Edward Farrar - May 29, 2007 Back to top


Q: Why does Sun use so many different management interfaces with its storage devices?

I have Sun StorEdge T3, 3511 and 6130 arrays, and each is totally different. The 6130... is hopefully not the direction Sun continues to go in -- it requires software on the Solaris box to provide a web interface to manage the machine.

IMHO the best solution would be a local web interface similar to the text menu interface on the Sun StorEdge 3511 array, with SNMP agent capability for status.

I also like the command-line interface on the Sun StorEdge T3 arrays. We wrote a script that telnet's into a T3 and parses fru stat to report status in "Big Brother". I can see folks liking a menu similar to the 3511. I don't see any advantage for the slow and klunky 6130 interface.

A: Sun is definitely aware of the differences in management interfaces among our storage array products. There is an effort underway to ensure more commonality among these interfaces. Sun StorageTek Common Array Manager software was a first step toward that goal.

Robert Eden - May 29, 2007 Back to top


Q: Sun has recently implemented TPGS into STMS. Is there any formal documentation available that describes the details?

A: The work is in progress to do the same. The document is scheduled to be posted as an Infodoc.

Hans-Paul Drumm - May 29, 2007 Back to top


Q: In the context of STMS/Leadville, the ssd stack is employed, otherwise it is the sd stack. There are some ssd and sd parameters that can be employed in order to deviate from the defaults. Is there any documentation that describes all the sd/ssd parameters? (I am aware of (s)sd_max_throttle, (s)sd_min_throttle, (s)sd_retry_count, and (s)sd_io_time -- what about the others?)

A: There are different retry counts for sd/ssd which can be set by the ssd-config-list parameter in the sd.conf file. We are working on a tunables document which will describe both sd/ssd and Leadville tunables and we plan to post it on the BigAdmin site.

Hans-Paul Drumm - May 29, 2007 Back to top


Q: With the Solaris 10 OS, (s)sd_retry_count does not seem to exist anymore. Is this correct, and if so, why?

A: This has already been identified as a bug and work is in progress to fix this issue. See the bug report at http://bugs.opensolaris.org/view_bug.do?bug_id=6518995.

Hans-Paul Drumm - May 29, 2007 Back to top


BigAdmin
  
 
 
 
Contact About Sun News & Events Employment Site Map Privacy Terms of Use Trademarks Copyright 1994-2008 Sun Microsystems, Inc.