Managing I/O Topology Changes Due to CPU Chip Failures in Multi-Node sun4v PlatformsSree Vemuri, February 2009 The Sun SPARC Enterprise T5240 Server is a dual-socket server with up to two UltraSPARC T2 Plus processors. The Sun SPARC Enterprise T5440 Server is a quad-socket server with up to four UltraSPARC T2 Plus processors. Both servers have an integrated Peripheral Component Interconnect (PCI) topology that causes the I/O device paths to change when a Chip Multiprocessing (CMP) node fails. The system can be booted to OpenBoot PROM (OBP) with the remaining CMP nodes, but the operating system might fail to boot, because the PCI Express (PCIe) device paths have changed. When the same device is found at a different I/O path after the power-cycle it is treated as a new device. A change to the boot path requires that the OS be reinstalled. The In a CPU node failover condition, the script combines the current Solaris device paths and the Usage/usr/platform/sun4v/sbin/device_remap [-v | -R <dir>] The following options are supported:
ProcedureAfter adding CPU nodes or removing CPU nodes, boot the system to the OBP prompt, and use the following steps: 1. Boot either the failsafe miniroot using 2. Mount the root disk as 3. Change to the mounted root disk directory: cd /mnt 4. Run the /mnt/usr/platform/sun4v/sbin/device_remap 5. Boot the system from disk. All the error messages are self-explanatory except for the error message ExamplesExample 1: When CMP node 1 fails, the # /a/usr/platform/sun4v/sbin/device_remap -v replacing /pci@500 with /pci@700/pci@0/pci@1 in /etc/path_to_inst path_to_inst changes: 45,52c45,52 < "/pci@500/pci@0" 5 "pxb_plx" < "/pci@500/pci@0/pci@9" 6 "pxb_plx" < "/pci@500/pci@0/pci@c" 7 "pxb_plx" < "/pci@500/pci@0/pci@c/network@0" 0 "nxge" < "/pci@500/pci@0/pci@c/network@0,1" 1 "nxge" < "/pci@500/pci@0/pci@c/network@0,2" 2 "nxge" < "/pci@500/pci@0/pci@c/network@0,3" 3 "nxge" < "/pci@500/pci@0/pci@d" 8 "pxb_plx" --- > "/pci@700/pci@0/pci@1/pci@0" 5 "pxb_plx" > "/pci@700/pci@0/pci@1/pci@0/pci@9" 6 "pxb_plx" > "/pci@700/pci@0/pci@1/pci@0/pci@c" 7 "pxb_plx" > "/pci@700/pci@0/pci@1/pci@0/pci@c/network@0" 0 "nxge" > "/pci@700/pci@0/pci@1/pci@0/pci@c/network@0,1" 1 "nxge" > "/pci@700/pci@0/pci@1/pci@0/pci@c/network@0,2" 2 "nxge" > "/pci@700/pci@0/pci@1/pci@0/pci@c/network@0,3" 3 "nxge" > "/pci@700/pci@0/pci@1/pci@0/pci@d" 8 "pxb_plx" updating /dev symlinks # Example 2: When CMP node 1 is added, the # /a/usr/platform/sun4v/sbin/device_remap -v replacing /pci@700/pci@0/pci@1 with /pci@500 in /etc/path_to_inst path_to_inst changes: 45,52c45,52 < "/pci@700/pci@0/pci@1/pci@0" 5 "pxb_plx" < "/pci@700/pci@0/pci@1/pci@0/pci@9" 6 "pxb_plx" < "/pci@700/pci@0/pci@1/pci@0/pci@c" 7 "pxb_plx" < "/pci@700/pci@0/pci@1/pci@0/pci@c/network@0" 0 "nxge" < "/pci@700/pci@0/pci@1/pci@0/pci@c/network@0,1" 1 "nxge" < "/pci@700/pci@0/pci@1/pci@0/pci@c/network@0,2" 2 "nxge" < "/pci@700/pci@0/pci@1/pci@0/pci@c/network@0,3" 3 "nxge" < "/pci@700/pci@0/pci@1/pci@0/pci@d" 8 "pxb_plx" --- > "/pci@500/pci@0" 5 "pxb_plx" > "/pci@500/pci@0/pci@9" 6 "pxb_plx" > "/pci@500/pci@0/pci@c" 7 "pxb_plx" > "/pci@500/pci@0/pci@c/network@0" 0 "nxge" > "/pci@500/pci@0/pci@c/network@0,1" 1 "nxge" > "/pci@500/pci@0/pci@c/network@0,2" 2 "nxge" > "/pci@500/pci@0/pci@c/network@0,3" 3 "nxge" > "/pci@500/pci@0/pci@d" 8 "pxb_plx" updating /dev symlinks # For More InformationHere are some additional resources:
Comments (latest comments first)Discuss and comment on this resource in the BigAdmin Wiki
Unless otherwise licensed, code in all technical manuals herein (including articles, FAQs, samples) is provided under this License. |
BigAdmin SubscriptionsBigAdmin Areas
BigAdmin Sun Center
BigAdmin Topics | ||||