[OmniOS-discuss] SSD rpool degraded

Johan Kragsterman johan.kragsterman at capvert.se
Thu Jul 2 13:55:07 UTC 2015


Hi!

I got a degraded rpool, consisting of a mirror of two SSD's. I feel unsure about if it really is the SSD that have failed, since it is enterprise grade and haven't been running that long

I would like to know if there is a way to figure out wether it is the SATA port or the SSD that have failed.

The zpool status looks like this:

NAME          STATE     READ WRITE CKSUM
        rpool         DEGRADED     0     0     0
          mirror-0    DEGRADED     0     0     0
            c2t0d0s0  ONLINE       0     0     0
            c2t1d0s0  FAULTED      1   191     0  too many errors



dmesg containes this:

Jun 29 00:39:47 omni2 genunix: [ID 517647 kern.warning] WARNING: ahci0: watchdog port 1 satapkt 0xffffff065eb76860 timed out
Jun 29 00:39:58 omni2 genunix: [ID 860969 kern.warning] WARNING: ahci0: ahci_port_reset port 1 the device hardware has been initialized and the power-up diagnostics failed
Jun 29 00:39:59 omni2 genunix: [ID 801845 kern.info] /pci at 0,0/pci1028,26e at 1f,2:
Jun 29 00:39:59 omni2  SATA port 1 error
Jun 29 00:39:59 omni2 genunix: [ID 801845 kern.info] /pci at 0,0/pci1028,26e at 1f,2:
Jun 29 00:39:59 omni2  SATA port 1 error
Jun 29 00:39:59 omni2 genunix: [ID 801845 kern.info] /pci at 0,0/pci1028,26e at 1f,2:
Jun 29 00:39:59 omni2  SATA port 1 error
Jun 29 00:39:59 omni2 genunix: [ID 801845 kern.info] /pci at 0,0/pci1028,26e at 1f,2:
Jun 29 00:39:59 omni2  SATA port 1 error
Jun 29 00:39:59 omni2 genunix: [ID 801845 kern.info] /pci at 0,0/pci1028,26e at 1f,2:
Jun 29 00:39:59 omni2  SATA port 1 error
Jun 29 00:39:59 omni2 genunix: [ID 801845 kern.info] /pci at 0,0/pci1028,26e at 1f,2:
Jun 29 00:39:59 omni2  SATA port 1 error
Jun 29 00:39:59 omni2 genunix: [ID 801845 kern.info] /pci at 0,0/pci1028,26e at 1f,2:
Jun 29 00:39:59 omni2  SATA port 1 error
Jun 29 00:39:59 omni2 genunix: [ID 801845 kern.info] /pci at 0,0/pci1028,26e at 1f,2:
Jun 29 00:39:59 omni2  SATA port 1 error
Jun 29 00:39:59 omni2 genunix: [ID 801845 kern.info] /pci at 0,0/pci1028,26e at 1f,2:
Jun 29 00:39:59 omni2  SATA port 1 error
Jun 29 00:39:59 omni2 genunix: [ID 801845 kern.info] /pci at 0,0/pci1028,26e at 1f,2:
Jun 29 00:39:59 omni2  SATA port 1 error
Jun 29 00:39:59 omni2 genunix: [ID 801845 kern.info] /pci at 0,0/pci1028,26e at 1f,2:
Jun 29 00:39:59 omni2  SATA port 1 error
Jun 29 00:40:14 omni2 fmd: [ID 377184 daemon.error] SUNW-MSG-ID: ZFS-8000-FD, TYPE: Fault, VER: 1, SEVERITY: Major
Jun 29 00:40:14 omni2 EVENT-TIME: Mon Jun 29 00:40:14 CEST 2015
Jun 29 00:40:14 omni2 PLATFORM: Precision-WorkStation-T5500, CSN: BCLJ55J, HOSTNAME: omni2
Jun 29 00:40:14 omni2 SOURCE: zfs-diagnosis, REV: 1.0
Jun 29 00:40:14 omni2 EVENT-ID: e44ba921-004f-61f8-cbdf-8f1ebf0d57c0
Jun 29 00:40:14 omni2 DESC: The number of I/O errors associated with a ZFS device exceeded
Jun 29 00:40:14 omni2        acceptable levels.  Refer to http://illumos.org/msg/ZFS-8000-FD for more information.
Jun 29 00:40:14 omni2 AUTO-RESPONSE: The device has been offlined and marked as faulted.  An attempt
Jun 29 00:40:14 omni2        will be made to activate a hot spare if available. 
Jun 29 00:40:14 omni2 IMPACT: Fault tolerance of the pool may be compromised.
Jun 29 00:40:14 omni2 REC-ACTION: Run 'zpool status -x' and replace the bad device.



>From that it looks like zfs hinting that it is the device, not the port...




cfgadm -al:

root at omni2:/root# cfgadm -al
Ap_Id                          Type         Receptacle   Occupant     Condition
c3                             scsi-sas     connected    unconfigured unknown
c5                             scsi-sas     connected    configured   unknown
c5::w5000c50078e5135e,0        disk-path    connected    configured   unknown
c8                             scsi-sas     connected    configured   unknown
c8::w5000c5007ffee30b,0        disk-path    connected    configured   unknown
c9                             scsi-sas     connected    configured   unknown
c9::w500a0751034af6dc,0        disk-path    connected    configured   unknown
sata1/0::dsk/c2t0d0            disk         connected    configured   ok
sata1/1                        sata-port    disconnected unconfigured failed
sata1/2                        sata-port    empty        unconfigured ok
sata1/3                        sata-port    empty        unconfigured ok
sata1/4                        sata-port    empty        unconfigured ok
sata1/5                        sata-port    empty        unconfigured ok


But with cfgadm I get unsure again...

Someone know...?


Best regards from/Med vänliga hälsningar från

Johan Kragsterman

Capvert



More information about the OmniOS-discuss mailing list