<div dir="ltr"><div><div><div>Hi,<br>
<br>
</div>I have Dell md1200 connected to two heads ( Dell R710 ). The heads
have Perc H800 card and drives are configured in Raid0 ( Virtual Disk)
in the RAID controller.<br>
<br>
One of the drives had crashed and is replaced by a spare. Resilvering
was triggered but fails to complete due to drives going offline. I have
to reboot the head ( R710) and drives comes online. This happened
repeatedly when resilver was 4% done, and again was rebooted , again
hung at 27% done, etc.<br>
<br>
</div><div>The issues happens with both Solaris11.1/ Omnios.<br>
</div><div>Its a 100Tb pool with 69Tb used. I have critical data and cant afford loss of data.<br>
</div>Can I recover the data anyway ( atleast partially ) ?<br>
<br>
</div>I had verified there is no hardware issue with H800 and also
upgraded the firmware for H800. The issue happens with both the heads.<br>
<div><br>
Current OS: Solaris 11.1<br>
<div><br>
Mar 22 21:47:55 solaris scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,340e@7/pci1028,1f15@0/sd@12,0 (sd26):<br>
Mar 22 21:47:55 solaris Command failed to complete...Device is gone<br>
Mar 22 21:47:55 solaris scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,340e@7/pci1028,1f15@0/sd@c,0 (sd20):<br>
Mar 22 21:47:55 solaris Command failed to complete...Device is gone<br>
Mar 22 21:47:55 solaris scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,340e@7/pci1028,1f15@0/sd@18,0 (sd32):<br>
Mar 22 21:47:55 solaris Command failed to complete...Device is gone<br>
Mar 22 21:47:55 solaris scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,340e@7/pci1028,1f15@0/sd@1c,0 (sd36):<br>
Mar 22 21:47:55 solaris Command failed to complete...Device is gone<br>
Mar 22 21:47:55 solaris scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,340e@7/pci1028,1f15@0/sd@1b,0 (sd35):<br>
Mar 22 21:47:55 solaris Command failed to complete...Device is gone<br>
Mar 22 21:47:55 solaris scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,340e@7/pci1028,1f15@0/sd@1e,0 (sd38):<br>
Mar 22 21:47:55 solaris Command failed to complete...Device is gone<br>
Mar 22 21:47:55 solaris scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,340e@7/pci1028,1f15@0/sd@19,0 (sd33):<br>
Mar 22 21:47:55 solaris Command failed to complete...Device is gone<br>
Mar 22 21:47:55 solaris scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,340e@7/pci1028,1f15@0/sd@1d,0 (sd37):<br>
Mar 22 21:47:55 solaris Command failed to complete...Device is gone<br>
Mar 22 21:47:55 solaris scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,340e@7/pci1028,1f15@0/sd@27,0 (sd47):<br>
Mar 22 21:47:55 solaris Command failed to complete...Device is gone<br>
Mar 22 21:47:55 solaris scsi: [ID 107833 kern.warning] WARNING: /pci@0,0/pci8086,340e@7/pci1028,1f15@0/sd@26,0 (sd46):<br>
Mar 22 21:47:55 solaris Command failed to complete...Device is gone<br>
<br>
</div><div># zpool status -v<br>
</div><div><br>
pool: test<br>
state: DEGRADED<br>
status: One or more devices is currently being resilvered. The pool will<br>
continue to function in a degraded state.<br>
action: Wait for the resilver to complete.<br>
scan: resilver in progress since Wed Mar 20 19:13:40 2013<br>
27.4T scanned out of 69.6T at 183M/s, 67h11m to go<br>
2.43T resilvered, 39.32% done<br>
config: <br>
<br>
NAME STATE READ WRITE CKSUM<br>
test DEGRADED 0 0 0<br>
raidz1-0 DEGRADED 0 0 0<br>
c8t0d0 ONLINE 0 0 0<br>
c8t1d0 DEGRADED 0 0 0<br>
c8t2d0 DEGRADED 0 0 0<br>
c8t3d0 ONLINE 0 0 0<br>
spare-4 DEGRADED 0 0 0<br>
12459181442598970150 UNAVAIL 0 0 0<br>
c8t45d0 DEGRADED 0 0 0 (resilvering)<br>
raidz1-1 ONLINE 0 0 0<br>
c8t5d0 ONLINE 0 0 0<br>
c8t6d0 ONLINE 0 0 0<br>
c8t7d0 ONLINE 0 0 0<br>
c8t8d0 ONLINE 0 0 0<br>
c8t9d0 ONLINE 0 0 0<br>
raidz1-3 DEGRADED 0 0 0<br>
c8t12d0 ONLINE 0 0 0<br>
c8t13d0 ONLINE 0 0 0<br>
c8t14d0 ONLINE 0 0 0<br>
c8t15d0 DEGRADED 0 0 0<br>
c8t16d0 ONLINE 0 0 0<br>
c8t17d0 ONLINE 0 0 0<br>
c8t18d0 ONLINE 0 0 0<br>
c8t19d0 ONLINE 0 0 0<br>
c8t20d0 DEGRADED 0 0 0<br>
c8t21d0 DEGRADED 0 0 0<br>
spare-10 DEGRADED 0 0 0<br>
c8t22d0 DEGRADED 0 0 0<br>
c8t47d0 DEGRADED 0 0 0 (resilvering)<br>
c8t23d0 ONLINE 0 0 0<br>
raidz1-4 DEGRADED 0 0 0<br>
c8t24d0 DEGRADED 0 0 0<br>
c8t25d0 ONLINE 0 0 0<br>
c8t26d0 ONLINE 0 0 0<br>
c8t27d0 ONLINE 0 0 0<br>
c8t28d0 ONLINE 0 0 0<br>
c8t29d0 DEGRADED 0 0 0<br>
c8t30d0 ONLINE 0 0 0<br>
raidz1-5 DEGRADED 0 0 0<br>
spare-0 DEGRADED 0 0 5<br>
c8t31d0 DEGRADED 0 0 0<br>
c8t46d0 DEGRADED 0 0 0 (resilvering)<br>
c8t32d0 ONLINE 0 0 0<br>
c8t33d0 ONLINE 0 0 0<br>
c8t34d0 ONLINE 0 0 0<br>
c8t35d0 DEGRADED 0 0 0<br>
c8t36d0 DEGRADED 0 0 0<br>
c8t37d0 ONLINE 0 0 0<br>
raidz1-6 DEGRADED 0 0 0<br>
c8t38d0 DEGRADED 0 0 0<br>
c8t39d0 ONLINE 0 0 0<br>
c8t40d0 DEGRADED 0 0 0<br>
c8t41d0 DEGRADED 0 0 0<br>
c8t42d0 ONLINE 0 0 0<br>
c8t43d0 ONLINE 0 0 0<br>
c8t44d0 ONLINE 0 0 0<br>
spares<br>
c8t45d0 INUSE<br>
c8t46d0 INUSE<br>
c8t47d0 INUSE<br>
<br>
device details:<br>
<br>
c8t1d0 DEGRADED scrub/resilver needed<br>
status: ZFS detected errors on this device.<br>
The device is missing some data that is recoverable.<br>
<br>
c8t2d0 DEGRADED scrub/resilver needed<br>
status: ZFS detected errors on this device.<br>
The device is missing some data that is recoverable.<br>
<br>
12459181442598970150 UNAVAIL was /dev/dsk/c2t4d0s0<br>
status: ZFS detected errors on this device.<br>
The device was missing.<br>
<br>
c8t45d0 DEGRADED scrub/resilver needed<br>
status: ZFS detected errors on this device.<br>
The device is missing some data that is recoverable.<br>
<br>
c8t15d0 DEGRADED scrub/resilver needed<br>
status: ZFS detected errors on this device.<br>
The device is missing some data that is recoverable.<br>
<br>
c8t20d0 DEGRADED scrub/resilver needed<br>
status: ZFS detected errors on this device.<br>
The device is missing some data that is recoverable.<br>
<br>
c8t21d0 DEGRADED scrub/resilver needed<br>
status: ZFS detected errors on this device.<br>
The device is missing some data that is recoverable.<br>
<br>
c8t22d0 DEGRADED scrub/resilver needed<br>
status: ZFS detected errors on this device.<br>
The device is missing some data that is recoverable.<br>
<br>
The device is missing some data that is recoverable.<br>
<br>
<br></div></div></div>