[OmniOS-discuss] issue importing zpool on S11.1 from omniOS LUNs

Stephan Budach stephan.budach at jvm.de
Wed Jan 25 22:41:19 UTC 2017


Hi Richard,

Am 25.01.17 um 20:27 schrieb Richard Elling:
> Hi Stephan,
>
>> On Jan 25, 2017, at 5:54 AM, Stephan Budach <stephan.budach at JVM.DE 
>> <mailto:stephan.budach at jvm.de>> wrote:
>>
>> Hi guys,
>>
>> I have been trying to import a zpool, based on a 3way-mirror provided 
>> by three omniOS boxes via iSCSI. This zpool had been working 
>> flawlessly until some random reboot of the S11.1 host. Since then, 
>> S11.1 has been importing this zpool without success.
>>
>> This zpool consists of three 108TB LUNs, based on a raidz-2 zvols… 
>> yeah I know, we shouldn't have done that in the first place, but 
>> performance was not the primary goal for that, as this one is a 
>> backup/archive pool.
>>
>> When issueing a zpool import, it says this:
>>
>> root at solaris11atest2:~# zpool import
>>   pool: vsmPool10
>>     id: 12653649504720395171
>>  state: DEGRADED
>> status: The pool was last accessed by another system.
>> action: The pool can be imported despite missing or damaged devices.  The
>>         fault tolerance of the pool may be compromised if imported.
>>    see: http://support.oracle.com/msg/ZFS-8000-EY
>> config:
>>
>> vsmPool10                                  DEGRADED
>> mirror-0                                 DEGRADED
>> c0t600144F07A3506580000569398F60001d0  DEGRADED corrupted data
>> c0t600144F07A35066C00005693A0D90001d0  DEGRADED corrupted data
>> c0t600144F07A35001A00005693A2810001d0  DEGRADED corrupted data
>>
>> device details:
>>
>> c0t600144F07A3506580000569398F60001d0 DEGRADED         scrub/resilver 
>> needed
>>         status: ZFS detected errors on this device.
>>                 The device is missing some data that is recoverable.
>>
>> c0t600144F07A35066C00005693A0D90001d0 DEGRADED         scrub/resilver 
>> needed
>>         status: ZFS detected errors on this device.
>>                 The device is missing some data that is recoverable.
>>
>> c0t600144F07A35001A00005693A2810001d0 DEGRADED         scrub/resilver 
>> needed
>>         status: ZFS detected errors on this device.
>>                 The device is missing some data that is recoverable.
>>
>> However, when  actually running zpool import -f vsmPool10, the system 
>> starts to perform a lot of writes on the LUNs and iostat report an 
>> alarming increase in h/w errors:
>>
>> root at solaris11atest2:~# iostat -xeM 5
>>                          extended device statistics         ---- 
>> errors ---
>> device    r/s    w/s   Mr/s   Mw/s wait actv  svc_t  %w  %b s/w h/w 
>> trn tot
>> sd0       0.0    0.0    0.0    0.0 0.0  0.0    0.0   0   0   0   0   
>> 0   0
>> sd1       0.0    0.0    0.0    0.0 0.0  0.0    0.0   0   0   0   0   
>> 0   0
>> sd2       0.0    0.0    0.0    0.0 0.0  0.0    0.0   0   0   0  71   
>> 0  71
>> sd3       0.0    0.0    0.0    0.0 0.0  0.0    0.0   0   0   0   0   
>> 0   0
>> sd4       0.0    0.0    0.0    0.0 0.0  0.0    0.0   0   0   0   0   
>> 0   0
>> sd5       0.0    0.0    0.0    0.0 0.0  0.0    0.0   0   0   0   0   
>> 0   0
>>                          extended device statistics         ---- 
>> errors ---
>> device    r/s    w/s   Mr/s   Mw/s wait actv  svc_t  %w  %b s/w h/w 
>> trn tot
>> sd0      14.2  147.3    0.7    0.4 0.2  0.1    2.0   6   9   0   0   
>> 0   0
>> sd1      14.2    8.4    0.4    0.0 0.0  0.0    0.3   0   0   0   0   
>> 0   0
>> sd2       0.0    4.2    0.0    0.0 0.0  0.0    0.0   0   0   0  92   
>> 0  92
>> sd3     157.3   46.2    2.1    0.2 0.0  0.7    3.7   0  14   0  30   
>> 0  30
>> sd4     123.9   29.4    1.6    0.1 0.0  1.7   10.9   0  36   0  40   
>> 0  40
>> sd5     142.5   43.0    2.0    0.1 0.0  1.9   10.2   0  45   0  88   
>> 0  88
>>                          extended device statistics         ---- 
>> errors ---
>> device    r/s    w/s   Mr/s   Mw/s wait actv  svc_t  %w  %b s/w h/w 
>> trn tot
>> sd0       0.0  234.5    0.0    0.6 0.2  0.1    1.4   6  10   0   0   
>> 0   0
>> sd1       0.0    0.0    0.0    0.0 0.0  0.0    0.0   0   0   0   0   
>> 0   0
>> sd2       0.0    0.0    0.0    0.0 0.0  0.0    0.0   0   0   0  92   
>> 0  92
>> sd3       3.6   64.0    0.0    0.5 0.0  4.3   63.2   0  63   0 235   
>> 0 235
>> sd4       3.0   67.0    0.0    0.6 0.0  4.2   60.5   0  68   0 298   
>> 0 298
>> sd5       4.2   59.6    0.0    0.4 0.0  5.2   81.0   0  72   0 406   
>> 0 406
>>                          extended device statistics         ---- 
>> errors ---
>> device    r/s    w/s   Mr/s   Mw/s wait actv  svc_t  %w  %b s/w h/w 
>> trn tot
>> sd0       0.0  234.8    0.0    0.7 0.4  0.1    2.2  11  10   0   0   
>> 0   0
>> sd1       0.0    0.0    0.0    0.0 0.0  0.0    0.0   0   0   0   0   
>> 0   0
>> sd2       0.0    0.0    0.0    0.0 0.0  0.0    0.0   0   0   0  92   
>> 0  92
>> sd3       5.4   54.4    0.0    0.3 0.0  2.9   48.5   0  67   0 384   
>> 0 384
>> sd4       6.0   53.4    0.0    0.3 0.0  4.6   77.7   0  87   0 519   
>> 0 519
>> sd5       6.0   60.8    0.0    0.3 0.0  4.8   72.5   0  87   0 727   
>> 0 727
>
> h/w errors are a classification of other errors. The full error list 
> is available from "iostat -E" and will
> be important to tracking this down.
>
> A better, more detailed analysis can be gleaned from the "fmdump -e" 
> ereports that should be
> associated with each h/w error. However, there are dozens of causes of 
> these so we don’t have
> enough info here to fully understand.
>  — richard
>
Well… I can't provide you with the output of fmdump -e (since  I am 
currently unable to get the '-' typed in to the console, due to some 
fancy keyboard layout issues and nit being able to login via ssh as well 
(can authenticate, but I don't get to the shell, which may be due to the 
running zpool import), but I can confirm that fmdump does show nothing 
at all. I could just reset the S11.1 host, after removing the 
zpool.cache file, such as that the system will not try to import the 
zpool upon restart right away…

…plus I might get the option to set the keyboard right, after reboot, 
but that's another issue…

Thanks,
Stephan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20170125/fb93ee4f/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 5546 bytes
Desc: not available
URL: <https://omniosce.org/ml-archive/attachments/20170125/fb93ee4f/attachment-0001.bin>


More information about the OmniOS-discuss mailing list