[OmniOS-discuss] iscsi timeouts

Saso Kiselkov skiselkov.ml at gmail.com
Tue Jan 21 22:16:11 UTC 2014


On 1/21/14, 10:09 PM, Saso Kiselkov wrote:
> On 1/21/14, 10:01 PM, Tobias Oetiker wrote:
>> Hi Nld,
>>
>> Today Narayan Desai wrote:
>>
>>> Sorry, I should have given the requisite "yes, I know that this is a recipe
>>> for sadness, for I too have experienced said sadness".
>>>
>>> That said, we've seen this kind of problem when there was a device in a
>>> vdev that was dying a slow death. There wouldn't necessarily be any sign,
>>> aside from insanely high service times on an individual device in the pool.
>>> From this, I assume that ZFS is still sensitive to variation in underlying
>>> drive performance.
>>>
>>> Tobi, what do your drive service times look like?
>>>  -nld
>>
>> the drives seem fine, smart is not reporting anything out of the
>> ordinary and also iostat -En shows 0 on all counts
>>
>> I don't think it is a disk issue, but rather something connected
>> with the network ...
>>
>> On times the machine becomes unreachable for some time, and then it
>> is possible to login via console and all seems well internally.
>> setting the network interface offline and then online again using
>> the dladm tool brings the connectivity back immediatly. waiting
>> helps as well ... since the problem sorts itself out after a few
>> seconds to minutes ...
>>
>> we just had another 'off the net' periode for 30 minutes
>>
>> unfortunately omnios itself does not seem to realize that something
>> is off, at least dmesg does not show any kernel messages about this
>> problem ...
>>
>> we have several systems running on the S2600CP MB ... this is the
>> only one showing problems ...
>>
>> the next thing I intend todo is to upgrade the MB firmware since I
>> found that this box has an older version than the other ones ...
>>
>> System Configuration: Intel Corporation S2600CP
>> BIOS Configuration: Intel Corp. SE5C600.86B.01.06.0002.110120121539 11/01/2012
>>
>> other ideas, most welcome !
> 
> You mentioned a couple of e-mails back that you're using Intel I350s.
> Can you verify that your kernel has:
> 
> commit 43ae55058ad99c869a9ae39d039490e8a3680520
> Author: Dan McDonald <danmcd at nexenta.com>
> Date:   Thu Feb 7 19:27:18 2013 -0500
> 
>     3534 Disable EEE support in igb for I350
>     Reviewed by: Robert Mustacchi <rm at joyent.com>
>     Reviewed by: Jason King <jason.brian.king at gmail.com>
>     Reviewed by: Marcel Telka <marcel at telka.sk>
>     Reviewed by: Sebastien Roy <sebastien.roy at delphix.com>
>     Approved by: Richard Lowe <richlowe at richlowe.net>
> 
> I guess you can check for this string at runtime:
> $ strings /kernel/drv/amd64/igb | grep _eee_support
> 
> If it is missing, then it could be the buggy EEE support that's throwing
> your link out of whack here.

Nevermind, missed your description of the KVM guests being reachable
while only the host goes offline... Did snoop show anything arriving at
the host while it is offline?

Cheers,
-- 
Saso


More information about the OmniOS-discuss mailing list