[OmniOS-discuss] [discuss] COMSTAR hanging

John Barfield john.barfield at bissinc.com
Wed Jan 13 04:52:14 UTC 2016


Oh I didnt catch that detail.

Okay well nevermind :)


Sent from Outlook Mobile<https://aka.ms/qtex0l>




On Tue, Jan 12, 2016 at 8:21 PM -0800, "Brian Hechinger" <wonko at 4amlunch.net<mailto:wonko at 4amlunch.net>> wrote:

In my case the SATA disks aren’t on the 1068E.

-brian

On Jan 12, 2016, at 11:19 PM, John Barfield <john.barfield at bissinc.com<mailto:john.barfield at bissinc.com>> wrote:

BTW I left off that it has the same LSI controller chipset

Sent from Outlook Mobile<https://aka.ms/qtex0l>

_____________________________
From: John Barfield <john.barfield at bissinc.com<mailto:john.barfield at bissinc.com>>
Sent: Tuesday, January 12, 2016 10:17 PM
Subject: Re: [OmniOS-discuss] [discuss] COMSTAR hanging
To: <discuss at lists.illumos.org<mailto:discuss at lists.illumos.org>>, omnios-discuss <omnios-discuss at lists.omniti.com<mailto:omnios-discuss at lists.omniti.com>>


My input may or may not be valid but Im going to throw it out there anyway :)

do you have any Mpt disconnect errors in /var/adm/messages?

Also do you have smartmontools installed?

I ran into similiar issues just booting a sunfire x4540 recently off of OmniOS live, i/o would just hang while probing device nodes.

I found the drive that was acting up and pulled it.

All of a sudden everything miraculously worked amazing.

I compiled smartmontools after I got it to boot and found 10 drives out of 48 with bad sectors in prefail state.

I dont know if this happens with SAS drives or not but Im using SATA and saw this was a common issue in old opensolaris threads.

-barfield

Sent from Outlook Mobile<https://aka.ms/qtex0l>




On Tue, Jan 12, 2016 at 8:08 PM -0800, "Brian Hechinger" <wonko at 4amlunch.net<mailto:wonko at 4amlunch.net>> wrote:

In the meantime I’ve removed the SLOG and L2ARC just in case. I don’t think that’s it though. At least will have some sort of data point to work with here. :)

-brian

> On Jan 12, 2016, at 10:55 PM, Brian Hechinger <wonko at 4amlunch.net<mailto:wonko at 4amlunch.net>> wrote:
>
> Ok, it has happened.
>
> Checking this here, the pool seems to be fine. I can read and write files.
>
> except ‘zpool status’ is now currently hanging. I can still read/write from the pool, however.
>
> I can telnet to port 3260, but restarting target services has hung.
>
> root at basket1:/tank/Share# svcs -a | grep stmf
> online         Jan_05   svc:/system/stmf:default
> root at basket1:/tank/Share# svcs -a | grep target
> disabled       Jan_05   svc:/system/fcoe_target:default
> online         Jan_05   svc:/network/iscsi/target:default
> online         Jan_05   svc:/system/ibsrp/target:default
> root at basket1:/tank/Share# svcadm restart /system/ibsrp/target
> root at basket1:/tank/Share# svcadm restart /network/iscsi/target
> root at basket1:/tank/Share# svcadm restart /system/stmf
> root at basket1:/tank/Share# svcs -a | grep target
> disabled       Jan_05   svc:/system/fcoe_target:default
> online*        22:43:03 svc:/system/ibsrp/target:default
> online*        22:43:13 svc:/network/iscsi/target:default
> root at basket1:/tank/Share# svcs -a | grep stmf
> online*        22:43:18 svc:/system/stmf:default
> root at basket1:/tank/Share#
>
> I’m doing a crash dump reboot. I’ll post the output somewhere.
>
> The output of echo '$<threadlist' | mdb -k is attached.
>
> <threadlist.out>
>
>> On Jan 8, 2016, at 3:11 PM, Matej Zerovnik <matej at zunaj.si<mailto:matej at zunaj.si>> wrote:
>>
>> Is the pool usable during comstar hang?
>> Can you write and read from the pool (test both, in my case, when pool froze, I wasn’t able to write to the pool, but I could read).
>>
>> Again, this might not be connected with Comstar, but in my case, Comstar and pool hang were exchanging.
>>
>> Matej
>>
>>> On 08 Jan 2016, at 20:11, Brian Hechinger <wonko at 4amlunch.net<mailto:wonko at 4amlunch.net>> wrote:
>>>
>>> Yeah, I’m using the 1068E to boot from (this has been supported since before Illumos) but that doesn’t have anything accessed by COMSTAR.
>>>
>>> It’s the ICH10R SATA that hosts the disks that COMSTAR shares out space from.
>>>
>>> -brian
>>>
>>>> On Jan 8, 2016, at 1:31 PM, Richard Jahnel <rjahnel at ellipseinc.com<mailto:rjahnel at ellipseinc.com>> wrote:
>>>>
>>>> First off, love SuperMicro good choice IMHO.
>>>>
>>>> This board has two on board controllers.
>>>>
>>>> LSI SAS1068E (not 100% sure there are working illumos drivers for this one)
>>>>
>>>> And
>>>>
>>>> Intel ICH10R SATA (So I'm guessing your using this one.)
>>>>
>>>> -----Original Message-----
>>>> From: OmniOS-discuss [ mailto:omnios-discuss-bounces at lists.omniti.com] On Behalf Of Brian Hechinger
>>>> Sent: Friday, January 08, 2016 12:16 PM
>>>> To: Matej Zerovnik <matej at zunaj.si<mailto:matej at zunaj.si>>
>>>> Cc: omnios-discuss <omnios-discuss at lists.omniti.com<mailto:omnios-discuss at lists.omniti.com>>
>>>> Subject: Re: [OmniOS-discuss] [discuss] COMSTAR hanging
>>>>
>>>>
>>>>> Which controller exactly do you have?
>>>>
>>>> Whatever ACHI stuff is built into the motherboard. Motherboard is X8DTL-3F.
>>>>
>>>>> Do you know firmware version?
>>>>
>>>> I’m assuming this is linked to the BIOS version?
>>>>
>>>>> Which hard drives?
>>>>
>>>> Hitachi-HUA723030ALA640-MKAOAA50-2.73TB
>>>>
>>>>> It might not tell much, but it’s good to have as much information as possible.
>>>>>
>>>>> When comstar hangs, can you telnet to the iSCSI port?
>>>>> What does svcs says, is the service running?
>>>>> What happens in you try to restart it?
>>>>> How do you restart it?
>>>>
>>>> I’ll try all these things next time.
>>>>
>>>>> In my case, svcs reported service running, but when I tried to telnet, there was no connection as well as there was no listening port opened when checking with 'netstat -an'. If I tried to restart target and stmf service, but stmf service got stucked in online* state and would not start. Reboot was the only solution in my case, but as I said, latest 014 release is working OK (but then again, load got reduced).
>>>>
>>>> All good info. Thanks!
>>>>
>>>> -brian
>>>>
>>>>>
>>>>> Matej
>>>>>
>>>>>> On 08 Jan 2016, at 17:50, Dave Pooser <dave-oo at pooserville.com<mailto:dave-oo at pooserville.com>> wrote:
>>>>>>
>>>>>>>> On Jan 8, 2016, at 11:22 AM, Brian Hechinger <wonko at 4amlunch.net<mailto:wonko at 4amlunch.net>> wrote:
>>>>>>>>
>>>>>>>> No, ZFS raid10
>>>>>>>
>>>>>>> Saw the HW-RAID term, and got concerned.  That's what, raidz2 in ZFS-ese?
>>>>>>
>>>>>> It's a zpool with multiple mirror vdevs.
>>>>>>
>>>>>> --
>>>>>> Dave Pooser
>>>>>> Cat-Herder-in-Chief, Pooserville.com<http://pooserville.com>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> OmniOS-discuss mailing list
>>>>>> OmniOS-discuss at lists.omniti.com<mailto:OmniOS-discuss at lists.omniti.com>
>>>>>> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>>>>>
>>>>> _______________________________________________
>>>>> OmniOS-discuss mailing list
>>>>> OmniOS-discuss at lists.omniti.com<mailto:OmniOS-discuss at lists.omniti.com>
>>>>> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>>>>
>>>> _______________________________________________
>>>> OmniOS-discuss mailing list
>>>> OmniOS-discuss at lists.omniti.com<mailto:OmniOS-discuss at lists.omniti.com>
>>>> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>>>
>>
>



http://www.listbox.com<http://www.listbox.com/>
illumos-discuss | Archives<https://www.listbox.com/member/archive/182180/=now> [http://postlink.www.listbox.com/2033704/833487e62783d55fe81f119fb93ef644/26677440/3044d385.jpg?uri=aHR0cHM6Ly93d3cubGlzdGJveC5jb20vaW1hZ2VzL2ZlZWQtaWNvbi0xMHgxMC5qcGc] <https://www.listbox.com/member/archive/rss/182180/26677440-40b316d8>  | Modify<https://www.listbox.com/member/?member_id=26677440&id_secret=26677440-8fd7f4fe> Your Subscription       [http://postlink.www.listbox.com/2033705/3379085af0f1cf7fc3708f04b4471ae2/26677440/3044d385.png?uri=aHR0cHM6Ly93d3cubGlzdGJveC5jb20vaW1hZ2VzL2xpc3Rib3gtbG9nby1zbWFsbC5wbmc] <http://www.listbox.com/>


_______________________________________________
OmniOS-discuss mailing list
OmniOS-discuss at lists.omniti.com<mailto:OmniOS-discuss at lists.omniti.com>
http://lists.omniti.com/mailman/listinfo/omnios-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20160113/ea6f66fa/attachment-0001.html>


More information about the OmniOS-discuss mailing list