[OmniOS-discuss] [discuss] COMSTAR hanging

Brian Hechinger wonko at 4amlunch.net
Wed Jan 13 03:55:00 UTC 2016


Ok, it has happened.

Checking this here, the pool seems to be fine. I can read and write files.

except ‘zpool status’ is now currently hanging. I can still read/write from the pool, however.

I can telnet to port 3260, but restarting target services has hung.

root at basket1:/tank/Share# svcs -a | grep stmf
online         Jan_05   svc:/system/stmf:default
root at basket1:/tank/Share# svcs -a | grep target
disabled       Jan_05   svc:/system/fcoe_target:default
online         Jan_05   svc:/network/iscsi/target:default
online         Jan_05   svc:/system/ibsrp/target:default
root at basket1:/tank/Share# svcadm restart /system/ibsrp/target
root at basket1:/tank/Share# svcadm restart /network/iscsi/target
root at basket1:/tank/Share# svcadm restart /system/stmf
root at basket1:/tank/Share# svcs -a | grep target
disabled       Jan_05   svc:/system/fcoe_target:default
online*        22:43:03 svc:/system/ibsrp/target:default
online*        22:43:13 svc:/network/iscsi/target:default
root at basket1:/tank/Share# svcs -a | grep stmf
online*        22:43:18 svc:/system/stmf:default
root at basket1:/tank/Share#

I’m doing a crash dump reboot. I’ll post the output somewhere.

The output of echo '$<threadlist' | mdb -k is attached.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: threadlist.out
Type: application/octet-stream
Size: 501272 bytes
Desc: not available
URL: <https://omniosce.org/ml-archive/attachments/20160112/b86a57c5/attachment-0001.obj>
-------------- next part --------------


> On Jan 8, 2016, at 3:11 PM, Matej Zerovnik <matej at zunaj.si> wrote:
> 
> Is the pool usable during comstar hang?
> Can you write and read from the pool (test both, in my case, when pool froze, I wasn’t able to write to the pool, but I could read).
> 
> Again, this might not be connected with Comstar, but in my case, Comstar and pool hang were exchanging.
> 
> Matej
> 
>> On 08 Jan 2016, at 20:11, Brian Hechinger <wonko at 4amlunch.net> wrote:
>> 
>> Yeah, I’m using the 1068E to boot from (this has been supported since before Illumos) but that doesn’t have anything accessed by COMSTAR.
>> 
>> It’s the ICH10R SATA that hosts the disks that COMSTAR shares out space from.
>> 
>> -brian
>> 
>>> On Jan 8, 2016, at 1:31 PM, Richard Jahnel <rjahnel at ellipseinc.com> wrote:
>>> 
>>> First off, love SuperMicro good choice IMHO.
>>> 
>>> This board has two on board controllers.
>>> 
>>> LSI SAS1068E (not 100% sure there are working illumos drivers for this one)
>>> 
>>> And
>>> 
>>> Intel ICH10R SATA (So I'm guessing your using this one.)
>>> 
>>> -----Original Message-----
>>> From: OmniOS-discuss [mailto:omnios-discuss-bounces at lists.omniti.com] On Behalf Of Brian Hechinger
>>> Sent: Friday, January 08, 2016 12:16 PM
>>> To: Matej Zerovnik <matej at zunaj.si>
>>> Cc: omnios-discuss <omnios-discuss at lists.omniti.com>
>>> Subject: Re: [OmniOS-discuss] [discuss] COMSTAR hanging
>>> 
>>> 
>>>> Which controller exactly do you have?
>>> 
>>> Whatever ACHI stuff is built into the motherboard. Motherboard is X8DTL-3F.
>>> 
>>>> Do you know firmware version?
>>> 
>>> I’m assuming this is linked to the BIOS version?
>>> 
>>>> Which hard drives?
>>> 
>>> Hitachi-HUA723030ALA640-MKAOAA50-2.73TB
>>> 
>>>> It might not tell much, but it’s good to have as much information as possible.
>>>> 
>>>> When comstar hangs, can you telnet to the iSCSI port?
>>>> What does svcs says, is the service running?
>>>> What happens in you try to restart it?
>>>> How do you restart it?
>>> 
>>> I’ll try all these things next time.
>>> 
>>>> In my case, svcs reported service running, but when I tried to telnet, there was no connection as well as there was no listening port opened when checking with 'netstat -an'. If I tried to restart target and stmf service, but stmf service got stucked in online* state and would not start. Reboot was the only solution in my case, but as I said, latest 014 release is working OK (but then again, load got reduced).
>>> 
>>> All good info. Thanks!
>>> 
>>> -brian
>>> 
>>>> 
>>>> Matej
>>>> 
>>>>> On 08 Jan 2016, at 17:50, Dave Pooser <dave-oo at pooserville.com> wrote:
>>>>> 
>>>>>>> On Jan 8, 2016, at 11:22 AM, Brian Hechinger <wonko at 4amlunch.net> wrote:
>>>>>>> 
>>>>>>> No, ZFS raid10
>>>>>> 
>>>>>> Saw the HW-RAID term, and got concerned.  That's what, raidz2 in ZFS-ese?
>>>>> 
>>>>> It's a zpool with multiple mirror vdevs.
>>>>> 
>>>>> -- 
>>>>> Dave Pooser
>>>>> Cat-Herder-in-Chief, Pooserville.com
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> OmniOS-discuss mailing list
>>>>> OmniOS-discuss at lists.omniti.com
>>>>> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>>>> 
>>>> _______________________________________________
>>>> OmniOS-discuss mailing list
>>>> OmniOS-discuss at lists.omniti.com
>>>> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>>> 
>>> _______________________________________________
>>> OmniOS-discuss mailing list
>>> OmniOS-discuss at lists.omniti.com
>>> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>> 
> 



More information about the OmniOS-discuss mailing list