[OmniOS-discuss] [discuss] COMSTAR hanging

Brian Hechinger wonko at 4amlunch.net
Wed Jan 13 04:20:34 UTC 2016


I will look to do this. The shared storage is on SATA disks, so maybe? Although they are new. I hope they are fine. :)

I don’t see anything about mpt in /var/adm/messages, no.

-brian

> On Jan 12, 2016, at 11:16 PM, John Barfield <john.barfield at bissinc.com> wrote:
> 
> My input may or may not be valid but Im going to throw it out there anyway :)
> 
> do you have any Mpt disconnect errors in /var/adm/messages?
> 
> Also do you have smartmontools installed?
> 
> I ran into similiar issues just booting a sunfire x4540 recently off of OmniOS live, i/o would just hang while probing device nodes.
> 
> I found the drive that was acting up and pulled it.
> 
> All of a sudden everything miraculously worked amazing.
> 
> I compiled smartmontools after I got it to boot and found 10 drives out of 48 with bad sectors in prefail state.
> 
> I dont know if this happens with SAS drives or not but Im using SATA and saw this was a common issue in old opensolaris threads.
> 
> -barfield
> 
> Sent from Outlook Mobile <https://aka.ms/qtex0l>
> 
> 
> 
> On Tue, Jan 12, 2016 at 8:08 PM -0800, "Brian Hechinger" <wonko at 4amlunch.net <mailto:wonko at 4amlunch.net>> wrote:
> 
> In the meantime I’ve removed the SLOG and L2ARC just in case. I don’t think that’s it though. At least will have some sort of data point to work with here. :)
> 
> -brian
> 
> > On Jan 12, 2016, at 10:55 PM, Brian Hechinger <wonko at 4amlunch.net> wrote:
> > 
> > Ok, it has happened.
> > 
> > Checking this here, the pool seems to be fine. I can read and write files.
> > 
> > except ‘zpool status’ is now currently hanging. I can still read/write from the pool, however.
> > 
> > I can telnet to port 3260, but restarting target services has hung.
> > 
> > root at basket1:/tank/Share# svcs -a | grep stmf
> > online         Jan_05   svc:/system/stmf:default
> > root at basket1:/tank/Share# svcs -a | grep target
> > disabled       Jan_05   svc:/system/fcoe_target:default
> > online         Jan_05   svc:/network/iscsi/target:default
> > online         Jan_05   svc:/system/ibsrp/target:default
> > root at basket1:/tank/Share# svcadm restart /system/ibsrp/target
> > root at basket1:/tank/Share# svcadm restart /network/iscsi/target
> > root at basket1:/tank/Share# svcadm restart /system/stmf
> > root at basket1:/tank/Share# svcs -a | grep target
> > disabled       Jan_05   svc:/system/fcoe_target:default
> > online*        22:43:03 svc:/system/ibsrp/target:default
> > online*        22:43:13 svc:/network/iscsi/target:default
> > root at basket1:/tank/Share# svcs -a | grep stmf
> > online*        22:43:18 svc:/system/stmf:default
> > root at basket1:/tank/Share#
> > 
> > I’m doing a crash dump reboot. I’ll post the output somewhere.
> > 
> > The output of echo '$<threadlist' | mdb -k is attached.
> > 
> > <threadlist.out>
> > 
> >> On Jan 8, 2016, at 3:11 PM, Matej Zerovnik <matej at zunaj.si> wrote:
> >> 
> >> Is the pool usable during comstar hang?
> >> Can you write and read from the pool (test both, in my case, when pool froze, I wasn’t able to write to the pool, but I could read).
> >> 
> >> Again, this might not be connected with Comstar, but in my case, Comstar and pool hang were exchanging.
> >> 
> >> Matej
> >> 
> >>> On 08 Jan 2016, at 20:11, Brian Hechinger <wonko at 4amlunch.net> wrote:
> >>> 
> >>> Yeah, I’m using the 1068E to boot from (this has been supported since before Illumos) but that doesn’t have anything accessed by COMSTAR.
> >>> 
> >>> It’s the ICH10R SATA that hosts the disks that COMSTAR shares out space from.
> >>> 
> >>> -brian
> >>> 
> >>>> On Jan 8, 2016, at 1:31 PM, Richard Jahnel <rjahnel at ellipseinc.com> wrote:
> >>>> 
> >>>> First off, love SuperMicro good choice IMHO.
> >>>> 
> >>>> This board has two on board controllers.
> >>>> 
> >>>> LSI SAS1068E (not 100% sure there are working illumos drivers for this one)
> >>>> 
> >>>> And
> >>>> 
> >>>> Intel ICH10R SATA (So I'm guessing your using this one.)
> >>>> 
> >>>> -----Original Message-----
> >>>> From: OmniOS-discuss [mailto:omnios-discuss-bounces at lists.omniti.com <mailto:omnios-discuss-bounces at lists.omniti.com>] On Behalf Of Brian Hechinger
> >>>> Sent: Friday, January 08, 2016 12:16 PM
> >>>> To: Matej Zerovnik <matej at zunaj.si>
> >>>> Cc: omnios-discuss <omnios-discuss at lists.omniti.com>
> >>>> Subject: Re: [OmniOS-discuss] [discuss] COMSTAR hanging
> >>>> 
> >>>> 
> >>>>> Which controller exactly do you have?
> >>>> 
> >>>> Whatever ACHI stuff is built into the motherboard. Motherboard is X8DTL-3F.
> >>>> 
> >>>>> Do you know firmware version?
> >>>> 
> >>>> I’m assuming this is linked to the BIOS version?
> >>>> 
> >>>>> Which hard drives?
> >>>> 
> >>>> Hitachi-HUA723030ALA640-MKAOAA50-2.73TB
> >>>> 
> >>>>> It might not tell much, but it’s good to have as much information as possible.
> >>>>> 
> >>>>> When comstar hangs, can you telnet to the iSCSI port?
> >>>>> What does svcs says, is the service running?
> >>>>> What happens in you try to restart it?
> >>>>> How do you restart it?
> >>>> 
> >>>> I’ll try all these things next time.
> >>>> 
> >>>>> In my case, svcs reported service running, but when I tried to telnet, there was no connection as well as there was no listening port opened when checking with 'netstat -an'. If I tried to restart target and stmf service, but stmf service got stucked in online* state and would not start. Reboot was the only solution in my case, but as I said, latest 014 release is working OK (but then again, load got reduced).
> >>>> 
> >>>> All good info. Thanks!
> >>>> 
> >>>> -brian
> >>>> 
> >>>>> 
> >>>>> Matej
> >>>>> 
> >>>>>> On 08 Jan 2016, at 17:50, Dave Pooser <dave-oo at pooserville.com> wrote:
> >>>>>> 
> >>>>>>>> On Jan 8, 2016, at 11:22 AM, Brian Hechinger <wonko at 4amlunch.net> wrote:
> >>>>>>>> 
> >>>>>>>> No, ZFS raid10
> >>>>>>> 
> >>>>>>> Saw the HW-RAID term, and got concerned.  That's what, raidz2 in ZFS-ese?
> >>>>>> 
> >>>>>> It's a zpool with multiple mirror vdevs.
> >>>>>> 
> >>>>>> -- 
> >>>>>> Dave Pooser
> >>>>>> Cat-Herder-in-Chief, Pooserville.com
> >>>>>> 
> >>>>>> 
> >>>>>> _______________________________________________
> >>>>>> OmniOS-discuss mailing list
> >>>>>> OmniOS-discuss at lists.omniti.com
> >>>>>> http://lists.omniti.com/mailman/listinfo/omnios-discuss <http://lists.omniti.com/mailman/listinfo/omnios-discuss>
> >>>>> 
> >>>>> _______________________________________________
> >>>>> OmniOS-discuss mailing list
> >>>>> OmniOS-discuss at lists.omniti.com
> >>>>> http://lists.omniti.com/mailman/listinfo/omnios-discuss <http://lists.omniti.com/mailman/listinfo/omnios-discuss>
> >>>> 
> >>>> _______________________________________________
> >>>> OmniOS-discuss mailing list
> >>>> OmniOS-discuss at lists.omniti.com
> >>>> http://lists.omniti.com/mailman/listinfo/omnios-discuss <http://lists.omniti.com/mailman/listinfo/omnios-discuss>
> >>> 
> >> 
> > 
> 
> 
> 
> -------------------------------------------
> illumos-discuss
> Archives: https://www.listbox.com/member/archive/182180/=now <https://www.listbox.com/member/archive/182180/=now>
> RSS Feed: https://www.listbox.com/member/archive/rss/182180/26677440-40b316d8 <https://www.listbox.com/member/archive/rss/182180/26677440-40b316d8>
> Modify Your Subscription: https://www.listbox.com/member/?member_id=26677440&id_secret=26677440-8fd7f4fe <https://www.listbox.com/member/?member_id=26677440&id_secret=26677440-8fd7f4fe>
> Powered by Listbox: http://www.listbox.com <http://www.listbox.com/>
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com <mailto:OmniOS-discuss at lists.omniti.com>
> http://lists.omniti.com/mailman/listinfo/omnios-discuss <http://lists.omniti.com/mailman/listinfo/omnios-discuss>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20160112/f096ae2b/attachment-0001.html>


More information about the OmniOS-discuss mailing list