[OmniOS-discuss] Slow Drive Detection and boot-archive

Schweiss, Chip chip at innovates.com
Wed Jul 29 14:51:03 UTC 2015


On Fri, Jul 24, 2015 at 5:03 PM, Michael Talbott <mtalbott at lji.org> wrote:

> Hi,
>
> I've downgraded the cards (LSI 9211-8e) to v.19 and disabled their boot
> bios. But I'm still getting the 8 second per drive delay after the kernel
> loads. Any other ideas?
>
>
8 seconds is way too long.   What JBODs and disks are you using?   Could it
be they are powered off and the delay in waiting for the power on command
to complete?   This could be accelerated by using lsiutils to send them all
power on commands first.

While I still consider it slow, however, my OmniOS systems with  LSI HBAs
discover about 2 disks per second.   With systems with LOTS of disk all
multipathed it still stacks up to a long time to discover them all.

-Chip


>
> ________________________
> Michael Talbott
> Systems Administrator
> La Jolla Institute
>
> > On Jul 20, 2015, at 11:27 PM, Floris van Essen ..:: House of Ancients
> Amstafs ::.. <info at houseofancients.nl> wrote:
> >
> > Michael,
> >
> > I know v20 does cause lots of issue's.
> > V19 , to the best of my knowledge doesn't contain any, so I would
> downgrade to v19
> >
> >
> > Kr,
> >
> >
> > Floris
> > -----Oorspronkelijk bericht-----
> > Van: OmniOS-discuss [mailto:omnios-discuss-bounces at lists.omniti.com]
> Namens Michael Talbott
> > Verzonden: dinsdag 21 juli 2015 4:57
> > Aan: Marion Hakanson <hakansom at ohsu.edu>
> > CC: omnios-discuss <omnios-discuss at lists.omniti.com>
> > Onderwerp: Re: [OmniOS-discuss] Slow Drive Detection and boot-archive
> >
> > Thanks for the reply. The bios for the card is disabled already. The 8
> second per drive scan happens after the kernel has already loaded and it is
> scanning for devices. I wonder if it's due to running newer firmware. I did
> update the cards to fw v.20.something before I moved to omnios. Is there a
> particular firmware version on the cards I should run to match OmniOS's
> drivers?
> >
> >
> > ________________________
> > Michael Talbott
> > Systems Administrator
> > La Jolla Institute
> >
> >> On Jul 20, 2015, at 6:06 PM, Marion Hakanson <hakansom at ohsu.edu> wrote:
> >>
> >> Michael,
> >>
> >> I've not seen this;  I do have one system with 120 drives and it
> >> definitely does not have this problem.  A couple with 80+ drives are
> >> also free of this issue, though they are still running OpenIndiana.
> >>
> >> One thing I pretty much always do here, is to disable the boot option
> >> in the LSI HBA's config utility (accessible from the during boot after
> >> the BIOS has started up).  I do this because I don't want the BIOS
> >> thinking it can boot from any of the external JBOD disks;  And also
> >> because I've had some system BIOS crashes when they tried to enumerate
> >> too many drives.  But, this all happens at the BIOS level, before the
> >> OS has even started up, so in theory it should not affect what you are
> >> seeing.
> >>
> >> Regards,
> >>
> >> Marion
> >>
> >>
> >> ================================================================
> >> Subject: Re: [OmniOS-discuss] Slow Drive Detection and boot-archive
> >> From: Michael Talbott <mtalbott at lji.org>
> >> Date: Fri, 17 Jul 2015 16:15:47 -0700
> >> To: omnios-discuss <omnios-discuss at lists.omniti.com>
> >>
> >> Just realized my typo. I'm using this on my 90 and 180 drive systems:
> >>
> >> # svccfg -s boot-archive setprop start/timeout_seconds=720 # svccfg -s
> >> boot-archive setprop start/timeout_seconds=1440
> >>
> >> Seems like 8 seconds to detect each drive is pretty excessive.
> >>
> >> Any ideas on how to speed that up?
> >>
> >>
> >> ________________________
> >> Michael Talbott
> >> Systems Administrator
> >> La Jolla Institute
> >>
> >>> On Jul 17, 2015, at 4:07 PM, Michael Talbott <mtalbott at lji.org> wrote:
> >>>
> >>> I have multiple NAS servers I've moved to OmniOS and each of them have
> 90-180 4T disks. Everything has worked out pretty well for the most part.
> But I've come into an issue where when I reboot any of them, I'm getting
> boot-archive service timeouts happening. I found a workaround of increasing
> the timeout value which brings me to the following. As you can see below in
> a dmesg output, it's taking the kernel about 8 seconds to detect each of
> the drives. They're connected via a couple SAS2008 based LSI cards.
> >>>
> >>> Is this normal?
> >>> Is there a way to speed that up?
> >>>
> >>> I've fixed my frustrating boot-archive timeout problem by adjusting
> the timeout value from the default of 60 seconds (I guess that'll work ok
> on systems with less than 8 drives?) to 8 seconds * 90 drives + a little
> extra time = 280 seconds (for the 90 drive systems). Which means it takes
> between 12-24 minutes to boot those machines up.
> >>>
> >>> # svccfg -s boot-archive setprop start/timeout_seconds=280
> >>>
> >>> I figure I can't be the only one. A little googling also revealed:
> >>> https://www.illumos.org/issues/4614
> >>> <https://www.illumos.org/issues/4614>
> >>>
> >>> Jul 17 15:40:15 store2 genunix: [ID 583861 kern.info] sd29 at
> >>> mpt_sas3: unit-address w50000c0f0401bd43,0: w50000c0f0401bd43,0 Jul
> >>> 17 15:40:15 store2 genunix: [ID 936769 kern.info] sd29 is
> >>> /pci at 0,0/pci8086,e06 at 2,2/pci1000,3080 at 0/iport at f/disk at w50000c0f0401bd4
> >>> 3,0 Jul 17 15:40:16 store2 genunix: [ID 408114 kern.info]
> >>> /pci at 0,0/pci8086,e06 at 2,2/pci1000,3080 at 0/iport at f/disk at w50000c0f0401bd4
> >>> 3,0 (sd29) online Jul 17 15:40:24 store2 genunix: [ID 583861
> >>> kern.info] sd30 at mpt_sas3: unit-address w50000c0f045679c3,0:
> >>> w50000c0f045679c3,0 Jul 17 15:40:24 store2 genunix: [ID 936769
> >>> kern.info] sd30 is
> >>> /pci at 0,0/pci8086,e06 at 2,2/pci1000,3080 at 0/iport at f/disk at w50000c0f045679c
> >>> 3,0 Jul 17 15:40:24 store2 genunix: [ID 408114 kern.info]
> >>> /pci at 0,0/pci8086,e06 at 2,2/pci1000,3080 at 0/iport at f/disk at w50000c0f045679c
> >>> 3,0 (sd30) online Jul 17 15:40:33 store2 genunix: [ID 583861
> >>> kern.info] sd31 at mpt_sas3: unit-address w50000c0f045712b3,0:
> >>> w50000c0f045712b3,0 Jul 17 15:40:33 store2 genunix: [ID 936769
> >>> kern.info] sd31 is
> >>> /pci at 0,0/pci8086,e06 at 2,2/pci1000,3080 at 0/iport at f/disk at w50000c0f045712b
> >>> 3,0 Jul 17 15:40:33 store2 genunix: [ID 408114 kern.info]
> >>> /pci at 0,0/pci8086,e06 at 2,2/pci1000,3080 at 0/iport at f/disk at w50000c0f045712b
> >>> 3,0 (sd31) online Jul 17 15:40:42 store2 genunix: [ID 583861
> >>> kern.info] sd32 at mpt_sas3: unit-address w50000c0f04571497,0:
> >>> w50000c0f04571497,0 Jul 17 15:40:42 store2 genunix: [ID 936769
> >>> kern.info] sd32 is
> >>> /pci at 0,0/pci8086,e06 at 2,2/pci1000,3080 at 0/iport at f/disk at w50000c0f0457149
> >>> 7,0 Jul 17 15:40:42 store2 genunix: [ID 408114 kern.info]
> >>> /pci at 0,0/pci8086,e06 at 2,2/pci1000,3080 at 0/iport at f/disk at w50000c0f0457149
> >>> 7,0 (sd32) online Jul 17 15:40:50 store2 genunix: [ID 583861
> >>> kern.info] sd33 at mpt_sas3: unit-address w50000c0f042ac8eb,0:
> >>> w50000c0f042ac8eb,0 Jul 17 15:40:50 store2 genunix: [ID 936769
> >>> kern.info] sd33 is
> >>> /pci at 0,0/pci8086,e06 at 2,2/pci1000,3080 at 0/iport at f/disk at w50000c0f042ac8e
> >>> b,0 Jul 17 15:40:50 store2 genunix: [ID 408114 kern.info]
> >>> /pci at 0,0/pci8086,e06 at 2,2/pci1000,3080 at 0/iport at f/disk at w50000c0f042ac8e
> >>> b,0 (sd33) online Jul 17 15:40:59 store2 genunix: [ID 583861
> >>> kern.info] sd34 at mpt_sas3: unit-address w50000c0f04571473,0:
> >>> w50000c0f04571473,0 Jul 17 15:40:59 store2 genunix: [ID 936769
> >>> kern.info] sd34 is
> >>> /pci at 0,0/pci8086,e06 at 2,2/pci1000,3080 at 0/iport at f/disk at w50000c0f0457147
> >>> 3,0 Jul 17 15:40:59 store2 genunix: [ID 408114 kern.info]
> >>> /pci at 0,0/pci8086,e06 at 2,2/pci1000,3080 at 0/iport at f/disk at w50000c0f0457147
> >>> 3,0 (sd34) online Jul 17 15:41:08 store2 genunix: [ID 583861
> >>> kern.info] sd35 at mpt_sas3: unit-address w50000c0f042c636f,0:
> >>> w50000c0f042c636f,0 Jul 17 15:41:08 store2 genunix: [ID 936769
> >>> kern.info] sd35 is
> >>> /pci at 0,0/pci8086,e06 at 2,2/pci1000,3080 at 0/iport at f/disk at w50000c0f042c636
> >>> f,0 Jul 17 15:41:08 store2 genunix: [ID 408114 kern.info]
> >>> /pci at 0,0/pci8086,e06 at 2,2/pci1000,3080 at 0/iport at f/disk at w50000c0f042c636
> >>> f,0 (sd35) online Jul 17 15:41:17 store2 genunix: [ID 583861
> >>> kern.info] sd36 at mpt_sas3: unit-address w50000c0f0401bf2f,0:
> >>> w50000c0f0401bf2f,0 Jul 17 15:41:17 store2 genunix: [ID 936769
> >>> kern.info] sd36 is
> >>> /pci at 0,0/pci8086,e06 at 2,2/pci1000,3080 at 0/iport at f/disk at w50000c0f0401bf2
> >>> f,0 Jul 17 15:41:17 store2 genunix: [ID 408114 kern.info]
> >>> /pci at 0,0/pci8086,e06 at 2,2/pci1000,3080 at 0/iport at f/disk at w50000c0f0401bf2
> >>> f,0 (sd36) online Jul 17 15:41:25 store2 genunix: [ID 583861
> >>> kern.info] sd38 at mpt_sas3: unit-address w50000c0f0401bc1f,0:
> >>> w50000c0f0401bc1f,0 Jul 17 15:41:25 store2 genunix: [ID 936769
> >>> kern.info] sd38 is
> >>> /pci at 0,0/pci8086,e06 at 2,2/pci1000,3080 at 0/iport at f/disk at w50000c0f0401bc1
> >>> f,0 Jul 17 15:41:26 store2 genunix: [ID 408114 kern.info]
> >>> /pci at 0,0/pci8086,e06 at 2,2/pci1000,3080 at 0/iport at f/disk at w50000c0f0401bc1
> >>> f,0 (sd38) online
> >>>
> >>>
> >>> ________________________
> >>> Michael Talbott
> >>> Systems Administrator
> >>> La Jolla Institute
> >>>
> >>
> >> _______________________________________________
> >> OmniOS-discuss mailing list
> >> OmniOS-discuss at lists.omniti.com
> >> http://lists.omniti.com/mailman/listinfo/omnios-discuss
> >>
> >>
> >
> > _______________________________________________
> > OmniOS-discuss mailing list
> > OmniOS-discuss at lists.omniti.com
> > http://lists.omniti.com/mailman/listinfo/omnios-discuss
> > ...:: House of Ancients ::...
> > American Staffordshire Terriers
> >
> > +31-628-161-350
> > +31-614-198-389
> > Het Perk 48
> > 4903 RB
> > Oosterhout
> > Netherlands
> > www.houseofancients.nl
>
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150729/d9ec3699/attachment.html>


More information about the OmniOS-discuss mailing list