[OmniOS-discuss] zfs pool 100% busy, disks less than 10%

Rune Tipsmark rt at steait.net
Fri Oct 31 19:38:01 UTC 2014


Ok, makes sense.
What other kind of  indicators can I look at?

I get decent results from DD but still feels a bit slow...

Compression lz4 should not slow it down right? Cpu is not doing much when copying data over, maybe 15% busy or so... 

Sync=always, block size 1M
204800000000 bytes (205 GB) copied, 296.379 s, 691 MB/s
real    4m56.382s
user    0m0.461s
sys     3m12.662s

Sync=disabled, block size 1M
204800000000 bytes (205 GB) copied, 117.774 s, 1.7 GB/s
real    1m57.777s
user    0m0.237s
sys     1m57.466s

... while doing this I was looking at my FIO cards, I think the reason is that the SLC's need more power to deliver higher performance, they are supposed to deliver 1.5GB/sec but only delivers around 350MB/sec each....

Now looking for aux power cables and will retest...

Br,
Rune

-----Original Message-----
From: Richard Elling [mailto:richard.elling at richardelling.com] 
Sent: Friday, October 31, 2014 9:03 AM
To: Eric Sproul
Cc: Rune Tipsmark; omnios-discuss at lists.omniti.com
Subject: Re: [OmniOS-discuss] zfs pool 100% busy, disks less than 10%


On Oct 31, 2014, at 7:14 AM, Eric Sproul <eric.sproul at circonus.com> wrote:

> On Fri, Oct 31, 2014 at 2:33 AM, Rune Tipsmark <rt at steait.net> wrote:
> 
>> Why is this pool showing near 100% busy when the underlying disks are 
>> doing nothing at all....
> 
> Simply put, it's just how the accounting works in iostat.  It treats 
> the pool like any other device, so if there is even one outstanding 
> request to the pool, it counts towards the busy%.  Keith W. from 
> Joyent explained this recently on the illumos-zfs list:
> http://www.listbox.com/member/archive/182191/2014/10/sort/time_rev/pag
> e/3/entry/18:93/20141017161955:F3E11AB2-563A-11E4-8EDC-D0C677981E2F/
> 
> The TL;DR is: if your pool has more than one disk in it, the pool-wide 
> busy% is useless.

FWIW, we use %busy as an indicator that we can ignore a device/subsystem when looking for performance problems. We don't use it as an indicator of problems. In other words, if the device isn't > 10% busy, forgetabouddit. If it is more busy, look in more detail at the meaningful performance indicators.
 -- richard



More information about the OmniOS-discuss mailing list