<html><head><meta http-equiv="Content-Type" content="text/html charset=us-ascii"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;"><br><div><div>On Dec 2, 2014, at 10:11 PM, wuffers <<a href="mailto:moo@wuffers.net">moo@wuffers.net</a>> wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite"><div dir="ltr">Just wanted to update this thread with my results. I had the scrub started since late July, and checked on it at the end of Oct:<div><br></div><div><div> pool: tank</div><div> state: ONLINE</div><div> scan: scrub in progress since Tue Jul 29 15:41:27 2014</div><div> 7.65T scanned out of 24.5T at 1024K/s, (scan is slow, no estimated time)</div><div> 0 repaired, 31.16% done</div></div><div><br></div><div>So it was scrubbing away for ~90 days and got a bit over 31% with vdev_scrub_max set to 5.</div><div><br></div><div>In anticipation of having to power off the head nodes for the building's power maintenance and not wanting to waste the scrub so far, I pushed the vdev_scrub_max to 10. The scrub then proceeded to accelerate to about 8% per day, and ended:</div><div><br></div><div><div> pool: tank</div><div> state: ONLINE</div><div> scan: scrub repaired 0 in 2372h54m with 0 errors on Wed Nov 5 11:36:00 2014</div></div><div><br></div><div>My pool wasn't busy so I was okay with it having so much time slice, but this obviously isn't the best thing to do. Is this really normal, or is there something wrong with my config?</div></div></blockquote><div><br></div><div>The default settings for zfs_vdev_scrub_max_active are (mostly) a guess. It is expected</div><div>that you can tune. It would not be surprising for the default to change as we get more,</div><div>varied field experience. Thanks for sharing your results!</div><div> -- richard</div><br><blockquote type="cite"><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Jul 31, 2014 at 3:44 PM, wuffers <span dir="ltr"><<a href="mailto:moo@wuffers.net" target="_blank">moo@wuffers.net</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra">This is going to be long winded as well (apologies!).. lots of pasted data.</div><div class="gmail_extra"><br><div class="gmail_quote"><span class="">On Thu, Jul 31, 2014 at 1:37 AM, Richard Elling <span dir="ltr"><<a href="mailto:richard.elling@richardelling.com" target="_blank">richard.elling@richardelling.com</a>></span> wrote:<br> <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div style="word-wrap:break-word"><div> <div><br></div></div><div>The %busy for controllers is a sum of the %busy for all disks on the controller, so</div><div>is can be large, but overall isn't interesting. With HDDs, there is no way you can </div><div> saturate the controller, so we don't really care what the %busy shows.</div><div><br></div><div>The more important item is that the number of read ops is fairly low for all but 4 disks.</div><div>Since you didn't post the pool configuration, we can only guess that they might be a</div> <div>souce of the bottleneck. </div><div><br></div><div>You're seeing a lot of reads from the cache devices. How much RAM does this system</div><div>have?</div><div><br></div></div></blockquote><div><br> </div></span><div>I realized that the busy % was a sum after looking through some of that data, but good to know that it isn't very relevant. </div><div><br></div><div>The pool configuration was in the original post, but here it is again (after re-attaching the mirror log device). Just saw your edit, but this has been updated from the original post anyways.</div> <div><br></div><div><span class=""><div> pool: tank</div><div> state: ONLINE</div><div> scan: scrub in progress since Tue Jul 29 15:41:27 2014</div></span><div> 82.5G scanned out of 24.5T at 555K/s, (scan is slow, no estimated time)</div> <div> 0 repaired, 0.33% done</div><div><div class="h5"><div>config:</div><div><br></div><div> NAME STATE READ WRITE CKSUM</div><div> tank ONLINE 0 0 0</div><div> mirror-0 ONLINE 0 0 0</div><div> c1t5000C50055F9F637d0 ONLINE 0 0 0</div><div> c1t5000C50055F9EF2Fd0 ONLINE 0 0 0</div><div> mirror-1 ONLINE 0 0 0</div> <div> c1t5000C50055F87D97d0 ONLINE 0 0 0</div><div> c1t5000C50055F9D3B3d0 ONLINE 0 0 0</div><div> mirror-2 ONLINE 0 0 0</div><div> c1t5000C50055E6606Fd0 ONLINE 0 0 0</div><div> c1t5000C50055F9F92Bd0 ONLINE 0 0 0</div><div> mirror-3 ONLINE 0 0 0</div><div> c1t5000C50055F856CFd0 ONLINE 0 0 0</div> <div> c1t5000C50055F9FE87d0 ONLINE 0 0 0</div><div> mirror-4 ONLINE 0 0 0</div><div> c1t5000C50055F84A97d0 ONLINE 0 0 0</div><div> c1t5000C50055FA0AF7d0 ONLINE 0 0 0</div><div> mirror-5 ONLINE 0 0 0</div><div> c1t5000C50055F9D3E3d0 ONLINE 0 0 0</div><div> c1t5000C50055F9F0B3d0 ONLINE 0 0 0</div> <div> mirror-6 ONLINE 0 0 0</div><div> c1t5000C50055F8A46Fd0 ONLINE 0 0 0</div><div> c1t5000C50055F9FB8Bd0 ONLINE 0 0 0</div><div> mirror-7 ONLINE 0 0 0</div><div> c1t5000C50055F8B21Fd0 ONLINE 0 0 0</div><div> c1t5000C50055F9F89Fd0 ONLINE 0 0 0</div><div> mirror-8 ONLINE 0 0 0</div> <div> c1t5000C50055F8BE3Fd0 ONLINE 0 0 0</div><div> c1t5000C50055F9E123d0 ONLINE 0 0 0</div><div> mirror-9 ONLINE 0 0 0</div><div> c1t5000C50055F9379Bd0 ONLINE 0 0 0</div><div> c1t5000C50055F9E7D7d0 ONLINE 0 0 0</div><div> mirror-10 ONLINE 0 0 0</div><div> c1t5000C50055E65F0Fd0 ONLINE 0 0 0</div> <div> c1t5000C50055F9F80Bd0 ONLINE 0 0 0</div><div> mirror-11 ONLINE 0 0 0</div><div> c1t5000C50055F8A22Bd0 ONLINE 0 0 0</div><div> c1t5000C50055F8D48Fd0 ONLINE 0 0 0</div><div> mirror-12 ONLINE 0 0 0</div><div> c1t5000C50055E65807d0 ONLINE 0 0 0</div><div> c1t5000C50055F8BFA3d0 ONLINE 0 0 0</div> <div> mirror-13 ONLINE 0 0 0</div><div> c1t5000C50055E579F7d0 ONLINE 0 0 0</div><div> c1t5000C50055E65877d0 ONLINE 0 0 0</div><div> mirror-14 ONLINE 0 0 0</div><div> c1t5000C50055F9FA1Fd0 ONLINE 0 0 0</div><div> c1t5000C50055F8CDA7d0 ONLINE 0 0 0</div><div> mirror-15 ONLINE 0 0 0</div> <div> c1t5000C50055F8BF9Bd0 ONLINE 0 0 0</div><div> c1t5000C50055F9A607d0 ONLINE 0 0 0</div><div> mirror-16 ONLINE 0 0 0</div><div> c1t5000C50055E66503d0 ONLINE 0 0 0</div><div> c1t5000C50055E4FDE7d0 ONLINE 0 0 0</div><div> mirror-17 ONLINE 0 0 0</div><div> c1t5000C50055F8E017d0 ONLINE 0 0 0</div> <div> c1t5000C50055F9F3EBd0 ONLINE 0 0 0</div><div> mirror-18 ONLINE 0 0 0</div><div> c1t5000C50055F8B80Fd0 ONLINE 0 0 0</div><div> c1t5000C50055F9F63Bd0 ONLINE 0 0 0</div><div> mirror-19 ONLINE 0 0 0</div><div> c1t5000C50055F84FB7d0 ONLINE 0 0 0</div><div> c1t5000C50055F9FEABd0 ONLINE 0 0 0</div> <div> mirror-20 ONLINE 0 0 0</div><div> c1t5000C50055F8CCAFd0 ONLINE 0 0 0</div><div> c1t5000C50055F9F91Bd0 ONLINE 0 0 0</div><div> mirror-21 ONLINE 0 0 0</div><div> c1t5000C50055E65ABBd0 ONLINE 0 0 0</div><div> c1t5000C50055F8905Fd0 ONLINE 0 0 0</div><div> mirror-22 ONLINE 0 0 0</div> <div> c1t5000C50055E57A5Fd0 ONLINE 0 0 0</div><div> c1t5000C50055F87E73d0 ONLINE 0 0 0</div><div> mirror-23 ONLINE 0 0 0</div><div> c1t5000C50055E66053d0 ONLINE 0 0 0</div><div> c1t5000C50055E66B63d0 ONLINE 0 0 0</div><div> mirror-24 ONLINE 0 0 0</div><div> c1t5000C50055F8723Bd0 ONLINE 0 0 0</div> <div> c1t5000C50055F8C3ABd0 ONLINE 0 0 0</div><div> logs</div></div></div><div> mirror-25 ONLINE 0 0 0</div><span class=""><div> c2t5000A72A3007811Dd0 ONLINE 0 0 0</div> </span><div> c12t5000A72B300780FFd0 ONLINE 0 0 0</div><span class=""><div> cache</div><div> c2t500117310015D579d0 ONLINE 0 0 0</div><div> c2t50011731001631FDd0 ONLINE 0 0 0</div> <div> c12t500117310015D59Ed0 ONLINE 0 0 0</div><div> c12t500117310015D54Ed0 ONLINE 0 0 0</div><div> spares</div><div> c1t5000C50055FA2AEFd0 AVAIL</div> <div> c1t5000C50055E595B7d0 AVAIL</div></span></div><div><br></div><div>Basically, this is 2 head nodes (Supermicro 826BE26) connected to a Supermicro 847E26 JBOD, using LSI 9207s. There are 52 Seagate ST4000NM0023s (4TB SAS drives) in 25 mirror pairs plus 2 which are spares. There are 4 Smart Optimus 400GB SSDs as cache drives, and 2 Stec ZeusRAMs for slogs. They're wired in such a way that both nodes can see all the drives (data, cache and log), and the data drives are on separate controllers than the cache/slog devices. RSF-1 was also specced in here but not in use at the moment. All the SAN traffic is through InfiniBand (SRP). Each head unit has 256GB of RAM. Dedupe is not in use and all the latest feature flags are enabled.</div> <div> </div><div>An arc_summary output:</div><div><br></div><div><div>System Memory:</div><div> Physical RAM: 262103 MB</div><div> Free Memory : 10273 MB</div><div> LotsFree: 4095 MB</div><div> <br></div><div>ZFS Tunables (/etc/system):</div><div> set zfs:zfs_arc_shrink_shift = 10</div><div><br></div><div>ARC Size:</div><div> Current Size: 225626 MB (arcsize)</div><div> Target Size (Adaptive): 225626 MB (c)</div> <div> Min Size (Hard Limit): 8190 MB (zfs_arc_min)</div><div> Max Size (Hard Limit): 261079 MB (zfs_arc_max)</div><div><br></div><div>ARC Size Breakdown:</div><div> Most Recently Used Cache Size: 10% 23290 MB (p)</div> <div> Most Frequently Used Cache Size: 89% 202335 MB (c-p)</div><div><br></div><div>ARC Efficency:</div><div> Cache Access Total: 27377320465</div><div> Cache Hit Ratio: 93% 25532510784 [Defined State for buffer]</div> <div> Cache Miss Ratio: 6% 1844809681 [Undefined State for Buffer]</div><div> REAL Hit Ratio: 92% 25243933796 [MRU/MFU Hits Only]</div><div><br></div><div> Data Demand Efficiency: 95%</div> <div> Data Prefetch Efficiency: 40%</div><div><br></div><div> CACHE HITS BY CACHE LIST:</div><div> Anon: --% Counter Rolled.</div><div> Most Recently Used: 18% 4663226393 (mru) [ Return Customer ]</div> <div> Most Frequently Used: 80% 20580707403 (mfu) [ Frequent Customer ]</div><div> Most Recently Used Ghost: 0% 176686906 (mru_ghost) [ Return Customer Evicted, Now Back ]</div> <div> Most Frequently Used Ghost: 0% 126286869 (mfu_ghost) [ Frequent Customer Evicted, Now Back ]</div><div> CACHE HITS BY DATA TYPE:</div><div> Demand Data: 95% 24413671342</div> <div> Prefetch Data: 1% 358419705</div><div> Demand Metadata: 2% 698314899</div><div> Prefetch Metadata: 0% 62104838</div><div> CACHE MISSES BY DATA TYPE:</div> <div> Demand Data: 69% 1277347273</div><div> Prefetch Data: 28% 519579788</div><div> Demand Metadata: 2% 39512363</div><div> Prefetch Metadata: 0% 8370257</div> </div><div><br></div><div>And a sample of arcstat (deleted first line of output):</div><div><br></div><div><div># arcstat -f read,hits,miss,hit%,l2read,l2hits,l2miss,l2hit%,arcsz,l2size,l2asize 1</div><div><br></div><div> read hits miss hit% l2read l2hits l2miss l2hit% arcsz l2size l2asize</div><div>5.9K 4.6K 1.3K 78 1.3K 1.2K 80 93 220G 1.6T 901G<br></div><div>6.7K 5.2K 1.5K 76 1.5K 1.3K 250 83 220G 1.6T 901G</div> <div>7.0K 5.3K 1.7K 76 1.7K 1.4K 316 81 220G 1.6T 901G</div><div>6.5K 5.3K 1.2K 80 1.2K 1.1K 111 91 220G 1.6T 901G</div><div>6.4K 5.2K 1.2K 81 1.2K 1.1K 100 91 220G 1.6T 901G</div> <div>7.2K 5.6K 1.6K 78 1.6K 1.3K 289 81 220G 1.6T 901G</div><div>8.5K 6.8K 1.7K 80 1.7K 1.3K 351 79 220G 1.6T 901G</div><div>7.5K 5.9K 1.6K 78 1.6K 1.3K 282 82 220G 1.6T 901G</div> <div>6.7K 5.6K 1.1K 83 1.1K 991 123 88 220G 1.6T 901G</div><div>6.8K 5.5K 1.3K 80 1.3K 1.1K 234 82 220G 1.6T 901G</div></div><div><br></div><div>Interesting to see only an l2asize of 901G even though I should have more.. 373G x 4 is just under 1.5TB of raw storage. The compressed l2arc size is 1.6TB, while actual used space is 901G. I expect more to be used. Perhaps Saso can comment on this portion, if he's following this thread (snipped from "zpool iostat -v"):</div> <div><br></div><div><div>cache - - - - - -</div><div> c2t500117310015D579d0 373G 8M 193 16 2.81M 394K</div><div> c2t50011731001631FDd0 373G 5.29M 194 15 2.85M 360K</div> <div> c12t500117310015D59Ed0 373G 5.50M 191 17 2.74M 368K</div><div> c12t500117310015D54Ed0 373G 5.57M 200 14 2.89M 300K</div></div><div><br></div><div>(from this discussion here: <a href="http://lists.omniti.com/pipermail/omnios-discuss/2014-February/002287.html" target="_blank">http://lists.omniti.com/pipermail/omnios-discuss/2014-February/002287.html</a>), and the uptime on this is currently around ~58 days, so it should have had enough time to rotate through the l2arc "rotor".</div><span class=""> <div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div style="word-wrap:break-word"><div> <div><br></div></div><div>methinks the scrub I/Os are getting starved and since they are low priority, they </div><div>could get very starved. In general, I wouldn't worry about it, but I understand </div><div>why you might be nervous. Keep in mind that in ZFS scrubs are intended to find </div> <div>errors on idle data, not frequently accessed data.</div><div><br></div><div>more far below...</div><div><br></div></div></blockquote><div><br></div></span><div>I'm worried because there's no way the scrub will ever complete before the next reboot. Regular scrubs are important, right?</div><span class=""> <div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div style="word-wrap:break-word"><div> <div><br></div></div><div>ok, so the pool is issuing 720 read iops, including resilver workload, vs 1298 write iops.</div><div>There is plenty of I/O capacity left on the table here, as you can see by the %busy being</div> <div>so low.</div><div><br></div><div>So I think the pool is not scheduling scrub I/Os very well. You can increase the number of</div><div>scrub I/Os in the scheduler by adjusting the zfs_vdev_scrub_max_active tunable. The</div> <div>default is 2, but you'll have to consider that a share (in the stock market sense) where</div><div>the active sync reads and writes are getting 10 each. You can try bumping up the value</div><div>and see what happens over some time, perhaps 10 minutes or so -- too short of a time</div> <div>and you won't get a good feeling for the impact (try this in off-peak time).</div><div><span style="white-space:pre-wrap"> </span>echo zfs_vdev_scrub_max_active/W0t5 | mdb -kw</div><div>will change the value from 2 to 5, increasing its share of the total I/O workload.</div> <div><br></div><div>You can see the progress of scan (scrubs do scan) workload by looking at the ZFS</div><div>debug messages.</div><div><span style="white-space:pre-wrap"> </span>echo ::zfs_dbgmsg | mdb -k</div><div>These will look mysterious... they are. But the interesting bits are about how many blocks</div> <div>are visited in some amount of time (txg sync interval). Ideally, this will change as you </div><div>adjust zfs_vdev_scrub_max_active.</div><span><font color="#888888"> -- richard</font></span><div> <div><br></div></div></div></blockquote></span></div><br></div><div class="gmail_extra">Actually, you used the data from before the resilver. During resilver this was the activity on the pool:</div><span class=""><div class="gmail_extra"> <br></div><div class="gmail_extra"><span style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13.333333969116211px">3883.5 1357.7 40141.6 60739.5 22.8 38.6 4.4 7.4 54 100 tank</span><br></div><div class="gmail_extra"> <span style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13.333333969116211px"><br></span></div></span><div class="gmail_extra"><span style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13.333333969116211px">Are you looking at an individual drive's busy % or the pool's busy % to determine whether it's "busy"? During the resilver this was the activity on the drives (a sample - between 38-45%, whereas during the scrub the individual drives were 2-5% busy):</span></div> <div class="gmail_extra"><span style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13.333333969116211px"><br></span></div><div class="gmail_extra"><span class=""><div style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13.333333969116211px"> 59.5 30.0 553.2 741.8 0.0 0.9 0.0 9.8 0 45 c1t5000C50055F8A46Fd0</div><div style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13.333333969116211px"> 57.4 22.5 504.0 724.5 0.0 0.8 0.0 9.6 0 41 c1t5000C50055F856CFd0</div> <div style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13.333333969116211px"> 58.4 24.6 531.4 786.9 0.0 0.7 0.0 8.4 0 38 c1t5000C50055E6606Fd0</div><div style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13.333333969116211px"> <br></div></span><div style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13.333333969116211px">But yes, without the resilver the busy % was much less (during the scrub each individual drive was 2-4% busy). I've pasted the current iostat output further below.</div> <div style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13.333333969116211px"><br></div><div style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13.333333969116211px">With the zfs_vdev_scrub_max_active at the default of 2, it was doing an average of 162 blocks:</div> <div style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13.333333969116211px"><br></div><div><div style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13.333333969116211px">doing scan sync txg 26678243; bm=897/1/0/15785978<br> </div><div style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13.333333969116211px">scanned dataset 897 (tank/vmware-64k-5tb-1) with min=1 max=26652167; pausing=1</div><div style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13.333333969116211px"> visited 162 blocks in 6090ms</div><div style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13.333333969116211px">doing scan sync txg 26678244; bm=897/1/0/15786126</div><div style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13.333333969116211px"> scanned dataset 897 (tank/vmware-64k-5tb-1) with min=1 max=26652167; pausing=1</div><div style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13.333333969116211px">visited 162 blocks in 6094ms</div><div style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13.333333969116211px"> <br></div><div style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13.333333969116211px">After changing it to 5, and waiting about 20 mins, I'm not seeing anything significantly different:</div><div style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13.333333969116211px"> <br></div><div><div><span style="color:rgb(80,0,80);font-family:arial,sans-serif">doing scan sync txg 26678816; bm=897/1/0/37082043</span><br></div><div><font color="#500050" face="arial, sans-serif">scanned dataset 897 (tank/vmware-64k-5tb-1) with min=1 max=26652167; pausing=1</font></div> <div><font color="#500050" face="arial, sans-serif">visited 163 blocks in 6154ms</font></div><div><font color="#500050" face="arial, sans-serif">doing scan sync txg 26678817; bm=897/1/0/37082193</font></div><div> <font color="#500050" face="arial, sans-serif">scanned dataset 897 (tank/vmware-64k-5tb-1) with min=1 max=26652167; pausing=1</font></div><div><font color="#500050" face="arial, sans-serif">visited 162 blocks in 6128ms</font></div> </div><div style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13.333333969116211px"><br></div><div><span class=""><div style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13.333333969116211px"> pool: tank</div> <div style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13.333333969116211px"> state: ONLINE</div><div style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13.333333969116211px"> scan: scrub in progress since Tue Jul 29 15:41:27 2014</div> </span><div style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13.333333969116211px"> 97.0G scanned out of 24.5T at 599K/s, (scan is slow, no estimated time)</div><div style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13.333333969116211px"> 0 repaired, 0.39% done</div><div style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13.333333969116211px"><br></div><div><font color="#500050" face="arial, sans-serif">I'll keep the </font><span style="color:rgb(34,34,34);font-family:arial;font-size:small">zfs_vdev_scrub_max_active tunable to 5, as it doesn't appear to be impacting too much, and monitor for changes. What's strange to me is that it was "humming" along at 5.5MB/s at the 2 week mark but is now 10x slower (compared to before reattaching the mirror log device). It *seems* marginally faster, from 541K/s to almost 600K/s..</span></div> <div><span style="color:rgb(34,34,34);font-family:arial;font-size:small"><br></span></div><div><span style="color:rgb(34,34,34);font-family:arial;font-size:small">This is the current activity from "</span>iostat -xnCz 60 2":</div> <div><br></div><div><span class=""><div> extended device statistics</div><div> r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device</div></span><div> 158.8 1219.2 3717.8 39969.8 0.0 1.6 0.0 1.1 0 139 c1</div> <div> 3.6 35.1 86.2 730.9 0.0 0.0 0.0 0.9 0 3 c1t5000C50055F8723Bd0</div><div> 3.7 19.9 83.7 789.9 0.0 0.0 0.0 1.4 0 3 c1t5000C50055E66B63d0</div><div> 2.7 22.5 60.8 870.9 0.0 0.0 0.0 1.1 0 2 c1t5000C50055F87E73d0</div> <div> 2.4 27.9 66.0 765.8 0.0 0.0 0.0 0.8 0 2 c1t5000C50055F8BFA3d0</div><div> 2.8 17.9 64.9 767.0 0.0 0.0 0.0 0.8 0 1 c1t5000C50055F9E123d0</div><div> 3.1 26.1 73.8 813.3 0.0 0.0 0.0 0.9 0 2 c1t5000C50055F9F0B3d0</div> <div> 3.1 15.5 79.4 783.4 0.0 0.0 0.0 1.3 0 2 c1t5000C50055F9D3B3d0</div><div> 3.8 38.6 86.2 826.8 0.0 0.1 0.0 1.2 0 4 c1t5000C50055E4FDE7d0</div><div> 3.8 15.4 93.0 822.3 0.0 0.0 0.0 1.5 0 3 c1t5000C50055F9A607d0</div> <div> 3.0 25.7 79.4 719.7 0.0 0.0 0.0 0.9 0 2 c1t5000C50055F8CDA7d0</div><div> 3.2 26.5 69.0 824.3 0.0 0.0 0.0 1.1 0 3 c1t5000C50055E65877d0</div><div> 3.7 42.6 79.2 834.1 0.0 0.1 0.0 1.3 0 5 c1t5000C50055F9E7D7d0</div> <div> 3.3 23.2 79.5 778.0 0.0 0.0 0.0 1.2 0 3 c1t5000C50055FA0AF7d0</div><div> 3.4 30.2 77.0 805.9 0.0 0.0 0.0 0.9 0 3 c1t5000C50055F9FE87d0</div><div> 3.0 15.4 72.6 795.0 0.0 0.0 0.0 1.6 0 3 c1t5000C50055F9F91Bd0</div> <div> 2.5 38.1 61.1 859.4 0.0 0.1 0.0 1.6 0 5 c1t5000C50055F9FEABd0</div><div> 2.1 13.2 42.7 801.6 0.0 0.0 0.0 1.6 0 2 c1t5000C50055F9F63Bd0</div><div> 3.0 20.0 62.6 766.6 0.0 0.0 0.0 1.1 0 2 c1t5000C50055F9F3EBd0</div> <div> 3.7 24.3 80.2 807.9 0.0 0.0 0.0 1.0 0 2 c1t5000C50055F9F80Bd0</div><div> 3.2 35.2 66.1 852.4 0.0 0.0 0.0 1.2 0 4 c1t5000C50055F9FB8Bd0</div><div> 3.9 30.6 84.7 845.7 0.0 0.0 0.0 0.8 0 3 c1t5000C50055F9F92Bd0</div> <div> 2.7 18.1 68.8 831.4 0.0 0.0 0.0 1.4 0 2 c1t5000C50055F8905Fd0</div><div> 2.7 17.7 61.4 762.1 0.0 0.0 0.0 1.0 0 2 c1t5000C50055F8D48Fd0</div><div> 3.5 17.5 87.8 749.7 0.0 0.0 0.0 1.7 0 3 c1t5000C50055F9F89Fd0</div> <div> 2.6 13.7 58.6 780.9 0.0 0.0 0.0 1.7 0 3 c1t5000C50055F9EF2Fd0</div><div> 3.3 34.9 74.5 730.9 0.0 0.0 0.0 0.8 0 3 c1t5000C50055F8C3ABd0</div><div> 3.1 19.3 64.7 789.9 0.0 0.0 0.0 1.0 0 2 c1t5000C50055E66053d0</div> <div> 3.8 38.5 82.9 826.8 0.0 0.1 0.0 1.3 0 4 c1t5000C50055E66503d0</div><div> 3.7 25.8 91.4 813.3 0.0 0.0 0.0 0.8 0 2 c1t5000C50055F9D3E3d0</div><div> 2.2 37.9 52.5 859.4 0.0 0.0 0.0 1.1 0 4 c1t5000C50055F84FB7d0</div> <div> 2.8 20.0 62.8 766.6 0.0 0.0 0.0 1.0 0 2 c1t5000C50055F8E017d0</div><div> 3.9 26.1 86.5 824.3 0.0 0.0 0.0 1.1 0 3 c1t5000C50055E579F7d0</div><div> 3.1 27.7 79.9 765.8 0.0 0.0 0.0 1.2 0 3 c1t5000C50055E65807d0</div> <div> 2.9 22.8 76.3 778.0 0.0 0.0 0.0 1.1 0 3 c1t5000C50055F84A97d0</div><div> 3.6 15.3 89.0 783.4 0.0 0.0 0.0 1.7 0 3 c1t5000C50055F87D97d0</div><div> 2.8 13.8 77.9 780.9 0.0 0.0 0.0 1.5 0 2 c1t5000C50055F9F637d0</div> <div> 2.1 18.3 51.4 831.4 0.0 0.0 0.0 1.1 0 2 c1t5000C50055E65ABBd0</div><div> 3.1 15.4 70.9 822.3 0.0 0.0 0.0 1.2 0 2 c1t5000C50055F8BF9Bd0</div><div> 3.2 17.9 75.5 762.1 0.0 0.0 0.0 1.2 0 2 c1t5000C50055F8A22Bd0</div> <div> 3.7 42.4 83.3 834.1 0.0 0.1 0.0 1.1 0 5 c1t5000C50055F9379Bd0</div><div> 4.0 22.7 86.8 870.9 0.0 0.0 0.0 1.0 0 2 c1t5000C50055E57A5Fd0</div><div> 2.6 15.5 67.5 795.0 0.0 0.0 0.0 1.4 0 2 c1t5000C50055F8CCAFd0</div> <div> 2.9 13.2 65.4 801.6 0.0 0.0 0.0 1.9 0 3 c1t5000C50055F8B80Fd0</div><div> 3.3 25.7 82.7 719.7 0.0 0.0 0.0 1.1 0 3 c1t5000C50055F9FA1Fd0</div><div> 4.0 24.0 84.9 807.9 0.0 0.0 0.0 1.1 0 3 c1t5000C50055E65F0Fd0</div> <div> 2.8 18.4 69.5 767.0 0.0 0.0 0.0 1.0 0 2 c1t5000C50055F8BE3Fd0</div><div> 3.3 17.6 81.6 749.7 0.0 0.0 0.0 1.4 0 3 c1t5000C50055F8B21Fd0</div><div> 3.3 35.1 64.2 852.4 0.0 0.0 0.0 1.1 0 4 c1t5000C50055F8A46Fd0</div> <div> 3.5 30.0 82.1 805.9 0.0 0.0 0.0 0.9 0 3 c1t5000C50055F856CFd0</div><div> 3.9 30.4 89.5 845.7 0.0 0.0 0.0 0.9 0 3 c1t5000C50055E6606Fd0</div><div> 429.4 133.6 5933.0 8163.0 0.0 0.2 0.0 0.3 0 12 c2</div> <div> 215.8 28.4 2960.4 677.7 0.0 0.1 0.0 0.2 0 5 c2t500117310015D579d0</div><div> 213.7 27.4 2972.6 654.1 0.0 0.1 0.0 0.2 0 5 c2t50011731001631FDd0</div><div> 0.0 77.8 0.0 6831.2 0.0 0.1 0.0 0.8 0 2 c2t5000A72A3007811Dd0</div> <div> 0.0 12.3 0.0 46.8 0.0 0.0 0.0 0.0 0 0 c4</div><div> 0.0 6.2 0.0 23.4 0.0 0.0 0.0 0.0 0 0 c4t0d0</div><div> 0.0 6.1 0.0 23.4 0.0 0.0 0.0 0.0 0 0 c4t1d0</div> <div> 418.4 134.8 5663.1 8197.8 0.0 0.2 0.0 0.3 0 11 c12</div><div> 0.0 77.8 0.0 6831.2 0.0 0.1 0.0 0.8 0 2 c12t5000A72B300780FFd0</div><div> 203.5 29.7 2738.0 715.8 0.0 0.1 0.0 0.2 0 5 c12t500117310015D59Ed0</div> <div> 214.9 27.2 2925.2 650.8 0.0 0.1 0.0 0.2 0 5 c12t500117310015D54Ed0</div><div> 0.0 11.3 0.0 46.8 0.0 0.0 0.7 0.1 0 0 rpool</div><div> 1006.7 1478.2 15313.9 56330.7 30.4 2.0 12.2 0.8 6 64 tank</div> <div><br></div><div>Seems the pool is busy at 64% but the individual drives are not taxed at all (this load is virtually identical to when the scrub was running before the resilver was triggered). Still not too sure how to interpret this data. Is the system over-stressed? Is there really a bottleneck somewhere, or just need to fine tune some settings?</div> <div><br></div><div>Going to try some iometer results via the VM I/O Analyzer.</div></div></div><div style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13.333333969116211px"><br></div><div style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13.333333969116211px"> <br></div></div></div></div> </blockquote></div><br></div> </blockquote></div><br></body></html>