<html><head><meta http-equiv="Content-Type" content="text/html charset=us-ascii"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;">On Jul 28, 2014, at 5:11 PM, wuffers <<a href="mailto:moo@wuffers.net">moo@wuffers.net</a>> wrote:<br><div><br class="Apple-interchange-newline"><blockquote type="cite"><div dir="ltr">Does this look normal?</div></blockquote><div><br></div><div>maybe, maybe not</div><br><blockquote type="cite"><div dir="ltr"><div><br></div><div><div> pool: rpool</div><div> state: ONLINE</div><div> scan: scrub repaired 0 in 0h3m with 0 errors on Tue Jul 15 09:36:17 2014</div><div>config:</div><div><br></div>
<div> NAME STATE READ WRITE CKSUM</div><div> rpool ONLINE 0 0 0</div><div> mirror-0 ONLINE 0 0 0</div><div> c4t0d0s0 ONLINE 0 0 0</div>
<div> c4t1d0s0 ONLINE 0 0 0</div><div><br></div><div>errors: No known data errors</div><div><br></div><div> pool: tank</div><div> state: ONLINE</div><div> scan: scrub in progress since Mon Jul 14 17:54:42 2014</div>
<div> 6.59T scanned out of 24.2T at 5.71M/s, (scan is slow, no estimated time)</div></div></div></blockquote><div><br></div><div>this is slower than most, surely slower than desired</div><br><blockquote type="cite"><div dir="ltr"><div><div> 0 repaired, 27.25% done</div><div>config:</div><div><br></div><div> NAME STATE READ WRITE CKSUM</div>
<div> tank ONLINE 0 0 0</div><div> mirror-0 ONLINE 0 0 0</div><div> c1t5000C50055F9F637d0 ONLINE 0 0 0</div><div>
c1t5000C50055F9EF2Fd0 ONLINE 0 0 0</div><div> mirror-1 ONLINE 0 0 0</div><div> c1t5000C50055F87D97d0 ONLINE 0 0 0</div><div> c1t5000C50055F9D3B3d0 ONLINE 0 0 0</div>
<div> mirror-2 ONLINE 0 0 0</div><div> c1t5000C50055E6606Fd0 ONLINE 0 0 0</div><div> c1t5000C50055F9F92Bd0 ONLINE 0 0 0</div><div>
mirror-3 ONLINE 0 0 0</div><div> c1t5000C50055F856CFd0 ONLINE 0 0 0</div><div> c1t5000C50055F9FE87d0 ONLINE 0 0 0</div><div> mirror-4 ONLINE 0 0 0</div>
<div> c1t5000C50055F84A97d0 ONLINE 0 0 0</div><div> c1t5000C50055FA0AF7d0 ONLINE 0 0 0</div><div> mirror-5 ONLINE 0 0 0</div><div>
c1t5000C50055F9D3E3d0 ONLINE 0 0 0</div><div> c1t5000C50055F9F0B3d0 ONLINE 0 0 0</div><div> mirror-6 ONLINE 0 0 0</div><div> c1t5000C50055F8A46Fd0 ONLINE 0 0 0</div>
<div> c1t5000C50055F9FB8Bd0 ONLINE 0 0 0</div><div> mirror-7 ONLINE 0 0 0</div><div> c1t5000C50055F8B21Fd0 ONLINE 0 0 0</div><div>
c1t5000C50055F9F89Fd0 ONLINE 0 0 0</div><div> mirror-8 ONLINE 0 0 0</div><div> c1t5000C50055F8BE3Fd0 ONLINE 0 0 0</div><div> c1t5000C50055F9E123d0 ONLINE 0 0 0</div>
<div> mirror-9 ONLINE 0 0 0</div><div> c1t5000C50055F9379Bd0 ONLINE 0 0 0</div><div> c1t5000C50055F9E7D7d0 ONLINE 0 0 0</div><div>
mirror-10 ONLINE 0 0 0</div><div> c1t5000C50055E65F0Fd0 ONLINE 0 0 0</div><div> c1t5000C50055F9F80Bd0 ONLINE 0 0 0</div><div> mirror-11 ONLINE 0 0 0</div>
<div> c1t5000C50055F8A22Bd0 ONLINE 0 0 0</div><div> c1t5000C50055F8D48Fd0 ONLINE 0 0 0</div><div> mirror-12 ONLINE 0 0 0</div><div>
c1t5000C50055E65807d0 ONLINE 0 0 0</div><div> c1t5000C50055F8BFA3d0 ONLINE 0 0 0</div><div> mirror-13 ONLINE 0 0 0</div><div> c1t5000C50055E579F7d0 ONLINE 0 0 0</div>
<div> c1t5000C50055E65877d0 ONLINE 0 0 0</div><div> mirror-14 ONLINE 0 0 0</div><div> c1t5000C50055F9FA1Fd0 ONLINE 0 0 0</div><div>
c1t5000C50055F8CDA7d0 ONLINE 0 0 0</div><div> mirror-15 ONLINE 0 0 0</div><div> c1t5000C50055F8BF9Bd0 ONLINE 0 0 0</div><div> c1t5000C50055F9A607d0 ONLINE 0 0 0</div>
<div> mirror-16 ONLINE 0 0 0</div><div> c1t5000C50055E66503d0 ONLINE 0 0 0</div><div> c1t5000C50055E4FDE7d0 ONLINE 0 0 0</div><div>
mirror-17 ONLINE 0 0 0</div><div> c1t5000C50055F8E017d0 ONLINE 0 0 0</div><div> c1t5000C50055F9F3EBd0 ONLINE 0 0 0</div><div> mirror-18 ONLINE 0 0 0</div>
<div> c1t5000C50055F8B80Fd0 ONLINE 0 0 0</div><div> c1t5000C50055F9F63Bd0 ONLINE 0 0 0</div><div> mirror-19 ONLINE 0 0 0</div><div>
c1t5000C50055F84FB7d0 ONLINE 0 0 0</div><div> c1t5000C50055F9FEABd0 ONLINE 0 0 0</div><div> mirror-20 ONLINE 0 0 0</div><div> c1t5000C50055F8CCAFd0 ONLINE 0 0 0</div>
<div> c1t5000C50055F9F91Bd0 ONLINE 0 0 0</div><div> mirror-21 ONLINE 0 0 0</div><div> c1t5000C50055E65ABBd0 ONLINE 0 0 0</div><div>
c1t5000C50055F8905Fd0 ONLINE 0 0 0</div><div> mirror-22 ONLINE 0 0 0</div><div> c1t5000C50055E57A5Fd0 ONLINE 0 0 0</div><div> c1t5000C50055F87E73d0 ONLINE 0 0 0</div>
<div> mirror-23 ONLINE 0 0 0</div><div> c1t5000C50055E66053d0 ONLINE 0 0 0</div><div> c1t5000C50055E66B63d0 ONLINE 0 0 0</div><div>
mirror-24 ONLINE 0 0 0</div><div> c1t5000C50055F8723Bd0 ONLINE 0 0 0</div><div> c1t5000C50055F8C3ABd0 ONLINE 0 0 0</div><div> logs</div>
<div> c2t5000A72A3007811Dd0 ONLINE 0 0 0</div><div> cache</div><div> c2t500117310015D579d0 ONLINE 0 0 0</div><div> c2t50011731001631FDd0 ONLINE 0 0 0</div>
<div> c12t500117310015D59Ed0 ONLINE 0 0 0</div><div> c12t500117310015D54Ed0 ONLINE 0 0 0</div><div> spares</div><div> c1t5000C50055FA2AEFd0 AVAIL</div><div>
c1t5000C50055E595B7d0 AVAIL</div><div><br></div><div>errors: No known data errors</div></div><div><br></div><div>---</div><div>This is a ~90TB SAN on r151008, with 25 pairs of 4TB mirror drives. The last scrub I ran was about 3 months ago, which took (from my recollection) ~250 hours or so. I've only run about 4 scrubs so far on this installation.</div>
<div><br></div><div>The current scrub has been running for 2 weeks, with no end in sight. The last time I saw an estimate, it said around ~650 hours remaining. </div></div></blockquote><div><br></div><div>The estimate is often very wrong, especially for busy systems.</div><div>If this is an older ZFS implementation, this pool is likely getting pounded by the</div><div>ZFS write throttle. There are some tunings that can be applied, but the old write</div><div>throttle is not a stable control system, so it will always be a little bit unpredictable.</div><br><blockquote type="cite"><div dir="ltr"><div><br></div><div>This thread <a href="http://comments.gmane.org/gmane.os.solaris.opensolaris.zfs/46021">http://comments.gmane.org/gmane.os.solaris.opensolaris.zfs/46021</a> from over 3 years ago mention the <span style="line-height:24px;text-align:justify">metaslab_min_alloc_size as a way to improve this (reducing it to 4K from 10MB). Further reading into this property got me this Illumos bug: </span><span style="line-height:23.999998092651367px;text-align:justify"><a href="https://www.illumos.org/issues/54">https://www.illumos.org/issues/54</a>, which states </span><span style="line-height:23.999998092651367px;text-align:justify">"</span><span style="line-height:23.999998092651367px">Turns out this tunable is made irrelevant as a result of a change to use the metaslab_df_ops allocator. We don't need to change it. I'm closing this bug."</span><span style="line-height:23.999998092651367px;text-align:justify">. </span>So that seems like a dead end to me. </div></div></blockquote><div><br></div><div>dead end.</div><br><blockquote type="cite"><div dir="ltr">
<div><br></div><div>This is the current load with scrub running (~350 VMs between Hyper-V and VMware environments):</div><div><br></div><div><div># iostat -xnze</div></div></div></blockquote><div><br></div><div>Unfortunately, this is the performance since boot and is not suitable for performance</div><div>analysis unless the system has been rebooted in the past 10 minutes or so. You'll need</div><div>to post the second batch from "iostat -zxCn 60 2"</div><br><blockquote type="cite"><div dir="ltr"><div><div> extended device statistics ---- errors ---</div>
<div> r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b s/w h/w trn tot device</div><div> 0.4 12.5 39.7 78.8 0.1 0.0 5.0 0.1 0 0 0 0 0 0 rpool</div><div> 0.2 6.9 19.9 39.4 0.0 0.0 0.0 0.1 0 0 0 0 0 0 c4t0d0</div>
<div> 0.2 6.8 19.9 39.4 0.0 0.0 0.0 0.1 0 0 0 0 0 0 c4t1d0</div><div> 4.4 29.3 209.7 962.7 0.0 0.0 0.0 1.4 0 3 0 0 0 0 c1t5000C50055F8723Bd0</div><div> 4.7 25.1 209.4 962.3 0.0 0.0 0.0 1.5 0 3 0 0 0 0 c1t5000C50055E66B63d0</div>
<div> 4.7 27.6 208.3 952.7 0.0 0.0 0.0 1.3 0 3 0 0 0 0 c1t5000C50055F87E73d0</div><div> 4.4 28.6 209.1 974.3 0.0 0.0 0.0 1.4 0 3 0 0 0 0 c1t5000C50055F8BFA3d0</div><div>
4.4 28.9 208.3 964.5 0.0 0.0 0.0 1.4 0 3 0 0 0 0 c1t5000C50055F9E123d0</div><div> 4.4 25.7 208.7 955.7 0.0 0.0 0.0 1.4 0 3 0 0 0 0 c1t5000C50055F9F0B3d0</div><div> 4.4 26.5 209.1 960.9 0.0 0.0 0.0 1.4 0 3 0 0 0 0 c1t5000C50055F9D3B3d0</div>
<div> 4.3 25.2 206.6 936.1 0.0 0.0 0.0 1.5 0 3 0 0 0 0 c1t5000C50055E4FDE7d0</div><div> 4.4 26.9 208.1 982.6 0.0 0.0 0.0 1.4 0 3 0 0 0 0 c1t5000C50055F9A607d0</div><div>
4.4 24.5 208.7 955.4 0.0 0.0 0.0 1.5 0 3 0 0 0 0 c1t5000C50055F8CDA7d0</div><div> 4.3 26.5 207.8 943.8 0.0 0.0 0.0 1.4 0 3 0 0 0 0 c1t5000C50055E65877d0</div><div> 4.4 27.7 208.0 961.1 0.0 0.0 0.0 1.4 0 3 0 0 0 0 c1t5000C50055F9E7D7d0</div>
<div> 4.3 26.0 208.0 953.9 0.0 0.0 0.0 1.4 0 3 0 0 0 0 c1t5000C50055FA0AF7d0</div><div> 4.3 26.1 208.0 966.2 0.0 0.0 0.0 1.4 0 3 0 0 0 0 c1t5000C50055F9FE87d0</div><div>
4.4 28.5 208.6 965.3 0.0 0.0 0.0 1.4 0 3 0 0 0 0 c1t5000C50055F9F91Bd0</div><div> 4.3 26.7 207.2 945.0 0.0 0.0 0.0 1.4 0 3 0 0 0 0 c1t5000C50055F9FEABd0</div><div> 4.4 26.5 209.3 980.1 0.0 0.0 0.0 1.4 0 3 0 0 0 0 c1t5000C50055F9F63Bd0</div>
<div> 4.3 26.1 207.6 944.3 0.0 0.0 0.0 1.5 0 3 0 0 0 0 c1t5000C50055F9F3EBd0</div><div> 4.3 26.5 208.1 954.9 0.0 0.0 0.0 1.4 0 3 0 0 0 0 c1t5000C50055F9F80Bd0</div><div>
32.5 14.7 1005.6 751.2 0.0 0.0 0.0 0.3 0 1 0 0 0 0 c2t500117310015D579d0</div><div> 32.5 14.7 1004.1 751.2 0.0 0.0 0.0 0.3 0 1 0 0 0 0 c2t50011731001631FDd0</div><div> 0.0 180.8 0.0 16434.5 0.0 0.3 0.0 1.6 0 4 0 0 0 0 c2t5000A72A3007811Dd0</div>
<div> 4.4 25.3 208.7 966.7 0.0 0.0 0.0 1.4 0 3 0 0 0 0 c1t5000C50055F9FB8Bd0</div><div> 4.4 26.3 208.5 949.1 0.0 0.0 0.0 1.4 0 3 0 0 0 0 c1t5000C50055F9F92Bd0</div><div>
4.4 29.7 208.6 975.1 0.0 0.0 0.0 1.3 0 3 0 0 0 0 c1t5000C50055F8905Fd0</div><div> 4.4 25.7 207.9 954.1 0.0 0.0 0.0 1.4 0 3 0 0 0 0 c1t5000C50055F8D48Fd0</div><div> 4.4 26.8 208.4 967.4 0.0 0.0 0.0 1.4 0 3 0 0 0 0 c1t5000C50055F9F89Fd0</div>
<div> 4.4 28.5 208.1 964.9 0.0 0.0 0.0 1.4 0 3 0 0 0 0 c1t5000C50055F9EF2Fd0</div><div> 4.4 29.4 209.5 962.7 0.0 0.0 0.0 1.4 0 3 0 0 0 0 c1t5000C50055F8C3ABd0</div><div>
4.7 25.0 208.9 962.3 0.0 0.0 0.0 1.5 0 3 0 0 0 0 c1t5000C50055E66053d0</div><div> 4.3 25.1 207.5 936.1 0.0 0.0 0.0 1.5 0 3 0 0 0 0 c1t5000C50055E66503d0</div><div> 4.4 25.6 209.1 955.7 0.0 0.0 0.0 1.4 0 3 0 0 0 0 c1t5000C50055F9D3E3d0</div>
<div> 4.3 26.6 207.4 945.0 0.0 0.0 0.0 1.4 0 3 0 0 0 0 c1t5000C50055F84FB7d0</div><div> 4.3 26.0 207.5 944.3 0.0 0.0 0.0 1.5 0 3 0 0 0 0 c1t5000C50055F8E017d0</div><div>
4.3 26.4 207.1 943.8 0.0 0.0 0.0 1.4 0 3 0 0 0 0 c1t5000C50055E579F7d0</div><div> 4.4 28.5 208.8 974.3 0.0 0.0 0.0 1.4 0 3 0 0 0 0 c1t5000C50055E65807d0</div><div> 4.4 25.9 208.5 953.9 0.0 0.0 0.0 1.4 0 3 0 0 0 0 c1t5000C50055F84A97d0</div>
<div> 4.4 26.4 209.2 960.9 0.0 0.0 0.0 1.4 0 3 0 0 0 0 c1t5000C50055F87D97d0</div><div> 4.4 28.5 208.8 964.9 0.0 0.0 0.0 1.4 0 3 0 0 0 0 c1t5000C50055F9F637d0</div><div>
4.4 29.6 208.9 975.1 0.0 0.0 0.0 1.3 0 3 0 0 0 0 c1t5000C50055E65ABBd0</div><div> 4.4 26.7 208.5 982.6 0.0 0.0 0.0 1.4 0 3 0 0 0 0 c1t5000C50055F8BF9Bd0</div><div> 4.3 25.6 207.6 954.1 0.0 0.0 0.0 1.5 0 3 0 0 0 0 c1t5000C50055F8A22Bd0</div>
<div> 4.4 27.6 208.2 961.1 0.0 0.0 0.0 1.4 0 3 0 0 0 0 c1t5000C50055F9379Bd0</div><div> 4.7 27.6 208.3 952.8 0.0 0.0 0.0 1.3 0 3 0 0 0 0 c1t5000C50055E57A5Fd0</div><div>
4.4 28.4 208.4 965.3 0.0 0.0 0.0 1.4 0 3 0 0 0 0 c1t5000C50055F8CCAFd0</div><div> 4.4 26.4 208.9 980.1 0.0 0.0 0.0 1.5 0 3 0 0 0 0 c1t5000C50055F8B80Fd0</div><div> 4.4 24.4 208.9 955.4 0.0 0.0 0.0 1.5 0 3 0 0 0 0 c1t5000C50055F9FA1Fd0</div>
<div> 4.3 26.4 207.6 954.9 0.0 0.0 0.0 1.4 0 3 0 0 0 0 c1t5000C50055E65F0Fd0</div><div> 4.4 28.8 208.3 964.5 0.0 0.0 0.0 1.4 0 3 0 0 0 0 c1t5000C50055F8BE3Fd0</div><div>
4.3 26.7 207.4 967.4 0.0 0.0 0.0 1.4 0 3 0 0 0 0 c1t5000C50055F8B21Fd0</div><div> 4.4 25.1 208.9 966.7 0.0 0.0 0.0 1.4 0 3 0 0 0 0 c1t5000C50055F8A46Fd0</div><div> 4.4 26.0 209.7 966.2 0.0 0.0 0.0 1.4 0 3 0 0 0 0 c1t5000C50055F856CFd0</div>
<div> 4.4 26.2 209.0 949.1 0.0 0.0 0.0 1.4 0 3 0 0 0 0 c1t5000C50055E6606Fd0</div><div> 32.5 14.7 1004.3 750.9 0.0 0.0 0.0 0.3 0 1 0 0 0 0 c12t500117310015D59Ed0</div><div>
32.5 14.7 1004.4 751.3 0.0 0.0 0.0 0.3 0 1 0 0 0 0 c12t500117310015D54Ed0</div><div> 349.1 646.9 14437.7 67437.3 52.7 2.6 52.9 2.6 12 37 0 0 0 0 tank</div></div><div><br></div><div>
What should I be checking for? Is a scrub supposed to take that long (and I thought over 10 days for the last one was long..)? There doesn't seem to be any hardware errors. Is the load too high (12% wait, 37% busy with asvc_t of 2.6ms)?<br></div></div></blockquote><div><br></div><div>There are many variables here, the biggest of which is the current non-scrub load.</div><div> -- richard</div></div><br></body></html>