<div dir="ltr"><div><div>Hello All,<br><br></div>I am mildly confused by something <span style="font-family:monospace,monospace">iostat</span> does when displaying statistics for a zpool. Before I begin rooting through the <span style="font-family:monospace,monospace">iostat</span> source, does anyone have an idea of why I am seeing high "<span style="font-family:monospace,monospace">wait</span>" and "<span style="font-family:monospace,monospace">wsvc_t</span>" values for "<span style="font-family:monospace,monospace">ppool</span>" when my devices apparently are not busy? I would have assumed that the stats for the pool would be the sum of the stats for the zdevs....<br><br><span style="font-family:monospace,monospace">                    extended device statistics<br>    r/s    w/s   kr/s     kw/s  wait actv wsvc_t asvc_t  %w  %b device<br>   10.0 9183.0   40.5 344942.0   0.0  1.8    0.0    0.2   0 178 c4<br>    1.0  187.0    4.0  19684.0   0.0  0.1    0.0    0.5   0   8 c4t5000C5006A597B93d0<br>    2.0  199.0   12.0  20908.0   0.0  0.1    0.0    0.6   0  12 c4t5000C500653DE049d0<br>    2.0  197.0    8.0  20788.0   0.0  0.2    0.0    0.8   0  15 c4t5000C5003607D87Bd0<br>    0.0  202.0    0.0  20908.0   0.0  0.1    0.0    0.6   0  11 c4t5000C5006A5903A2d0<br>    0.0  189.0    0.0  19684.0   0.0  0.1    0.0    0.5   0  10 c4t5000C500653DEE58d0<br>    5.0  957.0   16.5   1966.5   0.0  0.1    0.0    0.1   0   7 c4t50026B723A07AC78d0<br>    0.0  201.0    0.0  20787.9   0.0  0.1    0.0    0.7   0  14 c4t5000C5003604ED37d0<br>    0.0    0.0    0.0      0.0   0.0  0.0    0.0    0.0   0   0 c4t5000C500653E447Ad0<br>    0.0 3525.0    0.0 110107.7   0.0  0.5    0.0    0.2   0  51 c4t500253887000690Dd0<br>    0.0 3526.0    0.0 110107.7   0.0  0.5    0.0    0.1   1  50 c4t5002538870006917d0<br>   10.0 6046.0   40.5 344941.5 837.4  1.9  138.3    0.3  23  67 ppool<br><br></span><br></div><div>For those following the VAAI thread, this is the system I will be using as my testbed.<br></div><div><br></div>Here is the structure of <span style="font-family:monospace,monospace">ppool</span> (taken at a different time than above):<br><span style="font-family:monospace,monospace"><br>root@sanbox:/root# zpool iostat -v ppool<br>                              capacity     operations    bandwidth<br>pool                       alloc   free   read  write   read  write<br>-------------------------  -----  -----  -----  -----  -----  -----<br>ppool                       191G  7.97T     23    637   140K  15.0M<br>  mirror                   63.5G  2.66T      7    133  46.3K   840K<br>    c4t5000C5006A597B93d0      -      -      1     13  24.3K   844K<br>    c4t5000C500653DEE58d0      -      -      1     13  24.1K   844K<br>  mirror                   63.6G  2.66T      7    133  46.5K   839K<br>    c4t5000C5006A5903A2d0      -      -      1     13  24.0K   844K<br>    c4t5000C500653DE049d0      -      -      1     13  24.6K   844K<br>  mirror                   63.5G  2.66T      7    133  46.8K   839K<br>    c4t5000C5003607D87Bd0      -      -      1     13  24.5K   843K<br>    c4t5000C5003604ED37d0      -      -      1     13  24.4K   843K<br>logs                           -      -      -      -      -      -<br>  mirror                    301M   222G      0    236      0  12.5M<br>    c4t5002538870006917d0      -      -      0    236      5  12.5M<br>    c4t500253887000690Dd0      -      -      0    236      5  12.5M<br>cache                          -      -      -      -      -      -<br>  c4t50026B723A07AC78d0    62.3G  11.4G     19    113  83.0K  1.07M<br>-------------------------  -----  -----  -----  -----  -----  -----</span><br><div><br><span style="font-family:monospace,monospace">root@sanbox:/root# zfs get all ppool<br>NAME   PROPERTY              VALUE                  SOURCE<br>ppool  type                  filesystem             -<br>ppool  creation              Sat Jan 24 18:37 2015  -<br>ppool  used                  5.16T                  -<br>ppool  available             2.74T                  -<br>ppool  referenced            96K                    -<br>ppool  compressratio         1.51x                  -<br>ppool  mounted               yes                    -<br>ppool  quota                 none                   default<br>ppool  reservation           none                   default<br>ppool  recordsize            128K                   default<br>ppool  mountpoint            /ppool                 default<br>ppool  sharenfs              off                    default<br>ppool  checksum              on                     default<br>ppool  compression           lz4                    local<br>ppool  atime                 on                     default<br>ppool  devices               on                     default<br>ppool  exec                  on                     default<br>ppool  setuid                on                     default<br>ppool  readonly              off                    default<br>ppool  zoned                 off                    default<br>ppool  snapdir               hidden                 default<br>ppool  aclmode               discard                default<br>ppool  aclinherit            restricted             default<br>ppool  canmount              on                     default<br>ppool  xattr                 on                     default<br>ppool  copies                1                      default<br>ppool  version               5                      -<br>ppool  utf8only              off                    -<br>ppool  normalization         none                   -<br>ppool  casesensitivity       sensitive              -<br>ppool  vscan                 off                    default<br>ppool  nbmand                off                    default<br>ppool  sharesmb              off                    default<br>ppool  refquota              none                   default<br>ppool  refreservation        none                   default<br>ppool  primarycache          all                    default<br>ppool  secondarycache        all                    default<br>ppool  usedbysnapshots       0                      -<br>ppool  usedbydataset         96K                    -<br>ppool  usedbychildren        5.16T                  -<br>ppool  usedbyrefreservation  0                      -<br>ppool  logbias               latency                default<br>ppool  dedup                 off                    default<br>ppool  mlslabel              none                   default<br>ppool  sync                  standard               local<br>ppool  refcompressratio      1.00x                  -<br>ppool  written               96K                    -<br>ppool  logicalused           445G                   -<br>ppool  logicalreferenced     9.50K                  -<br>ppool  filesystem_limit      none                   default<br>ppool  snapshot_limit        none                   default<br>ppool  filesystem_count      none                   default<br>ppool  snapshot_count        none                   default<br>ppool  redundant_metadata    all                    default</span><br><br></div><div>Currently, <span style="font-family:monospace,monospace">ppool</span> contains a single 5TB zvol that I am hosting as an iSCSI block device. At the zdev level, I have ensured that the ashift is 12 for all devices, all physical devices are 4k-native SATA, and the cache/log SSDs are also set for 4k. The block sizes are manually set in <span style="font-family:monospace,monospace">sd.conf</span>, and confirmed with "<span style="font-family:monospace,monospace">echo ::sd_state | mdb -k | egrep '(^un|_blocksize)'</span>". The zvol blocksize is 4k, and the iSCSI block transfer size is 512B (not that it matters).<br><br></div><div>All drives contain a single Solaris2 partition with an EFI label, and are properly aligned:<br><span style="font-family:monospace,monospace">format> verify<br><br>Volume name = <        ><br>ascii name  = <ATA-ST3000DM001-1CH1-CC27-2.73TB><br>bytes/sector    =  512<br>sectors = 5860533167<br>accessible sectors = 5860533134<br>Part      Tag    Flag     First Sector          Size          Last Sector<br>  0        usr    wm               256         2.73TB           5860516750   <br>  1 unassigned    wm                 0            0                0<br>  2 unassigned    wm                 0            0                0<br>  3 unassigned    wm                 0            0                0<br>  4 unassigned    wm                 0            0                0<br>  5 unassigned    wm                 0            0                0<br>  6 unassigned    wm                 0            0                0<br>  8   reserved    wm        5860516751         8.00MB           5860533134 </span><br></div><div><br></div><div>I scrubbed the pool last night, which completed without error. From "<span style="font-family:monospace,monospace">zdb ppool</span>", I have extracted (with minor formatting):<br></div><div><span style="font-family:monospace,monospace"><br>                             capacity  operations   bandwidth  ---- errors ----<br>description                used avail  read write  read write  read write cksum<br>ppool                      339G 7.82T 26.6K     0  175M     0     0     0     5<br>  mirror                   113G 2.61T 8.87K     0 58.5M     0     0     0     2<br>    /dev/dsk/c4t5000C5006A597B93d0s0  3.15K     0 48.8M     0     0     0     2<br>    /dev/dsk/c4t5000C500653DEE58d0s0  3.10K     0 49.0M     0     0     0     2<br>  <br>  mirror                   113G 2.61T 8.86K     0 58.5M     0     0     0     8<br>    /dev/dsk/c4t5000C5006A5903A2d0s0  3.12K     0 48.7M     0     0     0     8<br>    /dev/dsk/c4t5000C500653DE049d0s0  3.08K     0 48.9M     0     0     0     8<br>  <br>  mirror                   113G 2.61T 8.86K     0 58.5M     0     0     0    10<br>    /dev/dsk/c4t5000C5003607D87Bd0s0  2.48K     0 48.8M     0     0     0    10<br>    /dev/dsk/c4t5000C5003604ED37d0s0  2.47K     0 48.9M     0     0     0    10<br>  <br>  log mirror              44.0K  222G     0     0    37     0     0     0     0<br>    /dev/dsk/c4t5002538870006917d0s0      0     0   290     0     0     0     0<br>    /dev/dsk/c4t500253887000690Dd0s0      0     0   290     0     0     0     0<br>  Cache<br>  /dev/dsk/c4t50026B723A07AC78d0s0<br>                              0 73.8G     0     0    35     0     0     0     0<br>  Spare<br>  /dev/dsk/c4t5000C500653E447Ad0s0        4     0  136K     0     0     0     0</span><br><br></div><div>This shows a few checksum errors, which is not consistent with the output of "<span style="font-family:monospace,monospace">zfs status -v</span>", and "<span style="font-family:monospace,monospace">iostat -eE</span>" shows no physical error count. I again see the discrepancy between the "<span style="font-family:monospace,monospace">ppool</span>" value and what I would expect, which would be a sum of the <span style="font-family:monospace,monospace">cksum</span> errors for each vdev.<br><br></div><div>I also observed a ton of leaked space, which I expect from a live pool, as well as a single:<br><span style="font-family:monospace,monospace">db_blkptr_cb: Got error 50 reading <96, 1, 2, 3fc8> DVA[0]=<1:1dc4962000:1000> DVA[1]=<2:1dc4654000:1000> [L2 zvol object] fletcher4 lz4 LE contiguous unique double size=4000L/a00P birth=52386L/52386P fill=4825 cksum=c70e8a7765:f2a                                         </span><br><span style="font-family:monospace,monospace">dce34f59c:c8a289b51fe11d:7e0af40fe154aab4 -- skipping</span><br></div><div><br><br></div><div>By the way, I also found:<br><span style="font-family:monospace,monospace"><br>Uberblock:<br>        magic = 000000000<b>0bab10c</b></span><br><br></div><div>Wow. Just wow.<br></div><div><br><br></div><div>-Warren V<br></div><div><br></div></div>