<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">Am 07.01.15 um 21:48 schrieb Richard
Elling:<br>
</div>
<blockquote
cite="mid:3010EE58-59DE-408D-8BFA-28571F9B1A2B@richardelling.com"
type="cite">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<br class="">
<div>
<blockquote type="cite" class="">
<div class="">On Jan 7, 2015, at 12:11 PM, Stephan Budach <<a
moz-do-not-send="true" href="mailto:stephan.budach@jvm.de"
class="">stephan.budach@jvm.de</a>> wrote:</div>
<br class="Apple-interchange-newline">
<div class="">
<div text="#000000" bgcolor="#FFFFFF" class="">
<div class="moz-cite-prefix">Am 07.01.15 um 18:00 schrieb
Richard Elling:<br class="">
</div>
<blockquote
cite="mid:ACE15BA6-97B4-4C94-B758-654AF147FE27@richardelling.com"
type="cite" class=""> <br class="">
<div class="">
<blockquote type="cite" class="">
<div class="">On Jan 7, 2015, at 2:28 AM, Stephan
Budach <<a moz-do-not-send="true"
href="mailto:stephan.budach@JVM.DE" class="">stephan.budach@JVM.DE</a>>
wrote:</div>
<br class="Apple-interchange-newline">
<div class="">
<div text="#000000" bgcolor="#FFFFFF" class=""> <font
class="" face="Helvetica, Arial, sans-serif">Hello
everyone,<br class="">
<br class="">
I am sharing my zfs via NFS to a couple of OVM
nodes. I noticed really bad NFS read
performance, when rsize goes beyond 128k,
whereas the performance is just fine at 32k.
The issue is, that the ovs-agent, which is
performing the actual mount, doesn't accept or
pass any NFS mount options to the NFS server.
</font></div>
</div>
</blockquote>
<div class=""><br class="">
</div>
<div class="">The other issue is that illumos/Solaris
on x86 tuning of server-side size settings does</div>
<div class="">not work because the compiler optimizes
away the tunables. There is a trivial fix, but it</div>
<div class="">requires a rebuild.</div>
<br class="">
<blockquote type="cite" class="">
<div class="">
<div text="#000000" bgcolor="#FFFFFF" class=""><font
class="" face="Helvetica, Arial, sans-serif">To
give some numbers, a rsize of 1mb results in a
read throughput of approx. 2Mb/s, whereas a
rsize of 32k gives me 110Mb/s. Mounting a NFS
export from a OEL 6u4 box has no issues with
this, as the read speeds from this export are
108+MB/s regardles of the rsize of the NFS
mount.<br class="">
</font></div>
</div>
</blockquote>
<div class=""><br class="">
</div>
<div class="">Brendan wrote about a similar issue in
the Dtrace book as a case study. See chapter 5</div>
<div class="">case study on ZFS 8KB mirror reads.</div>
<br class="">
<blockquote type="cite" class="">
<div class="">
<div text="#000000" bgcolor="#FFFFFF" class=""><font
class="" face="Helvetica, Arial, sans-serif">
<br class="">
The OmniOS box is currently connected to a
10GbE port at our core 6509, but the NFS
client is connected through a 1GbE port only.
MTU is at 1500 and can currently not be upped.<br
class="">
Anyone having a tip, why a rsize of 64k+ will
result in such a performance drop?<br class="">
</font></div>
</div>
</blockquote>
<div class=""><br class="">
</div>
<div class="">It is entirely due to optimizations for
small I/O going way back to the 1980s.</div>
<div class=""> -- richard</div>
</div>
</blockquote>
But, doesn't that mean, that Oracle Solaris will have the
same issue or has Oracle addressed that in recent Solaris
versions? Not, that I am intending to switch over, but
that would be something I'd like to give my SR engineer to
chew on…<br class="">
</div>
</div>
</blockquote>
<div><br class="">
</div>
<div>Look for yourself :-)</div>
<div>In "broken" systems, such as this Solaris 11.1 system:</div>
<div>
<div style="margin: 0px; font-size: 11px; font-family: Menlo;
color: rgb(215, 201, 167); background-color: rgb(142, 53,
40);" class=""># echo nfs3_tsize::dis | mdb -k</div>
<div style="margin: 0px; font-size: 11px; font-family: Menlo;
color: rgb(215, 201, 167); background-color: rgb(142, 53,
40);" class="">nfs3_tsize: pushq %rbp</div>
<div style="margin: 0px; font-size: 11px; font-family: Menlo;
color: rgb(215, 201, 167); background-color: rgb(142, 53,
40);" class="">nfs3_tsize+1: movq
%rsp,%rbp</div>
<div style="margin: 0px; font-size: 11px; font-family: Menlo;
color: rgb(215, 201, 167); background-color: rgb(142, 53,
40);" class="">nfs3_tsize+4: subq
$0x8,%rsp</div>
<div style="margin: 0px; font-size: 11px; font-family: Menlo;
color: rgb(215, 201, 167); background-color: rgb(142, 53,
40);" class="">nfs3_tsize+8: movq
%rdi,-0x8(%rbp)</div>
<div style="margin: 0px; font-size: 11px; font-family: Menlo;
color: rgb(215, 201, 167); background-color: rgb(142, 53,
40);" class="">nfs3_tsize+0xc: movl
(%rdi),%eax</div>
<div style="margin: 0px; font-size: 11px; font-family: Menlo;
color: rgb(215, 201, 167); background-color: rgb(142, 53,
40);" class="">nfs3_tsize+0xe: leal
-0x2(%rax),%ecx</div>
<div style="margin: 0px; font-size: 11px; font-family: Menlo;
color: rgb(215, 201, 167); background-color: rgb(142, 53,
40);" class="">nfs3_tsize+0x11: cmpl
$0x1,%ecx</div>
<div style="margin: 0px; font-size: 11px; font-family: Menlo;
color: rgb(215, 201, 167); background-color: rgb(142, 53,
40);" class="">nfs3_tsize+0x14: jbe +0x12
<nfs3_tsize+0x28></div>
<div style="margin: 0px; font-size: 11px; font-family: Menlo;
color: rgb(215, 201, 167); background-color: rgb(142, 53,
40);" class="">nfs3_tsize+0x16: cmpl
$0x5,%eax</div>
<div style="margin: 0px; font-size: 11px; font-family: Menlo;
color: rgb(215, 201, 167); background-color: rgb(142, 53,
40);" class="">nfs3_tsize+0x19: movl
$0x100000,%eax</div>
<div style="margin: 0px; font-size: 11px; font-family: Menlo;
color: rgb(215, 201, 167); background-color: rgb(142, 53,
40);" class="">nfs3_tsize+0x1e: movl
$0x8000,%ecx</div>
<div style="margin: 0px; font-size: 11px; font-family: Menlo;
color: rgb(215, 201, 167); background-color: rgb(142, 53,
40);" class="">nfs3_tsize+0x23: cmovl.ne
%ecx,%eax</div>
<div style="margin: 0px; font-size: 11px; font-family: Menlo;
color: rgb(215, 201, 167); background-color: rgb(142, 53,
40);" class="">nfs3_tsize+0x26: jmp +0x5
<nfs3_tsize+0x2d></div>
<div style="margin: 0px; font-size: 11px; font-family: Menlo;
color: rgb(215, 201, 167); background-color: rgb(142, 53,
40);" class="">nfs3_tsize+0x28: movl
$0x100000,%eax</div>
<div style="margin: 0px; font-size: 11px; font-family: Menlo;
color: rgb(215, 201, 167); background-color: rgb(142, 53,
40);" class="">nfs3_tsize+0x2d: leave </div>
<div style="margin: 0px; font-size: 11px; font-family: Menlo;
color: rgb(215, 201, 167); background-color: rgb(142, 53,
40);" class="">nfs3_tsize+0x2e: ret </div>
<div class=""><br class="">
</div>
<div class="">at +0x19 you'll notice hardwired 1MB</div>
</div>
</div>
</blockquote>
Ouch! Is that from a NFS client or server? Or rather, I know that
the NFS server negotiates the options with the client and if no
options are passed from the client to the server, the server sets up
the connection with it's defaults. So, this S11.1 output - is that
from the NFS server? If yes, it would mean that the NFS server would
go with the 1mb rsize/wsize since the OracleVM Server has not
provided any options to it.<br>
<blockquote
cite="mid:3010EE58-59DE-408D-8BFA-28571F9B1A2B@richardelling.com"
type="cite">
<div>
<div>
<div class=""><br class="">
</div>
<div class="">by contrast, on a proper system</div>
<div class="">
<div style="margin: 0px; font-family: Menlo; color: rgb(76,
47, 45); background-color: rgb(223, 219, 196);" class="">#
echo nfs3_tsize::dis | mdb -k</div>
<div style="margin: 0px; font-family: Menlo; color: rgb(76,
47, 45); background-color: rgb(223, 219, 196);" class="">nfs3_tsize:
pushq %rbp</div>
<div style="margin: 0px; font-family: Menlo; color: rgb(76,
47, 45); background-color: rgb(223, 219, 196);" class="">nfs3_tsize+1:
movq %rsp,%rbp</div>
<div style="margin: 0px; font-family: Menlo; color: rgb(76,
47, 45); background-color: rgb(223, 219, 196);" class="">nfs3_tsize+4:
subq $0x10,%rsp</div>
<div style="margin: 0px; font-family: Menlo; color: rgb(76,
47, 45); background-color: rgb(223, 219, 196);" class="">nfs3_tsize+8:
movq %rdi,-0x8(%rbp)</div>
<div style="margin: 0px; font-family: Menlo; color: rgb(76,
47, 45); background-color: rgb(223, 219, 196);" class="">nfs3_tsize+0xc:
movl (%rdi),%edx</div>
<div style="margin: 0px; font-family: Menlo; color: rgb(76,
47, 45); background-color: rgb(223, 219, 196);" class="">nfs3_tsize+0xe:
leal -0x2(%rdx),%eax</div>
<div style="margin: 0px; font-family: Menlo; color: rgb(76,
47, 45); background-color: rgb(223, 219, 196);" class="">nfs3_tsize+0x11:
cmpl $0x1,%eax</div>
<div style="margin: 0px; font-family: Menlo; color: rgb(76,
47, 45); background-color: rgb(223, 219, 196);" class="">nfs3_tsize+0x14:
jbe +0x12 <nfs3_tsize+0x28></div>
<div style="margin: 0px; font-family: Menlo; color: rgb(76,
47, 45); background-color: rgb(223, 219, 196);" class="">nfs3_tsize+0x16:
</div>
<div style="margin: 0px; font-family: Menlo; color: rgb(76,
47, 45); background-color: rgb(223, 219, 196);" class="">movl
-0x37f8ea60(%rip),%eax
<nfs3_max_transfer_size_rdma></div>
<div style="margin: 0px; font-family: Menlo; color: rgb(76,
47, 45); background-color: rgb(223, 219, 196);" class="">nfs3_tsize+0x1c:
cmpl $0x5,%edx</div>
<div style="margin: 0px; font-family: Menlo; color: rgb(76,
47, 45); background-color: rgb(223, 219, 196);" class="">nfs3_tsize+0x1f:
</div>
<div style="margin: 0px; font-family: Menlo; color: rgb(76,
47, 45); background-color: rgb(223, 219, 196);" class="">cmovl.ne
-0x37f8ea72(%rip),%eax <nfs3_max_transfer_size_clts></div>
<div style="margin: 0px; font-family: Menlo; color: rgb(76,
47, 45); background-color: rgb(223, 219, 196);" class="">nfs3_tsize+0x26:
leave </div>
<div style="margin: 0px; font-family: Menlo; color: rgb(76,
47, 45); background-color: rgb(223, 219, 196);" class="">nfs3_tsize+0x27:
ret </div>
<div style="margin: 0px; font-family: Menlo; color: rgb(76,
47, 45); background-color: rgb(223, 219, 196);" class="">nfs3_tsize+0x28:
</div>
<div style="margin: 0px; font-family: Menlo; color: rgb(76,
47, 45); background-color: rgb(223, 219, 196);" class="">movl
-0x37f8ea76(%rip),%eax
<nfs3_max_transfer_size_cots></div>
<div style="margin: 0px; font-family: Menlo; color: rgb(76,
47, 45); background-color: rgb(223, 219, 196);" class="">nfs3_tsize+0x2e:
leave </div>
<div style="margin: 0px; font-family: Menlo; color: rgb(76,
47, 45); background-color: rgb(223, 219, 196);" class="">nfs3_tsize+0x2f:
ret </div>
</div>
<div class=""><br class="">
</div>
<div class="">where you can actually tune it according to the
Solaris Tunable Parameters guide.</div>
<div class=""><br class="">
</div>
<div class="">NB, we fixed this years ago at Nexenta and I'm
certain it has not been upstreamed. There are</div>
<div class="">a number of other related fixes, all of the same
nature. If someone is inclined to upstream </div>
<div class="">contact me directly.</div>
<div class=""><br class="">
</div>
<div class="">Once, fixed, you'll be able to change the
server's settings for negotiating the rsize/wsize with</div>
<div class="">the clients. Many NAS vendors use smaller
limits, and IMHO it is a good idea anyway. For </div>
<div class="">example, see <a moz-do-not-send="true"
href="http://blog.richardelling.com/2012/04/latency-and-io-size-cars-vs-trains.html"
class="">http://blog.richardelling.com/2012/04/latency-and-io-size-cars-vs-trains.html</a></div>
<div class=""> -- richard</div>
</div>
<div><br class="">
</div>
</div>
</blockquote>
I am mostly satisfied with a transfer size of 32k and as this NFS is
used as storage repository for the vdisk images and approx 80 guests
are accessing those, so the i/o is random anyway. So smaller I/Os
are preferred anyway. However, the NFS export from the OEL box just
doesn't have this massive performance hit, even with a rsize/wsize
of 1mb.<br>
<blockquote
cite="mid:3010EE58-59DE-408D-8BFA-28571F9B1A2B@richardelling.com"
type="cite">
<div><br class="">
<blockquote type="cite" class="">
<div class="">
<div text="#000000" bgcolor="#FFFFFF" class=""> <br
class="">
In any way, the first bummer is, that Oracle chose to not
have it's ovs-agent be capable of accepting and passing
the NFS mount options…<br class="">
<br class="">
Cheers,<br class="">
budy<br class="">
</div>
</div>
</blockquote>
</div>
<br class="">
</blockquote>
Thanks,<br>
budy<br>
</body>
</html>