[OmniOS-discuss] OmniOS / Nappit slow iscsi / ZFS performance with Proxmox

Thu Sep 3 11:18:14 UTC 2015

Hi Michael,

- I am running several VLANs for splitting them up. I have one VLAN for 
Cluster communication (all nodes are connected through LACP Channel) and 
one VLAN (also LACP Channel) for VM Traffic.
The storage an app servers are direct attached with 10GbE, so all 
networks are definately splitted.

- I have set MTU to 9000 and enabled Jumbo Frames in my HP Procurve 
Switch..

Meanwhile I got some performance improvements with setting recordsize 
for pool to 64k and enabling write back cache for all LU's...
I have about 250MB/s with random tests (load of tank is then around 
40-50%), which is quite good and okay for me.

If someone has more helpful advices to tune comstar / ZFS / ... 
parameters I appreciate this highly!

Thank you very much,
Steffen

-- 
Steffen Wagner
August-Bebel-Straße 61
D-68199 Mannheim

M +49 (0) 1523 3544688
E mail at steffenwagner.com
I http://wagnst.de

Get my public GnuPG key:
mail <at> steffenwagner <dot> com
http://http-keys.gnupg.net/pks/lookup?op=get&search=0x8A3406FB4688EE99

Am 2015-08-31 00:45, schrieb Michael Talbott:
> This may be a given, but, since you didn't mention this in your
> network topology.. Make sure the 1g LAN link is on a different subnet
> than the 20g iscsi link. Otherwise iscsi traffic might be flowing
> through the 1g link. Also jumbo frames can help with iscsi.
> 
> Additionally, dd speed tests from /dev/zero to a zfs disk are highly
> misleading if you have any compression enabled on the zfs disk (since
> only 512 bytes of disk is actually written for nearly any amount of
> consecutive zeros)
> 
> Michael
> Sent from my iPhone
> 
> On Aug 30, 2015, at 7:17 AM, Steffen Wagner <mail at steffenwagner.com>
> wrote:
> 
>> Hi everyone!
>> 
>> I just setup a small network with 2 nodes:
>> 
>> * 1 proxmox host on Debian Wheezy hosting KVM VMs
>> 
>> * 1 napp-it host on OmniOS stable
>> 
>> The systems are currently connected through a 1 GBit link for
>> general WAN and LAN communitcation and a 20 GBit link (two 10 GBit
>> links aggregated) for the iSCSI communication.
>> 
>> Both connection's bandwidth was confirmed using iperf.
>> 
>> The napp-it system currently has one pool (tank) consisting of 2
>> mirror vdevs. The 4 disks are SAS3 disks connected to a SAS2
>> backplane and directly attached (no expander) to the LSI SAS3008
>> (9300-8i) HBA.
>> 
>> Comstar is running on that Machine with 1 target (vm-storage) in 1
>> target group (vm-storage-group).
>> 
>> Proxmox has this iSCSI target configured as a "ZFS over iSCSI"
>> storage using a block size of 8k and the "Write cache" option
>> enabled.
>> 
>> This is where the problem starts:
>> 
>> dd if=/dev/zero of=/tank/test bs=1G count=20 conv=fdatasync
>> 
>> This dd test yields around 300 MB/s directly on the napp-it system.
>> 
>> dd if=/dev/zero of=/home/test bs=1G count=20 conv=fdatasync
>> 
>> This dd test yields around 100 MB/s on a VM with it's disk on the
>> napp-it system connected via iSCSI.
>> 
>> The problem here is not the absolute numbers as these tests do not
>> provide accurate numbers, the problem is the difference between the
>> two values. I expected at least something around 80% of the local
>> bandwidth, but this is usually around 30% or less.
>> 
>> What I noticed during the tests: When running the test locally on
>> the napp-it system, all disks will be fully utilized (read using
>> iostat -x 1). When running the test inside a VM, the disk
>> utilization barely reaches 30% (which seems to reflect the results
>> of the bandwidth displayed by dd).
>> 
>> These 30% are only reached, if the locical unit of the VM disk has
>> the writeback cache enabled. Disabling it results in 20-30 MB/s with
>> the dd test mentioned above. Enabling it also increases the disk
>> utilization.
>> 
>> These values are also seen during the disk migration. Migrating one
>> disk results in slow speed and low disk utilization. Migrating
>> several disks in parallel will evetually cause 100% disk
>> utilization.
>> 
>> I also tested a NFS share as VM storage in proxmox. Running the same
>> test inside a VM on the NFS share yields results around 200-220
>> MB/s. This is better (and shows that the traffic is going over the
>> fast link between the servers), but not really yet as I still lose a
>> third.
>> 
>> I am fairly new to the Solaris and ZFS world, so any help is greatly
>> appreciated.
>> 
>> Thanks in advance!
>> 
>> Steffen
> 
>> _______________________________________________
>> OmniOS-discuss mailing list
>> OmniOS-discuss at lists.omniti.com
>> http://lists.omniti.com/mailman/listinfo/omnios-discuss [1]
> 
> 
> Links:
> ------
> [1] http://lists.omniti.com/mailman/listinfo/omnios-discuss