[OmniOS-discuss] Status of TRIM support?

Saso Kiselkov skiselkov.ml at gmail.com
Wed May 28 09:36:23 UTC 2014


On 5/28/14, 3:11 AM, Dan Swartzendruber wrote:
> 
> So I've been running with sync=disabled on my vsphere NFS datastore.  I've
> been willing to do so because I have a big-ass UPS, and do hourly backups.
>  But, I'm thinking of going to an active/passive connection to my JBOD,
> using Saso's blog post on zfs zfs-create.blogspot.com.  Here's why I think
> I can't keep using sync=disabled (I would love to have my logic sanity
> checked.)  If you switch manually from host A to B, all is well, since
> before host A exports the pool, any pending writes will be completed (so
> even though we lied to vsphere, it's okay.)  On the other hand, if host A
> crashes/hangs and host B takes over, forcibly importing the pool, you
> could end up with the following scenario: vsphere issues writes for blocks
> A, B, C, D and E.  A and B have been written.  C and D were sent to host
> A, and ACKed, so vsphere thinks all is well.  Host A has not yet committed
> blocks C and D to disk.  Host B imports the pool, assumes the virtual IP
> for the NFS share and vsphere reconnects to the datastore.  Since it
> thinks it has written blocks A-D, it then issues a write for block E. 
> Host B commits that to disk.  vsphere thinks blocks A-E were written to
> disk, when in fact, blocks C and D were not.  Silent data corruption, and
> as far as I can tell, no way to know this happened, so if I ever did have
> a forced failover, I would have to rollback every single VM to the last
> known, good snapshot.  Anyway, I decided to see what would happen
> write-wise with an SLOG SSD.  I took a samsung 840PRO used for l2arc and
> made that a log device.  I ran crystaldiskmark before and after.  Prior to
> the SLOG, I was getting about 90MB/sec (gigabit enet), which is pretty
> good.  Afterward, it went down to 8MB/sec!  I pulled the SSD and plugged
> it into my windows 7 workstation, formatted it and deleted the partition,
> which should have TRIM'ed it.  I reinserted it as SLOG and re-ran the
> test.  50MB/sec.  Still not great, but this is after all an MLC device,
> not SLC, and that's probably 'good enough'.  Looking at open-zfs.org, it
> looks like out of illumos, freebsd and ZoL, only freebsd has TRIM now.  I
> don't want to have to re-TRIM the thing every few weeks (or however long
> it takes).  Does over-provisioning help?

Hi Dan,

First off, the Samsung 840 Pro apparently doesn't have power loss
protection, so DON'T use it for slog (ZIL). Use some enterprise-class
SSD that has proper protection of its DRAM contents. Even better, if you
have the cash to spend, get a ZeusRAM - these are true NVRAM devices
with extremely low latency.

If you use an SSD for slog, do a secure erase on it and then partition
it so that you leave something like 1/3 of it unused and untouched by
the OS. Evidence suggests that that might dramatically improve write
IOPS consistency:
http://www.anandtech.com/show/6489/playing-with-op

Cheers,
-- 
Saso


More information about the OmniOS-discuss mailing list