[OmniOS-discuss] multithreaded gzip (or equivalent) and moving some files while preserving file trees

Valrhona valrhona at gmail.com
Sat Aug 17 23:01:30 UTC 2013


> Just in case, you might also want to consider 7zip - I think it is
> parallel out of the box, and might offer best compression of them
> all (if your backups happen to be more constrained by space than IO)
> though not all versions support stdin|stdout compression.
Thanks. I use this for default compression on individual files, so it
makes sense to try it here.


> Also note that the common mailing-list wisdom argues against relying
> on ZFS-send streams in files as a backup method. Unlike ZFS itself,
> the streams offer little to no protection against bitrot, except that
> if a stream is corrupt - this is detectable and you can no longer
> import it. When the streams are used "live", piped to zfs receive
> to save into a target dataset, the transport error can be detected
> (at least by the admin or scripts that do the operation) and retried.
> When the stream file image is your only medium during restoration
> (especially if it is stored not on ZFS which would protect the file's
> backend storage) - by Murphy's Law you can bet to find the needed data
> un-importable.
Thanks for the concern. So I didn't elaborate on what I actually do
further downstream.

I do a zfs send > file.zstream

To check integrity, I do a zstreamdump < file.zstream

Then before committing to tape, checksum again:

md5sum file.zstream > file.md5

I then tar it to tape. Then I destroy the the original filesystem
which had the zstream file and create a new zpool, restore the
contents of the tape to the files, then rerun the md5sum. This way, if
a single bit changes throughout the backup and restore process, I will
know. And the checksum is stored on the tape in the file.md5 file, so
at any point in the future I can restore and know that the data is
good.

This is somewhat tedious and takes a while, but fortunately it's just
a few commands and I can let it run in the background. But I don't see
any better backup solution, because ZFS internally checksums, and I do
a manual backup and restore, and checksum before and after. I think
this is more intense in terms of verification than most of the other
solutions I know.

It is also a bit more future-proof, since it relies only on ZFS, tar,
and md5sum, all of which are on live-CDs and standard illumos (without
any libraries). So with a tape, tape drive, enough space on a hard
disk subsystem, I can restore my filesystems without any additional
software (don't even need network connectivity), and have end-to-end
checksummng.

But maybe there are things I could be doing better?


More information about the OmniOS-discuss mailing list