[OmniOS-discuss] multithreaded gzip (or equivalent) and moving some files while preserving file trees

Jim Klimov jimklimov at cos.ru
Sun Aug 18 08:44:33 UTC 2013


On 2013-08-18 01:01, Valrhona wrote:
>> Also note that the common mailing-list wisdom argues against relying
>> on ZFS-send streams in files as a backup method. Unlike ZFS itself,
>> the streams offer little to no protection against bitrot, except that
>> if a stream is corrupt - this is detectable and you can no longer
>> import it. When the streams are used "live", piped to zfs receive
>> to save into a target dataset, the transport error can be detected
>> (at least by the admin or scripts that do the operation) and retried.
>> When the stream file image is your only medium during restoration
>> (especially if it is stored not on ZFS which would protect the file's
>> backend storage) - by Murphy's Law you can bet to find the needed data
>> un-importable.
> Thanks for the concern. So I didn't elaborate on what I actually do
> further downstream.
>
> I do a zfs send > file.zstream
>
> To check integrity, I do a zstreamdump < file.zstream
>
> Then before committing to tape, checksum again:
>
> md5sum file.zstream > file.md5
>
> I then tar it to tape. Then I destroy the the original filesystem
> which had the zstream file and create a new zpool, restore the
> contents of the tape to the files, then rerun the md5sum. This way, if
> a single bit changes throughout the backup and restore process, I will
> know. And the checksum is stored on the tape in the file.md5 file, so
> at any point in the future I can restore and know that the data is
> good.

Luckily for me, ZFS stream files (stored on ZFS) did not fail me yet.
I believe some of the illumos distros use such images for quick
installation of zones and systems, so the approach is not "taboo" -
but my experience ends at this and concerns-from-internet step in.

Namely, that "ZFS receive" would also know about a single bit-flip,
and refuse to use this stream to initiate a dataset at all. That is,
it wouldn't be an error limited to some unlucky file in the received
dataset, but it would be absence of the dataset at all.

On a side note, you can easily stage an experiment by taking a ZFS
stream file, corrupting it with "dd", replacing a byte or more with
different data (you can also forge the byte to be a bit-off from an
originally stored value) and see what happens today upon receive;
and in particular - whether error-diags offer any practical remedies.

Maybe this is FUD by now and the "problem" was fixed by i.e. allowing
an admin to force receipt of a stream, or some ECC/CRC was added into
stream format, or... - I just don't know, and hope that people who do
would now step in and tell us the state of things now in this area.

Also, there was some work about ZFS and backups by integration with
NDMP protocol (format?) Here, again, "I heard a tune but don't know
the words", so search the internet for these keywords to see if this
"something" fits you better :)

HTH,
//Jim



More information about the OmniOS-discuss mailing list