[OmniOS-discuss] multithreaded gzip (or equivalent) and moving some files while preserving file trees

Jim Klimov jimklimov at cos.ru
Fri Aug 16 08:20:10 UTC 2013


On 2013-08-15 21:44, Valrhona wrote:
> Thanks to OmniTI for making a fantastic product for the community!
>
> I am doing a bunch of backups, and trying to organize data, and have two
> questions:
>
> 1. Is there a better alternative, perhaps in the new package
> repositories, for gzip-style compression that is multithreaded? I am
> doing the usualy zfs send to a file, which I then backup to tape. Using
> gzip makes the process like 100x slower than if I just dump the zfs
> stream to an uncompressed file, and so it's not practical from a time
> standpoint.

I am not sure about repositories, but there are projects such as
pigz and pbzip2 which are parallelized interfaces to the same
compression libraries, and easily compilable.

They split the incoming data into chunks of size you specify
(like 900kb for best results) and send them off to different
cores; then a dispatcher thread collects the results and spews
them off in correct order. Archives/streams are format-compatible
with other (single-thread) implementations.

In some versions of pigz there was a problem with compression of
multiple filename arguments (some state was not cleared, so they
were processed as if concatenated), so for predictable results
better script up loops and call each time for one argument :)

> 2. This is a more general UNIX question: I have a lot of directories
> with mixed files, but I want to extract all of the files with a certain
> extension, say .xyz, and move only those files to a different zfs
> filesystem. In the destination, I want to recreate the directory
> structure of the original tree, but only have the .xyz files in them
> (these are large, uncompressed raw data). So the source and destination
> should have the same directory structure; the source would have none of
> the .xyz files, and the destination would have all of the .xyz files.
>
> Is there a simple way to do this with mv, or is another command
> recommended? I am sure this is obvious to many of the unix gurus around,
> so any help would be appreciated. Thanks!

You can look into examples on cpio manpage - it takes results of
the find command as input, so you can "find" piles according to
the name pattern(s) you need, and as part of their path they would
include the directories which contain them. If you need the whole
directory structure, try "find -type d" - though you might need
to somehow tell cpio to only archive the directories and not their
contents recursively. Or if you're doing this locally - just create
the structure at new root node with "mkdir -p". For remote sending,
you might do the same and then archive this empty structure ;)

HTH,
//Jim Klimov




More information about the OmniOS-discuss mailing list