[OmniOS-discuss] How do non-rpool ZFS filesystems get mounted?

Mark Harrison mark at omniti.com
Tue Mar 4 23:29:42 UTC 2014


You mention 'directories' being empty. Does /fs3-test-02 contain empty
directories before being mounted? If so, this will be why zfs thinks
it's isn't empty and then fail to mount it. However, the child
filesystems might still mount because their directories are empty,
giving the appearance of everything being mounted OK. I'm not sure why
you're not seeing truss show zfs trying to mount non-rpool
filesystems, but it should be doing so. My wild guess right now is
that it is due to zfs checking to see if the directory is empty first,
and only showing up that it's doing something in truss if the dir
isnt' empty.

We've had this happen before when someone runs mv on a directory that
is actually the root of a filesystem. When zfs remounts it on reboot,
it gets remounted at the old location, which may or may not have other
data in it at this point (this comes up a lot when doing something
like mv foo foo.old; mkdir foo; do_stuff_with foo). I've not tracked
down the exact pathology of this when it happens, but our solution
then has basically to be to unmount all affected filesystems, then run
rmdir on all the blank directories, move any non-blank directories
aside (keep them in case they have data that needs to be kept), then
run zfs mount -a to let it clean things up.



On Tue, Mar 4, 2014 at 6:03 PM, Chris Siebenmann <cks at cs.toronto.edu> wrote:
>  I will ask my question to start with and then explain the background.
> As far as I can tell from running truss on the 'zfs mount -a' in
> /lib/svc/method/fs-local, this *does not* mount filesystems from pools
> other than rpool. However the mounts are absent immediately before it
> runs and present immediately afterwards. So: does anyone understand
> how this works? I assume 'zfs mount -a' is doing some ZFS action that
> activates non-rpool pools and causes them to magically mount their
> filesystems?
>
>  Thanks in advance if anyone knows this.
>
> Background:
>  I am having an extremely weird heisenbug problem where on boot[*] our
> test OmniOS machine fails out at the ZFS mount stage with errors about:
>
>         Reading ZFS config: done.
>         Mounting ZFS filesystems: cannot mount 'fs3-test-01': mountmount or data is busy
>         cannot mount '/fs3-test-02': directory is not empty
>         cannot mount 'fs3-test-02/h/999': mountpoint or dataset is busy
>         (20/20)
>         svc:/system/filesystem/local:default: WARNING: /usr/sbin/zfs mount -a foiled: exit status 1
> [failures go on]
>
> The direct problem here is that as far as I can tell this is incorrect.
> If I log in to the console after this failure, the pools and their
> filesystems are present. If I hack up /lib/svc/method/fs-local to add
> debugging stuff, all of the directories involved are empty (and unmounted)
> before 'zfs mount -a' runs and magically present afterwards, even as 'zfs
> mount -a' complains and errors out. That was when I started truss'ing
> the 'zfs mount -a' itself and discovered that it normally doesn't mount
> non-rpool filesystems. In fact, based on a truss trace I have during an
> incident it appears that the problem happens exactly when 'zfs mount -a'
> thinks that it *does* need to mount such a filesystem but finds that
> the target directory already has things in it because the filesystem is
> actually mounted already.
>
>  Running truss on the 'zfs mount -a' seems to make this happen much less
> frequently, especially a relatively verbose truss that is tracing calls
> in libzfs as well as system calls. This makes me wonder if there is some
> sort of a race involved.
>
>         - cks
> [*: the other problem is that the test OmniOS machine has stopped actually
>     rebooting when I run 'reboot'; it hangs during shutdown and must be
>     power cycled (and I have the magic fastboot settings turned off).
>     Neither this nor the mount problem used to happen; both appeared this
>     morning. No packages have been updated.
> ]
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss



-- 
Mark Harrison
Lead Site Reliability Engineer
OmniTI


More information about the OmniOS-discuss mailing list