[OmniOS-discuss] How do non-rpool ZFS filesystems get mounted?

Chris Siebenmann cks at cs.toronto.edu
Tue Mar 4 23:03:13 UTC 2014


 I will ask my question to start with and then explain the background.
As far as I can tell from running truss on the 'zfs mount -a' in
/lib/svc/method/fs-local, this *does not* mount filesystems from pools
other than rpool. However the mounts are absent immediately before it
runs and present immediately afterwards. So: does anyone understand
how this works? I assume 'zfs mount -a' is doing some ZFS action that
activates non-rpool pools and causes them to magically mount their
filesystems?

 Thanks in advance if anyone knows this.

Background:
 I am having an extremely weird heisenbug problem where on boot[*] our
test OmniOS machine fails out at the ZFS mount stage with errors about:

	Reading ZFS config: done.
	Mounting ZFS filesystems: cannot mount 'fs3-test-01': mountmount or data is busy
	cannot mount '/fs3-test-02': directory is not empty
	cannot mount 'fs3-test-02/h/999': mountpoint or dataset is busy
	(20/20)
	svc:/system/filesystem/local:default: WARNING: /usr/sbin/zfs mount -a foiled: exit status 1
[failures go on]

The direct problem here is that as far as I can tell this is incorrect.
If I log in to the console after this failure, the pools and their
filesystems are present. If I hack up /lib/svc/method/fs-local to add
debugging stuff, all of the directories involved are empty (and unmounted)
before 'zfs mount -a' runs and magically present afterwards, even as 'zfs
mount -a' complains and errors out. That was when I started truss'ing
the 'zfs mount -a' itself and discovered that it normally doesn't mount
non-rpool filesystems. In fact, based on a truss trace I have during an
incident it appears that the problem happens exactly when 'zfs mount -a'
thinks that it *does* need to mount such a filesystem but finds that
the target directory already has things in it because the filesystem is
actually mounted already.

 Running truss on the 'zfs mount -a' seems to make this happen much less
frequently, especially a relatively verbose truss that is tracing calls
in libzfs as well as system calls. This makes me wonder if there is some
sort of a race involved.

	- cks
[*: the other problem is that the test OmniOS machine has stopped actually
    rebooting when I run 'reboot'; it hangs during shutdown and must be
    power cycled (and I have the magic fastboot settings turned off).
    Neither this nor the mount problem used to happen; both appeared this
    morning. No packages have been updated.
]


More information about the OmniOS-discuss mailing list