[OmniOS-discuss] Clues for tracking down a drastic ZFS fs space difference?

Chris Siebenmann cks at cs.toronto.edu
Wed Apr 29 20:00:34 UTC 2015


> > On Apr 29, 2015, at 3:21 PM, Chris Siebenmann <cks at cs.toronto.edu> wrote:
> > 
> > We have a filesystem/dataset with no snapshots,
> 
> You're sure about no snapshots? "zfs list -t snapshot" has surprised
> me once or twice in the past. :-/

 Completely sure. 'zfs list -t snapshot' has nothing and all of the
usedby* figures for anything except 'usedbydataset' are 0; no space
used by snapshots, children, or refreservation.

 I forgot to mention: this is an NFS server with no local processes.  So
if there's any deleted files being held open by user processes on client
machines, they should be being held open via NFS '.nfs*' silly-rename
files. fuser says nothing on the fileserver is using anything (which is
what I'd expect).

> > What sort of things should I be looking at to try to figure out why
> > this is happening, including with eg zdb? Are there any obvious
> > reasons why this would be happening? Is there any easy way to fix
> > this short of 'copy all data to a new dataset, destroy old dataset,
> > put new dataset in the place of the old?'
>
> I take it that a reboot of this machine (which would kill any
> processes with an open-but-deleted file) has already been done?

 A reboot of the machine hasn't been done because it would have a very
high user impact (this fileserver holds several of our most crucial core
filesystems) and so far there's no sign that a reboot would fix things
(eg by forcing some process to die and relinquish an open file).

 Here I need to cough, shuffle my feet, and admit that this machine is
running r151010, not r151014, and may not be running the very latest
r151010 kernel at that. Are there any potentially relevant bug fixes
in ZFS (or NFS) between r151010 and r151014?

	- cks


More information about the OmniOS-discuss mailing list