On Fri, Jul 22, 2016 at 12:40:26PM +1000, NeilBrown wrote:
> On Fri, Jul 22 2016, J. Bruce Fields wrote:
>
> > On Fri, Jul 22, 2016 at 11:08:17AM +1000, NeilBrown wrote:
> >> On Fri, Jun 10 2016, fdmanana@xxxxxxxxxx wrote:
> >>
> >> > From: Filipe Manana <fdmanana@xxxxxxxx>
> >> >
> >> > When we attempt to read an inode from disk, we end up always returning an
> >> > -ESTALE error to the caller regardless of the actual failure reason, which
> >> > can be an out of memory problem (when allocating a path), some error found
> >> > when reading from the fs/subvolume btree (like a genuine IO error) or the
> >> > inode does not exists. So lets start returning the real error code to the
> >> > callers so that they don't treat all -ESTALE errors as meaning that the
> >> > inode does not exists (such as during orphan cleanup). This will also be
> >> > needed for a subsequent patch in the same series dealing with a special
> >> > fsync case.
> >> >
> >> > Signed-off-by: Filipe Manana <fdmanana@xxxxxxxx>
> >>
> >> SNIP
> >>
> >> > @@ -5594,7 +5602,8 @@ struct inode *btrfs_iget(struct super_block *s, struct btrfs_key *location,
> >> > } else {
> >> > unlock_new_inode(inode);
> >> > iput(inode);
> >> > - inode = ERR_PTR(-ESTALE);
> >> > + ASSERT(ret < 0);
> >> > + inode = ERR_PTR(ret < 0 ? ret : -ESTALE);
> >> > }
> >>
> >> Just a heads-up. This change breaks NFS :-(
> >>
> >> The change in error code percolates up the call chain:
> >>
> >> nfs4_pufh->fh_verify->nfsd_set_fh_dentry->exportfs_decode_fh
> >> ->btrfs_fh_to_dentry->ntrfs_get_dentry->btrfs_iget
> >>
> >> and nfsd returns NFS4ERR_NOENT to the client instead of NFS4ERR_STALE,
> >> and the client doesn't handle that quite the same way.
> >>
> >> This doesn't mean that the change is wrong, but it could mean we need to
> >> fix something else in the path to sanitize the error code.
> >>
> >> nfsd_set_fh_dentry already has
> >>
> >> error = nfserr_stale;
> >> if (PTR_ERR(exp) == -ENOENT)
> >> return error;
> >>
> >> if (IS_ERR(exp))
> >> return nfserrno(PTR_ERR(exp));
> >>
> >> for a different error case, so duplicating that would work, but I doubt
> >> it is best. At the very least we should check for valid errors, not
> >> specific invalid ones.
> >>
> >> Bruce: do you have an opinion where we should make sure that PUTFH (and
> >> various other requests) returns a valid error code?
> >
> > Uh, I guess not. Maybe exportfs_decode_fh?
> >
> > Though my kneejerk reaction is to be cranky and wonder why btrfs
> > suddenly needs a different convention for decode_fh().
> >
>
> I can certainly agree with that perspective, though it would be
> appropriate in that case to make sure we document the requirements for
> fh_to_dentry (the current spelling for 'decode_fh'). So I went looking
> for documentation and found:
>
> * fh_to_dentry:
> * @fh_to_dentry is given a &struct super_block (@sb) and a file handle
> * fragment (@fh, @fh_len). It should return a &struct dentry which refers
> * to the same file that the file handle fragment refers to. If it cannot,
> * it should return a %NULL pointer if the file was found but no acceptable
> * &dentries were available, or an %ERR_PTR error code indicating why it
> * couldn't be found (e.g. %ENOENT or %ENOMEM). Any suitable dentry can be
> * returned including, if necessary, a new dentry created with d_alloc_root.
> * The caller can then find any other extant dentries by following the
> * d_alias links.
> *
>
> So the new btrfs code is actually conformant!!
> That documentation dates back to 2002 when I wrote it....
> And it looks like ENOENT wasn't handled correctly then :-(
>
> I suspect anything that isn't ENOMEM should be converted to ESTALE.
> ENOMEM causes the client to be asked to retry the request later.
>
> Does this look reasonable to you?
> (Adding Christof as he as contributed a lot to exportfs)
>
> If there is agreement I'll test and post a proper patch.
I can live with it. It bothers me that we're losing potentially useful
information about what went wrong in the filesystem. Maybe this is a
place a dprintk could be handy?
--b.
>
> Thanks,
> NeilBrown
>
>
> diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c
> index 207ba8d627ca..3527b58cd5bc 100644
> --- a/fs/exportfs/expfs.c
> +++ b/fs/exportfs/expfs.c
> @@ -428,10 +428,10 @@ struct dentry *exportfs_decode_fh(struct vfsmount *mnt, struct fid *fid,
> if (!nop || !nop->fh_to_dentry)
> return ERR_PTR(-ESTALE);
> result = nop->fh_to_dentry(mnt->mnt_sb, fid, fh_len, fileid_type);
> - if (!result)
> - result = ERR_PTR(-ESTALE);
> - if (IS_ERR(result))
> - return result;
> + if (PTR_ERR(result) == -ENOMEM)
> + return ERR_CAST(result)
> + if (IS_ERR_OR_NULL(result))
> + return ERR_PTR(-ESTALE);
>
> if (d_is_dir(result)) {
> /*
> @@ -541,6 +541,8 @@ struct dentry *exportfs_decode_fh(struct vfsmount *mnt, struct fid *fid,
>
> err_result:
> dput(result);
> + if (err != -ENOMEM)
> + err = -ESTALE;
> return ERR_PTR(err);
> }
> EXPORT_SYMBOL_GPL(exportfs_decode_fh);
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html