mirror of
https://github.com/brain-hackers/linux-brain.git
synced 2024-06-09 15:26:21 +09:00
a528d35e8b
Add a system call to make extended file information available, including file creation and some attribute flags where available through the underlying filesystem. The getattr inode operation is altered to take two additional arguments: a u32 request_mask and an unsigned int flags that indicate the synchronisation mode. This change is propagated to the vfs_getattr*() function. Functions like vfs_stat() are now inline wrappers around new functions vfs_statx() and vfs_statx_fd() to reduce stack usage. ======== OVERVIEW ======== The idea was initially proposed as a set of xattrs that could be retrieved with getxattr(), but the general preference proved to be for a new syscall with an extended stat structure. A number of requests were gathered for features to be included. The following have been included: (1) Make the fields a consistent size on all arches and make them large. (2) Spare space, request flags and information flags are provided for future expansion. (3) Better support for the y2038 problem [Arnd Bergmann] (tv_sec is an __s64). (4) Creation time: The SMB protocol carries the creation time, which could be exported by Samba, which will in turn help CIFS make use of FS-Cache as that can be used for coherency data (stx_btime). This is also specified in NFSv4 as a recommended attribute and could be exported by NFSD [Steve French]. (5) Lightweight stat: Ask for just those details of interest, and allow a netfs (such as NFS) to approximate anything not of interest, possibly without going to the server [Trond Myklebust, Ulrich Drepper, Andreas Dilger] (AT_STATX_DONT_SYNC). (6) Heavyweight stat: Force a netfs to go to the server, even if it thinks its cached attributes are up to date [Trond Myklebust] (AT_STATX_FORCE_SYNC). And the following have been left out for future extension: (7) Data version number: Could be used by userspace NFS servers [Aneesh Kumar]. Can also be used to modify fill_post_wcc() in NFSD which retrieves i_version directly, but has just called vfs_getattr(). It could get it from the kstat struct if it used vfs_xgetattr() instead. (There's disagreement on the exact semantics of a single field, since not all filesystems do this the same way). (8) BSD stat compatibility: Including more fields from the BSD stat such as creation time (st_btime) and inode generation number (st_gen) [Jeremy Allison, Bernd Schubert]. (9) Inode generation number: Useful for FUSE and userspace NFS servers [Bernd Schubert]. (This was asked for but later deemed unnecessary with the open-by-handle capability available and caused disagreement as to whether it's a security hole or not). (10) Extra coherency data may be useful in making backups [Andreas Dilger]. (No particular data were offered, but things like last backup timestamp, the data version number and the DOS archive bit would come into this category). (11) Allow the filesystem to indicate what it can/cannot provide: A filesystem can now say it doesn't support a standard stat feature if that isn't available, so if, for instance, inode numbers or UIDs don't exist or are fabricated locally... (This requires a separate system call - I have an fsinfo() call idea for this). (12) Store a 16-byte volume ID in the superblock that can be returned in struct xstat [Steve French]. (Deferred to fsinfo). (13) Include granularity fields in the time data to indicate the granularity of each of the times (NFSv4 time_delta) [Steve French]. (Deferred to fsinfo). (14) FS_IOC_GETFLAGS value. These could be translated to BSD's st_flags. Note that the Linux IOC flags are a mess and filesystems such as Ext4 define flags that aren't in linux/fs.h, so translation in the kernel may be a necessity (or, possibly, we provide the filesystem type too). (Some attributes are made available in stx_attributes, but the general feeling was that the IOC flags were to ext[234]-specific and shouldn't be exposed through statx this way). (15) Mask of features available on file (eg: ACLs, seclabel) [Brad Boyer, Michael Kerrisk]. (Deferred, probably to fsinfo. Finding out if there's an ACL or seclabal might require extra filesystem operations). (16) Femtosecond-resolution timestamps [Dave Chinner]. (A __reserved field has been left in the statx_timestamp struct for this - if there proves to be a need). (17) A set multiple attributes syscall to go with this. =============== NEW SYSTEM CALL =============== The new system call is: int ret = statx(int dfd, const char *filename, unsigned int flags, unsigned int mask, struct statx *buffer); The dfd, filename and flags parameters indicate the file to query, in a similar way to fstatat(). There is no equivalent of lstat() as that can be emulated with statx() by passing AT_SYMLINK_NOFOLLOW in flags. There is also no equivalent of fstat() as that can be emulated by passing a NULL filename to statx() with the fd of interest in dfd. Whether or not statx() synchronises the attributes with the backing store can be controlled by OR'ing a value into the flags argument (this typically only affects network filesystems): (1) AT_STATX_SYNC_AS_STAT tells statx() to behave as stat() does in this respect. (2) AT_STATX_FORCE_SYNC will require a network filesystem to synchronise its attributes with the server - which might require data writeback to occur to get the timestamps correct. (3) AT_STATX_DONT_SYNC will suppress synchronisation with the server in a network filesystem. The resulting values should be considered approximate. mask is a bitmask indicating the fields in struct statx that are of interest to the caller. The user should set this to STATX_BASIC_STATS to get the basic set returned by stat(). It should be noted that asking for more information may entail extra I/O operations. buffer points to the destination for the data. This must be 256 bytes in size. ====================== MAIN ATTRIBUTES RECORD ====================== The following structures are defined in which to return the main attribute set: struct statx_timestamp { __s64 tv_sec; __s32 tv_nsec; __s32 __reserved; }; struct statx { __u32 stx_mask; __u32 stx_blksize; __u64 stx_attributes; __u32 stx_nlink; __u32 stx_uid; __u32 stx_gid; __u16 stx_mode; __u16 __spare0[1]; __u64 stx_ino; __u64 stx_size; __u64 stx_blocks; __u64 __spare1[1]; struct statx_timestamp stx_atime; struct statx_timestamp stx_btime; struct statx_timestamp stx_ctime; struct statx_timestamp stx_mtime; __u32 stx_rdev_major; __u32 stx_rdev_minor; __u32 stx_dev_major; __u32 stx_dev_minor; __u64 __spare2[14]; }; The defined bits in request_mask and stx_mask are: STATX_TYPE Want/got stx_mode & S_IFMT STATX_MODE Want/got stx_mode & ~S_IFMT STATX_NLINK Want/got stx_nlink STATX_UID Want/got stx_uid STATX_GID Want/got stx_gid STATX_ATIME Want/got stx_atime{,_ns} STATX_MTIME Want/got stx_mtime{,_ns} STATX_CTIME Want/got stx_ctime{,_ns} STATX_INO Want/got stx_ino STATX_SIZE Want/got stx_size STATX_BLOCKS Want/got stx_blocks STATX_BASIC_STATS [The stuff in the normal stat struct] STATX_BTIME Want/got stx_btime{,_ns} STATX_ALL [All currently available stuff] stx_btime is the file creation time, stx_mask is a bitmask indicating the data provided and __spares*[] are where as-yet undefined fields can be placed. Time fields are structures with separate seconds and nanoseconds fields plus a reserved field in case we want to add even finer resolution. Note that times will be negative if before 1970; in such a case, the nanosecond fields will also be negative if not zero. The bits defined in the stx_attributes field convey information about a file, how it is accessed, where it is and what it does. The following attributes map to FS_*_FL flags and are the same numerical value: STATX_ATTR_COMPRESSED File is compressed by the fs STATX_ATTR_IMMUTABLE File is marked immutable STATX_ATTR_APPEND File is append-only STATX_ATTR_NODUMP File is not to be dumped STATX_ATTR_ENCRYPTED File requires key to decrypt in fs Within the kernel, the supported flags are listed by: KSTAT_ATTR_FS_IOC_FLAGS [Are any other IOC flags of sufficient general interest to be exposed through this interface?] New flags include: STATX_ATTR_AUTOMOUNT Object is an automount trigger These are for the use of GUI tools that might want to mark files specially, depending on what they are. Fields in struct statx come in a number of classes: (0) stx_dev_*, stx_blksize. These are local system information and are always available. (1) stx_mode, stx_nlinks, stx_uid, stx_gid, stx_[amc]time, stx_ino, stx_size, stx_blocks. These will be returned whether the caller asks for them or not. The corresponding bits in stx_mask will be set to indicate whether they actually have valid values. If the caller didn't ask for them, then they may be approximated. For example, NFS won't waste any time updating them from the server, unless as a byproduct of updating something requested. If the values don't actually exist for the underlying object (such as UID or GID on a DOS file), then the bit won't be set in the stx_mask, even if the caller asked for the value. In such a case, the returned value will be a fabrication. Note that there are instances where the type might not be valid, for instance Windows reparse points. (2) stx_rdev_*. This will be set only if stx_mode indicates we're looking at a blockdev or a chardev, otherwise will be 0. (3) stx_btime. Similar to (1), except this will be set to 0 if it doesn't exist. ======= TESTING ======= The following test program can be used to test the statx system call: samples/statx/test-statx.c Just compile and run, passing it paths to the files you want to examine. The file is built automatically if CONFIG_SAMPLES is enabled. Here's some example output. Firstly, an NFS directory that crosses to another FSID. Note that the AUTOMOUNT attribute is set because transiting this directory will cause d_automount to be invoked by the VFS. [root@andromeda ~]# /tmp/test-statx -A /warthog/data statx(/warthog/data) = 0 results=7ff Size: 4096 Blocks: 8 IO Block: 1048576 directory Device: 00:26 Inode: 1703937 Links: 125 Access: (3777/drwxrwxrwx) Uid: 0 Gid: 4041 Access: 2016-11-24 09:02:12.219699527+0000 Modify: 2016-11-17 10:44:36.225653653+0000 Change: 2016-11-17 10:44:36.225653653+0000 Attributes: 0000000000001000 (-------- -------- -------- -------- -------- -------- ---m---- --------) Secondly, the result of automounting on that directory. [root@andromeda ~]# /tmp/test-statx /warthog/data statx(/warthog/data) = 0 results=7ff Size: 4096 Blocks: 8 IO Block: 1048576 directory Device: 00:27 Inode: 2 Links: 125 Access: (3777/drwxrwxrwx) Uid: 0 Gid: 4041 Access: 2016-11-24 09:02:12.219699527+0000 Modify: 2016-11-17 10:44:36.225653653+0000 Change: 2016-11-17 10:44:36.225653653+0000 Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
1853 lines
45 KiB
C
1853 lines
45 KiB
C
/*
|
|
FUSE: Filesystem in Userspace
|
|
Copyright (C) 2001-2008 Miklos Szeredi <miklos@szeredi.hu>
|
|
|
|
This program can be distributed under the terms of the GNU GPL.
|
|
See the file COPYING.
|
|
*/
|
|
|
|
#include "fuse_i.h"
|
|
|
|
#include <linux/pagemap.h>
|
|
#include <linux/file.h>
|
|
#include <linux/sched.h>
|
|
#include <linux/namei.h>
|
|
#include <linux/slab.h>
|
|
#include <linux/xattr.h>
|
|
#include <linux/posix_acl.h>
|
|
|
|
static bool fuse_use_readdirplus(struct inode *dir, struct dir_context *ctx)
|
|
{
|
|
struct fuse_conn *fc = get_fuse_conn(dir);
|
|
struct fuse_inode *fi = get_fuse_inode(dir);
|
|
|
|
if (!fc->do_readdirplus)
|
|
return false;
|
|
if (!fc->readdirplus_auto)
|
|
return true;
|
|
if (test_and_clear_bit(FUSE_I_ADVISE_RDPLUS, &fi->state))
|
|
return true;
|
|
if (ctx->pos == 0)
|
|
return true;
|
|
return false;
|
|
}
|
|
|
|
static void fuse_advise_use_readdirplus(struct inode *dir)
|
|
{
|
|
struct fuse_inode *fi = get_fuse_inode(dir);
|
|
|
|
set_bit(FUSE_I_ADVISE_RDPLUS, &fi->state);
|
|
}
|
|
|
|
union fuse_dentry {
|
|
u64 time;
|
|
struct rcu_head rcu;
|
|
};
|
|
|
|
static inline void fuse_dentry_settime(struct dentry *entry, u64 time)
|
|
{
|
|
((union fuse_dentry *) entry->d_fsdata)->time = time;
|
|
}
|
|
|
|
static inline u64 fuse_dentry_time(struct dentry *entry)
|
|
{
|
|
return ((union fuse_dentry *) entry->d_fsdata)->time;
|
|
}
|
|
|
|
/*
|
|
* FUSE caches dentries and attributes with separate timeout. The
|
|
* time in jiffies until the dentry/attributes are valid is stored in
|
|
* dentry->d_fsdata and fuse_inode->i_time respectively.
|
|
*/
|
|
|
|
/*
|
|
* Calculate the time in jiffies until a dentry/attributes are valid
|
|
*/
|
|
static u64 time_to_jiffies(u64 sec, u32 nsec)
|
|
{
|
|
if (sec || nsec) {
|
|
struct timespec64 ts = {
|
|
sec,
|
|
min_t(u32, nsec, NSEC_PER_SEC - 1)
|
|
};
|
|
|
|
return get_jiffies_64() + timespec64_to_jiffies(&ts);
|
|
} else
|
|
return 0;
|
|
}
|
|
|
|
/*
|
|
* Set dentry and possibly attribute timeouts from the lookup/mk*
|
|
* replies
|
|
*/
|
|
static void fuse_change_entry_timeout(struct dentry *entry,
|
|
struct fuse_entry_out *o)
|
|
{
|
|
fuse_dentry_settime(entry,
|
|
time_to_jiffies(o->entry_valid, o->entry_valid_nsec));
|
|
}
|
|
|
|
static u64 attr_timeout(struct fuse_attr_out *o)
|
|
{
|
|
return time_to_jiffies(o->attr_valid, o->attr_valid_nsec);
|
|
}
|
|
|
|
static u64 entry_attr_timeout(struct fuse_entry_out *o)
|
|
{
|
|
return time_to_jiffies(o->attr_valid, o->attr_valid_nsec);
|
|
}
|
|
|
|
/*
|
|
* Mark the attributes as stale, so that at the next call to
|
|
* ->getattr() they will be fetched from userspace
|
|
*/
|
|
void fuse_invalidate_attr(struct inode *inode)
|
|
{
|
|
get_fuse_inode(inode)->i_time = 0;
|
|
}
|
|
|
|
/**
|
|
* Mark the attributes as stale due to an atime change. Avoid the invalidate if
|
|
* atime is not used.
|
|
*/
|
|
void fuse_invalidate_atime(struct inode *inode)
|
|
{
|
|
if (!IS_RDONLY(inode))
|
|
fuse_invalidate_attr(inode);
|
|
}
|
|
|
|
/*
|
|
* Just mark the entry as stale, so that a next attempt to look it up
|
|
* will result in a new lookup call to userspace
|
|
*
|
|
* This is called when a dentry is about to become negative and the
|
|
* timeout is unknown (unlink, rmdir, rename and in some cases
|
|
* lookup)
|
|
*/
|
|
void fuse_invalidate_entry_cache(struct dentry *entry)
|
|
{
|
|
fuse_dentry_settime(entry, 0);
|
|
}
|
|
|
|
/*
|
|
* Same as fuse_invalidate_entry_cache(), but also try to remove the
|
|
* dentry from the hash
|
|
*/
|
|
static void fuse_invalidate_entry(struct dentry *entry)
|
|
{
|
|
d_invalidate(entry);
|
|
fuse_invalidate_entry_cache(entry);
|
|
}
|
|
|
|
static void fuse_lookup_init(struct fuse_conn *fc, struct fuse_args *args,
|
|
u64 nodeid, const struct qstr *name,
|
|
struct fuse_entry_out *outarg)
|
|
{
|
|
memset(outarg, 0, sizeof(struct fuse_entry_out));
|
|
args->in.h.opcode = FUSE_LOOKUP;
|
|
args->in.h.nodeid = nodeid;
|
|
args->in.numargs = 1;
|
|
args->in.args[0].size = name->len + 1;
|
|
args->in.args[0].value = name->name;
|
|
args->out.numargs = 1;
|
|
args->out.args[0].size = sizeof(struct fuse_entry_out);
|
|
args->out.args[0].value = outarg;
|
|
}
|
|
|
|
u64 fuse_get_attr_version(struct fuse_conn *fc)
|
|
{
|
|
u64 curr_version;
|
|
|
|
/*
|
|
* The spin lock isn't actually needed on 64bit archs, but we
|
|
* don't yet care too much about such optimizations.
|
|
*/
|
|
spin_lock(&fc->lock);
|
|
curr_version = fc->attr_version;
|
|
spin_unlock(&fc->lock);
|
|
|
|
return curr_version;
|
|
}
|
|
|
|
/*
|
|
* Check whether the dentry is still valid
|
|
*
|
|
* If the entry validity timeout has expired and the dentry is
|
|
* positive, try to redo the lookup. If the lookup results in a
|
|
* different inode, then let the VFS invalidate the dentry and redo
|
|
* the lookup once more. If the lookup results in the same inode,
|
|
* then refresh the attributes, timeouts and mark the dentry valid.
|
|
*/
|
|
static int fuse_dentry_revalidate(struct dentry *entry, unsigned int flags)
|
|
{
|
|
struct inode *inode;
|
|
struct dentry *parent;
|
|
struct fuse_conn *fc;
|
|
struct fuse_inode *fi;
|
|
int ret;
|
|
|
|
inode = d_inode_rcu(entry);
|
|
if (inode && is_bad_inode(inode))
|
|
goto invalid;
|
|
else if (time_before64(fuse_dentry_time(entry), get_jiffies_64()) ||
|
|
(flags & LOOKUP_REVAL)) {
|
|
struct fuse_entry_out outarg;
|
|
FUSE_ARGS(args);
|
|
struct fuse_forget_link *forget;
|
|
u64 attr_version;
|
|
|
|
/* For negative dentries, always do a fresh lookup */
|
|
if (!inode)
|
|
goto invalid;
|
|
|
|
ret = -ECHILD;
|
|
if (flags & LOOKUP_RCU)
|
|
goto out;
|
|
|
|
fc = get_fuse_conn(inode);
|
|
|
|
forget = fuse_alloc_forget();
|
|
ret = -ENOMEM;
|
|
if (!forget)
|
|
goto out;
|
|
|
|
attr_version = fuse_get_attr_version(fc);
|
|
|
|
parent = dget_parent(entry);
|
|
fuse_lookup_init(fc, &args, get_node_id(d_inode(parent)),
|
|
&entry->d_name, &outarg);
|
|
ret = fuse_simple_request(fc, &args);
|
|
dput(parent);
|
|
/* Zero nodeid is same as -ENOENT */
|
|
if (!ret && !outarg.nodeid)
|
|
ret = -ENOENT;
|
|
if (!ret) {
|
|
fi = get_fuse_inode(inode);
|
|
if (outarg.nodeid != get_node_id(inode)) {
|
|
fuse_queue_forget(fc, forget, outarg.nodeid, 1);
|
|
goto invalid;
|
|
}
|
|
spin_lock(&fc->lock);
|
|
fi->nlookup++;
|
|
spin_unlock(&fc->lock);
|
|
}
|
|
kfree(forget);
|
|
if (ret == -ENOMEM)
|
|
goto out;
|
|
if (ret || (outarg.attr.mode ^ inode->i_mode) & S_IFMT)
|
|
goto invalid;
|
|
|
|
forget_all_cached_acls(inode);
|
|
fuse_change_attributes(inode, &outarg.attr,
|
|
entry_attr_timeout(&outarg),
|
|
attr_version);
|
|
fuse_change_entry_timeout(entry, &outarg);
|
|
} else if (inode) {
|
|
fi = get_fuse_inode(inode);
|
|
if (flags & LOOKUP_RCU) {
|
|
if (test_bit(FUSE_I_INIT_RDPLUS, &fi->state))
|
|
return -ECHILD;
|
|
} else if (test_and_clear_bit(FUSE_I_INIT_RDPLUS, &fi->state)) {
|
|
parent = dget_parent(entry);
|
|
fuse_advise_use_readdirplus(d_inode(parent));
|
|
dput(parent);
|
|
}
|
|
}
|
|
ret = 1;
|
|
out:
|
|
return ret;
|
|
|
|
invalid:
|
|
ret = 0;
|
|
goto out;
|
|
}
|
|
|
|
static int invalid_nodeid(u64 nodeid)
|
|
{
|
|
return !nodeid || nodeid == FUSE_ROOT_ID;
|
|
}
|
|
|
|
static int fuse_dentry_init(struct dentry *dentry)
|
|
{
|
|
dentry->d_fsdata = kzalloc(sizeof(union fuse_dentry), GFP_KERNEL);
|
|
|
|
return dentry->d_fsdata ? 0 : -ENOMEM;
|
|
}
|
|
static void fuse_dentry_release(struct dentry *dentry)
|
|
{
|
|
union fuse_dentry *fd = dentry->d_fsdata;
|
|
|
|
kfree_rcu(fd, rcu);
|
|
}
|
|
|
|
const struct dentry_operations fuse_dentry_operations = {
|
|
.d_revalidate = fuse_dentry_revalidate,
|
|
.d_init = fuse_dentry_init,
|
|
.d_release = fuse_dentry_release,
|
|
};
|
|
|
|
const struct dentry_operations fuse_root_dentry_operations = {
|
|
.d_init = fuse_dentry_init,
|
|
.d_release = fuse_dentry_release,
|
|
};
|
|
|
|
int fuse_valid_type(int m)
|
|
{
|
|
return S_ISREG(m) || S_ISDIR(m) || S_ISLNK(m) || S_ISCHR(m) ||
|
|
S_ISBLK(m) || S_ISFIFO(m) || S_ISSOCK(m);
|
|
}
|
|
|
|
int fuse_lookup_name(struct super_block *sb, u64 nodeid, const struct qstr *name,
|
|
struct fuse_entry_out *outarg, struct inode **inode)
|
|
{
|
|
struct fuse_conn *fc = get_fuse_conn_super(sb);
|
|
FUSE_ARGS(args);
|
|
struct fuse_forget_link *forget;
|
|
u64 attr_version;
|
|
int err;
|
|
|
|
*inode = NULL;
|
|
err = -ENAMETOOLONG;
|
|
if (name->len > FUSE_NAME_MAX)
|
|
goto out;
|
|
|
|
|
|
forget = fuse_alloc_forget();
|
|
err = -ENOMEM;
|
|
if (!forget)
|
|
goto out;
|
|
|
|
attr_version = fuse_get_attr_version(fc);
|
|
|
|
fuse_lookup_init(fc, &args, nodeid, name, outarg);
|
|
err = fuse_simple_request(fc, &args);
|
|
/* Zero nodeid is same as -ENOENT, but with valid timeout */
|
|
if (err || !outarg->nodeid)
|
|
goto out_put_forget;
|
|
|
|
err = -EIO;
|
|
if (!outarg->nodeid)
|
|
goto out_put_forget;
|
|
if (!fuse_valid_type(outarg->attr.mode))
|
|
goto out_put_forget;
|
|
|
|
*inode = fuse_iget(sb, outarg->nodeid, outarg->generation,
|
|
&outarg->attr, entry_attr_timeout(outarg),
|
|
attr_version);
|
|
err = -ENOMEM;
|
|
if (!*inode) {
|
|
fuse_queue_forget(fc, forget, outarg->nodeid, 1);
|
|
goto out;
|
|
}
|
|
err = 0;
|
|
|
|
out_put_forget:
|
|
kfree(forget);
|
|
out:
|
|
return err;
|
|
}
|
|
|
|
static struct dentry *fuse_lookup(struct inode *dir, struct dentry *entry,
|
|
unsigned int flags)
|
|
{
|
|
int err;
|
|
struct fuse_entry_out outarg;
|
|
struct inode *inode;
|
|
struct dentry *newent;
|
|
bool outarg_valid = true;
|
|
|
|
fuse_lock_inode(dir);
|
|
err = fuse_lookup_name(dir->i_sb, get_node_id(dir), &entry->d_name,
|
|
&outarg, &inode);
|
|
fuse_unlock_inode(dir);
|
|
if (err == -ENOENT) {
|
|
outarg_valid = false;
|
|
err = 0;
|
|
}
|
|
if (err)
|
|
goto out_err;
|
|
|
|
err = -EIO;
|
|
if (inode && get_node_id(inode) == FUSE_ROOT_ID)
|
|
goto out_iput;
|
|
|
|
newent = d_splice_alias(inode, entry);
|
|
err = PTR_ERR(newent);
|
|
if (IS_ERR(newent))
|
|
goto out_err;
|
|
|
|
entry = newent ? newent : entry;
|
|
if (outarg_valid)
|
|
fuse_change_entry_timeout(entry, &outarg);
|
|
else
|
|
fuse_invalidate_entry_cache(entry);
|
|
|
|
fuse_advise_use_readdirplus(dir);
|
|
return newent;
|
|
|
|
out_iput:
|
|
iput(inode);
|
|
out_err:
|
|
return ERR_PTR(err);
|
|
}
|
|
|
|
/*
|
|
* Atomic create+open operation
|
|
*
|
|
* If the filesystem doesn't support this, then fall back to separate
|
|
* 'mknod' + 'open' requests.
|
|
*/
|
|
static int fuse_create_open(struct inode *dir, struct dentry *entry,
|
|
struct file *file, unsigned flags,
|
|
umode_t mode, int *opened)
|
|
{
|
|
int err;
|
|
struct inode *inode;
|
|
struct fuse_conn *fc = get_fuse_conn(dir);
|
|
FUSE_ARGS(args);
|
|
struct fuse_forget_link *forget;
|
|
struct fuse_create_in inarg;
|
|
struct fuse_open_out outopen;
|
|
struct fuse_entry_out outentry;
|
|
struct fuse_file *ff;
|
|
|
|
/* Userspace expects S_IFREG in create mode */
|
|
BUG_ON((mode & S_IFMT) != S_IFREG);
|
|
|
|
forget = fuse_alloc_forget();
|
|
err = -ENOMEM;
|
|
if (!forget)
|
|
goto out_err;
|
|
|
|
err = -ENOMEM;
|
|
ff = fuse_file_alloc(fc);
|
|
if (!ff)
|
|
goto out_put_forget_req;
|
|
|
|
if (!fc->dont_mask)
|
|
mode &= ~current_umask();
|
|
|
|
flags &= ~O_NOCTTY;
|
|
memset(&inarg, 0, sizeof(inarg));
|
|
memset(&outentry, 0, sizeof(outentry));
|
|
inarg.flags = flags;
|
|
inarg.mode = mode;
|
|
inarg.umask = current_umask();
|
|
args.in.h.opcode = FUSE_CREATE;
|
|
args.in.h.nodeid = get_node_id(dir);
|
|
args.in.numargs = 2;
|
|
args.in.args[0].size = sizeof(inarg);
|
|
args.in.args[0].value = &inarg;
|
|
args.in.args[1].size = entry->d_name.len + 1;
|
|
args.in.args[1].value = entry->d_name.name;
|
|
args.out.numargs = 2;
|
|
args.out.args[0].size = sizeof(outentry);
|
|
args.out.args[0].value = &outentry;
|
|
args.out.args[1].size = sizeof(outopen);
|
|
args.out.args[1].value = &outopen;
|
|
err = fuse_simple_request(fc, &args);
|
|
if (err)
|
|
goto out_free_ff;
|
|
|
|
err = -EIO;
|
|
if (!S_ISREG(outentry.attr.mode) || invalid_nodeid(outentry.nodeid))
|
|
goto out_free_ff;
|
|
|
|
ff->fh = outopen.fh;
|
|
ff->nodeid = outentry.nodeid;
|
|
ff->open_flags = outopen.open_flags;
|
|
inode = fuse_iget(dir->i_sb, outentry.nodeid, outentry.generation,
|
|
&outentry.attr, entry_attr_timeout(&outentry), 0);
|
|
if (!inode) {
|
|
flags &= ~(O_CREAT | O_EXCL | O_TRUNC);
|
|
fuse_sync_release(ff, flags);
|
|
fuse_queue_forget(fc, forget, outentry.nodeid, 1);
|
|
err = -ENOMEM;
|
|
goto out_err;
|
|
}
|
|
kfree(forget);
|
|
d_instantiate(entry, inode);
|
|
fuse_change_entry_timeout(entry, &outentry);
|
|
fuse_invalidate_attr(dir);
|
|
err = finish_open(file, entry, generic_file_open, opened);
|
|
if (err) {
|
|
fuse_sync_release(ff, flags);
|
|
} else {
|
|
file->private_data = fuse_file_get(ff);
|
|
fuse_finish_open(inode, file);
|
|
}
|
|
return err;
|
|
|
|
out_free_ff:
|
|
fuse_file_free(ff);
|
|
out_put_forget_req:
|
|
kfree(forget);
|
|
out_err:
|
|
return err;
|
|
}
|
|
|
|
static int fuse_mknod(struct inode *, struct dentry *, umode_t, dev_t);
|
|
static int fuse_atomic_open(struct inode *dir, struct dentry *entry,
|
|
struct file *file, unsigned flags,
|
|
umode_t mode, int *opened)
|
|
{
|
|
int err;
|
|
struct fuse_conn *fc = get_fuse_conn(dir);
|
|
struct dentry *res = NULL;
|
|
|
|
if (d_in_lookup(entry)) {
|
|
res = fuse_lookup(dir, entry, 0);
|
|
if (IS_ERR(res))
|
|
return PTR_ERR(res);
|
|
|
|
if (res)
|
|
entry = res;
|
|
}
|
|
|
|
if (!(flags & O_CREAT) || d_really_is_positive(entry))
|
|
goto no_open;
|
|
|
|
/* Only creates */
|
|
*opened |= FILE_CREATED;
|
|
|
|
if (fc->no_create)
|
|
goto mknod;
|
|
|
|
err = fuse_create_open(dir, entry, file, flags, mode, opened);
|
|
if (err == -ENOSYS) {
|
|
fc->no_create = 1;
|
|
goto mknod;
|
|
}
|
|
out_dput:
|
|
dput(res);
|
|
return err;
|
|
|
|
mknod:
|
|
err = fuse_mknod(dir, entry, mode, 0);
|
|
if (err)
|
|
goto out_dput;
|
|
no_open:
|
|
return finish_no_open(file, res);
|
|
}
|
|
|
|
/*
|
|
* Code shared between mknod, mkdir, symlink and link
|
|
*/
|
|
static int create_new_entry(struct fuse_conn *fc, struct fuse_args *args,
|
|
struct inode *dir, struct dentry *entry,
|
|
umode_t mode)
|
|
{
|
|
struct fuse_entry_out outarg;
|
|
struct inode *inode;
|
|
int err;
|
|
struct fuse_forget_link *forget;
|
|
|
|
forget = fuse_alloc_forget();
|
|
if (!forget)
|
|
return -ENOMEM;
|
|
|
|
memset(&outarg, 0, sizeof(outarg));
|
|
args->in.h.nodeid = get_node_id(dir);
|
|
args->out.numargs = 1;
|
|
args->out.args[0].size = sizeof(outarg);
|
|
args->out.args[0].value = &outarg;
|
|
err = fuse_simple_request(fc, args);
|
|
if (err)
|
|
goto out_put_forget_req;
|
|
|
|
err = -EIO;
|
|
if (invalid_nodeid(outarg.nodeid))
|
|
goto out_put_forget_req;
|
|
|
|
if ((outarg.attr.mode ^ mode) & S_IFMT)
|
|
goto out_put_forget_req;
|
|
|
|
inode = fuse_iget(dir->i_sb, outarg.nodeid, outarg.generation,
|
|
&outarg.attr, entry_attr_timeout(&outarg), 0);
|
|
if (!inode) {
|
|
fuse_queue_forget(fc, forget, outarg.nodeid, 1);
|
|
return -ENOMEM;
|
|
}
|
|
kfree(forget);
|
|
|
|
err = d_instantiate_no_diralias(entry, inode);
|
|
if (err)
|
|
return err;
|
|
|
|
fuse_change_entry_timeout(entry, &outarg);
|
|
fuse_invalidate_attr(dir);
|
|
return 0;
|
|
|
|
out_put_forget_req:
|
|
kfree(forget);
|
|
return err;
|
|
}
|
|
|
|
static int fuse_mknod(struct inode *dir, struct dentry *entry, umode_t mode,
|
|
dev_t rdev)
|
|
{
|
|
struct fuse_mknod_in inarg;
|
|
struct fuse_conn *fc = get_fuse_conn(dir);
|
|
FUSE_ARGS(args);
|
|
|
|
if (!fc->dont_mask)
|
|
mode &= ~current_umask();
|
|
|
|
memset(&inarg, 0, sizeof(inarg));
|
|
inarg.mode = mode;
|
|
inarg.rdev = new_encode_dev(rdev);
|
|
inarg.umask = current_umask();
|
|
args.in.h.opcode = FUSE_MKNOD;
|
|
args.in.numargs = 2;
|
|
args.in.args[0].size = sizeof(inarg);
|
|
args.in.args[0].value = &inarg;
|
|
args.in.args[1].size = entry->d_name.len + 1;
|
|
args.in.args[1].value = entry->d_name.name;
|
|
return create_new_entry(fc, &args, dir, entry, mode);
|
|
}
|
|
|
|
static int fuse_create(struct inode *dir, struct dentry *entry, umode_t mode,
|
|
bool excl)
|
|
{
|
|
return fuse_mknod(dir, entry, mode, 0);
|
|
}
|
|
|
|
static int fuse_mkdir(struct inode *dir, struct dentry *entry, umode_t mode)
|
|
{
|
|
struct fuse_mkdir_in inarg;
|
|
struct fuse_conn *fc = get_fuse_conn(dir);
|
|
FUSE_ARGS(args);
|
|
|
|
if (!fc->dont_mask)
|
|
mode &= ~current_umask();
|
|
|
|
memset(&inarg, 0, sizeof(inarg));
|
|
inarg.mode = mode;
|
|
inarg.umask = current_umask();
|
|
args.in.h.opcode = FUSE_MKDIR;
|
|
args.in.numargs = 2;
|
|
args.in.args[0].size = sizeof(inarg);
|
|
args.in.args[0].value = &inarg;
|
|
args.in.args[1].size = entry->d_name.len + 1;
|
|
args.in.args[1].value = entry->d_name.name;
|
|
return create_new_entry(fc, &args, dir, entry, S_IFDIR);
|
|
}
|
|
|
|
static int fuse_symlink(struct inode *dir, struct dentry *entry,
|
|
const char *link)
|
|
{
|
|
struct fuse_conn *fc = get_fuse_conn(dir);
|
|
unsigned len = strlen(link) + 1;
|
|
FUSE_ARGS(args);
|
|
|
|
args.in.h.opcode = FUSE_SYMLINK;
|
|
args.in.numargs = 2;
|
|
args.in.args[0].size = entry->d_name.len + 1;
|
|
args.in.args[0].value = entry->d_name.name;
|
|
args.in.args[1].size = len;
|
|
args.in.args[1].value = link;
|
|
return create_new_entry(fc, &args, dir, entry, S_IFLNK);
|
|
}
|
|
|
|
void fuse_update_ctime(struct inode *inode)
|
|
{
|
|
if (!IS_NOCMTIME(inode)) {
|
|
inode->i_ctime = current_time(inode);
|
|
mark_inode_dirty_sync(inode);
|
|
}
|
|
}
|
|
|
|
static int fuse_unlink(struct inode *dir, struct dentry *entry)
|
|
{
|
|
int err;
|
|
struct fuse_conn *fc = get_fuse_conn(dir);
|
|
FUSE_ARGS(args);
|
|
|
|
args.in.h.opcode = FUSE_UNLINK;
|
|
args.in.h.nodeid = get_node_id(dir);
|
|
args.in.numargs = 1;
|
|
args.in.args[0].size = entry->d_name.len + 1;
|
|
args.in.args[0].value = entry->d_name.name;
|
|
err = fuse_simple_request(fc, &args);
|
|
if (!err) {
|
|
struct inode *inode = d_inode(entry);
|
|
struct fuse_inode *fi = get_fuse_inode(inode);
|
|
|
|
spin_lock(&fc->lock);
|
|
fi->attr_version = ++fc->attr_version;
|
|
/*
|
|
* If i_nlink == 0 then unlink doesn't make sense, yet this can
|
|
* happen if userspace filesystem is careless. It would be
|
|
* difficult to enforce correct nlink usage so just ignore this
|
|
* condition here
|
|
*/
|
|
if (inode->i_nlink > 0)
|
|
drop_nlink(inode);
|
|
spin_unlock(&fc->lock);
|
|
fuse_invalidate_attr(inode);
|
|
fuse_invalidate_attr(dir);
|
|
fuse_invalidate_entry_cache(entry);
|
|
fuse_update_ctime(inode);
|
|
} else if (err == -EINTR)
|
|
fuse_invalidate_entry(entry);
|
|
return err;
|
|
}
|
|
|
|
static int fuse_rmdir(struct inode *dir, struct dentry *entry)
|
|
{
|
|
int err;
|
|
struct fuse_conn *fc = get_fuse_conn(dir);
|
|
FUSE_ARGS(args);
|
|
|
|
args.in.h.opcode = FUSE_RMDIR;
|
|
args.in.h.nodeid = get_node_id(dir);
|
|
args.in.numargs = 1;
|
|
args.in.args[0].size = entry->d_name.len + 1;
|
|
args.in.args[0].value = entry->d_name.name;
|
|
err = fuse_simple_request(fc, &args);
|
|
if (!err) {
|
|
clear_nlink(d_inode(entry));
|
|
fuse_invalidate_attr(dir);
|
|
fuse_invalidate_entry_cache(entry);
|
|
} else if (err == -EINTR)
|
|
fuse_invalidate_entry(entry);
|
|
return err;
|
|
}
|
|
|
|
static int fuse_rename_common(struct inode *olddir, struct dentry *oldent,
|
|
struct inode *newdir, struct dentry *newent,
|
|
unsigned int flags, int opcode, size_t argsize)
|
|
{
|
|
int err;
|
|
struct fuse_rename2_in inarg;
|
|
struct fuse_conn *fc = get_fuse_conn(olddir);
|
|
FUSE_ARGS(args);
|
|
|
|
memset(&inarg, 0, argsize);
|
|
inarg.newdir = get_node_id(newdir);
|
|
inarg.flags = flags;
|
|
args.in.h.opcode = opcode;
|
|
args.in.h.nodeid = get_node_id(olddir);
|
|
args.in.numargs = 3;
|
|
args.in.args[0].size = argsize;
|
|
args.in.args[0].value = &inarg;
|
|
args.in.args[1].size = oldent->d_name.len + 1;
|
|
args.in.args[1].value = oldent->d_name.name;
|
|
args.in.args[2].size = newent->d_name.len + 1;
|
|
args.in.args[2].value = newent->d_name.name;
|
|
err = fuse_simple_request(fc, &args);
|
|
if (!err) {
|
|
/* ctime changes */
|
|
fuse_invalidate_attr(d_inode(oldent));
|
|
fuse_update_ctime(d_inode(oldent));
|
|
|
|
if (flags & RENAME_EXCHANGE) {
|
|
fuse_invalidate_attr(d_inode(newent));
|
|
fuse_update_ctime(d_inode(newent));
|
|
}
|
|
|
|
fuse_invalidate_attr(olddir);
|
|
if (olddir != newdir)
|
|
fuse_invalidate_attr(newdir);
|
|
|
|
/* newent will end up negative */
|
|
if (!(flags & RENAME_EXCHANGE) && d_really_is_positive(newent)) {
|
|
fuse_invalidate_attr(d_inode(newent));
|
|
fuse_invalidate_entry_cache(newent);
|
|
fuse_update_ctime(d_inode(newent));
|
|
}
|
|
} else if (err == -EINTR) {
|
|
/* If request was interrupted, DEITY only knows if the
|
|
rename actually took place. If the invalidation
|
|
fails (e.g. some process has CWD under the renamed
|
|
directory), then there can be inconsistency between
|
|
the dcache and the real filesystem. Tough luck. */
|
|
fuse_invalidate_entry(oldent);
|
|
if (d_really_is_positive(newent))
|
|
fuse_invalidate_entry(newent);
|
|
}
|
|
|
|
return err;
|
|
}
|
|
|
|
static int fuse_rename2(struct inode *olddir, struct dentry *oldent,
|
|
struct inode *newdir, struct dentry *newent,
|
|
unsigned int flags)
|
|
{
|
|
struct fuse_conn *fc = get_fuse_conn(olddir);
|
|
int err;
|
|
|
|
if (flags & ~(RENAME_NOREPLACE | RENAME_EXCHANGE))
|
|
return -EINVAL;
|
|
|
|
if (flags) {
|
|
if (fc->no_rename2 || fc->minor < 23)
|
|
return -EINVAL;
|
|
|
|
err = fuse_rename_common(olddir, oldent, newdir, newent, flags,
|
|
FUSE_RENAME2,
|
|
sizeof(struct fuse_rename2_in));
|
|
if (err == -ENOSYS) {
|
|
fc->no_rename2 = 1;
|
|
err = -EINVAL;
|
|
}
|
|
} else {
|
|
err = fuse_rename_common(olddir, oldent, newdir, newent, 0,
|
|
FUSE_RENAME,
|
|
sizeof(struct fuse_rename_in));
|
|
}
|
|
|
|
return err;
|
|
}
|
|
|
|
static int fuse_link(struct dentry *entry, struct inode *newdir,
|
|
struct dentry *newent)
|
|
{
|
|
int err;
|
|
struct fuse_link_in inarg;
|
|
struct inode *inode = d_inode(entry);
|
|
struct fuse_conn *fc = get_fuse_conn(inode);
|
|
FUSE_ARGS(args);
|
|
|
|
memset(&inarg, 0, sizeof(inarg));
|
|
inarg.oldnodeid = get_node_id(inode);
|
|
args.in.h.opcode = FUSE_LINK;
|
|
args.in.numargs = 2;
|
|
args.in.args[0].size = sizeof(inarg);
|
|
args.in.args[0].value = &inarg;
|
|
args.in.args[1].size = newent->d_name.len + 1;
|
|
args.in.args[1].value = newent->d_name.name;
|
|
err = create_new_entry(fc, &args, newdir, newent, inode->i_mode);
|
|
/* Contrary to "normal" filesystems it can happen that link
|
|
makes two "logical" inodes point to the same "physical"
|
|
inode. We invalidate the attributes of the old one, so it
|
|
will reflect changes in the backing inode (link count,
|
|
etc.)
|
|
*/
|
|
if (!err) {
|
|
struct fuse_inode *fi = get_fuse_inode(inode);
|
|
|
|
spin_lock(&fc->lock);
|
|
fi->attr_version = ++fc->attr_version;
|
|
inc_nlink(inode);
|
|
spin_unlock(&fc->lock);
|
|
fuse_invalidate_attr(inode);
|
|
fuse_update_ctime(inode);
|
|
} else if (err == -EINTR) {
|
|
fuse_invalidate_attr(inode);
|
|
}
|
|
return err;
|
|
}
|
|
|
|
static void fuse_fillattr(struct inode *inode, struct fuse_attr *attr,
|
|
struct kstat *stat)
|
|
{
|
|
unsigned int blkbits;
|
|
struct fuse_conn *fc = get_fuse_conn(inode);
|
|
|
|
/* see the comment in fuse_change_attributes() */
|
|
if (fc->writeback_cache && S_ISREG(inode->i_mode)) {
|
|
attr->size = i_size_read(inode);
|
|
attr->mtime = inode->i_mtime.tv_sec;
|
|
attr->mtimensec = inode->i_mtime.tv_nsec;
|
|
attr->ctime = inode->i_ctime.tv_sec;
|
|
attr->ctimensec = inode->i_ctime.tv_nsec;
|
|
}
|
|
|
|
stat->dev = inode->i_sb->s_dev;
|
|
stat->ino = attr->ino;
|
|
stat->mode = (inode->i_mode & S_IFMT) | (attr->mode & 07777);
|
|
stat->nlink = attr->nlink;
|
|
stat->uid = make_kuid(&init_user_ns, attr->uid);
|
|
stat->gid = make_kgid(&init_user_ns, attr->gid);
|
|
stat->rdev = inode->i_rdev;
|
|
stat->atime.tv_sec = attr->atime;
|
|
stat->atime.tv_nsec = attr->atimensec;
|
|
stat->mtime.tv_sec = attr->mtime;
|
|
stat->mtime.tv_nsec = attr->mtimensec;
|
|
stat->ctime.tv_sec = attr->ctime;
|
|
stat->ctime.tv_nsec = attr->ctimensec;
|
|
stat->size = attr->size;
|
|
stat->blocks = attr->blocks;
|
|
|
|
if (attr->blksize != 0)
|
|
blkbits = ilog2(attr->blksize);
|
|
else
|
|
blkbits = inode->i_sb->s_blocksize_bits;
|
|
|
|
stat->blksize = 1 << blkbits;
|
|
}
|
|
|
|
static int fuse_do_getattr(struct inode *inode, struct kstat *stat,
|
|
struct file *file)
|
|
{
|
|
int err;
|
|
struct fuse_getattr_in inarg;
|
|
struct fuse_attr_out outarg;
|
|
struct fuse_conn *fc = get_fuse_conn(inode);
|
|
FUSE_ARGS(args);
|
|
u64 attr_version;
|
|
|
|
attr_version = fuse_get_attr_version(fc);
|
|
|
|
memset(&inarg, 0, sizeof(inarg));
|
|
memset(&outarg, 0, sizeof(outarg));
|
|
/* Directories have separate file-handle space */
|
|
if (file && S_ISREG(inode->i_mode)) {
|
|
struct fuse_file *ff = file->private_data;
|
|
|
|
inarg.getattr_flags |= FUSE_GETATTR_FH;
|
|
inarg.fh = ff->fh;
|
|
}
|
|
args.in.h.opcode = FUSE_GETATTR;
|
|
args.in.h.nodeid = get_node_id(inode);
|
|
args.in.numargs = 1;
|
|
args.in.args[0].size = sizeof(inarg);
|
|
args.in.args[0].value = &inarg;
|
|
args.out.numargs = 1;
|
|
args.out.args[0].size = sizeof(outarg);
|
|
args.out.args[0].value = &outarg;
|
|
err = fuse_simple_request(fc, &args);
|
|
if (!err) {
|
|
if ((inode->i_mode ^ outarg.attr.mode) & S_IFMT) {
|
|
make_bad_inode(inode);
|
|
err = -EIO;
|
|
} else {
|
|
fuse_change_attributes(inode, &outarg.attr,
|
|
attr_timeout(&outarg),
|
|
attr_version);
|
|
if (stat)
|
|
fuse_fillattr(inode, &outarg.attr, stat);
|
|
}
|
|
}
|
|
return err;
|
|
}
|
|
|
|
int fuse_update_attributes(struct inode *inode, struct kstat *stat,
|
|
struct file *file, bool *refreshed)
|
|
{
|
|
struct fuse_inode *fi = get_fuse_inode(inode);
|
|
int err;
|
|
bool r;
|
|
|
|
if (time_before64(fi->i_time, get_jiffies_64())) {
|
|
r = true;
|
|
forget_all_cached_acls(inode);
|
|
err = fuse_do_getattr(inode, stat, file);
|
|
} else {
|
|
r = false;
|
|
err = 0;
|
|
if (stat) {
|
|
generic_fillattr(inode, stat);
|
|
stat->mode = fi->orig_i_mode;
|
|
stat->ino = fi->orig_ino;
|
|
}
|
|
}
|
|
|
|
if (refreshed != NULL)
|
|
*refreshed = r;
|
|
|
|
return err;
|
|
}
|
|
|
|
int fuse_reverse_inval_entry(struct super_block *sb, u64 parent_nodeid,
|
|
u64 child_nodeid, struct qstr *name)
|
|
{
|
|
int err = -ENOTDIR;
|
|
struct inode *parent;
|
|
struct dentry *dir;
|
|
struct dentry *entry;
|
|
|
|
parent = ilookup5(sb, parent_nodeid, fuse_inode_eq, &parent_nodeid);
|
|
if (!parent)
|
|
return -ENOENT;
|
|
|
|
inode_lock(parent);
|
|
if (!S_ISDIR(parent->i_mode))
|
|
goto unlock;
|
|
|
|
err = -ENOENT;
|
|
dir = d_find_alias(parent);
|
|
if (!dir)
|
|
goto unlock;
|
|
|
|
name->hash = full_name_hash(dir, name->name, name->len);
|
|
entry = d_lookup(dir, name);
|
|
dput(dir);
|
|
if (!entry)
|
|
goto unlock;
|
|
|
|
fuse_invalidate_attr(parent);
|
|
fuse_invalidate_entry(entry);
|
|
|
|
if (child_nodeid != 0 && d_really_is_positive(entry)) {
|
|
inode_lock(d_inode(entry));
|
|
if (get_node_id(d_inode(entry)) != child_nodeid) {
|
|
err = -ENOENT;
|
|
goto badentry;
|
|
}
|
|
if (d_mountpoint(entry)) {
|
|
err = -EBUSY;
|
|
goto badentry;
|
|
}
|
|
if (d_is_dir(entry)) {
|
|
shrink_dcache_parent(entry);
|
|
if (!simple_empty(entry)) {
|
|
err = -ENOTEMPTY;
|
|
goto badentry;
|
|
}
|
|
d_inode(entry)->i_flags |= S_DEAD;
|
|
}
|
|
dont_mount(entry);
|
|
clear_nlink(d_inode(entry));
|
|
err = 0;
|
|
badentry:
|
|
inode_unlock(d_inode(entry));
|
|
if (!err)
|
|
d_delete(entry);
|
|
} else {
|
|
err = 0;
|
|
}
|
|
dput(entry);
|
|
|
|
unlock:
|
|
inode_unlock(parent);
|
|
iput(parent);
|
|
return err;
|
|
}
|
|
|
|
/*
|
|
* Calling into a user-controlled filesystem gives the filesystem
|
|
* daemon ptrace-like capabilities over the current process. This
|
|
* means, that the filesystem daemon is able to record the exact
|
|
* filesystem operations performed, and can also control the behavior
|
|
* of the requester process in otherwise impossible ways. For example
|
|
* it can delay the operation for arbitrary length of time allowing
|
|
* DoS against the requester.
|
|
*
|
|
* For this reason only those processes can call into the filesystem,
|
|
* for which the owner of the mount has ptrace privilege. This
|
|
* excludes processes started by other users, suid or sgid processes.
|
|
*/
|
|
int fuse_allow_current_process(struct fuse_conn *fc)
|
|
{
|
|
const struct cred *cred;
|
|
|
|
if (fc->allow_other)
|
|
return 1;
|
|
|
|
cred = current_cred();
|
|
if (uid_eq(cred->euid, fc->user_id) &&
|
|
uid_eq(cred->suid, fc->user_id) &&
|
|
uid_eq(cred->uid, fc->user_id) &&
|
|
gid_eq(cred->egid, fc->group_id) &&
|
|
gid_eq(cred->sgid, fc->group_id) &&
|
|
gid_eq(cred->gid, fc->group_id))
|
|
return 1;
|
|
|
|
return 0;
|
|
}
|
|
|
|
static int fuse_access(struct inode *inode, int mask)
|
|
{
|
|
struct fuse_conn *fc = get_fuse_conn(inode);
|
|
FUSE_ARGS(args);
|
|
struct fuse_access_in inarg;
|
|
int err;
|
|
|
|
BUG_ON(mask & MAY_NOT_BLOCK);
|
|
|
|
if (fc->no_access)
|
|
return 0;
|
|
|
|
memset(&inarg, 0, sizeof(inarg));
|
|
inarg.mask = mask & (MAY_READ | MAY_WRITE | MAY_EXEC);
|
|
args.in.h.opcode = FUSE_ACCESS;
|
|
args.in.h.nodeid = get_node_id(inode);
|
|
args.in.numargs = 1;
|
|
args.in.args[0].size = sizeof(inarg);
|
|
args.in.args[0].value = &inarg;
|
|
err = fuse_simple_request(fc, &args);
|
|
if (err == -ENOSYS) {
|
|
fc->no_access = 1;
|
|
err = 0;
|
|
}
|
|
return err;
|
|
}
|
|
|
|
static int fuse_perm_getattr(struct inode *inode, int mask)
|
|
{
|
|
if (mask & MAY_NOT_BLOCK)
|
|
return -ECHILD;
|
|
|
|
forget_all_cached_acls(inode);
|
|
return fuse_do_getattr(inode, NULL, NULL);
|
|
}
|
|
|
|
/*
|
|
* Check permission. The two basic access models of FUSE are:
|
|
*
|
|
* 1) Local access checking ('default_permissions' mount option) based
|
|
* on file mode. This is the plain old disk filesystem permission
|
|
* modell.
|
|
*
|
|
* 2) "Remote" access checking, where server is responsible for
|
|
* checking permission in each inode operation. An exception to this
|
|
* is if ->permission() was invoked from sys_access() in which case an
|
|
* access request is sent. Execute permission is still checked
|
|
* locally based on file mode.
|
|
*/
|
|
static int fuse_permission(struct inode *inode, int mask)
|
|
{
|
|
struct fuse_conn *fc = get_fuse_conn(inode);
|
|
bool refreshed = false;
|
|
int err = 0;
|
|
|
|
if (!fuse_allow_current_process(fc))
|
|
return -EACCES;
|
|
|
|
/*
|
|
* If attributes are needed, refresh them before proceeding
|
|
*/
|
|
if (fc->default_permissions ||
|
|
((mask & MAY_EXEC) && S_ISREG(inode->i_mode))) {
|
|
struct fuse_inode *fi = get_fuse_inode(inode);
|
|
|
|
if (time_before64(fi->i_time, get_jiffies_64())) {
|
|
refreshed = true;
|
|
|
|
err = fuse_perm_getattr(inode, mask);
|
|
if (err)
|
|
return err;
|
|
}
|
|
}
|
|
|
|
if (fc->default_permissions) {
|
|
err = generic_permission(inode, mask);
|
|
|
|
/* If permission is denied, try to refresh file
|
|
attributes. This is also needed, because the root
|
|
node will at first have no permissions */
|
|
if (err == -EACCES && !refreshed) {
|
|
err = fuse_perm_getattr(inode, mask);
|
|
if (!err)
|
|
err = generic_permission(inode, mask);
|
|
}
|
|
|
|
/* Note: the opposite of the above test does not
|
|
exist. So if permissions are revoked this won't be
|
|
noticed immediately, only after the attribute
|
|
timeout has expired */
|
|
} else if (mask & (MAY_ACCESS | MAY_CHDIR)) {
|
|
err = fuse_access(inode, mask);
|
|
} else if ((mask & MAY_EXEC) && S_ISREG(inode->i_mode)) {
|
|
if (!(inode->i_mode & S_IXUGO)) {
|
|
if (refreshed)
|
|
return -EACCES;
|
|
|
|
err = fuse_perm_getattr(inode, mask);
|
|
if (!err && !(inode->i_mode & S_IXUGO))
|
|
return -EACCES;
|
|
}
|
|
}
|
|
return err;
|
|
}
|
|
|
|
static int parse_dirfile(char *buf, size_t nbytes, struct file *file,
|
|
struct dir_context *ctx)
|
|
{
|
|
while (nbytes >= FUSE_NAME_OFFSET) {
|
|
struct fuse_dirent *dirent = (struct fuse_dirent *) buf;
|
|
size_t reclen = FUSE_DIRENT_SIZE(dirent);
|
|
if (!dirent->namelen || dirent->namelen > FUSE_NAME_MAX)
|
|
return -EIO;
|
|
if (reclen > nbytes)
|
|
break;
|
|
if (memchr(dirent->name, '/', dirent->namelen) != NULL)
|
|
return -EIO;
|
|
|
|
if (!dir_emit(ctx, dirent->name, dirent->namelen,
|
|
dirent->ino, dirent->type))
|
|
break;
|
|
|
|
buf += reclen;
|
|
nbytes -= reclen;
|
|
ctx->pos = dirent->off;
|
|
}
|
|
|
|
return 0;
|
|
}
|
|
|
|
static int fuse_direntplus_link(struct file *file,
|
|
struct fuse_direntplus *direntplus,
|
|
u64 attr_version)
|
|
{
|
|
struct fuse_entry_out *o = &direntplus->entry_out;
|
|
struct fuse_dirent *dirent = &direntplus->dirent;
|
|
struct dentry *parent = file->f_path.dentry;
|
|
struct qstr name = QSTR_INIT(dirent->name, dirent->namelen);
|
|
struct dentry *dentry;
|
|
struct dentry *alias;
|
|
struct inode *dir = d_inode(parent);
|
|
struct fuse_conn *fc;
|
|
struct inode *inode;
|
|
DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq);
|
|
|
|
if (!o->nodeid) {
|
|
/*
|
|
* Unlike in the case of fuse_lookup, zero nodeid does not mean
|
|
* ENOENT. Instead, it only means the userspace filesystem did
|
|
* not want to return attributes/handle for this entry.
|
|
*
|
|
* So do nothing.
|
|
*/
|
|
return 0;
|
|
}
|
|
|
|
if (name.name[0] == '.') {
|
|
/*
|
|
* We could potentially refresh the attributes of the directory
|
|
* and its parent?
|
|
*/
|
|
if (name.len == 1)
|
|
return 0;
|
|
if (name.name[1] == '.' && name.len == 2)
|
|
return 0;
|
|
}
|
|
|
|
if (invalid_nodeid(o->nodeid))
|
|
return -EIO;
|
|
if (!fuse_valid_type(o->attr.mode))
|
|
return -EIO;
|
|
|
|
fc = get_fuse_conn(dir);
|
|
|
|
name.hash = full_name_hash(parent, name.name, name.len);
|
|
dentry = d_lookup(parent, &name);
|
|
if (!dentry) {
|
|
retry:
|
|
dentry = d_alloc_parallel(parent, &name, &wq);
|
|
if (IS_ERR(dentry))
|
|
return PTR_ERR(dentry);
|
|
}
|
|
if (!d_in_lookup(dentry)) {
|
|
struct fuse_inode *fi;
|
|
inode = d_inode(dentry);
|
|
if (!inode ||
|
|
get_node_id(inode) != o->nodeid ||
|
|
((o->attr.mode ^ inode->i_mode) & S_IFMT)) {
|
|
d_invalidate(dentry);
|
|
dput(dentry);
|
|
goto retry;
|
|
}
|
|
if (is_bad_inode(inode)) {
|
|
dput(dentry);
|
|
return -EIO;
|
|
}
|
|
|
|
fi = get_fuse_inode(inode);
|
|
spin_lock(&fc->lock);
|
|
fi->nlookup++;
|
|
spin_unlock(&fc->lock);
|
|
|
|
forget_all_cached_acls(inode);
|
|
fuse_change_attributes(inode, &o->attr,
|
|
entry_attr_timeout(o),
|
|
attr_version);
|
|
/*
|
|
* The other branch comes via fuse_iget()
|
|
* which bumps nlookup inside
|
|
*/
|
|
} else {
|
|
inode = fuse_iget(dir->i_sb, o->nodeid, o->generation,
|
|
&o->attr, entry_attr_timeout(o),
|
|
attr_version);
|
|
if (!inode)
|
|
inode = ERR_PTR(-ENOMEM);
|
|
|
|
alias = d_splice_alias(inode, dentry);
|
|
d_lookup_done(dentry);
|
|
if (alias) {
|
|
dput(dentry);
|
|
dentry = alias;
|
|
}
|
|
if (IS_ERR(dentry))
|
|
return PTR_ERR(dentry);
|
|
}
|
|
if (fc->readdirplus_auto)
|
|
set_bit(FUSE_I_INIT_RDPLUS, &get_fuse_inode(inode)->state);
|
|
fuse_change_entry_timeout(dentry, o);
|
|
|
|
dput(dentry);
|
|
return 0;
|
|
}
|
|
|
|
static int parse_dirplusfile(char *buf, size_t nbytes, struct file *file,
|
|
struct dir_context *ctx, u64 attr_version)
|
|
{
|
|
struct fuse_direntplus *direntplus;
|
|
struct fuse_dirent *dirent;
|
|
size_t reclen;
|
|
int over = 0;
|
|
int ret;
|
|
|
|
while (nbytes >= FUSE_NAME_OFFSET_DIRENTPLUS) {
|
|
direntplus = (struct fuse_direntplus *) buf;
|
|
dirent = &direntplus->dirent;
|
|
reclen = FUSE_DIRENTPLUS_SIZE(direntplus);
|
|
|
|
if (!dirent->namelen || dirent->namelen > FUSE_NAME_MAX)
|
|
return -EIO;
|
|
if (reclen > nbytes)
|
|
break;
|
|
if (memchr(dirent->name, '/', dirent->namelen) != NULL)
|
|
return -EIO;
|
|
|
|
if (!over) {
|
|
/* We fill entries into dstbuf only as much as
|
|
it can hold. But we still continue iterating
|
|
over remaining entries to link them. If not,
|
|
we need to send a FORGET for each of those
|
|
which we did not link.
|
|
*/
|
|
over = !dir_emit(ctx, dirent->name, dirent->namelen,
|
|
dirent->ino, dirent->type);
|
|
ctx->pos = dirent->off;
|
|
}
|
|
|
|
buf += reclen;
|
|
nbytes -= reclen;
|
|
|
|
ret = fuse_direntplus_link(file, direntplus, attr_version);
|
|
if (ret)
|
|
fuse_force_forget(file, direntplus->entry_out.nodeid);
|
|
}
|
|
|
|
return 0;
|
|
}
|
|
|
|
static int fuse_readdir(struct file *file, struct dir_context *ctx)
|
|
{
|
|
int plus, err;
|
|
size_t nbytes;
|
|
struct page *page;
|
|
struct inode *inode = file_inode(file);
|
|
struct fuse_conn *fc = get_fuse_conn(inode);
|
|
struct fuse_req *req;
|
|
u64 attr_version = 0;
|
|
|
|
if (is_bad_inode(inode))
|
|
return -EIO;
|
|
|
|
req = fuse_get_req(fc, 1);
|
|
if (IS_ERR(req))
|
|
return PTR_ERR(req);
|
|
|
|
page = alloc_page(GFP_KERNEL);
|
|
if (!page) {
|
|
fuse_put_request(fc, req);
|
|
return -ENOMEM;
|
|
}
|
|
|
|
plus = fuse_use_readdirplus(inode, ctx);
|
|
req->out.argpages = 1;
|
|
req->num_pages = 1;
|
|
req->pages[0] = page;
|
|
req->page_descs[0].length = PAGE_SIZE;
|
|
if (plus) {
|
|
attr_version = fuse_get_attr_version(fc);
|
|
fuse_read_fill(req, file, ctx->pos, PAGE_SIZE,
|
|
FUSE_READDIRPLUS);
|
|
} else {
|
|
fuse_read_fill(req, file, ctx->pos, PAGE_SIZE,
|
|
FUSE_READDIR);
|
|
}
|
|
fuse_lock_inode(inode);
|
|
fuse_request_send(fc, req);
|
|
fuse_unlock_inode(inode);
|
|
nbytes = req->out.args[0].size;
|
|
err = req->out.h.error;
|
|
fuse_put_request(fc, req);
|
|
if (!err) {
|
|
if (plus) {
|
|
err = parse_dirplusfile(page_address(page), nbytes,
|
|
file, ctx,
|
|
attr_version);
|
|
} else {
|
|
err = parse_dirfile(page_address(page), nbytes, file,
|
|
ctx);
|
|
}
|
|
}
|
|
|
|
__free_page(page);
|
|
fuse_invalidate_atime(inode);
|
|
return err;
|
|
}
|
|
|
|
static const char *fuse_get_link(struct dentry *dentry,
|
|
struct inode *inode,
|
|
struct delayed_call *done)
|
|
{
|
|
struct fuse_conn *fc = get_fuse_conn(inode);
|
|
FUSE_ARGS(args);
|
|
char *link;
|
|
ssize_t ret;
|
|
|
|
if (!dentry)
|
|
return ERR_PTR(-ECHILD);
|
|
|
|
link = kmalloc(PAGE_SIZE, GFP_KERNEL);
|
|
if (!link)
|
|
return ERR_PTR(-ENOMEM);
|
|
|
|
args.in.h.opcode = FUSE_READLINK;
|
|
args.in.h.nodeid = get_node_id(inode);
|
|
args.out.argvar = 1;
|
|
args.out.numargs = 1;
|
|
args.out.args[0].size = PAGE_SIZE - 1;
|
|
args.out.args[0].value = link;
|
|
ret = fuse_simple_request(fc, &args);
|
|
if (ret < 0) {
|
|
kfree(link);
|
|
link = ERR_PTR(ret);
|
|
} else {
|
|
link[ret] = '\0';
|
|
set_delayed_call(done, kfree_link, link);
|
|
}
|
|
fuse_invalidate_atime(inode);
|
|
return link;
|
|
}
|
|
|
|
static int fuse_dir_open(struct inode *inode, struct file *file)
|
|
{
|
|
return fuse_open_common(inode, file, true);
|
|
}
|
|
|
|
static int fuse_dir_release(struct inode *inode, struct file *file)
|
|
{
|
|
fuse_release_common(file, FUSE_RELEASEDIR);
|
|
|
|
return 0;
|
|
}
|
|
|
|
static int fuse_dir_fsync(struct file *file, loff_t start, loff_t end,
|
|
int datasync)
|
|
{
|
|
return fuse_fsync_common(file, start, end, datasync, 1);
|
|
}
|
|
|
|
static long fuse_dir_ioctl(struct file *file, unsigned int cmd,
|
|
unsigned long arg)
|
|
{
|
|
struct fuse_conn *fc = get_fuse_conn(file->f_mapping->host);
|
|
|
|
/* FUSE_IOCTL_DIR only supported for API version >= 7.18 */
|
|
if (fc->minor < 18)
|
|
return -ENOTTY;
|
|
|
|
return fuse_ioctl_common(file, cmd, arg, FUSE_IOCTL_DIR);
|
|
}
|
|
|
|
static long fuse_dir_compat_ioctl(struct file *file, unsigned int cmd,
|
|
unsigned long arg)
|
|
{
|
|
struct fuse_conn *fc = get_fuse_conn(file->f_mapping->host);
|
|
|
|
if (fc->minor < 18)
|
|
return -ENOTTY;
|
|
|
|
return fuse_ioctl_common(file, cmd, arg,
|
|
FUSE_IOCTL_COMPAT | FUSE_IOCTL_DIR);
|
|
}
|
|
|
|
static bool update_mtime(unsigned ivalid, bool trust_local_mtime)
|
|
{
|
|
/* Always update if mtime is explicitly set */
|
|
if (ivalid & ATTR_MTIME_SET)
|
|
return true;
|
|
|
|
/* Or if kernel i_mtime is the official one */
|
|
if (trust_local_mtime)
|
|
return true;
|
|
|
|
/* If it's an open(O_TRUNC) or an ftruncate(), don't update */
|
|
if ((ivalid & ATTR_SIZE) && (ivalid & (ATTR_OPEN | ATTR_FILE)))
|
|
return false;
|
|
|
|
/* In all other cases update */
|
|
return true;
|
|
}
|
|
|
|
static void iattr_to_fattr(struct iattr *iattr, struct fuse_setattr_in *arg,
|
|
bool trust_local_cmtime)
|
|
{
|
|
unsigned ivalid = iattr->ia_valid;
|
|
|
|
if (ivalid & ATTR_MODE)
|
|
arg->valid |= FATTR_MODE, arg->mode = iattr->ia_mode;
|
|
if (ivalid & ATTR_UID)
|
|
arg->valid |= FATTR_UID, arg->uid = from_kuid(&init_user_ns, iattr->ia_uid);
|
|
if (ivalid & ATTR_GID)
|
|
arg->valid |= FATTR_GID, arg->gid = from_kgid(&init_user_ns, iattr->ia_gid);
|
|
if (ivalid & ATTR_SIZE)
|
|
arg->valid |= FATTR_SIZE, arg->size = iattr->ia_size;
|
|
if (ivalid & ATTR_ATIME) {
|
|
arg->valid |= FATTR_ATIME;
|
|
arg->atime = iattr->ia_atime.tv_sec;
|
|
arg->atimensec = iattr->ia_atime.tv_nsec;
|
|
if (!(ivalid & ATTR_ATIME_SET))
|
|
arg->valid |= FATTR_ATIME_NOW;
|
|
}
|
|
if ((ivalid & ATTR_MTIME) && update_mtime(ivalid, trust_local_cmtime)) {
|
|
arg->valid |= FATTR_MTIME;
|
|
arg->mtime = iattr->ia_mtime.tv_sec;
|
|
arg->mtimensec = iattr->ia_mtime.tv_nsec;
|
|
if (!(ivalid & ATTR_MTIME_SET) && !trust_local_cmtime)
|
|
arg->valid |= FATTR_MTIME_NOW;
|
|
}
|
|
if ((ivalid & ATTR_CTIME) && trust_local_cmtime) {
|
|
arg->valid |= FATTR_CTIME;
|
|
arg->ctime = iattr->ia_ctime.tv_sec;
|
|
arg->ctimensec = iattr->ia_ctime.tv_nsec;
|
|
}
|
|
}
|
|
|
|
/*
|
|
* Prevent concurrent writepages on inode
|
|
*
|
|
* This is done by adding a negative bias to the inode write counter
|
|
* and waiting for all pending writes to finish.
|
|
*/
|
|
void fuse_set_nowrite(struct inode *inode)
|
|
{
|
|
struct fuse_conn *fc = get_fuse_conn(inode);
|
|
struct fuse_inode *fi = get_fuse_inode(inode);
|
|
|
|
BUG_ON(!inode_is_locked(inode));
|
|
|
|
spin_lock(&fc->lock);
|
|
BUG_ON(fi->writectr < 0);
|
|
fi->writectr += FUSE_NOWRITE;
|
|
spin_unlock(&fc->lock);
|
|
wait_event(fi->page_waitq, fi->writectr == FUSE_NOWRITE);
|
|
}
|
|
|
|
/*
|
|
* Allow writepages on inode
|
|
*
|
|
* Remove the bias from the writecounter and send any queued
|
|
* writepages.
|
|
*/
|
|
static void __fuse_release_nowrite(struct inode *inode)
|
|
{
|
|
struct fuse_inode *fi = get_fuse_inode(inode);
|
|
|
|
BUG_ON(fi->writectr != FUSE_NOWRITE);
|
|
fi->writectr = 0;
|
|
fuse_flush_writepages(inode);
|
|
}
|
|
|
|
void fuse_release_nowrite(struct inode *inode)
|
|
{
|
|
struct fuse_conn *fc = get_fuse_conn(inode);
|
|
|
|
spin_lock(&fc->lock);
|
|
__fuse_release_nowrite(inode);
|
|
spin_unlock(&fc->lock);
|
|
}
|
|
|
|
static void fuse_setattr_fill(struct fuse_conn *fc, struct fuse_args *args,
|
|
struct inode *inode,
|
|
struct fuse_setattr_in *inarg_p,
|
|
struct fuse_attr_out *outarg_p)
|
|
{
|
|
args->in.h.opcode = FUSE_SETATTR;
|
|
args->in.h.nodeid = get_node_id(inode);
|
|
args->in.numargs = 1;
|
|
args->in.args[0].size = sizeof(*inarg_p);
|
|
args->in.args[0].value = inarg_p;
|
|
args->out.numargs = 1;
|
|
args->out.args[0].size = sizeof(*outarg_p);
|
|
args->out.args[0].value = outarg_p;
|
|
}
|
|
|
|
/*
|
|
* Flush inode->i_mtime to the server
|
|
*/
|
|
int fuse_flush_times(struct inode *inode, struct fuse_file *ff)
|
|
{
|
|
struct fuse_conn *fc = get_fuse_conn(inode);
|
|
FUSE_ARGS(args);
|
|
struct fuse_setattr_in inarg;
|
|
struct fuse_attr_out outarg;
|
|
|
|
memset(&inarg, 0, sizeof(inarg));
|
|
memset(&outarg, 0, sizeof(outarg));
|
|
|
|
inarg.valid = FATTR_MTIME;
|
|
inarg.mtime = inode->i_mtime.tv_sec;
|
|
inarg.mtimensec = inode->i_mtime.tv_nsec;
|
|
if (fc->minor >= 23) {
|
|
inarg.valid |= FATTR_CTIME;
|
|
inarg.ctime = inode->i_ctime.tv_sec;
|
|
inarg.ctimensec = inode->i_ctime.tv_nsec;
|
|
}
|
|
if (ff) {
|
|
inarg.valid |= FATTR_FH;
|
|
inarg.fh = ff->fh;
|
|
}
|
|
fuse_setattr_fill(fc, &args, inode, &inarg, &outarg);
|
|
|
|
return fuse_simple_request(fc, &args);
|
|
}
|
|
|
|
/*
|
|
* Set attributes, and at the same time refresh them.
|
|
*
|
|
* Truncation is slightly complicated, because the 'truncate' request
|
|
* may fail, in which case we don't want to touch the mapping.
|
|
* vmtruncate() doesn't allow for this case, so do the rlimit checking
|
|
* and the actual truncation by hand.
|
|
*/
|
|
int fuse_do_setattr(struct dentry *dentry, struct iattr *attr,
|
|
struct file *file)
|
|
{
|
|
struct inode *inode = d_inode(dentry);
|
|
struct fuse_conn *fc = get_fuse_conn(inode);
|
|
struct fuse_inode *fi = get_fuse_inode(inode);
|
|
FUSE_ARGS(args);
|
|
struct fuse_setattr_in inarg;
|
|
struct fuse_attr_out outarg;
|
|
bool is_truncate = false;
|
|
bool is_wb = fc->writeback_cache;
|
|
loff_t oldsize;
|
|
int err;
|
|
bool trust_local_cmtime = is_wb && S_ISREG(inode->i_mode);
|
|
|
|
if (!fc->default_permissions)
|
|
attr->ia_valid |= ATTR_FORCE;
|
|
|
|
err = setattr_prepare(dentry, attr);
|
|
if (err)
|
|
return err;
|
|
|
|
if (attr->ia_valid & ATTR_OPEN) {
|
|
if (fc->atomic_o_trunc)
|
|
return 0;
|
|
file = NULL;
|
|
}
|
|
|
|
if (attr->ia_valid & ATTR_SIZE)
|
|
is_truncate = true;
|
|
|
|
if (is_truncate) {
|
|
fuse_set_nowrite(inode);
|
|
set_bit(FUSE_I_SIZE_UNSTABLE, &fi->state);
|
|
if (trust_local_cmtime && attr->ia_size != inode->i_size)
|
|
attr->ia_valid |= ATTR_MTIME | ATTR_CTIME;
|
|
}
|
|
|
|
memset(&inarg, 0, sizeof(inarg));
|
|
memset(&outarg, 0, sizeof(outarg));
|
|
iattr_to_fattr(attr, &inarg, trust_local_cmtime);
|
|
if (file) {
|
|
struct fuse_file *ff = file->private_data;
|
|
inarg.valid |= FATTR_FH;
|
|
inarg.fh = ff->fh;
|
|
}
|
|
if (attr->ia_valid & ATTR_SIZE) {
|
|
/* For mandatory locking in truncate */
|
|
inarg.valid |= FATTR_LOCKOWNER;
|
|
inarg.lock_owner = fuse_lock_owner_id(fc, current->files);
|
|
}
|
|
fuse_setattr_fill(fc, &args, inode, &inarg, &outarg);
|
|
err = fuse_simple_request(fc, &args);
|
|
if (err) {
|
|
if (err == -EINTR)
|
|
fuse_invalidate_attr(inode);
|
|
goto error;
|
|
}
|
|
|
|
if ((inode->i_mode ^ outarg.attr.mode) & S_IFMT) {
|
|
make_bad_inode(inode);
|
|
err = -EIO;
|
|
goto error;
|
|
}
|
|
|
|
spin_lock(&fc->lock);
|
|
/* the kernel maintains i_mtime locally */
|
|
if (trust_local_cmtime) {
|
|
if (attr->ia_valid & ATTR_MTIME)
|
|
inode->i_mtime = attr->ia_mtime;
|
|
if (attr->ia_valid & ATTR_CTIME)
|
|
inode->i_ctime = attr->ia_ctime;
|
|
/* FIXME: clear I_DIRTY_SYNC? */
|
|
}
|
|
|
|
fuse_change_attributes_common(inode, &outarg.attr,
|
|
attr_timeout(&outarg));
|
|
oldsize = inode->i_size;
|
|
/* see the comment in fuse_change_attributes() */
|
|
if (!is_wb || is_truncate || !S_ISREG(inode->i_mode))
|
|
i_size_write(inode, outarg.attr.size);
|
|
|
|
if (is_truncate) {
|
|
/* NOTE: this may release/reacquire fc->lock */
|
|
__fuse_release_nowrite(inode);
|
|
}
|
|
spin_unlock(&fc->lock);
|
|
|
|
/*
|
|
* Only call invalidate_inode_pages2() after removing
|
|
* FUSE_NOWRITE, otherwise fuse_launder_page() would deadlock.
|
|
*/
|
|
if ((is_truncate || !is_wb) &&
|
|
S_ISREG(inode->i_mode) && oldsize != outarg.attr.size) {
|
|
truncate_pagecache(inode, outarg.attr.size);
|
|
invalidate_inode_pages2(inode->i_mapping);
|
|
}
|
|
|
|
clear_bit(FUSE_I_SIZE_UNSTABLE, &fi->state);
|
|
return 0;
|
|
|
|
error:
|
|
if (is_truncate)
|
|
fuse_release_nowrite(inode);
|
|
|
|
clear_bit(FUSE_I_SIZE_UNSTABLE, &fi->state);
|
|
return err;
|
|
}
|
|
|
|
static int fuse_setattr(struct dentry *entry, struct iattr *attr)
|
|
{
|
|
struct inode *inode = d_inode(entry);
|
|
struct fuse_conn *fc = get_fuse_conn(inode);
|
|
struct file *file = (attr->ia_valid & ATTR_FILE) ? attr->ia_file : NULL;
|
|
int ret;
|
|
|
|
if (!fuse_allow_current_process(get_fuse_conn(inode)))
|
|
return -EACCES;
|
|
|
|
if (attr->ia_valid & (ATTR_KILL_SUID | ATTR_KILL_SGID)) {
|
|
attr->ia_valid &= ~(ATTR_KILL_SUID | ATTR_KILL_SGID |
|
|
ATTR_MODE);
|
|
|
|
/*
|
|
* The only sane way to reliably kill suid/sgid is to do it in
|
|
* the userspace filesystem
|
|
*
|
|
* This should be done on write(), truncate() and chown().
|
|
*/
|
|
if (!fc->handle_killpriv) {
|
|
/*
|
|
* ia_mode calculation may have used stale i_mode.
|
|
* Refresh and recalculate.
|
|
*/
|
|
ret = fuse_do_getattr(inode, NULL, file);
|
|
if (ret)
|
|
return ret;
|
|
|
|
attr->ia_mode = inode->i_mode;
|
|
if (inode->i_mode & S_ISUID) {
|
|
attr->ia_valid |= ATTR_MODE;
|
|
attr->ia_mode &= ~S_ISUID;
|
|
}
|
|
if ((inode->i_mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP)) {
|
|
attr->ia_valid |= ATTR_MODE;
|
|
attr->ia_mode &= ~S_ISGID;
|
|
}
|
|
}
|
|
}
|
|
if (!attr->ia_valid)
|
|
return 0;
|
|
|
|
ret = fuse_do_setattr(entry, attr, file);
|
|
if (!ret) {
|
|
/*
|
|
* If filesystem supports acls it may have updated acl xattrs in
|
|
* the filesystem, so forget cached acls for the inode.
|
|
*/
|
|
if (fc->posix_acl)
|
|
forget_all_cached_acls(inode);
|
|
|
|
/* Directory mode changed, may need to revalidate access */
|
|
if (d_is_dir(entry) && (attr->ia_valid & ATTR_MODE))
|
|
fuse_invalidate_entry_cache(entry);
|
|
}
|
|
return ret;
|
|
}
|
|
|
|
static int fuse_getattr(const struct path *path, struct kstat *stat,
|
|
u32 request_mask, unsigned int flags)
|
|
{
|
|
struct inode *inode = d_inode(path->dentry);
|
|
struct fuse_conn *fc = get_fuse_conn(inode);
|
|
|
|
if (!fuse_allow_current_process(fc))
|
|
return -EACCES;
|
|
|
|
return fuse_update_attributes(inode, stat, NULL, NULL);
|
|
}
|
|
|
|
static const struct inode_operations fuse_dir_inode_operations = {
|
|
.lookup = fuse_lookup,
|
|
.mkdir = fuse_mkdir,
|
|
.symlink = fuse_symlink,
|
|
.unlink = fuse_unlink,
|
|
.rmdir = fuse_rmdir,
|
|
.rename = fuse_rename2,
|
|
.link = fuse_link,
|
|
.setattr = fuse_setattr,
|
|
.create = fuse_create,
|
|
.atomic_open = fuse_atomic_open,
|
|
.mknod = fuse_mknod,
|
|
.permission = fuse_permission,
|
|
.getattr = fuse_getattr,
|
|
.listxattr = fuse_listxattr,
|
|
.get_acl = fuse_get_acl,
|
|
.set_acl = fuse_set_acl,
|
|
};
|
|
|
|
static const struct file_operations fuse_dir_operations = {
|
|
.llseek = generic_file_llseek,
|
|
.read = generic_read_dir,
|
|
.iterate_shared = fuse_readdir,
|
|
.open = fuse_dir_open,
|
|
.release = fuse_dir_release,
|
|
.fsync = fuse_dir_fsync,
|
|
.unlocked_ioctl = fuse_dir_ioctl,
|
|
.compat_ioctl = fuse_dir_compat_ioctl,
|
|
};
|
|
|
|
static const struct inode_operations fuse_common_inode_operations = {
|
|
.setattr = fuse_setattr,
|
|
.permission = fuse_permission,
|
|
.getattr = fuse_getattr,
|
|
.listxattr = fuse_listxattr,
|
|
.get_acl = fuse_get_acl,
|
|
.set_acl = fuse_set_acl,
|
|
};
|
|
|
|
static const struct inode_operations fuse_symlink_inode_operations = {
|
|
.setattr = fuse_setattr,
|
|
.get_link = fuse_get_link,
|
|
.getattr = fuse_getattr,
|
|
.listxattr = fuse_listxattr,
|
|
};
|
|
|
|
void fuse_init_common(struct inode *inode)
|
|
{
|
|
inode->i_op = &fuse_common_inode_operations;
|
|
}
|
|
|
|
void fuse_init_dir(struct inode *inode)
|
|
{
|
|
inode->i_op = &fuse_dir_inode_operations;
|
|
inode->i_fop = &fuse_dir_operations;
|
|
}
|
|
|
|
void fuse_init_symlink(struct inode *inode)
|
|
{
|
|
inode->i_op = &fuse_symlink_inode_operations;
|
|
}
|