aboutsummaryrefslogtreecommitdiffstats
path: root/fs/fuse/fuse_i.h
diff options
context:
space:
mode:
authorVivek Goyal <vgoyal@redhat.com>2020-08-19 18:19:54 -0400
committerMiklos Szeredi <mszeredi@redhat.com>2020-09-10 11:39:23 +0200
commit6ae330cad6ef22ab8347ea9e0707dc56a7c7363f (patch)
tree200aaccefa5b50b7155fea5a0e97d1805c9c634a /fs/fuse/fuse_i.h
parent9483e7d5809ab41890298a6a1f5d23c4e10a2cfd (diff)
downloadkernel_replicant_linux-6ae330cad6ef22ab8347ea9e0707dc56a7c7363f.tar.gz
kernel_replicant_linux-6ae330cad6ef22ab8347ea9e0707dc56a7c7363f.tar.bz2
kernel_replicant_linux-6ae330cad6ef22ab8347ea9e0707dc56a7c7363f.zip
virtiofs: serialize truncate/punch_hole and dax fault path
Currently in fuse we don't seem have any lock which can serialize fault path with truncate/punch_hole path. With dax support I need one for following reasons. 1. Dax requirement DAX fault code relies on inode size being stable for the duration of fault and want to serialize with truncate/punch_hole and they explicitly mention it. static vm_fault_t dax_iomap_pmd_fault(struct vm_fault *vmf, pfn_t *pfnp, const struct iomap_ops *ops) /* * Check whether offset isn't beyond end of file now. Caller is * supposed to hold locks serializing us with truncate / punch hole so * this is a reliable test. */ max_pgoff = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE); 2. Make sure there are no users of pages being truncated/punch_hole get_user_pages() might take references to page and then do some DMA to said pages. Filesystem might truncate those pages without knowing that a DMA is in progress or some I/O is in progress. So use dax_layout_busy_page() to make sure there are no such references and I/O is not in progress on said pages before moving ahead with truncation. 3. Limitation of kvm page fault error reporting If we are truncating file on host first and then removing mappings in guest lateter (truncate page cache etc), then this could lead to a problem with KVM. Say a mapping is in place in guest and truncation happens on host. Now if guest accesses that mapping, then host will take a fault and kvm will either exit to qemu or spin infinitely. IOW, before we do truncation on host, we need to make sure that guest inode does not have any mapping in that region or whole file. 4. virtiofs memory range reclaim Soon I will introduce the notion of being able to reclaim dax memory ranges from a fuse dax inode. There also I need to make sure that no I/O or fault is going on in the reclaimed range and nobody is using it so that range can be reclaimed without issues. Currently if we take inode lock, that serializes read/write. But it does not do anything for faults. So I add another semaphore fuse_inode->i_mmap_sem for this purpose. It can be used to serialize with faults. As of now, I am adding taking this semaphore only in dax fault path and not regular fault path because existing code does not have one. May be existing code can benefit from it as well to take care of some races, but that we can fix later if need be. For now, I am just focussing only on DAX path which is new path. Also added logic to take fuse_inode->i_mmap_sem in truncate/punch_hole/open(O_TRUNC) path to make sure file truncation and fuse dax fault are mutually exlusive and avoid all the above problems. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Cc: Dave Chinner <david@fromorbit.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Diffstat (limited to 'fs/fuse/fuse_i.h')
-rw-r--r--fs/fuse/fuse_i.h8
1 files changed, 8 insertions, 0 deletions
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index 2d2bdd596194..32c935f1e1b0 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -149,6 +149,13 @@ struct fuse_inode {
/** Lock to protect write related fields */
spinlock_t lock;
+ /**
+ * Can't take inode lock in fault path (leads to circular dependency).
+ * Introduce another semaphore which can be taken in fault path and
+ * then other filesystem paths can take this to block faults.
+ */
+ struct rw_semaphore i_mmap_sem;
+
#ifdef CONFIG_FUSE_DAX
/*
* Dax specific inode data
@@ -1116,6 +1123,7 @@ void fuse_free_conn(struct fuse_conn *fc);
ssize_t fuse_dax_read_iter(struct kiocb *iocb, struct iov_iter *to);
ssize_t fuse_dax_write_iter(struct kiocb *iocb, struct iov_iter *from);
int fuse_dax_mmap(struct file *file, struct vm_area_struct *vma);
+int fuse_dax_break_layouts(struct inode *inode, u64 dmap_start, u64 dmap_end);
int fuse_dax_conn_alloc(struct fuse_conn *fc, struct dax_device *dax_dev);
void fuse_dax_conn_free(struct fuse_conn *fc);
bool fuse_dax_inode_alloc(struct super_block *sb, struct fuse_inode *fi);