| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit cb13eea3b49055bd78e6ddf39defd6340f7379fc ]
If we remount a filesystem in RO mode while the qgroup rescan worker is
running, we can end up having it still running after the remount is done,
and at unmount time we may end up with an open transaction that ends up
never getting committed. If that happens we end up with several memory
leaks and can crash when hardware acceleration is unavailable for crc32c.
Possibly it can lead to other nasty surprises too, due to use-after-free
issues.
The following steps explain how the problem happens.
1) We have a filesystem mounted in RW mode and the qgroup rescan worker is
running;
2) We remount the filesystem in RO mode, and never stop/pause the rescan
worker, so after the remount the rescan worker is still running. The
important detail here is that the rescan task is still running after
the remount operation committed any ongoing transaction through its
call to btrfs_commit_super();
3) The rescan is still running, and after the remount completed, the
rescan worker started a transaction, after it finished iterating all
leaves of the extent tree, to update the qgroup status item in the
quotas tree. It does not commit the transaction, it only releases its
handle on the transaction;
4) A filesystem unmount operation starts shortly after;
5) The unmount task, at close_ctree(), stops the transaction kthread,
which had not had a chance to commit the open transaction since it was
sleeping and the commit interval (default of 30 seconds) has not yet
elapsed since the last time it committed a transaction;
6) So after stopping the transaction kthread we still have the transaction
used to update the qgroup status item open. At close_ctree(), when the
filesystem is in RO mode and no transaction abort happened (or the
filesystem is in error mode), we do not expect to have any transaction
open, so we do not call btrfs_commit_super();
7) We then proceed to destroy the work queues, free the roots and block
groups, etc. After that we drop the last reference on the btree inode
by calling iput() on it. Since there are dirty pages for the btree
inode, corresponding to the COWed extent buffer for the quotas btree,
btree_write_cache_pages() is invoked to flush those dirty pages. This
results in creating a bio and submitting it, which makes us end up at
btrfs_submit_metadata_bio();
8) At btrfs_submit_metadata_bio() we end up at the if-then-else branch
that calls btrfs_wq_submit_bio(), because check_async_write() returned
a value of 1. This value of 1 is because we did not have hardware
acceleration available for crc32c, so BTRFS_FS_CSUM_IMPL_FAST was not
set in fs_info->flags;
9) Then at btrfs_wq_submit_bio() we call btrfs_queue_work() against the
workqueue at fs_info->workers, which was already freed before by the
call to btrfs_stop_all_workers() at close_ctree(). This results in an
invalid memory access due to a use-after-free, leading to a crash.
When this happens, before the crash there are several warnings triggered,
since we have reserved metadata space in a block group, the delayed refs
reservation, etc:
------------[ cut here ]------------
WARNING: CPU: 4 PID: 1729896 at fs/btrfs/block-group.c:125 btrfs_put_block_group+0x63/0xa0 [btrfs]
Modules linked in: btrfs dm_snapshot dm_thin_pool (...)
CPU: 4 PID: 1729896 Comm: umount Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:btrfs_put_block_group+0x63/0xa0 [btrfs]
Code: f0 01 00 00 48 39 c2 75 (...)
RSP: 0018:ffffb270826bbdd8 EFLAGS: 00010206
RAX: 0000000000000001 RBX: ffff947ed73e4000 RCX: ffff947ebc8b29c8
RDX: 0000000000000001 RSI: ffffffffc0b150a0 RDI: ffff947ebc8b2800
RBP: ffff947ebc8b2800 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: ffff947ed73e4110
R13: ffff947ed73e4160 R14: ffff947ebc8b2988 R15: dead000000000100
FS: 00007f15edfea840(0000) GS:ffff9481ad600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f37e2893320 CR3: 0000000138f68001 CR4: 00000000003706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
btrfs_free_block_groups+0x17f/0x2f0 [btrfs]
close_ctree+0x2ba/0x2fa [btrfs]
generic_shutdown_super+0x6c/0x100
kill_anon_super+0x14/0x30
btrfs_kill_super+0x12/0x20 [btrfs]
deactivate_locked_super+0x31/0x70
cleanup_mnt+0x100/0x160
task_work_run+0x68/0xb0
exit_to_user_mode_prepare+0x1bb/0x1c0
syscall_exit_to_user_mode+0x4b/0x260
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f15ee221ee7
Code: ff 0b 00 f7 d8 64 89 01 48 (...)
RSP: 002b:00007ffe9470f0f8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
RAX: 0000000000000000 RBX: 00007f15ee347264 RCX: 00007f15ee221ee7
RDX: ffffffffffffff78 RSI: 0000000000000000 RDI: 000056169701d000
RBP: 0000561697018a30 R08: 0000000000000000 R09: 00007f15ee2e2be0
R10: 000056169701efe0 R11: 0000000000000246 R12: 0000000000000000
R13: 000056169701d000 R14: 0000561697018b40 R15: 0000561697018c60
irq event stamp: 0
hardirqs last enabled at (0): [<0000000000000000>] 0x0
hardirqs last disabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
softirqs last enabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
softirqs last disabled at (0): [<0000000000000000>] 0x0
---[ end trace dd74718fef1ed5c6 ]---
------------[ cut here ]------------
WARNING: CPU: 2 PID: 1729896 at fs/btrfs/block-rsv.c:459 btrfs_release_global_block_rsv+0x70/0xc0 [btrfs]
Modules linked in: btrfs dm_snapshot dm_thin_pool (...)
CPU: 2 PID: 1729896 Comm: umount Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:btrfs_release_global_block_rsv+0x70/0xc0 [btrfs]
Code: 48 83 bb b0 03 00 00 00 (...)
RSP: 0018:ffffb270826bbdd8 EFLAGS: 00010206
RAX: 000000000033c000 RBX: ffff947ed73e4000 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffffffffc0b0d8c1 RDI: 00000000ffffffff
RBP: ffff947ebc8b7000 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: ffff947ed73e4110
R13: ffff947ed73e5278 R14: dead000000000122 R15: dead000000000100
FS: 00007f15edfea840(0000) GS:ffff9481aca00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000561a79f76e20 CR3: 0000000138f68006 CR4: 00000000003706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
btrfs_free_block_groups+0x24c/0x2f0 [btrfs]
close_ctree+0x2ba/0x2fa [btrfs]
generic_shutdown_super+0x6c/0x100
kill_anon_super+0x14/0x30
btrfs_kill_super+0x12/0x20 [btrfs]
deactivate_locked_super+0x31/0x70
cleanup_mnt+0x100/0x160
task_work_run+0x68/0xb0
exit_to_user_mode_prepare+0x1bb/0x1c0
syscall_exit_to_user_mode+0x4b/0x260
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f15ee221ee7
Code: ff 0b 00 f7 d8 64 89 01 (...)
RSP: 002b:00007ffe9470f0f8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
RAX: 0000000000000000 RBX: 00007f15ee347264 RCX: 00007f15ee221ee7
RDX: ffffffffffffff78 RSI: 0000000000000000 RDI: 000056169701d000
RBP: 0000561697018a30 R08: 0000000000000000 R09: 00007f15ee2e2be0
R10: 000056169701efe0 R11: 0000000000000246 R12: 0000000000000000
R13: 000056169701d000 R14: 0000561697018b40 R15: 0000561697018c60
irq event stamp: 0
hardirqs last enabled at (0): [<0000000000000000>] 0x0
hardirqs last disabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
softirqs last enabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
softirqs last disabled at (0): [<0000000000000000>] 0x0
---[ end trace dd74718fef1ed5c7 ]---
------------[ cut here ]------------
WARNING: CPU: 2 PID: 1729896 at fs/btrfs/block-group.c:3377 btrfs_free_block_groups+0x25d/0x2f0 [btrfs]
Modules linked in: btrfs dm_snapshot dm_thin_pool (...)
CPU: 5 PID: 1729896 Comm: umount Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:btrfs_free_block_groups+0x25d/0x2f0 [btrfs]
Code: ad de 49 be 22 01 00 (...)
RSP: 0018:ffffb270826bbde8 EFLAGS: 00010206
RAX: ffff947ebeae1d08 RBX: ffff947ed73e4000 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffff947e9d823ae8 RDI: 0000000000000246
RBP: ffff947ebeae1d08 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: ffff947ebeae1c00
R13: ffff947ed73e5278 R14: dead000000000122 R15: dead000000000100
FS: 00007f15edfea840(0000) GS:ffff9481ad200000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f1475d98ea8 CR3: 0000000138f68005 CR4: 00000000003706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
close_ctree+0x2ba/0x2fa [btrfs]
generic_shutdown_super+0x6c/0x100
kill_anon_super+0x14/0x30
btrfs_kill_super+0x12/0x20 [btrfs]
deactivate_locked_super+0x31/0x70
cleanup_mnt+0x100/0x160
task_work_run+0x68/0xb0
exit_to_user_mode_prepare+0x1bb/0x1c0
syscall_exit_to_user_mode+0x4b/0x260
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f15ee221ee7
Code: ff 0b 00 f7 d8 64 89 (...)
RSP: 002b:00007ffe9470f0f8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
RAX: 0000000000000000 RBX: 00007f15ee347264 RCX: 00007f15ee221ee7
RDX: ffffffffffffff78 RSI: 0000000000000000 RDI: 000056169701d000
RBP: 0000561697018a30 R08: 0000000000000000 R09: 00007f15ee2e2be0
R10: 000056169701efe0 R11: 0000000000000246 R12: 0000000000000000
R13: 000056169701d000 R14: 0000561697018b40 R15: 0000561697018c60
irq event stamp: 0
hardirqs last enabled at (0): [<0000000000000000>] 0x0
hardirqs last disabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
softirqs last enabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
softirqs last disabled at (0): [<0000000000000000>] 0x0
---[ end trace dd74718fef1ed5c8 ]---
BTRFS info (device sdc): space_info 4 has 268238848 free, is not full
BTRFS info (device sdc): space_info total=268435456, used=114688, pinned=0, reserved=16384, may_use=0, readonly=65536
BTRFS info (device sdc): global_block_rsv: size 0 reserved 0
BTRFS info (device sdc): trans_block_rsv: size 0 reserved 0
BTRFS info (device sdc): chunk_block_rsv: size 0 reserved 0
BTRFS info (device sdc): delayed_block_rsv: size 0 reserved 0
BTRFS info (device sdc): delayed_refs_rsv: size 524288 reserved 0
And the crash, which only happens when we do not have crc32c hardware
acceleration, produces the following trace immediately after those
warnings:
stack segment: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC PTI
CPU: 2 PID: 1749129 Comm: umount Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:btrfs_queue_work+0x36/0x190 [btrfs]
Code: 54 55 53 48 89 f3 (...)
RSP: 0018:ffffb27082443ae8 EFLAGS: 00010282
RAX: 0000000000000004 RBX: ffff94810ee9ad90 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffff94810ee9ad90 RDI: ffff947ed8ee75a0
RBP: a56b6b6b6b6b6b6b R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000007 R11: 0000000000000001 R12: ffff947fa9b435a8
R13: ffff94810ee9ad90 R14: 0000000000000000 R15: ffff947e93dc0000
FS: 00007f3cfe974840(0000) GS:ffff9481ac600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f1b42995a70 CR3: 0000000127638003 CR4: 00000000003706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
btrfs_wq_submit_bio+0xb3/0xd0 [btrfs]
btrfs_submit_metadata_bio+0x44/0xc0 [btrfs]
submit_one_bio+0x61/0x70 [btrfs]
btree_write_cache_pages+0x414/0x450 [btrfs]
? kobject_put+0x9a/0x1d0
? trace_hardirqs_on+0x1b/0xf0
? _raw_spin_unlock_irqrestore+0x3c/0x60
? free_debug_processing+0x1e1/0x2b0
do_writepages+0x43/0xe0
? lock_acquired+0x199/0x490
__writeback_single_inode+0x59/0x650
writeback_single_inode+0xaf/0x120
write_inode_now+0x94/0xd0
iput+0x187/0x2b0
close_ctree+0x2c6/0x2fa [btrfs]
generic_shutdown_super+0x6c/0x100
kill_anon_super+0x14/0x30
btrfs_kill_super+0x12/0x20 [btrfs]
deactivate_locked_super+0x31/0x70
cleanup_mnt+0x100/0x160
task_work_run+0x68/0xb0
exit_to_user_mode_prepare+0x1bb/0x1c0
syscall_exit_to_user_mode+0x4b/0x260
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f3cfebabee7
Code: ff 0b 00 f7 d8 64 89 01 (...)
RSP: 002b:00007ffc9c9a05f8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
RAX: 0000000000000000 RBX: 00007f3cfecd1264 RCX: 00007f3cfebabee7
RDX: ffffffffffffff78 RSI: 0000000000000000 RDI: 0000562b6b478000
RBP: 0000562b6b473a30 R08: 0000000000000000 R09: 00007f3cfec6cbe0
R10: 0000562b6b479fe0 R11: 0000000000000246 R12: 0000000000000000
R13: 0000562b6b478000 R14: 0000562b6b473b40 R15: 0000562b6b473c60
Modules linked in: btrfs dm_snapshot dm_thin_pool (...)
---[ end trace dd74718fef1ed5cc ]---
Finally when we remove the btrfs module (rmmod btrfs), there are several
warnings about objects that were allocated from our slabs but were never
freed, consequence of the transaction that was never committed and got
leaked:
=============================================================================
BUG btrfs_delayed_ref_head (Tainted: G B W ): Objects remaining in btrfs_delayed_ref_head on __kmem_cache_shutdown()
-----------------------------------------------------------------------------
INFO: Slab 0x0000000094c2ae56 objects=24 used=2 fp=0x000000002bfa2521 flags=0x17fffc000010200
CPU: 5 PID: 1729921 Comm: rmmod Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Call Trace:
dump_stack+0x8d/0xb5
slab_err+0xb7/0xdc
? lock_acquired+0x199/0x490
__kmem_cache_shutdown+0x1ac/0x3c0
? lock_release+0x20e/0x4c0
kmem_cache_destroy+0x55/0x120
btrfs_delayed_ref_exit+0x11/0x35 [btrfs]
exit_btrfs_fs+0xa/0x59 [btrfs]
__x64_sys_delete_module+0x194/0x260
? fpregs_assert_state_consistent+0x1e/0x40
? exit_to_user_mode_prepare+0x55/0x1c0
? trace_hardirqs_on+0x1b/0xf0
do_syscall_64+0x33/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f693e305897
Code: 73 01 c3 48 8b 0d f9 f5 (...)
RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
INFO: Object 0x0000000050cbdd61 @offset=12104
INFO: Allocated in btrfs_add_delayed_tree_ref+0xbb/0x480 [btrfs] age=1894 cpu=6 pid=1729873
__slab_alloc.isra.0+0x109/0x1c0
kmem_cache_alloc+0x7bb/0x830
btrfs_add_delayed_tree_ref+0xbb/0x480 [btrfs]
btrfs_free_tree_block+0x128/0x360 [btrfs]
__btrfs_cow_block+0x489/0x5f0 [btrfs]
btrfs_cow_block+0xf7/0x220 [btrfs]
btrfs_search_slot+0x62a/0xc40 [btrfs]
btrfs_del_orphan_item+0x65/0xd0 [btrfs]
btrfs_find_orphan_roots+0x1bf/0x200 [btrfs]
open_ctree+0x125a/0x18a0 [btrfs]
btrfs_mount_root.cold+0x13/0xed [btrfs]
legacy_get_tree+0x30/0x60
vfs_get_tree+0x28/0xe0
fc_mount+0xe/0x40
vfs_kern_mount.part.0+0x71/0x90
btrfs_mount+0x13b/0x3e0 [btrfs]
INFO: Freed in __btrfs_run_delayed_refs+0x1117/0x1290 [btrfs] age=4292 cpu=2 pid=1729526
kmem_cache_free+0x34c/0x3c0
__btrfs_run_delayed_refs+0x1117/0x1290 [btrfs]
btrfs_run_delayed_refs+0x81/0x210 [btrfs]
commit_cowonly_roots+0xfb/0x300 [btrfs]
btrfs_commit_transaction+0x367/0xc40 [btrfs]
sync_filesystem+0x74/0x90
generic_shutdown_super+0x22/0x100
kill_anon_super+0x14/0x30
btrfs_kill_super+0x12/0x20 [btrfs]
deactivate_locked_super+0x31/0x70
cleanup_mnt+0x100/0x160
task_work_run+0x68/0xb0
exit_to_user_mode_prepare+0x1bb/0x1c0
syscall_exit_to_user_mode+0x4b/0x260
entry_SYSCALL_64_after_hwframe+0x44/0xa9
INFO: Object 0x0000000086e9b0ff @offset=12776
INFO: Allocated in btrfs_add_delayed_tree_ref+0xbb/0x480 [btrfs] age=1900 cpu=6 pid=1729873
__slab_alloc.isra.0+0x109/0x1c0
kmem_cache_alloc+0x7bb/0x830
btrfs_add_delayed_tree_ref+0xbb/0x480 [btrfs]
btrfs_alloc_tree_block+0x2bf/0x360 [btrfs]
alloc_tree_block_no_bg_flush+0x4f/0x60 [btrfs]
__btrfs_cow_block+0x12d/0x5f0 [btrfs]
btrfs_cow_block+0xf7/0x220 [btrfs]
btrfs_search_slot+0x62a/0xc40 [btrfs]
btrfs_del_orphan_item+0x65/0xd0 [btrfs]
btrfs_find_orphan_roots+0x1bf/0x200 [btrfs]
open_ctree+0x125a/0x18a0 [btrfs]
btrfs_mount_root.cold+0x13/0xed [btrfs]
legacy_get_tree+0x30/0x60
vfs_get_tree+0x28/0xe0
fc_mount+0xe/0x40
vfs_kern_mount.part.0+0x71/0x90
INFO: Freed in __btrfs_run_delayed_refs+0x1117/0x1290 [btrfs] age=3141 cpu=6 pid=1729803
kmem_cache_free+0x34c/0x3c0
__btrfs_run_delayed_refs+0x1117/0x1290 [btrfs]
btrfs_run_delayed_refs+0x81/0x210 [btrfs]
btrfs_write_dirty_block_groups+0x17d/0x3d0 [btrfs]
commit_cowonly_roots+0x248/0x300 [btrfs]
btrfs_commit_transaction+0x367/0xc40 [btrfs]
close_ctree+0x113/0x2fa [btrfs]
generic_shutdown_super+0x6c/0x100
kill_anon_super+0x14/0x30
btrfs_kill_super+0x12/0x20 [btrfs]
deactivate_locked_super+0x31/0x70
cleanup_mnt+0x100/0x160
task_work_run+0x68/0xb0
exit_to_user_mode_prepare+0x1bb/0x1c0
syscall_exit_to_user_mode+0x4b/0x260
entry_SYSCALL_64_after_hwframe+0x44/0xa9
kmem_cache_destroy btrfs_delayed_ref_head: Slab cache still has objects
CPU: 5 PID: 1729921 Comm: rmmod Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Call Trace:
dump_stack+0x8d/0xb5
kmem_cache_destroy+0x119/0x120
btrfs_delayed_ref_exit+0x11/0x35 [btrfs]
exit_btrfs_fs+0xa/0x59 [btrfs]
__x64_sys_delete_module+0x194/0x260
? fpregs_assert_state_consistent+0x1e/0x40
? exit_to_user_mode_prepare+0x55/0x1c0
? trace_hardirqs_on+0x1b/0xf0
do_syscall_64+0x33/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f693e305897
Code: 73 01 c3 48 8b 0d f9 f5 0b (...)
RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
=============================================================================
BUG btrfs_delayed_tree_ref (Tainted: G B W ): Objects remaining in btrfs_delayed_tree_ref on __kmem_cache_shutdown()
-----------------------------------------------------------------------------
INFO: Slab 0x0000000011f78dc0 objects=37 used=2 fp=0x0000000032d55d91 flags=0x17fffc000010200
CPU: 3 PID: 1729921 Comm: rmmod Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Call Trace:
dump_stack+0x8d/0xb5
slab_err+0xb7/0xdc
? lock_acquired+0x199/0x490
__kmem_cache_shutdown+0x1ac/0x3c0
? lock_release+0x20e/0x4c0
kmem_cache_destroy+0x55/0x120
btrfs_delayed_ref_exit+0x1d/0x35 [btrfs]
exit_btrfs_fs+0xa/0x59 [btrfs]
__x64_sys_delete_module+0x194/0x260
? fpregs_assert_state_consistent+0x1e/0x40
? exit_to_user_mode_prepare+0x55/0x1c0
? trace_hardirqs_on+0x1b/0xf0
do_syscall_64+0x33/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f693e305897
Code: 73 01 c3 48 8b 0d f9 f5 (...)
RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
INFO: Object 0x000000001a340018 @offset=4408
INFO: Allocated in btrfs_add_delayed_tree_ref+0x9e/0x480 [btrfs] age=1917 cpu=6 pid=1729873
__slab_alloc.isra.0+0x109/0x1c0
kmem_cache_alloc+0x7bb/0x830
btrfs_add_delayed_tree_ref+0x9e/0x480 [btrfs]
btrfs_free_tree_block+0x128/0x360 [btrfs]
__btrfs_cow_block+0x489/0x5f0 [btrfs]
btrfs_cow_block+0xf7/0x220 [btrfs]
btrfs_search_slot+0x62a/0xc40 [btrfs]
btrfs_del_orphan_item+0x65/0xd0 [btrfs]
btrfs_find_orphan_roots+0x1bf/0x200 [btrfs]
open_ctree+0x125a/0x18a0 [btrfs]
btrfs_mount_root.cold+0x13/0xed [btrfs]
legacy_get_tree+0x30/0x60
vfs_get_tree+0x28/0xe0
fc_mount+0xe/0x40
vfs_kern_mount.part.0+0x71/0x90
btrfs_mount+0x13b/0x3e0 [btrfs]
INFO: Freed in __btrfs_run_delayed_refs+0x63d/0x1290 [btrfs] age=4167 cpu=4 pid=1729795
kmem_cache_free+0x34c/0x3c0
__btrfs_run_delayed_refs+0x63d/0x1290 [btrfs]
btrfs_run_delayed_refs+0x81/0x210 [btrfs]
btrfs_commit_transaction+0x60/0xc40 [btrfs]
create_subvol+0x56a/0x990 [btrfs]
btrfs_mksubvol+0x3fb/0x4a0 [btrfs]
__btrfs_ioctl_snap_create+0x119/0x1a0 [btrfs]
btrfs_ioctl_snap_create+0x58/0x80 [btrfs]
btrfs_ioctl+0x1a92/0x36f0 [btrfs]
__x64_sys_ioctl+0x83/0xb0
do_syscall_64+0x33/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
INFO: Object 0x000000002b46292a @offset=13648
INFO: Allocated in btrfs_add_delayed_tree_ref+0x9e/0x480 [btrfs] age=1923 cpu=6 pid=1729873
__slab_alloc.isra.0+0x109/0x1c0
kmem_cache_alloc+0x7bb/0x830
btrfs_add_delayed_tree_ref+0x9e/0x480 [btrfs]
btrfs_alloc_tree_block+0x2bf/0x360 [btrfs]
alloc_tree_block_no_bg_flush+0x4f/0x60 [btrfs]
__btrfs_cow_block+0x12d/0x5f0 [btrfs]
btrfs_cow_block+0xf7/0x220 [btrfs]
btrfs_search_slot+0x62a/0xc40 [btrfs]
btrfs_del_orphan_item+0x65/0xd0 [btrfs]
btrfs_find_orphan_roots+0x1bf/0x200 [btrfs]
open_ctree+0x125a/0x18a0 [btrfs]
btrfs_mount_root.cold+0x13/0xed [btrfs]
legacy_get_tree+0x30/0x60
vfs_get_tree+0x28/0xe0
fc_mount+0xe/0x40
vfs_kern_mount.part.0+0x71/0x90
INFO: Freed in __btrfs_run_delayed_refs+0x63d/0x1290 [btrfs] age=3164 cpu=6 pid=1729803
kmem_cache_free+0x34c/0x3c0
__btrfs_run_delayed_refs+0x63d/0x1290 [btrfs]
btrfs_run_delayed_refs+0x81/0x210 [btrfs]
commit_cowonly_roots+0xfb/0x300 [btrfs]
btrfs_commit_transaction+0x367/0xc40 [btrfs]
close_ctree+0x113/0x2fa [btrfs]
generic_shutdown_super+0x6c/0x100
kill_anon_super+0x14/0x30
btrfs_kill_super+0x12/0x20 [btrfs]
deactivate_locked_super+0x31/0x70
cleanup_mnt+0x100/0x160
task_work_run+0x68/0xb0
exit_to_user_mode_prepare+0x1bb/0x1c0
syscall_exit_to_user_mode+0x4b/0x260
entry_SYSCALL_64_after_hwframe+0x44/0xa9
kmem_cache_destroy btrfs_delayed_tree_ref: Slab cache still has objects
CPU: 5 PID: 1729921 Comm: rmmod Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Call Trace:
dump_stack+0x8d/0xb5
kmem_cache_destroy+0x119/0x120
btrfs_delayed_ref_exit+0x1d/0x35 [btrfs]
exit_btrfs_fs+0xa/0x59 [btrfs]
__x64_sys_delete_module+0x194/0x260
? fpregs_assert_state_consistent+0x1e/0x40
? exit_to_user_mode_prepare+0x55/0x1c0
? trace_hardirqs_on+0x1b/0xf0
do_syscall_64+0x33/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f693e305897
Code: 73 01 c3 48 8b 0d f9 f5 (...)
RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
=============================================================================
BUG btrfs_delayed_extent_op (Tainted: G B W ): Objects remaining in btrfs_delayed_extent_op on __kmem_cache_shutdown()
-----------------------------------------------------------------------------
INFO: Slab 0x00000000f145ce2f objects=22 used=1 fp=0x00000000af0f92cf flags=0x17fffc000010200
CPU: 5 PID: 1729921 Comm: rmmod Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Call Trace:
dump_stack+0x8d/0xb5
slab_err+0xb7/0xdc
? lock_acquired+0x199/0x490
__kmem_cache_shutdown+0x1ac/0x3c0
? __mutex_unlock_slowpath+0x45/0x2a0
kmem_cache_destroy+0x55/0x120
exit_btrfs_fs+0xa/0x59 [btrfs]
__x64_sys_delete_module+0x194/0x260
? fpregs_assert_state_consistent+0x1e/0x40
? exit_to_user_mode_prepare+0x55/0x1c0
? trace_hardirqs_on+0x1b/0xf0
do_syscall_64+0x33/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f693e305897
Code: 73 01 c3 48 8b 0d f9 f5 (...)
RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
INFO: Object 0x000000004cf95ea8 @offset=6264
INFO: Allocated in btrfs_alloc_tree_block+0x1e0/0x360 [btrfs] age=1931 cpu=6 pid=1729873
__slab_alloc.isra.0+0x109/0x1c0
kmem_cache_alloc+0x7bb/0x830
btrfs_alloc_tree_block+0x1e0/0x360 [btrfs]
alloc_tree_block_no_bg_flush+0x4f/0x60 [btrfs]
__btrfs_cow_block+0x12d/0x5f0 [btrfs]
btrfs_cow_block+0xf7/0x220 [btrfs]
btrfs_search_slot+0x62a/0xc40 [btrfs]
btrfs_del_orphan_item+0x65/0xd0 [btrfs]
btrfs_find_orphan_roots+0x1bf/0x200 [btrfs]
open_ctree+0x125a/0x18a0 [btrfs]
btrfs_mount_root.cold+0x13/0xed [btrfs]
legacy_get_tree+0x30/0x60
vfs_get_tree+0x28/0xe0
fc_mount+0xe/0x40
vfs_kern_mount.part.0+0x71/0x90
btrfs_mount+0x13b/0x3e0 [btrfs]
INFO: Freed in __btrfs_run_delayed_refs+0xabd/0x1290 [btrfs] age=3173 cpu=6 pid=1729803
kmem_cache_free+0x34c/0x3c0
__btrfs_run_delayed_refs+0xabd/0x1290 [btrfs]
btrfs_run_delayed_refs+0x81/0x210 [btrfs]
commit_cowonly_roots+0xfb/0x300 [btrfs]
btrfs_commit_transaction+0x367/0xc40 [btrfs]
close_ctree+0x113/0x2fa [btrfs]
generic_shutdown_super+0x6c/0x100
kill_anon_super+0x14/0x30
btrfs_kill_super+0x12/0x20 [btrfs]
deactivate_locked_super+0x31/0x70
cleanup_mnt+0x100/0x160
task_work_run+0x68/0xb0
exit_to_user_mode_prepare+0x1bb/0x1c0
syscall_exit_to_user_mode+0x4b/0x260
entry_SYSCALL_64_after_hwframe+0x44/0xa9
kmem_cache_destroy btrfs_delayed_extent_op: Slab cache still has objects
CPU: 3 PID: 1729921 Comm: rmmod Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Call Trace:
dump_stack+0x8d/0xb5
kmem_cache_destroy+0x119/0x120
exit_btrfs_fs+0xa/0x59 [btrfs]
__x64_sys_delete_module+0x194/0x260
? fpregs_assert_state_consistent+0x1e/0x40
? exit_to_user_mode_prepare+0x55/0x1c0
? trace_hardirqs_on+0x1b/0xf0
do_syscall_64+0x33/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f693e305897
Code: 73 01 c3 48 8b 0d f9 (...)
RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
BTRFS: state leak: start 30408704 end 30425087 state 1 in tree 1 refs 1
Fix this issue by having the remount path stop the qgroup rescan worker
when we are remounting RO and teach the rescan worker to stop when a
remount is in progress. If later a remount in RW mode happens, we are
already resuming the qgroup rescan worker through the call to
btrfs_qgroup_rescan_resume(), so we do not need to worry about that.
Tested-by: Fabian Vogt <fvogt@suse.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
We have this thing wrapped in an RCU lock, but it's really not needed.
We create all the space_info's on mount, and we destroy them on unmount.
The list never changes and we're protected from messing with it by the
normal mount/umount path, so kill the RCU stuff around it.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now that we have the data ticketing stuff in place, move normal data
reservations to use an async reclaim helper to satisfy tickets. Before
we could have multiple tasks race in and both allocate chunks, resulting
in more data chunks than we would necessarily need. Serializing these
allocations and making a single thread responsible for flushing will
only allocate chunks as needed, as well as cut down on transaction
commits and other flush related activities.
Priority reservations will still work as they have before, simply
trying to allocate a chunk until they can make their reservation.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Tested-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently a user can set mount "-o compress" which will set the
compression algorithm to zlib, and use the default compress level for
zlib (3):
relatime,compress=zlib:3,space_cache
If the user remounts the fs using "-o compress=lzo", then the old
compress_level is used:
relatime,compress=lzo:3,space_cache
But lzo does not expose any tunable compression level. The same happens
if we set any compress argument with different level, also with zstd.
Fix this by resetting the compress_level when compress=lzo is
specified. With the fix applied, lzo is shown without compress level:
relatime,compress=lzo,space_cache
CC: stable@vger.kernel.org # 4.4+
Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There's some inconsistency around SB_I_VERSION handling with mount and
remount. Since we don't really want it to be off ever just work around
this by making sure we don't get the flag cleared on remount.
There's a tiny cpu cost of setting the bit, otherwise all changes to
i_version also change some of the times (ctime/mtime) so the inode needs
to be synced. We wouldn't save anything by disabling it.
Reported-by: Eric Sandeen <sandeen@redhat.com>
CC: stable@vger.kernel.org # 5.4+
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ add perf impact analysis ]
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Chris Murphy reported a problem where rpm ostree will bind mount a bunch
of things for whatever voodoo it's doing. But when it does this
/proc/mounts shows something like
/dev/sda /mnt/test btrfs rw,relatime,subvolid=256,subvol=/foo 0 0
/dev/sda /mnt/test/baz btrfs rw,relatime,subvolid=256,subvol=/foo/bar 0 0
Despite subvolid=256 being subvol=/foo. This is because we're just
spitting out the dentry of the mount point, which in the case of bind
mounts is the source path for the mountpoint. Instead we should spit
out the path to the actual subvol. Fix this by looking up the name for
the subvolid we have mounted. With this fix the same test looks like
this
/dev/sda /mnt/test btrfs rw,relatime,subvolid=256,subvol=/foo 0 0
/dev/sda /mnt/test/baz btrfs rw,relatime,subvolid=256,subvol=/foo 0 0
Reported-by: Chris Murphy <chris@colorremedies.com>
CC: stable@vger.kernel.org # 4.4+
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reported by Forza on IRC that remounting with compression options does
not reflect the change in level, or at least it does not appear to do so
according to the messages:
mount -o compress=zstd:1 /dev/sda /mnt
mount -o remount,compress=zstd:15 /mnt
does not print the change to the level to syslog:
[ 41.366060] BTRFS info (device vda): use zstd compression, level 1
[ 41.368254] BTRFS info (device vda): disk space caching is enabled
[ 41.390429] BTRFS info (device vda): disk space caching is enabled
What really happens is that the message is lost but the level is actualy
changed.
There's another weird output, if compression is reset to 'no':
[ 45.413776] BTRFS info (device vda): use no compression, level 4
To fix that, save the previous compression level and print the message
in that case too and use separate message for 'no' compression.
CC: stable@vger.kernel.org # 4.19+
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we're (re)mounting a btrfs filesystem we set the
BTRFS_FS_STATE_REMOUNTING state in fs_info to serialize against async
reclaim or defrags.
This flag is set in btrfs_remount_prepare() called by btrfs_remount().
As btrfs_remount_prepare() does nothing but setting this flag and
doesn't have a second caller, we can just open-code the flag setting in
btrfs_remount().
Similarly do for so clearing of the flag by moving it out of
btrfs_remount_cleanup() into btrfs_remount() to be symmetrical.
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
| |
We've had some discussions about what to do in certain scenarios for
error codes, specifically EUCLEAN and EROFS. Document these near the
error handling code so its clear what their intentions are.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
->show_devname currently shows the lowest devid in the list. As the seed
devices have the lowest devid in the sprouted filesystem, the userland
tool such as findmnt end up seeing seed device instead of the device from
the read-writable sprouted filesystem. As shown below.
mount /dev/sda /btrfs
mount: /btrfs: WARNING: device write-protected, mounted read-only.
findmnt --output SOURCE,TARGET,UUID /btrfs
SOURCE TARGET UUID
/dev/sda /btrfs 899f7027-3e46-4626-93e7-7d4c9ad19111
btrfs dev add -f /dev/sdb /btrfs
umount /btrfs
mount /dev/sdb /btrfs
findmnt --output SOURCE,TARGET,UUID /btrfs
SOURCE TARGET UUID
/dev/sda /btrfs 899f7027-3e46-4626-93e7-7d4c9ad19111
All sprouts from a single seed will show the same seed device and the
same fsid. That's confusing.
This is causing problems in our prototype as there isn't any reference
to the sprout file-system(s) which is being used for actual read and
write.
This was added in the patch which implemented the show_devname in btrfs
commit 9c5085c14798 ("Btrfs: implement ->show_devname").
I tried to look for any particular reason that we need to show the seed
device, there isn't any.
So instead, do not traverse through the seed devices, just show the
lowest devid in the sprouted fsid.
After the patch:
mount /dev/sda /btrfs
mount: /btrfs: WARNING: device write-protected, mounted read-only.
findmnt --output SOURCE,TARGET,UUID /btrfs
SOURCE TARGET UUID
/dev/sda /btrfs 899f7027-3e46-4626-93e7-7d4c9ad19111
btrfs dev add -f /dev/sdb /btrfs
mount -o rw,remount /dev/sdb /btrfs
findmnt --output SOURCE,TARGET,UUID /btrfs
SOURCE TARGET UUID
/dev/sdb /btrfs 595ca0e6-b82e-46b5-b9e2-c72a6928be48
mount /dev/sda /btrfs1
mount: /btrfs1: WARNING: device write-protected, mounted read-only.
btrfs dev add -f /dev/sdc /btrfs1
findmnt --output SOURCE,TARGET,UUID /btrfs1
SOURCE TARGET UUID
/dev/sdc /btrfs1 ca1dbb7a-8446-4f95-853c-a20f3f82bdbb
cat /proc/self/mounts | grep btrfs
/dev/sdb /btrfs btrfs rw,relatime,noacl,space_cache,subvolid=5,subvol=/ 0 0
/dev/sdc /btrfs1 btrfs ro,relatime,noacl,space_cache,subvolid=5,subvol=/ 0 0
Reported-by: Martin K. Petersen <martin.petersen@oracle.com>
CC: stable@vger.kernel.org # 4.19+
Tested-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
| |
The option subvolrootid used to be a workaround for mounting subvolumes
and ineffective since 5e2a4b25da23 ("btrfs: deprecate subvolrootid mount
option"). We have subvol= that works and we don't need to keep the
cruft, let's remove it.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
| |
The mount option alloc_start has no effect since 0d0c71b31720 ("btrfs:
obsolete and remove mount option alloc_start") which has details why
it's been deprecated. We can remove it.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Estimated time of removal of the functionality is 5.11, the option will
be still parsed but will have no effect.
Reasons for deprecation and removal:
- very poor naming choice of the mount option, it's supposed to cache
and reuse the inode _numbers_, but it sounds a some generic cache for
inodes
- the only known usecase where this option would make sense is on a
32bit architecture where inode numbers in one subvolume would be
exhausted due to 32bit inode::i_ino
- the cache is stored on disk, consumes space, needs to be loaded and
written back
- new inode number allocation is slower due to lookups into the cache
(compared to a simple increment which is the default)
- uses the free-space-cache code that is going to be deprecated as well
in the future
Known problems:
- since 2011, returning EEXIST when there's not enough space in a page
to store all checksums, see commit 4b9465cb9e38 ("Btrfs: add mount -o
inode_cache")
Remaining issues:
- if the option was enabled, new inodes created, the option disabled
again, the cache is still stored on the devices and there's currently
no way to remove it
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch introduces a new "rescue=" mount option group for all mount
options for data recovery.
Different rescue sub options are seperated by ':'. E.g
"ro,rescue=nologreplay:usebackuproot".
The original plan was to use ';', but ';' needs to be escaped/quoted,
or it will be interpreted by bash, similar to '|'.
And obviously, user can specify rescue options one by one like:
"ro,rescue=nologreplay,rescue=usebackuproot".
The following mount options are converted to "rescue=", old mount
options are deprecated but still available for compatibility purpose:
- usebackuproot
Now it's "rescue=usebackuproot"
- nologreplay
Now it's "rescue=nologreplay"
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
| |
Convert fall through comments to the pseudo-keyword which is now the
preferred way.
Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The inode lookup starting at btrfs_iget takes the full location key,
while only the objectid is used to match the inode, because the lookup
happens inside the given root thus the inode number is unique.
The entire location key is properly set up in btrfs_init_locked_inode.
Simplify the helpers and pass only inode number, renaming it to 'ino'
instead of 'objectid'. This allows to remove temporary variables key,
saving some stack space.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The main function to lookup a root by its id btrfs_get_fs_root takes the
whole key, while only using the objectid. The value of offset is preset
to (u64)-1 but not actually used until btrfs_find_root that does the
actual search.
Switch btrfs_get_fs_root to use only objectid and remove all local
variables that existed just for the lookup. The actual key for search is
set up in btrfs_get_fs_root, reusing another key variable.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
| |
I've grepped logs for 'errno=.*unknown' and found -95, -117 and -122,
now added to the table. The wording is adjusted so it makes sense in
context of filesystem.
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
| |
Add the raw errnos and sort them accordingly.
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
| |
An unrecognized option is a failure that should get user/administrator
attention, the info level is often below what gets logged, so make it
error.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
| |
The functions will be used outside of export.c and super.c to allow
resolving subvolume name from a given id, eg. for subvolume deletion by
id ioctl.
Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ split from the next patch ]
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
| |
The status of aborted transaction can change between calls and it needs
to be accessed by READ_ONCE. Add a helper that also wraps the unlikely
hint.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
| |
We are now using these for all roots, rename them to btrfs_put_root()
and btrfs_grab_root();
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In adding things like eb leak checking and root leak checking there were
a lot of weird corner cases that come from the fact that
1) We do not init the fs_info until we get to open_ctree time in the
normal case and
2) The test infrastructure half-init's the fs_info for things that it
needs.
This makes it really annoying to make changes because you have to add
init in two different places, have special cases for testing fs_info's
that may not have certain things initialized, and cases for fs_info's
that didn't make it to open_ctree and thus are not fully set up.
Fix this by extracting out the non-allocating init of the fs info into
it's own public function and use that to make sure we're all getting
consistent views of an allocated fs_info.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Now that all callers of btrfs_get_fs_root are subsequently calling
btrfs_grab_fs_root and handling dropping the ref when they are done
appropriately, go ahead and push btrfs_grab_fs_root up into
btrfs_get_fs_root.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
| |
We're going to start freeing roots and doing other complicated things in
free_fs_info, so we need to move it to disk-io.c and export it in order
to use things lik btrfs_put_fs_root().
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
| |
We lookup the name of a subvol which means we'll cross into different
roots. Hold a ref while we're doing the look ups in the fs_root we're
searching.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
| |
All this does is call btrfs_get_fs_root() with check_ref == true. Just
use btrfs_get_fs_root() so we don't have a bunch of different helpers
that do the same thing.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
| |
Btrfsctl was removed in 2012, now the function btrfs_control_ioctl()
is only used for devices ioctls. So update the comment.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Su Yue <Damenly_Su@gmx.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A remount to a read-write filesystem is not safe when there's tree-log
to be replayed. Files that could be opened until now might be affected
by the changes in the tree-log.
A regular mount is needed to replay the log so the filesystem presents
the consistent view with the pending changes included.
CC: stable@vger.kernel.org # 4.4+
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There was some logic added a while ago to clear out f_bavail in statfs()
if we did not have enough free metadata space to satisfy our global
reserve. This was incorrect at the time, however didn't really pose a
problem for normal file systems because we would often allocate chunks
if we got this low on free metadata space, and thus wouldn't really hit
this case unless we were actually full.
Fast forward to today and now we are much better about not allocating
metadata chunks all of the time. Couple this with d792b0f19711 ("btrfs:
always reserve our entire size for the global reserve") which now means
we'll easily have a larger global reserve than our free space, we are
now more likely to trip over this while still having plenty of space.
Fix this by skipping this logic if the global rsv's space_info is not
full. space_info->full is 0 unless we've attempted to allocate a chunk
for that space_info and that has failed. If this happens then the space
for the global reserve is definitely sacred and we need to report
b_avail == 0, but before then we can just use our calculated b_avail.
Reported-by: Martin Steigerwald <martin@lichtvoll.de>
Fixes: ca8a51b3a979 ("btrfs: statfs: report zero available if metadata are exhausted")
CC: stable@vger.kernel.org # 4.5+
Reviewed-by: Qu Wenruo <wqu@suse.com>
Tested-By: Martin Steigerwald <martin@lichtvoll.de>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When discard is enabled, everytime a pinned extent is released back to
the block_group's free space cache, a discard is issued for the extent.
This is an overeager approach when it comes to discarding and helping
the SSD maintain enough free space to prevent severe garbage collection
situations.
This adds the beginning of async discard. Instead of issuing a discard
prior to returning it to the free space, it is just marked as untrimmed.
The block_group is then added to a LRU which then feeds into a workqueue
to issue discards at a much slower rate. Full discarding of unused block
groups is still done and will be addressed in a future patch of the
series.
For now, we don't persist the discard state of extents and bitmaps.
Therefore, our failure recovery mode will be to consider extents
untrimmed. This lets us handle failure and unmounting as one in the
same.
On a number of Facebook webservers, I collected data every minute
accounting the time we spent in btrfs_finish_extent_commit() (col. 1)
and in btrfs_commit_transaction() (col. 2). btrfs_finish_extent_commit()
is where we discard extents synchronously before returning them to the
free space cache.
discard=sync:
p99 total per minute p99 total per minute
Drive | extent_commit() (ms) | commit_trans() (ms)
---------------------------------------------------------------
Drive A | 434 | 1170
Drive B | 880 | 2330
Drive C | 2943 | 3920
Drive D | 4763 | 5701
discard=async:
p99 total per minute p99 total per minute
Drive | extent_commit() (ms) | commit_trans() (ms)
--------------------------------------------------------------
Drive A | 134 | 956
Drive B | 64 | 1972
Drive C | 59 | 1032
Drive D | 62 | 1200
While it's not great that the stats are cumulative over 1m, all of these
servers are running the same workload and and the delta between the two
are substantial. We are spending significantly less time in
btrfs_finish_extent_commit() which is responsible for discarding.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Dennis Zhou <dennis@kernel.org>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This series introduces async discard which will use the flag
DISCARD_ASYNC, so rename the original flag to DISCARD_SYNC as it is
synchronously done in transaction commit.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Dennis Zhou <dennis@kernel.org>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground
Pull removal of most of fs/compat_ioctl.c from Arnd Bergmann:
"As part of the cleanup of some remaining y2038 issues, I came to
fs/compat_ioctl.c, which still has a couple of commands that need
support for time64_t.
In completely unrelated work, I spent time on cleaning up parts of
this file in the past, moving things out into drivers instead.
After Al Viro reviewed an earlier version of this series and did a lot
more of that cleanup, I decided to try to completely eliminate the
rest of it and move it all into drivers.
This series incorporates some of Al's work and many patches of my own,
but in the end stops short of actually removing the last part, which
is the scsi ioctl handlers. I have patches for those as well, but they
need more testing or possibly a rewrite"
* tag 'compat-ioctl-5.5' of git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground: (42 commits)
scsi: sd: enable compat ioctls for sed-opal
pktcdvd: add compat_ioctl handler
compat_ioctl: move SG_GET_REQUEST_TABLE handling
compat_ioctl: ppp: move simple commands into ppp_generic.c
compat_ioctl: handle PPPIOCGIDLE for 64-bit time_t
compat_ioctl: move PPPIOCSCOMPRESS to ppp_generic
compat_ioctl: unify copy-in of ppp filters
tty: handle compat PPP ioctls
compat_ioctl: move SIOCOUTQ out of compat_ioctl.c
compat_ioctl: handle SIOCOUTQNSD
af_unix: add compat_ioctl support
compat_ioctl: reimplement SG_IO handling
compat_ioctl: move WDIOC handling into wdt drivers
fs: compat_ioctl: move FITRIM emulation into file systems
gfs2: add compat_ioctl support
compat_ioctl: remove unused convert_in_user macro
compat_ioctl: remove last RAID handling code
compat_ioctl: remove /dev/raw ioctl translation
compat_ioctl: remove PCI ioctl translation
compat_ioctl: remove joystick ioctl translation
...
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The .ioctl and .compat_ioctl file operations have the same prototype so
they can both point to the same function, which works great almost all
the time when all the commands are compatible.
One exception is the s390 architecture, where a compat pointer is only
31 bit wide, and converting it into a 64-bit pointer requires calling
compat_ptr(). Most drivers here will never run in s390, but since we now
have a generic helper for it, it's easy enough to use it consistently.
I double-checked all these drivers to ensure that all ioctl arguments
are used as pointers or are ignored, but are not interpreted as integer
values.
Acked-by: Jason Gunthorpe <jgg@mellanox.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: David Sterba <dsterba@suse.com>
Acked-by: Darren Hart (VMware) <dvhart@infradead.org>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Add new block group profile to store 4 copies in a simliar way that
current RAID1 does. The profile attributes and constraints are defined
in the raid table and used by the same code that already handles the 2-
and 3-copy RAID1.
The minimum number of devices is 4, the maximum number of devices/chunks
that can be lost/damaged is 3. There is no comparable traditional RAID
level, the profile is added for future needs to accompany triple-parity
and beyond.
Signed-off-by: David Sterba <dsterba@suse.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Add new block group profile to store 3 copies in a simliar way that
current RAID1 does. The profile attributes and constraints are defined
in the raid table and used by the same code that already handles the
2-copy RAID1.
The minimum number of devices is 3, the maximum number of devices/chunks
that can be lost/damaged is 2. Like RAID6 but with 33% space
utilization.
Signed-off-by: David Sterba <dsterba@suse.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
In btrfs_statfs() we cache fs_info::space_info in a local variable only
to use it once in a list_for_each_rcu() statement.
Not only is the local variable unnecessary it even makes the code harder
to follow as it's not clear which list it is iterating.
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
| |
| |
| |
| |
| |
| |
| | |
Add blake2b (with 256 bit digest) to the list of possible checksumming
algorithms used by BTRFS.
Signed-off-by: David Sterba <dsterba@suse.com>
|
| |
| |
| |
| |
| |
| |
| |
| | |
Add sha256 to the list of possible checksumming algorithms used by BTRFS.
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Add xxhash64 to the list of possible checksumming algorithms used by
BTRFS.
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Now that we're not using btrfs_schedule_bio() anymore, delete all the
code that supported it.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
For some reason the attribute is called __attribute_const__ and not
__const, marks functions that have no observable effects on program
state, IOW not reading pointers, just the arguments and calculating a
value. Allows the compiler to do some optimizations, based on
-Wsuggest-attribute=const . The effects are rather small, though, about
60 bytes decrese of btrfs.ko.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The attribute can mark functions supposed to be called rarely if at all
and the text can be moved to sections far from the other code. The
attribute has been added to several functions already, this patch is
based on hints given by gcc -Wsuggest-attribute=cold.
The net effect of this patch is decrease of btrfs.ko by 1000-1300,
depending on the config options.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The parameter is now always set to NULL and could be dropped. The last
user was get_default_root but that got reworked in 05dbe6837b60 ("Btrfs:
unify subvol= and subvolid= mounting") and the parameter became unused.
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|/
|
|
|
|
|
|
|
| |
We are moving extent_io_tree into it's on file, so separate out the
extent_state init stuff from extent_io_tree_init().
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
| |
As the header for sysfs code already exists, use it to clean up ctree.h.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
| |
This is prep work for moving all of the block group cache code into its
own file.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ minor comment updates ]
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
|
| |
Simplify the code by removing variables that don't bring any real value
as well as simplifying the checks when buidling the candidate list of
devices. No functional changes.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
|
|
|
|
|
|
|
| |
Migrate the struct definition and the one helper that's in ctree.h into
space-info.h
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|