| Sažetak | In the Linux kernel, the following vulnerability has been resolved:
btrfs: fix tree mod log mishandling of reallocated nodes
We have been seeing the following panic in production
  kernel BUG at fs/btrfs/tree-mod-log.c:677!
  invalid opcode: 0000 [#1] SMP
  RIP: 0010:tree_mod_log_rewind+0x1b4/0x200
  RSP: 0000:ffffc9002c02f890 EFLAGS: 00010293
  RAX: 0000000000000003 RBX: ffff8882b448c700 RCX: 0000000000000000
  RDX: 0000000000008000 RSI: 00000000000000a7 RDI: ffff88877d831c00
  RBP: 0000000000000002 R08: 000000000000009f R09: 0000000000000000
  R10: 0000000000000000 R11: 0000000000100c40 R12: 0000000000000001
  R13: ffff8886c26d6a00 R14: ffff88829f5424f8 R15: ffff88877d831a00
  FS:  00007fee1d80c780(0000) GS:ffff8890400c0000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007fee1963a020 CR3: 0000000434f33002 CR4: 00000000007706e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  PKRU: 55555554
  Call Trace:
   btrfs_get_old_root+0x12b/0x420
   btrfs_search_old_slot+0x64/0x2f0
   ? tree_mod_log_oldest_root+0x3d/0xf0
   resolve_indirect_ref+0xfd/0x660
   ? ulist_alloc+0x31/0x60
   ? kmem_cache_alloc_trace+0x114/0x2c0
   find_parent_nodes+0x97a/0x17e0
   ? ulist_alloc+0x30/0x60
   btrfs_find_all_roots_safe+0x97/0x150
   iterate_extent_inodes+0x154/0x370
   ? btrfs_search_path_in_tree+0x240/0x240
   iterate_inodes_from_logical+0x98/0xd0
   ? btrfs_search_path_in_tree+0x240/0x240
   btrfs_ioctl_logical_to_ino+0xd9/0x180
   btrfs_ioctl+0xe2/0x2ec0
   ? __mod_memcg_lruvec_state+0x3d/0x280
   ? do_sys_openat2+0x6d/0x140
   ? kretprobe_dispatcher+0x47/0x70
   ? kretprobe_rethook_handler+0x38/0x50
   ? rethook_trampoline_handler+0x82/0x140
   ? arch_rethook_trampoline_callback+0x3b/0x50
   ? kmem_cache_free+0xfb/0x270
   ? do_sys_openat2+0xd5/0x140
   __x64_sys_ioctl+0x71/0xb0
   do_syscall_64+0x2d/0x40
Which is this code in tree_mod_log_rewind()
	switch (tm->op) {
        case BTRFS_MOD_LOG_KEY_REMOVE_WHILE_FREEING:
		BUG_ON(tm->slot < n);
This occurs because we replay the nodes in order that they happened, and
when we do a REPLACE we will log a REMOVE_WHILE_FREEING for every slot,
starting at 0.  'n' here is the number of items in this block, which in
this case was 1, but we had 2 REMOVE_WHILE_FREEING operations.
The actual root cause of this was that we were replaying operations for
a block that shouldn't have been replayed.  Consider the following
sequence of events
1. We have an already modified root, and we do a btrfs_get_tree_mod_seq().
2. We begin removing items from this root, triggering KEY_REPLACE for
   it's child slots.
3. We remove one of the 2 children this root node points to, thus triggering
   the root node promotion of the remaining child, and freeing this node.
4. We modify a new root, and re-allocate the above node to the root node of
   this other root.
The tree mod log looks something like this
	logical 0	op KEY_REPLACE (slot 1)			seq 2
	logical 0	op KEY_REMOVE (slot 1)			seq 3
	logical 0	op KEY_REMOVE_WHILE_FREEING (slot 0)	seq 4
	logical 4096	op LOG_ROOT_REPLACE (old logical 0)	seq 5
	logical 8192	op KEY_REMOVE_WHILE_FREEING (slot 1)	seq 6
	logical 8192	op KEY_REMOVE_WHILE_FREEING (slot 0)	seq 7
	logical 0	op LOG_ROOT_REPLACE (old logical 8192)	seq 8
>From here the bug is triggered by the following steps
1.  Call btrfs_get_old_root() on the new_root.
2.  We call tree_mod_log_oldest_root(btrfs_root_node(new_root)), which is
    currently logical 0.
3.  tree_mod_log_oldest_root() calls tree_mod_log_search_oldest(), which
    gives us the KEY_REPLACE seq 2, and since that's not a
    LOG_ROOT_REPLACE we incorrectly believe that we don't have an old
    root, because we expect that the most recent change should be a
    LOG_ROOT_REPLACE.
4.  Back in tree_mod_log_oldest_root() we don't have a LOG_ROOT_REPLACE,
    so we don't set old_root, we simply use our e
---truncated--- |