android_kernel_samsung_univ.../fs/btrfs
Zygo Blaxell f9b4a2e04c btrfs: add missing memset while reading compressed inline extents
[ Upstream commit e1699d2d7bf6e6cce3e1baff19f9dd4595a58664 ]

This is a story about 4 distinct (and very old) btrfs bugs.

Commit c8b978188c ("Btrfs: Add zlib compression support") added
three data corruption bugs for inline extents (bugs #1-3).

Commit 93c82d5750 ("Btrfs: zero page past end of inline file items")
fixed bug #1:  uncompressed inline extents followed by a hole and more
extents could get non-zero data in the hole as they were read.  The fix
was to add a memset in btrfs_get_extent to zero out the hole.

Commit 166ae5a418 ("btrfs: fix inline compressed read err corruption")
fixed bug #2:  compressed inline extents which contained non-zero bytes
might be replaced with zero bytes in some cases.  This patch removed an
unhelpful memset from uncompress_inline, but the case where memset is
required was missed.

There is also a memset in the decompression code, but this only covers
decompressed data that is shorter than the ram_bytes from the extent
ref record.  This memset doesn't cover the region between the end of the
decompressed data and the end of the page.  It has also moved around a
few times over the years, so there's no single patch to refer to.

This patch fixes bug #3:  compressed inline extents followed by a hole
and more extents could get non-zero data in the hole as they were read
(i.e. bug #3 is the same as bug #1, but s/uncompressed/compressed/).
The fix is the same:  zero out the hole in the compressed case too,
by putting a memset back in uncompress_inline, but this time with
correct parameters.

The last and oldest bug, bug #0, is the cause of the offending inline
extent/hole/extent pattern.  Bug #0 is a subtle and mostly-harmless quirk
of behavior somewhere in the btrfs write code.  In a few special cases,
an inline extent and hole are allowed to persist where they normally
would be combined with later extents in the file.

A fast reproducer for bug #0 is presented below.  A few offending extents
are also created in the wild during large rsync transfers with the -S
flag.  A Linux kernel build (git checkout; make allyesconfig; make -j8)
will produce a handful of offending files as well.  Once an offending
file is created, it can present different content to userspace each
time it is read.

Bug #0 is at least 4 and possibly 8 years old.  I verified every vX.Y
kernel back to v3.5 has this behavior.  There are fossil records of this
bug's effects in commits all the way back to v2.6.32.  I have no reason
to believe bug #0 wasn't present at the beginning of btrfs compression
support in v2.6.29, but I can't easily test kernels that old to be sure.

It is not clear whether bug #0 is worth fixing.  A fix would likely
require injecting extra reads into currently write-only paths, and most
of the exceptional cases caused by bug #0 are already handled now.

Whether we like them or not, bug #0's inline extents followed by holes
are part of the btrfs de-facto disk format now, and we need to be able
to read them without data corruption or an infoleak.  So enough about
bug #0, let's get back to bug #3 (this patch).

An example of on-disk structure leading to data corruption found in
the wild:

        item 61 key (606890 INODE_ITEM 0) itemoff 9662 itemsize 160
                inode generation 50 transid 50 size 47424 nbytes 49141
                block group 0 mode 100644 links 1 uid 0 gid 0
                rdev 0 flags 0x0(none)
        item 62 key (606890 INODE_REF 603050) itemoff 9642 itemsize 20
                inode ref index 3 namelen 10 name: DB_File.so
        item 63 key (606890 EXTENT_DATA 0) itemoff 8280 itemsize 1362
                inline extent data size 1341 ram 4085 compress(zlib)
        item 64 key (606890 EXTENT_DATA 4096) itemoff 8227 itemsize 53
                extent data disk byte 5367308288 nr 20480
                extent data offset 0 nr 45056 ram 45056
                extent compression(zlib)

Different data appears in userspace during each read of the 11 bytes
between 4085 and 4096.  The extent in item 63 is not long enough to
fill the first page of the file, so a memset is required to fill the
space between item 63 (ending at 4085) and item 64 (beginning at 4096)
with zero.

Here is a reproducer from Liu Bo, which demonstrates another method
of creating the same inline extent and hole pattern:

Using 'page_poison=on' kernel command line (or enable
CONFIG_PAGE_POISONING) run the following:

	# touch foo
	# chattr +c foo
	# xfs_io -f -c "pwrite -W 0 1000" foo
	# xfs_io -f -c "falloc 4 8188" foo
	# od -x foo
	# echo 3 >/proc/sys/vm/drop_caches
	# od -x foo

This produce the following on my box:

Correct output:  file contains 1000 data bytes followed
by zeros:

	0000000 cdcd cdcd cdcd cdcd cdcd cdcd cdcd cdcd
	*
	0001740 cdcd cdcd cdcd cdcd 0000 0000 0000 0000
	0001760 0000 0000 0000 0000 0000 0000 0000 0000
	*
	0020000

Actual output:  the data after the first 1000 bytes
will be different each run:

	0000000 cdcd cdcd cdcd cdcd cdcd cdcd cdcd cdcd
	*
	0001740 cdcd cdcd cdcd cdcd 6c63 7400 635f 006d
	0001760 5f74 6f43 7400 435f 0053 5f74 7363 7400
	0002000 435f 0056 5f74 6164 7400 645f 0062 5f74
	(...)

Signed-off-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: Chris Mason <clm@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-20 10:04:57 +01:00
..
tests Btrfs: tests: checking for NULL instead of IS_ERR() 2015-11-25 05:19:50 -08:00
acl.c posix_acl: Clear SGID bit when setting file permissions 2016-10-31 04:13:58 -06:00
async-thread.c btrfs: limit async_work allocation and worker func duration 2017-01-06 11:16:06 +01:00
async-thread.h btrfs: limit async_work allocation and worker func duration 2017-01-06 11:16:06 +01:00
backref.c Btrfs: fix hang on extent buffer lock caused by the inode_paths ioctl 2016-02-25 12:01:15 -08:00
backref.h btrfs: cleanup, remove inode_item_info helper 2015-01-14 19:23:47 +01:00
btrfs_inode.h Btrfs: Direct I/O: Fix space accounting 2015-09-21 13:47:55 -07:00
check-integrity.c Merge branch 'cleanups/for-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.4 2015-10-21 18:21:40 -07:00
check-integrity.h block: submit_bio_wait() conversions 2013-11-24 16:33:41 -07:00
compression.c btrfs: assign error values to the correct bio structs 2016-10-22 12:26:54 +02:00
compression.h btrfs: constify structs with op functions or static definitions 2015-02-16 18:48:44 +01:00
ctree.c btrfs: account for non-CoW'd blocks in btrfs_abort_transaction 2016-07-27 09:47:33 -07:00
ctree.h btrfs: store and load values of stripes_min/stripes_max in balance status item 2017-01-06 11:16:06 +01:00
delayed-inode.c btrfs: limit async_work allocation and worker func duration 2017-01-06 11:16:06 +01:00
delayed-inode.h btrfs: properly set the termination value of ctx->pos in readdir 2016-02-25 12:01:15 -08:00
delayed-ref.c btrfs: qgroup: Fix a race in delayed_ref which leads to abort trans 2015-10-26 19:44:39 -07:00
delayed-ref.h btrfs: qgroup: Fix a race in delayed_ref which leads to abort trans 2015-10-26 19:44:39 -07:00
dev-replace.c Merge branch 'fix/waitqueue-barriers' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.4 2015-10-12 16:24:40 -07:00
dev-replace.h Btrfs: add new sources for device replace code 2012-12-12 17:15:41 -05:00
dir-item.c Btrfs: make xattr replace operations atomic 2014-11-20 17:20:07 -08:00
disk-io.c btrfs: properly track when rescan worker is running 2016-09-07 08:32:43 +02:00
disk-io.h Btrfs: add btrfs_read_dev_one_super() to read one specific SB 2015-10-01 17:29:38 +02:00
export.c BTRFS: support NFSv2 export 2015-10-06 06:55:23 -07:00
export.h
extent_io.c Btrfs: fix memory leak in reading btree blocks 2017-01-06 11:16:10 +01:00
extent_io.h btrfs: qgroup: Introduce btrfs_qgroup_reserve_data function 2015-10-21 18:37:45 -07:00
extent_map.c Btrfs: do not move em to modified list when unpinning 2014-11-21 11:59:54 -08:00
extent_map.h Btrfs: fix NULL pointer crash when running balance and scrub concurrently 2014-06-19 14:20:55 -07:00
extent-tree.c btrfs: clear space cache inode generation always 2017-12-05 11:22:50 +01:00
extent-tree.h btrfs: qgroup: Add new qgroup calculation function 2015-06-10 09:25:49 -07:00
file-item.c Merge branch 'cleanups-post-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.1 2015-03-25 10:52:48 -07:00
file.c fs: add i_blocksize() 2017-06-14 13:16:24 +02:00
free-space-cache.c Merge branch 'for-linus-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs 2015-12-18 15:35:08 -08:00
free-space-cache.h Btrfs: keep track of largest extent in bitmaps 2015-10-21 18:55:40 -07:00
hash.c btrfs: LLVMLinux: Remove VLAIS 2014-10-14 10:51:22 +02:00
hash.h Btrfs: fix btrfs boot when compiled as built-in 2014-01-28 13:20:31 -08:00
inode-item.c Btrfs: consolidate btrfs_error() to btrfs_std_error() 2015-09-29 16:30:00 +02:00
inode-map.c Btrfs: Initialize btrfs_root->highest_objectid when loading tree root and subvolume roots 2016-03-03 15:07:12 -08:00
inode-map.h Btrfs: Initialize btrfs_root->highest_objectid when loading tree root and subvolume roots 2016-03-03 15:07:12 -08:00
inode.c btrfs: add missing memset while reading compressed inline extents 2017-12-20 10:04:57 +01:00
ioctl.c btrfs: prevent to set invalid default subvolid 2017-10-05 09:41:47 +02:00
Kconfig rcu: Make SRCU optional by using CONFIG_SRCU 2015-01-06 11:04:29 -08:00
locking.c btrfs: comment the rest of implicit barriers before waitqueue_active 2015-10-10 18:42:00 +02:00
locking.h btrfs: fix lockups from btrfs_clear_path_blocking 2014-11-19 10:34:35 -08:00
lzo.c btrfs: constify structs with op functions or static definitions 2015-02-16 18:48:44 +01:00
Makefile Btrfs: add sanity tests for new qgroup accounting code 2014-06-09 17:20:49 -07:00
math.h btrfs: cleanup 64bit/32bit divs, compile time constants 2015-03-03 17:23:57 +01:00
ordered-data.c Btrfs: change how we wait for pending ordered extents 2015-10-21 18:51:40 -07:00
ordered-data.h Btrfs: change how we wait for pending ordered extents 2015-10-21 18:51:40 -07:00
orphan.c btrfs: kill the key type accessor helpers 2014-09-17 13:37:12 -07:00
print-tree.c btrfs: remove parameter blocksize from read_tree_block 2014-10-02 17:14:50 +02:00
print-tree.h btrfs: make static code static & remove dead code 2013-05-06 15:55:23 -04:00
props.c btrfs: cleanup iterating over prop_handlers array 2015-10-21 18:28:48 +02:00
props.h Btrfs: add support for inode properties 2014-01-28 13:20:24 -08:00
qgroup.c Btrfs: fix qgroup rescan worker initialization 2017-01-06 11:16:06 +01:00
qgroup.h btrfs: waiting on qgroup rescan should not always be interruptible 2016-09-07 08:32:43 +02:00
raid56.c btrfs: comment waitqueue_active implied by locks 2015-10-10 18:35:10 +02:00
raid56.h Btrfs: add RAID 5/6 BTRFS_RBIO_REBUILD_MISSING operation 2015-08-09 07:34:26 -07:00
rcu-string.h
reada.c btrfs: reada: Fix returned errno code 2015-10-21 18:29:50 +02:00
relocation.c btrfs: fix NULL pointer dereference from free_reloc_roots() 2017-10-05 09:41:46 +02:00
root-tree.c Btrfs: fix loading of orphan roots leading to BUG_ON 2016-03-09 15:34:53 -08:00
scrub.c Btrfs: fix scrub preventing unused block groups from being deleted 2015-11-25 05:22:08 -08:00
send.c Btrfs: send, fix failure to rename top level inode due to name collision 2017-10-21 17:09:04 +02:00
send.h btrfs: make static code static & remove dead code 2013-05-06 15:55:23 -04:00
struct-funcs.c
super.c btrfs: resume qgroup rescan on rw remount 2017-09-13 14:09:46 -07:00
sysfs.c Btrfs: rename super_kobj to fsid_kobj 2015-09-29 16:29:59 +02:00
sysfs.h Btrfs: rename btrfs_kobj_rm_device to btrfs_sysfs_rm_device_link 2015-09-29 16:29:59 +02:00
transaction.c Btrfs: fix unprotected list move from unused_bgs to deleted_bgs list 2015-12-10 11:22:38 +00:00
transaction.h btrfs: account for non-CoW'd blocks in btrfs_abort_transaction 2016-07-27 09:47:33 -07:00
tree-defrag.c Btrfs: cleanup: remove unnecessary check before btrfs_free_path is called 2015-08-31 11:46:41 -07:00
tree-log.c Btrfs: fix tree search logic when replaying directory entry deletes 2017-01-06 11:16:06 +01:00
tree-log.h Btrfs: fix metadata inconsistencies after directory fsync 2015-03-26 17:56:23 -07:00
ulist.c btrfs: ulist: Add ulist_del() function. 2015-06-10 09:26:17 -07:00
ulist.h btrfs: ulist: Add ulist_del() function. 2015-06-10 09:26:17 -07:00
uuid-tree.c btrfs: return the actual error value from from btrfs_uuid_tree_iterate 2017-11-30 08:37:28 +00:00
volumes.c btrfs: remove duplicate const specifier 2017-09-02 07:06:50 +02:00
volumes.h btrfs: fix clashing number of the enhanced balance usage filter 2015-11-25 05:19:50 -08:00
xattr.c Btrfs: fix race when listing an inode's xattrs 2015-11-09 18:34:40 +00:00
xattr.h btrfs: use generic posix ACL infrastructure 2014-01-25 23:58:18 -05:00
zlib.c btrfs: constify structs with op functions or static definitions 2015-02-16 18:48:44 +01:00