android_kernel_samsung_univ.../kernel
Tejun Heo fce67b31c7 workqueue: replace pool->manager_arb mutex with a flag
commit 692b48258dda7c302e777d7d5f4217244478f1f6 upstream.

Josef reported a HARDIRQ-safe -> HARDIRQ-unsafe lock order detected by
lockdep:

 [ 1270.472259] WARNING: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected
 [ 1270.472783] 4.14.0-rc1-xfstests-12888-g76833e8 #110 Not tainted
 [ 1270.473240] -----------------------------------------------------
 [ 1270.473710] kworker/u5:2/5157 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
 [ 1270.474239]  (&(&lock->wait_lock)->rlock){+.+.}, at: [<ffffffff8da253d2>] __mutex_unlock_slowpath+0xa2/0x280
 [ 1270.474994]
 [ 1270.474994] and this task is already holding:
 [ 1270.475440]  (&pool->lock/1){-.-.}, at: [<ffffffff8d2992f6>] worker_thread+0x366/0x3c0
 [ 1270.476046] which would create a new lock dependency:
 [ 1270.476436]  (&pool->lock/1){-.-.} -> (&(&lock->wait_lock)->rlock){+.+.}
 [ 1270.476949]
 [ 1270.476949] but this new dependency connects a HARDIRQ-irq-safe lock:
 [ 1270.477553]  (&pool->lock/1){-.-.}
 ...
 [ 1270.488900] to a HARDIRQ-irq-unsafe lock:
 [ 1270.489327]  (&(&lock->wait_lock)->rlock){+.+.}
 ...
 [ 1270.494735]  Possible interrupt unsafe locking scenario:
 [ 1270.494735]
 [ 1270.495250]        CPU0                    CPU1
 [ 1270.495600]        ----                    ----
 [ 1270.495947]   lock(&(&lock->wait_lock)->rlock);
 [ 1270.496295]                                local_irq_disable();
 [ 1270.496753]                                lock(&pool->lock/1);
 [ 1270.497205]                                lock(&(&lock->wait_lock)->rlock);
 [ 1270.497744]   <Interrupt>
 [ 1270.497948]     lock(&pool->lock/1);

, which will cause a irq inversion deadlock if the above lock scenario
happens.

The root cause of this safe -> unsafe lock order is the
mutex_unlock(pool->manager_arb) in manage_workers() with pool->lock
held.

Unlocking mutex while holding an irq spinlock was never safe and this
problem has been around forever but it never got noticed because the
only time the mutex is usually trylocked while holding irqlock making
actual failures very unlikely and lockdep annotation missed the
condition until the recent b9c16a0e1f73 ("locking/mutex: Fix
lockdep_assert_held() fail").

Using mutex for pool->manager_arb has always been a bit of stretch.
It primarily is an mechanism to arbitrate managership between workers
which can easily be done with a pool flag.  The only reason it became
a mutex is that pool destruction path wants to exclude parallel
managing operations.

This patch replaces the mutex with a new pool flag POOL_MANAGER_ACTIVE
and make the destruction path wait for the current manager on a wait
queue.

v2: Drop unnecessary flag clearing before pool destruction as
    suggested by Boqun.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-02 09:40:48 +01:00
..
bpf bpf/verifier: reject BPF_ALU64|BPF_END 2017-10-21 17:09:02 +02:00
configs kconfig: tinyconfig: provide whole choice blocks to avoid warnings 2016-09-24 10:07:42 +02:00
debug kernel/debug/debug_core.c: more properly delay for secondary CPUs 2017-01-06 11:16:16 +01:00
events bpf: one perf event close won't free bpf program attached by another perf event 2017-10-21 17:09:02 +02:00
gcov gcov: support GCC 7.1 2017-09-02 07:06:51 +02:00
irq genirq: Release resources in __setup_irq() error path 2017-06-26 07:13:10 +02:00
livepatch livepatch: x86: fix relocation computation with kASLR 2015-11-11 17:36:04 +01:00
locking locking/lockdep: Add nest_lock integrity test 2017-10-21 17:09:03 +02:00
power sched/cpuset/pm: Fix cpuset vs. suspend-resume bugs 2017-10-12 11:27:35 +02:00
printk printk: use rcuidle console tracepoint 2017-02-23 17:43:10 +01:00
rcu rcu: Allow for page faults in NMI handlers 2017-10-18 09:20:41 +02:00
sched sched/autogroup: Fix autogroup_move_group() to never skip sched_move_task() 2017-10-27 10:23:17 +02:00
time timer/sysclt: Restrict timer migration sysctl values to 0 and 1 2017-10-05 09:41:47 +02:00
trace ftrace: Fix kmemleak in unregister_ftrace_graph 2017-10-12 11:27:33 +02:00
.gitignore
acct.c
async.c
audit_fsnotify.c
audit_tree.c
audit_watch.c audit: Fix use after free in audit_remove_watch_rule() 2017-08-24 17:02:35 -07:00
audit.c mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd 2015-11-06 17:50:42 -08:00
audit.h
auditfilter.c
auditsc.c audit: fix a double fetch in audit_log_single_execve_arg() 2016-08-20 18:09:22 +02:00
backtracetest.c
bounds.c
capability.c exec: Ensure mm->user_ns contains the execed files 2017-01-06 11:16:14 +01:00
cgroup_freezer.c cgroup: fix handling of multi-destination migration from subtree_control enabling 2015-12-03 10:18:21 -05:00
cgroup_pids.c cgroup_pids: don't account for the root cgroup 2015-12-03 10:18:21 -05:00
cgroup.c cgroup, kthread: close race window where new kthreads can be migrated to non-root cgroups 2017-04-21 09:30:04 +02:00
compat.c
configs.c
context_tracking.c context_tracking: avoid irq_save/irq_restore on guest entry and exit 2015-11-10 12:06:23 +01:00
cpu_pm.c
cpu.c stable-fixup: hotplug: fix unused function warning 2017-01-12 11:22:48 +01:00
cpuset.c sched/cpuset/pm: Fix cpuset vs. suspend-resume bugs 2017-10-12 11:27:35 +02:00
crash_dump.c
cred.c cred: Reject inodes with invalid ids in set_create_file_as() 2016-09-15 08:27:49 +02:00
delayacct.c
dma.c
elfcore.c
exec_domain.c
exit.c wait/ptrace: assume __WALL if the child is traced 2016-06-07 18:14:35 -07:00
extable.c kernel/extable.c: mark core_kernel_text notrace 2017-07-21 07:44:56 +02:00
fork.c stackprotector: Increase the per-task stack canary's random range from 32 bits to 64 bits on 64-bit platforms 2017-06-14 13:16:23 +02:00
freezer.c
futex_compat.c ptrace: use fsuid, fsgid, effective creds for fs access checks 2016-02-25 12:01:16 -08:00
futex.c futex: Add missing error handling to FUTEX_REQUEUE_PI 2017-03-22 12:04:19 +01:00
groups.c
hung_task.c
irq_work.c treewide: Remove old email address 2015-11-23 09:44:58 +01:00
jump_label.c jump_labels: API for flushing deferred jump label updates 2017-01-19 20:17:19 +01:00
kallsyms.c
kcmp.c ptrace: use fsuid, fsgid, effective creds for fs access checks 2016-02-25 12:01:16 -08:00
Kconfig.freezer
Kconfig.hz
Kconfig.locks
Kconfig.preempt
kexec_core.c kexec: use file name as the output message prefix 2015-11-06 17:50:42 -08:00
kexec_file.c kexec: fix double-free when failing to relocate the purgatory 2016-09-24 10:07:36 +02:00
kexec_internal.h
kexec.c kexec: use file name as the output message prefix 2015-11-06 17:50:42 -08:00
kmod.c
kprobes.c tracing/kprobes: Enforce kprobes teardown after testing 2017-05-25 14:30:17 +02:00
ksysfs.c
kthread.c cgroup, kthread: close race window where new kthreads can be migrated to non-root cgroups 2017-04-21 09:30:04 +02:00
latencytop.c
Makefile
membarrier.c Fix: Disable sys_membarrier when nohz_full is enabled 2017-03-12 06:37:26 +01:00
memremap.c mm: fix devm_memremap_pages crash, use mem_hotplug_{begin, done} 2017-01-19 20:17:18 +01:00
module_signing.c
module-internal.h
module.c module: Invalidate signatures on force-loaded modules 2016-08-20 18:09:27 +02:00
notifier.c
nsproxy.c
padata.c padata: free correct variable 2017-05-20 14:27:02 +02:00
panic.c kernel/panic.c: add missing \n 2017-07-05 14:37:19 +02:00
params.c Nothing exciting, minor tweaks and cleanups. 2015-11-09 15:53:39 -08:00
pid_namespace.c pid_ns: Sleep in TASK_INTERRUPTIBLE in zap_pid_ns_processes 2017-05-25 14:30:11 +02:00
pid.c pids: make task_tgid_nr_ns() safe 2017-08-24 17:02:36 -07:00
profile.c
ptrace.c ptrace: Properly initialize ptracer_cred on fork 2017-06-14 13:16:20 +02:00
range.c
reboot.c
relay.c
resource.c /proc/iomem: only expose physical resource addresses to privileged users 2017-08-06 19:19:42 -07:00
seccomp.c seccomp: fix the usage of get/put_seccomp_filter() in seccomp_get_filter() 2017-10-05 09:41:46 +02:00
signal.c signal: protect SIGNAL_UNKILLABLE from unintentional clearing. 2017-08-11 09:08:59 -07:00
smp.c mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd 2015-11-06 17:50:42 -08:00
smpboot.c
smpboot.h
softirq.c
stacktrace.c
stop_machine.c kernel: remove stop_machine() Kconfig dependency 2015-12-12 10:15:34 -08:00
sys_ni.c mm: mlock: add new mlock system call 2015-11-05 19:34:48 -08:00
sys.c prctl: take mmap sem for writing to protect against others 2016-02-25 12:01:25 -08:00
sysctl_binary.c fs/coredump: prevent fsuid=0 dumps into user-controlled directories 2016-04-12 09:08:58 -07:00
sysctl.c timer/sysclt: Restrict timer migration sysctl values to 0 and 1 2017-10-05 09:41:47 +02:00
task_work.c
taskstats.c
test_kprobes.c
torture.c
tracepoint.c
tsacct.c
uid16.c
up.c
user_namespace.c
user-return-notifier.c
user.c
utsname_sysctl.c
utsname.c
watchdog.c kernel/watchdog: use nmi registers snapshot in hardlockup handler 2017-01-06 11:16:16 +01:00
workqueue_internal.h
workqueue.c workqueue: replace pool->manager_arb mutex with a flag 2017-11-02 09:40:48 +01:00