commit 3b885ac1dc35b87a39ee176a6c7e2af9c789d8b8 upstream.
Now that mb() is an instruction barrier, it will slow performance if we issue
unnecessary barriers.
The spinlock defines have a number of unnecessary barriers. The __ldcw()
define is both a hardware and compiler barrier. The mb() barriers in the
routines using __ldcw() serve no purpose.
The only barrier needed is the one in arch_spin_unlock(). We need to ensure
all accesses are complete prior to releasing the lock.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Cc: stable@vger.kernel.org # 4.0+
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fedb8da96355f5f64353625bf96dc69423ad1826 upstream.
For years I thought all parisc machines executed loads and stores in
order. However, Jeff Law recently indicated on gcc-patches that this is
not correct. There are various degrees of out-of-order execution all the
way back to the PA7xxx processor series (hit-under-miss). The PA8xxx
series has full out-of-order execution for both integer operations, and
loads and stores.
This is described in the following article:
http://web.archive.org/web/20040214092531/http://www.cpus.hp.com/technical_references/advperf.shtml
For this reason, we need to define mb() and to insert a memory barrier
before the store unlocking spinlocks. This ensures that all memory
accesses are complete prior to unlocking. The ldcw instruction performs
the same function on entry.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Cc: stable@vger.kernel.org # 4.0+
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 30d6e0a4190d37740e9447e4e4815f06992dd8c3 upstream.
There is code duplicated over all architecture's headers for
futex_atomic_op_inuser. Namely op decoding, access_ok check for uaddr,
and comparison of the result.
Remove this duplication and leave up to the arches only the needed
assembly which is now in arch_futex_atomic_op_inuser.
This effectively distributes the Will Deacon's arm64 fix for undefined
behaviour reported by UBSAN to all architectures. The fix was done in
commit 5f16a046f8e1 (arm64: futex: Fix undefined behaviour with
FUTEX_OP_OPARG_SHIFT usage). Look there for an example dump.
And as suggested by Thomas, check for negative oparg too, because it was
also reported to cause undefined behaviour report.
Note that s390 removed access_ok check in d12a29703 ("s390/uaccess:
remove pointless access_ok() checks") as access_ok there returns true.
We introduce it back to the helper for the sake of simplicity (it gets
optimized away anyway).
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Russell King <rmk+kernel@armlinux.org.uk>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> [s390]
Acked-by: Chris Metcalf <cmetcalf@mellanox.com> [for tile]
Reviewed-by: Darren Hart (VMware) <dvhart@infradead.org>
Reviewed-by: Will Deacon <will.deacon@arm.com> [core/arm64]
Cc: linux-mips@linux-mips.org
Cc: Rich Felker <dalias@libc.org>
Cc: linux-ia64@vger.kernel.org
Cc: linux-sh@vger.kernel.org
Cc: peterz@infradead.org
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: sparclinux@vger.kernel.org
Cc: Jonas Bonn <jonas@southpole.se>
Cc: linux-s390@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: linux-hexagon@vger.kernel.org
Cc: Helge Deller <deller@gmx.de>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: linux-snps-arc@lists.infradead.org
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: linux-xtensa@linux-xtensa.org
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: openrisc@lists.librecores.org
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Stafford Horne <shorne@gmail.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Richard Henderson <rth@twiddle.net>
Cc: Chris Zankel <chris@zankel.net>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-parisc@vger.kernel.org
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: linux-alpha@vger.kernel.org
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: "David S. Miller" <davem@davemloft.net>
Link: http://lkml.kernel.org/r/20170824073105.3901-1-jslaby@suse.cz
Cc: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 88776c0e70be0290f8357019d844aae15edaa967 upstream.
Qemu for PARISC reported on a 32bit SMP parisc kernel strange failures
about "Not-handled unaligned insn 0x0e8011d6 and 0x0c2011c9."
Those opcodes evaluate to the ldcw() assembly instruction which requires
(on 32bit) an alignment of 16 bytes to ensure atomicity.
As it turns out, qemu is correct and in our assembly code in entry.S and
pacache.S we don't pay attention to the required alignment.
This patch fixes the problem by aligning the lock offset in assembly
code in the same manner as we do in our C-code.
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 649aa24254e85bf6bd7807dd372d083707852b1f upstream.
This is because of commit f98db6013c55 ("sched/core: Add switch_mm_irqs_off()
and use it in the scheduler") in which switch_mm_irqs_off() is called by the
scheduler, vs switch_mm() which is used by use_mm().
This patch lets the parisc code mirror the x86 and powerpc code, ie. it
disables interrupts in switch_mm(), and optimises the scheduler case by
defining switch_mm_irqs_off().
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2ad5d52d42810bed95100a3d912679d8864421ec upstream.
In swab.h the "#if BITS_PER_LONG > 32" breaks compiling userspace programs if
BITS_PER_LONG is #defined by userspace with the sizeof() compiler builtin.
Solve this problem by using __BITS_PER_LONG instead. Since we now
#include asm/bitsperlong.h avoid further potential userspace pollution
by moving the #define of SHIFT_PER_LONG to bitops.h which is not
exported to userspace.
This patch unbreaks compiling qemu on hppa/parisc.
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c78e710c1c9fbeff43dddc0aa3d0ff458e70b0cc upstream.
The attached change interchanges the order of purging the TLB and
setting the corresponding page table entry. TLB purges are strongly
ordered. It occurred to me one night that setting the PTE first might
have subtle ordering issues on SMP machines and cause random memory
corruption.
A TLB lock guards the insertion of user TLB entries. So after the TLB
is purged, a new entry can't be inserted until the lock is released.
This ensures that the new PTE value is used when the lock is released.
Since making this change, no random segmentation faults have been
observed on the Debian hppa buildd servers.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 65bf34f59594c11f13d371c5334a6a0a385cd7ae upstream.
Increase the initial kernel default page mapping size for 64-bit kernels to
64 MB and for 32-bit kernels to 32 MB.
Due to the additional support of ftrace, tracepoint and huge pages the kernel
size can exceed the sizes we used up to now.
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 690d097c00c88fa9d93d198591e184164b1d8c20 upstream.
Increase the initial kernel default page mapping size for SMP kernels to 32MB
and add a runtime check which panics early if the kernel is bigger than the
initial mapping size.
This fixes boot crashes of 32bit SMP kernels. Due to the introduction of huge
page support in kernel 4.4 and it's required initial kernel layout in memory, a
32bit SMP kernel usually got bigger (in layout, not size) than 16MB.
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3eb53b20d7bd1374598cfb1feaa081fcac0e76cd upstream.
When building gccgo in userspace, errno.h gets parsed and the go include file
sysinfo.go is generated.
Since EREFUSED is defined to the same value as ECONNREFUSED, and ECONNREFUSED
is defined later on in errno.h, this leads to go complaining that EREFUSED
isn't defined yet.
Fix this trivial problem by moving the define of EREFUSED down after
ECONNREFUSED in errno.h (and clean up the indenting while touching this line).
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2ef4dfd9d9f288943e249b78365a69e3ea3ec072 upstream.
Handling exceptions from modules never worked on parisc.
It was just masked by the fact that exceptions from modules
don't happen during normal use.
When a module triggers an exception in get_user() we need to load the
main kernel dp value before accessing the exception_data structure, and
afterwards restore the original dp value of the module on exit.
Noticed-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e60fc5aa608eb38b47ba4ee058f306f739eb70a0 upstream.
On a 64bit kernel build the compiler aligns the _sifields union in the
struct siginfo_t on a 64bit address. The __ARCH_SI_PREAMBLE_SIZE define
compensates for this alignment and thus fixes the wait testcase of the
strace package.
The symptoms of a wrong __ARCH_SI_PREAMBLE_SIZE value is that
_sigchld.si_stime variable is missed to be copied and thus after a
copy_siginfo() will have uninitialized values.
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b0e551313ebde17764f3a5ed273df524d1e7e690 upstream.
PA-RISC doesn't have atomic instructions to modify page table entries, so it
takes spinlock in the TLB handler and modifies the page table entry
non-atomically. If you modify the page table entry without the spinlock, you
may race with TLB handler on another CPU and your modification may be lost.
Protect against that with usage of purge_tlb_start() and purge_tlb_end() which
handles the TLB spinlock.
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull parisc update from Helge Deller:
"This patchset adds Huge Page and HUGETLBFS support for parisc"
Honestly, the hugepage support should have gone through in the merge
window, and is not really an rc-time fix. But it only touches
arch/parisc, and I cannot find it in myself to care. If one of the
three parisc users notices a breakage, I will point at Helge and make
rude farting noises.
* 'parisc-4.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Map kernel text and data on huge pages
parisc: Add Huge Page and HUGETLBFS support
parisc: Use long branch to do_syscall_trace_exit
parisc: Increase initial kernel mapping to 32MB on 64bit kernel
parisc: Initialize the fault vector earlier in the boot process.
parisc: Add defines for Huge page support
parisc: Drop unused MADV_xxxK_PAGES flags from asm/mman.h
parisc: Drop definition of start_thread_som for HP-UX SOM binaries
parisc: Fix wrong comment regarding first pmd entry flags
This patch adds huge page support to allow userspace to allocate huge
pages and to use hugetlbfs filesystem on 32- and 64-bit Linux kernels.
A later patch will add kernel support to map kernel text and data on
huge pages.
The only requirement is, that the kernel needs to be compiled for a
PA8X00 CPU (PA2.0 architecture). Older PA1.X CPUs do not support
variable page sizes. 64bit Kernels are compiled for PA2.0 by default.
Technically on parisc multiple physical huge pages may be needed to
emulate standard 2MB huge pages.
Signed-off-by: Helge Deller <deller@gmx.de>
For the 64bit kernel the initially 16 MB kernel memory might become too
small if you build a kernel with many modules built-in and with kernel
text and data areas mapped on huge pages.
This patch increases the initial mapping to 32MB for 64bit kernels and
keeps 16MB for 32bit kernels.
Signed-off-by: Helge Deller <deller@gmx.de>
Huge pages on parisc will have the same size as one pmd table, which
is on a 64bit kernel 2MB on a kernel with 4K kernel page sizes, and
on a 32bit kernel 4MB when used with 4K kernel pages.
Since parisc does not physically supports 2MB huge page sizes, emulate
it with two consecutive 1MB page sizes instead. Keeping the same huge
page size as one pmd will allow us to add transparent huge page support
later on.
Bit 21 in the pte flags was unused and will now be used to mark a page
as huge page (_PAGE_HPAGE_BIT).
Signed-off-by: Helge Deller <deller@gmx.de>
Drop the MADV_xxK_PAGES flags, which were never used and were from a proposed
API which was never integrated into the generic Linux kernel code.
Cc: stable@vger.kernel.org
Signed-off-by: Helge Deller <deller@gmx.de>
The definition of start_thread_som was planned to be used to execute
HP-UX SOM binaries. Since HP-UX compatibility was dropped with kernel 4.0
there is no need to carry it further.
Signed-off-by: Helge Deller <deller@gmx.de>
Pull parisc updates from Helge Deller:
"We have two patches in here:
- The parisc uapi headers have been screwed up since quite some time.
This patch fixes some bugs (e.g. endianess not respected in
compat_semid64_ds) and cleans them up (e.g. uid_t was used instead
of __kernel_uid_t) so that they can be used by userspace again.
This patch has been reviewed by Arnd Bergmann and is scheduled for
stable kernel series.
- Drop the hpux_stat64 struct from stat.h, we do not support HP-UX
binaries since kernel 4.0"
* 'parisc-4.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Fixes and cleanups in kernel uapi header files
parisc: Drop hpux_stat64 struct from stat.h header file
Removal started in commit 5bbeed12bd ("sparc32: drop unused
kmap_atomic_to_page"). Let's do it across the whole tree.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch fixes some bugs and partly cleans up the parisc uapi header
files to what glibc defined:
- compat_semid64_ds was wrong and did not take the endianess into
account
- ipc64_perm exported userspace types which broke building userspace
packages on debian (e.g. trinity)
- ipc64_perm needs to use a 32bit mode_t on 64bit kernel
- msqid64_ds and semid64_ds needs unsigned longs for various struct members
- shmid64_ds exported size_t instead of __kernel_size_t
And finally add some compile-time checks for the sizes of those structs
to avoid future breakage.
Runtime-tested with the Linux Test Project (LTP) testsuite.
Cc: <stable@vger.kernel.org> # 3.18+
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Helge Deller <deller@gmx.de>
The struct hpux_stat64 is not needed any longer since we dropped HP-UX
support in commit 04c1614 ("parisc: hpux - Drop support for HP-UX
binaries").
Signed-off-by: Helge Deller <deller@gmx.de>
The previous patch introduced a flag that specified pages in a VMA should
be placed on the unevictable LRU, but they should not be made present when
the area is created. This patch adds the ability to set this state via
the new mlock system calls.
We add MLOCK_ONFAULT for mlock2 and MCL_ONFAULT for mlockall.
MLOCK_ONFAULT will set the VM_LOCKONFAULT modifier for VM_LOCKED.
MCL_ONFAULT should be used as a modifier to the two other mlockall flags.
When used with MCL_CURRENT, all current mappings will be marked with
VM_LOCKED | VM_LOCKONFAULT. When used with MCL_FUTURE, the mm->def_flags
will be marked with VM_LOCKED | VM_LOCKONFAULT. When used with both
MCL_CURRENT and MCL_FUTURE, all current mappings and mm->def_flags will be
marked with VM_LOCKED | VM_LOCKONFAULT.
Prior to this patch, mlockall() will unconditionally clear the
mm->def_flags any time it is called without MCL_FUTURE. This behavior is
maintained after adding MCL_ONFAULT. If a call to mlockall(MCL_FUTURE) is
followed by mlockall(MCL_CURRENT), the mm->def_flags will be cleared and
new VMAs will be unlocked. This remains true with or without MCL_ONFAULT
in either mlockall() invocation.
munlock() will unconditionally clear both vma flags. munlockall()
unconditionally clears for VMA flags on all VMAs and in the mm->def_flags
field.
Signed-off-by: Eric B Munson <emunson@akamai.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Shuah Khan <shuahkh@osg.samsung.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull parisc updates from Helge Deller:
"The most important change is that we reduce L1_CACHE_BYTES to 16
bytes, for which a trivial patch for XPS in the network layer was
needed. Then we wire up the sys_membarrier and userfaultfd syscalls
and added two other small cleanups"
* 'parisc-4.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Change L1_CACHE_BYTES to 16
net/xps: Fix calculation of initial number of xps queues
parisc: reduce syslog debug output
parisc: serial/mux: Convert to uart_console_device instead of open-coded
parisc: Wire up userfaultfd syscall
parisc: allocate sys_membarrier system call number
Change L1_CACHE_BYTES to 16 bytes.
Tested for 16 days on rp3440.
Additional remarks from Helge Deller:
Saves ~17 kb of kernel code/data and gives a slight performance improvement in
various test cases.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
This patch makes sure that atomic_{read,set}() are at least
{READ,WRITE}_ONCE().
We already had the 'requirement' that atomic_read() should use
ACCESS_ONCE(), and most archs had this, but a few were lacking.
All are now converted to use READ_ONCE().
And, by a symmetry and general paranoia argument, upgrade atomic_set()
to use WRITE_ONCE().
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: james.hogan@imgtec.com
Cc: linux-kernel@vger.kernel.org
Cc: oleg@redhat.com
Cc: will.deacon@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Commit 3cc2dac5be ("drivers/video/fbdev/atyfb: Replace MTRR UC hole
with strong UC") introduces calls to ioremap_wc and ioremap_uc. This
causes build failures with parisc:allmodconfig. Map the missing
functions to ioremap_nocache.
Fixes: 3cc2dac5be ("drivers/video/fbdev/atyfb:
Replace MTRR UC hole with strong UC")
Cc: Luis R. Rodriguez <mcgrof@suse.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Helge Deller <deller@gmx.de>
Implement atomic logic ops -- atomic_{or,xor,and}.
These will replace the atomic_{set,clear}_mask functions that are
available on some archs.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Implement atomic logic ops -- atomic_{or,xor,and}.
These will replace the atomic_{set,clear}_mask functions that are
available on some archs.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Commit 0e0da48dee ("parisc: mm: don't count preallocated pmds")
introduced a memory leak.
After this commit, the 'return' statement in pmd_free is executed in all
cases. Even for pmd that are not attached to the pgd. So 'free_pages'
can never be called anymore, leading to a memory leak.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Mikulas Patocka <mpatocka@redhat.com>
Acked-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # v4.0+
Signed-off-by: Helge Deller <deller@gmx.de>
Commit 2ae416b142 ("mm: new mm hook framework") introduced an empty
header file (mm-arch-hooks.h) for every architecture, even those which
doesn't need to define mm hooks.
As suggested by Geert Uytterhoeven, this could be cleaned through the use
of a generic header file included via each per architecture
asm/include/Kbuild file.
The PowerPC architecture is not impacted here since this architecture has
to defined the arch_remap MM hook.
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The increased use of pdtlb/pitlb instructions seemed to increase the
frequency of random segmentation faults building packages. Further, we
had a number of cases where TLB inserts would repeatedly fail and all
forward progress would stop. The Haskell ghc package caused a lot of
trouble in this area. The final indication of a race in pte handling was
this syslog entry on sibaris (C8000):
swap_free: Unused swap offset entry 00000004
BUG: Bad page map in process mysqld pte:00000100 pmd:019bbec5
addr:00000000ec464000 vm_flags:00100073 anon_vma:0000000221023828 mapping: (null) index:ec464
CPU: 1 PID: 9176 Comm: mysqld Not tainted 4.0.0-2-parisc64-smp #1 Debian 4.0.5-1
Backtrace:
[<0000000040173eb0>] show_stack+0x20/0x38
[<0000000040444424>] dump_stack+0x9c/0x110
[<00000000402a0d38>] print_bad_pte+0x1a8/0x278
[<00000000402a28b8>] unmap_single_vma+0x3d8/0x770
[<00000000402a4090>] zap_page_range+0xf0/0x198
[<00000000402ba2a4>] SyS_madvise+0x404/0x8c0
Note that the pte value is 0 except for the accessed bit 0x100. This bit
shouldn't be set without the present bit.
It should be noted that the madvise system call is probably a trigger for many
of the random segmentation faults.
In looking at the kernel code, I found the following problems:
1) The pte_clear define didn't take TLB lock when clearing a pte.
2) We didn't test pte present bit inside lock in exception support.
3) The pte and tlb locks needed to merged in order to ensure consistency
between page table and TLB. This also has the effect of serializing TLB
broadcasts on SMP systems.
The attached change implements the above and a few other tweaks to try
to improve performance. Based on the timing code, TLB purges are very
slow (e.g., ~ 209 cycles per page on rp3440). Thus, I think it
beneficial to test the split_tlb variable to avoid duplicate purges.
Probably, all PA 2.0 machines have combined TLBs.
I dropped using __flush_tlb_range in flush_tlb_mm as I realized all
applications and most threads have a stack size that is too large to
make this useful. I added some comments to this effect.
Since implementing 1 through 3, I haven't had any random segmentation
faults on mx3210 (rp3440) in about one week of building code and running
as a Debian buildd.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Cc: stable@vger.kernel.org # v3.18+
Signed-off-by: Helge Deller <deller@gmx.de>
Pull asm/scatterlist.h removal from Jens Axboe:
"We don't have any specific arch scatterlist anymore, since parisc
finally switched over. Kill the include"
* 'for-4.2/sg' of git://git.kernel.dk/linux-block:
remove scatterlist.h generation from arch Kbuild files
remove <asm/scatterlist.h>
CRIU is recreating the process memory layout by remapping the checkpointee
memory area on top of the current process (criu). This includes remapping
the vDSO to the place it has at checkpoint time.
However some architectures like powerpc are keeping a reference to the
vDSO base address to build the signal return stack frame by calling the
vDSO sigreturn service. So once the vDSO has been moved, this reference
is no more valid and the signal frame built later are not usable.
This patch serie is introducing a new mm hook framework, and a new
arch_remap hook which is called when mremap is done and the mm lock still
hold. The next patch is adding the vDSO remap and unmap tracking to the
powerpc architecture.
This patch (of 3):
This patch introduces a new set of header file to manage mm hooks:
- per architecture empty header file (arch/x/include/asm/mm-arch-hooks.h)
- a generic header (include/linux/mm-arch-hooks.h)
The architecture which need to overwrite a hook as to redefine it in its
header file, while architecture which doesn't need have nothing to do.
The default hooks are defined in the generic header and are used in the
case the architecture is not defining it.
In a next step, mm hooks defined in include/asm-generic/mm_hooks.h should
be moved here.
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull scheduler updates from Ingo Molnar:
"The main changes are:
- lockless wakeup support for futexes and IPC message queues
(Davidlohr Bueso, Peter Zijlstra)
- Replace spinlocks with atomics in thread_group_cputimer(), to
improve scalability (Jason Low)
- NUMA balancing improvements (Rik van Riel)
- SCHED_DEADLINE improvements (Wanpeng Li)
- clean up and reorganize preemption helpers (Frederic Weisbecker)
- decouple page fault disabling machinery from the preemption
counter, to improve debuggability and robustness (David
Hildenbrand)
- SCHED_DEADLINE documentation updates (Luca Abeni)
- topology CPU masks cleanups (Bartosz Golaszewski)
- /proc/sched_debug improvements (Srikar Dronamraju)"
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (79 commits)
sched/deadline: Remove needless parameter in dl_runtime_exceeded()
sched: Remove superfluous resetting of the p->dl_throttled flag
sched/deadline: Drop duplicate init_sched_dl_class() declaration
sched/deadline: Reduce rq lock contention by eliminating locking of non-feasible target
sched/deadline: Make init_sched_dl_class() __init
sched/deadline: Optimize pull_dl_task()
sched/preempt: Add static_key() to preempt_notifiers
sched/preempt: Fix preempt notifiers documentation about hlist_del() within unsafe iteration
sched/stop_machine: Fix deadlock between multiple stop_two_cpus()
sched/debug: Add sum_sleep_runtime to /proc/<pid>/sched
sched/debug: Replace vruntime with wait_sum in /proc/sched_debug
sched/debug: Properly format runnable tasks in /proc/sched_debug
sched/numa: Only consider less busy nodes as numa balancing destinations
Revert 095bebf61a ("sched/numa: Do not move past the balance point if unbalanced")
sched/fair: Prevent throttling in early pick_next_task_fair()
preempt: Reorganize the notrace definitions a bit
preempt: Use preempt_schedule_context() as the official tracing preemption point
sched: Make preempt_schedule_context() function-tracing safe
x86: Remove cpu_sibling_mask() and cpu_core_mask()
x86: Replace cpu_**_mask() with topology_**_cpumask()
...
Pull locking updates from Ingo Molnar:
"The main changes are:
- 'qspinlock' support, enabled on x86: queued spinlocks - these are
now the spinlock variant used by x86 as they outperform ticket
spinlocks in every category. (Waiman Long)
- 'pvqspinlock' support on x86: paravirtualized variant of queued
spinlocks. (Waiman Long, Peter Zijlstra)
- 'qrwlock' support, enabled on x86: queued rwlocks. Similar to
queued spinlocks, they are now the variant used by x86:
CONFIG_ARCH_USE_QUEUED_SPINLOCKS=y
CONFIG_QUEUED_SPINLOCKS=y
CONFIG_ARCH_USE_QUEUED_RWLOCKS=y
CONFIG_QUEUED_RWLOCKS=y
- various lockdep fixlets
- various locking primitives cleanups, further WRITE_ONCE()
propagation"
* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
locking/lockdep: Remove hard coded array size dependency
locking/qrwlock: Don't contend with readers when setting _QW_WAITING
lockdep: Do not break user-visible string
locking/arch: Rename set_mb() to smp_store_mb()
locking/arch: Add WRITE_ONCE() to set_mb()
rtmutex: Warn if trylock is called from hard/softirq context
arch: Remove __ARCH_HAVE_CMPXCHG
locking/rtmutex: Drop usage of __HAVE_ARCH_CMPXCHG
locking/qrwlock: Rename QUEUE_RWLOCK to QUEUED_RWLOCKS
locking/pvqspinlock: Rename QUEUED_SPINLOCK to QUEUED_SPINLOCKS
locking/pvqspinlock: Replace xchg() by the more descriptive set_mb()
locking/pvqspinlock, x86: Enable PV qspinlock for Xen
locking/pvqspinlock, x86: Enable PV qspinlock for KVM
locking/pvqspinlock, x86: Implement the paravirt qspinlock call patching
locking/pvqspinlock: Implement simple paravirt support for the qspinlock
locking/qspinlock: Revert to test-and-set on hypervisors
locking/qspinlock: Use a simple write to grab the lock
locking/qspinlock: Optimize for smaller NR_CPUS
locking/qspinlock: Extract out code snippets for the next patch
locking/qspinlock: Add pending bit
...
pci_dma_burst_advice() was added by e24c2d963a ("[PATCH] PCI: DMA
bursting advice") but apparently never used. Remove it.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Michal Simek <monstr@monstr.eu> # microblaze
CC: David S. Miller <davem@davemloft.net>
We removed the only user of this define in the rtmutex code. Get rid
of it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>