aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* net/l2tp: don't fall back on UDP [get|set]sockoptSasha Levin2016-03-111-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | The l2tp [get|set]sockopt() code has fallen back to the UDP functions for socket option levels != SOL_PPPOL2TP since day one, but that has never actually worked, since the l2tp socket isn't an inet socket. As David Miller points out: "If we wanted this to work, it'd have to look up the tunnel and then use tunnel->sk, but I wonder how useful that would be" Since this can never have worked so nobody could possibly have depended on that functionality, just remove the broken code and return -EINVAL. Reported-by: Sasha Levin <sasha.levin@oracle.com> Acked-by: James Chapman <jchapman@katalix.com> Acked-by: David Miller <davem@davemloft.net> Cc: Phil Turnbull <phil.turnbull@oracle.com> Cc: Vegard Nossum <vegard.nossum@oracle.com> Cc: Willy Tarreau <w@1wt.eu> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Change-Id: I63a8c6572971c62370442c7e74c53e66b43e4b1b
* n_tty: Fix n_tty_write crash when echoing in raw modePeter Hurley2016-03-111-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The tty atomic_write_lock does not provide an exclusion guarantee for the tty driver if the termios settings are LECHO & !OPOST. And since it is unexpected and not allowed to call TTY buffer helpers like tty_insert_flip_string concurrently, this may lead to crashes when concurrect writers call pty_write. In that case the following two writers: * the ECHOing from a workqueue and * pty_write from the process race and can overflow the corresponding TTY buffer like follows. If we look into tty_insert_flip_string_fixed_flag, there is: int space = __tty_buffer_request_room(port, goal, flags); struct tty_buffer *tb = port->buf.tail; ... memcpy(char_buf_ptr(tb, tb->used), chars, space); ... tb->used += space; so the race of the two can result in something like this: A B __tty_buffer_request_room __tty_buffer_request_room memcpy(buf(tb->used), ...) tb->used += space; memcpy(buf(tb->used), ...) ->BOOM B's memcpy is past the tty_buffer due to the previous A's tb->used increment. Since the N_TTY line discipline input processing can output concurrently with a tty write, obtain the N_TTY ldisc output_lock to serialize echo output with normal tty writes. This ensures the tty buffer helper tty_insert_flip_string is not called concurrently and everything is fine. Note that this is nicely reproducible by an ordinary user using forkpty and some setup around that (raw termios + ECHO). And it is present in kernels at least after commit d945cb9cce20ac7143c2de8d88b187f62db99bdc (pty: Rework the pty layer to use the normal buffering logic) in 2.6.31-rc3. js: add more info to the commit log js: switch to bool js: lock unconditionally js: lock only the tty->ops->write call Change-Id: I63e622f7303d46a784116b26f5937ae65036b23b References: CVE-2014-0196 Reported-and-tested-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cherry-picked from git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git branch: master commit: 4291086b1f081b869c6d79e5b7441633dc3ace00 Needed some tweaks because there is no bddc715 2012-10-18 jslaby@suse.cz TTY: move ldisc data from tty_struct: locks Signed-off-by: JP Abgrall <jpa@google.com>
* net: ipv4: current group_info should be put after using.JP Abgrall2016-03-111-4/+11
| | | | | | | | | | | | | | | | | | Plug a group_info refcount leak in ping_init. group_info is only needed during initialization and the code failed to release the reference on exit. While here move grabbing the reference to a place where it is actually needed. Signed-off-by: Chuansheng Liu <chuansheng.liu@intel.com> Signed-off-by: Zhang Dongxing <dongxing.zhang@intel.com> Signed-off-by: xiaoming wang <xiaoming.wang@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Conflicts: net/ipv4/ping.c Change-Id: I51931e439cce7f19b4179f5828d7355bb3b1dcfa
* net: Fix "ip rule delete table 256"Andreas Henriksson2016-03-111-1/+2
| | | | | | | | | | | | | | | | | | | | | [ Upstream commit 13eb2ab2d33c57ebddc57437a7d341995fc9138c ] When trying to delete a table >= 256 using iproute2 the local table will be deleted. The table id is specified as a netlink attribute when it needs more then 8 bits and iproute2 then sets the table field to RT_TABLE_UNSPEC (0). Preconditions to matching the table id in the rule delete code doesn't seem to take the "table id in netlink attribute" into condition so the frh_get_table helper function never gets to do its job when matching against current rule. Use the helper function twice instead of peaking at the table value directly. Originally reported at: http://bugs.debian.org/724783 Reported-by: Nicolas HICHER <nhicher@avencall.com> Signed-off-by: Andreas Henriksson <andreas@fatal.se> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* net: ipv6: autoconf routes into per-device tablesLorenzo Colitti2016-03-114-44/+103
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, IPv6 router discovery always puts routes into RT6_TABLE_MAIN. This causes problems for connection managers that want to support multiple simultaneous network connections and want control over which one is used by default (e.g., wifi and wired). To work around this connection managers typically take the routes they prefer and copy them to static routes with low metrics in the main table. This puts the burden on the connection manager to watch netlink to see if the routes have changed, delete the routes when their lifetime expires, etc. Instead, this patch adds a per-interface sysctl to have the kernel put autoconf routes into different tables. This allows each interface to have its own autoconf table, and choosing the default interface (or using different interfaces at the same time for different types of traffic) can be done using appropriate ip rules. The sysctl behaves as follows: - = 0: default. Put routes into RT6_TABLE_MAIN as before. - > 0: manual. Put routes into the specified table. - < 0: automatic. Add the absolute value of the sysctl to the device's ifindex, and use that table. The automatic mode is most useful in conjunction with net.ipv6.conf.default.accept_ra_rt_table. A connection manager or distribution could set it to, say, -100 on boot, and thereafter just use IP rules. Change-Id: I093d39fb06ec413905dc0d0d5792c1bc5d5c73a9 Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Conflicts: net/ipv6/addrconf.c net/ipv6/route.c
* add extra free kbytes tunableRik van Riel2016-03-113-7/+50
| | | | | | | | | | | | | | | | | | | | | | | | | Add a userspace visible knob to tell the VM to keep an extra amount of memory free, by increasing the gap between each zone's min and low watermarks. This is useful for realtime applications that call system calls and have a bound on the number of allocations that happen in any short time period. In this application, extra_free_kbytes would be left at an amount equal to or larger than than the maximum number of allocations that happen in any burst. It may also be useful to reduce the memory use of virtual machines (temporarily?), in a way that does not cause memory fragmentation like ballooning does. [ccross] Revived for use on old kernels where no other solution exists. The tunable will be removed on kernels that do better at avoiding direct reclaim. Change-Id: I765a42be8e964bfd3e2886d1ca85a29d60c3bb3e Signed-off-by: Rik van Riel<riel@redhat.com> Signed-off-by: Colin Cross <ccross@android.com>
* oom: do not kill tasks with oom_score_adj OOM_SCORE_ADJ_MINMichal Hocko2016-03-111-0/+5
| | | | | | | | | | | | | | | | | | | | | | | Commit c9f01245 ("oom: remove oom_disable_count") has removed the oom_disable_count counter which has been used for early break out from oom_badness so we could never select a task with oom_score_adj set to OOM_SCORE_ADJ_MIN (oom disabled). Now that the counter is gone we are always going through heuristics calculation and we always return a non zero positive value. This means that we can end up killing a task with OOM disabled because it is indistinguishable from regular tasks with 1% resp. CAP_SYS_ADMIN tasks with 3% usage of memory or tasks with oom_score_adj set but OOM enabled. Let's break out early if the task should have OOM disabled. Signed-off-by: Michal Hocko <mhocko@suse.cz> Acked-by: David Rientjes <rientjes@google.com> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Ying Han <yinghan@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* oom: remove oom_disable_countDavid Rientjes2016-03-116-49/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | This removes mm->oom_disable_count entirely since it's unnecessary and currently buggy. The counter was intended to be per-process but it's currently decremented in the exit path for each thread that exits, causing it to underflow. The count was originally intended to prevent oom killing threads that share memory with threads that cannot be killed since it doesn't lead to future memory freeing. The counter could be fixed to represent all threads sharing the same mm, but it's better to remove the count since: - it is possible that the OOM_DISABLE thread sharing memory with the victim is waiting on that thread to exit and will actually cause future memory freeing, and - there is no guarantee that a thread is disabled from oom killing just because another thread sharing its mm is oom disabled. Signed-off-by: David Rientjes <rientjes@google.com> Reported-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Cc: Ying Han <yinghan@google.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* oom: avoid killing kthreads if they assume the oom killed thread's mmDavid Rientjes2016-03-111-2/+3
| | | | | | | | | | | | | | | | | | | | After selecting a task to kill, the oom killer iterates all processes and kills all other threads that share the same mm_struct in different thread groups. It would not otherwise be helpful to kill a thread if its memory would not be subsequently freed. A kernel thread, however, may assume a user thread's mm by using use_mm(). This is only temporary and should not result in sending a SIGKILL to that kthread. This patch ensures that only user threads and not kthreads are sent a SIGKILL if they share the same mm_struct as the oom killed task. Signed-off-by: David Rientjes <rientjes@google.com> Reviewed-by: Michal Hocko <mhocko@suse.cz> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ARM: 8120/1: crypto: sha512: add ARM NEON implementationJussi Kivilinna2016-03-114-0/+777
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds ARM NEON assembly implementation of SHA-512 and SHA-384 algorithms. tcrypt benchmark results on Cortex-A8, sha512-generic vs sha512-neon-asm: block-size bytes/update old-vs-new 16 16 2.99x 64 16 2.67x 64 64 3.00x 256 16 2.64x 256 64 3.06x 256 256 3.33x 1024 16 2.53x 1024 256 3.39x 1024 1024 3.52x 2048 16 2.50x 2048 256 3.41x 2048 1024 3.54x 2048 2048 3.57x 4096 16 2.49x 4096 256 3.42x 4096 1024 3.56x 4096 4096 3.59x 8192 16 2.48x 8192 256 3.42x 8192 1024 3.56x 8192 4096 3.60x 8192 8192 3.60x Change-Id: Ibc318f8c9136507f57e2bb8d8f51b4714d8ed70b Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Iliyan Malchev <malchev@google.com>
* ARM: 8119/1: crypto: sha1: add ARM NEON implementationJussi Kivilinna2016-03-116-3/+859
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds ARM NEON assembly implementation of SHA-1 algorithm. tcrypt benchmark results on Cortex-A8, sha1-arm-asm vs sha1-neon-asm: block-size bytes/update old-vs-new 16 16 1.04x 64 16 1.02x 64 64 1.05x 256 16 1.03x 256 64 1.04x 256 256 1.30x 1024 16 1.03x 1024 256 1.36x 1024 1024 1.52x 2048 16 1.03x 2048 256 1.39x 2048 1024 1.55x 2048 2048 1.59x 4096 16 1.03x 4096 256 1.40x 4096 1024 1.57x 4096 4096 1.62x 8192 16 1.03x 8192 256 1.40x 8192 1024 1.58x 8192 4096 1.63x 8192 8192 1.63x Change-Id: I6df3c0a9ba8d450d034cf78785b6ce80a72bef4a Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Iliyan Malchev <malchev@google.com>
* ARM: 8118/1: crypto: sha1/make use of common SHA-1 structuresJussi Kivilinna2016-03-111-28/+22
| | | | | | | | | | | Common SHA-1 structures are defined in <crypto/sha.h> for code sharing. This patch changes SHA-1/ARM glue code to use these structures. Change-Id: I5b82530706fa7c6f5ec08926992b86d26fa1c24d Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* crypto: arm-aes - fix encryption of unaligned dataMikulas Patocka2016-03-111-5/+5
| | | | | | | | | | | Fix the same alignment bug as in arm64 - we need to pass residue unprocessed bytes as the last argument to blkcipher_walk_done. Change-Id: Ia4d3cacb006269aa5b9c0c542256eff5822e84ac Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org # 3.13+ Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* CRYPTO: Fix more AES build errorsRussell King2016-03-112-2/+2
| | | | | | | | | | | | | | | | | | | | | Building a multi-arch kernel results in: arch/arm/crypto/built-in.o: In function `aesbs_xts_decrypt': sha1_glue.c:(.text+0x15c8): undefined reference to `bsaes_xts_decrypt' arch/arm/crypto/built-in.o: In function `aesbs_xts_encrypt': sha1_glue.c:(.text+0x1664): undefined reference to `bsaes_xts_encrypt' arch/arm/crypto/built-in.o: In function `aesbs_ctr_encrypt': sha1_glue.c:(.text+0x184c): undefined reference to `bsaes_ctr32_encrypt_blocks' arch/arm/crypto/built-in.o: In function `aesbs_cbc_decrypt': sha1_glue.c:(.text+0x19b4): undefined reference to `bsaes_cbc_encrypt' This code is already runtime-conditional on NEON being supported, so there's no point compiling it out depending on the minimum build architecture. Change-Id: Iff65acec7d30c508bf72132acad67332ea56bd3b Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: add .gitignore entry for aesbs-core.SRussell King2016-03-111-0/+1
| | | | | | | This avoids this file being incorrectly added to git. Change-Id: If8d1d669d8565b1f1cf3751b202bae052d26b53b Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: add support for bit sliced AES using NEON instructionsArd Biesheuvel2016-03-115-2/+5473
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Bit sliced AES gives around 45% speedup on Cortex-A15 for encryption and around 25% for decryption. This implementation of the AES algorithm does not rely on any lookup tables so it is believed to be invulnerable to cache timing attacks. This algorithm processes up to 8 blocks in parallel in constant time. This means that it is not usable by chaining modes that are strictly sequential in nature, such as CBC encryption. CBC decryption, however, can benefit from this implementation and runs about 25% faster. The other chaining modes implemented in this module, XTS and CTR, can execute fully in parallel in both directions. The core code has been adopted from the OpenSSL project (in collaboration with the original author, on cc). For ease of maintenance, this version is identical to the upstream OpenSSL code, i.e., all modifications that were required to make it suitable for inclusion into the kernel have been made upstream. The original can be found here: http://git.openssl.org/gitweb/?p=openssl.git;a=commit;h=6f6a6130 Note to integrators: While this implementation is significantly faster than the existing table based ones (generic or ARM asm), especially in CTR mode, the effects on power efficiency are unclear as of yet. This code does fundamentally more work, by calculating values that the table based code obtains by a simple lookup; only by doing all of that work in a SIMD fashion, it manages to perform better. Change-Id: Ife4f79ce9e8994e248d6fc01fcb23b0534265418 Cc: Andy Polyakov <appro@openssl.org> Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
* ARM: move AES typedefs and function prototypes to separate headerArd Biesheuvel2016-03-112-16/+25
| | | | | | | | | Put the struct definitions for AES keys and the asm function prototypes in a separate header and export the asm functions from the module. This allows other drivers to use them directly. Change-Id: Ic79a7da83232d4e3658f3fc64de4761c88ae73f3 Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
* ARM: 7837/3: fix Thumb-2 bug in AES assembler codeArd Biesheuvel2016-03-111-3/+3
| | | | | | | | | | | | | | | | commit 40190c85f427dcfdbab5dbef4ffd2510d649da1f upstream. Patch 638591c enabled building the AES assembler code in Thumb2 mode. However, this code used arithmetic involving PC rather than adr{l} instructions to generate PC-relative references to the lookup tables, and this needs to take into account the different PC offset when running in Thumb mode. Change-Id: I7358a145be3f37420c8ce5b8fc83a761b0d863ac Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* ARM: 7723/1: crypto: sha1-armv4-large.S: fix SP handlingArd Biesheuvel2016-03-111-1/+1
| | | | | | | | | | | | | | Make the SHA1 asm code ABI conformant by making sure all stack accesses occur above the stack pointer. Origin: http://git.openssl.org/gitweb/?p=openssl.git;a=commit;h=1a9d60d2 Change-Id: I89fe44b5021ee3d37ac924f04a82e9631e31843e Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Nicolas Pitre <nico@linaro.org> Cc: stable@vger.kernel.org Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* ARM: 7626/1: arm/crypto: Make asm SHA-1 and AES code Thumb-2 compatibleDave Martin2016-03-112-59/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes aes-armv4.S and sha1-armv4-large.S to work natively in Thumb. This allows ARM/Thumb interworking workarounds to be removed. I also take the opportunity to convert some explicit assembler directives for exported functions to the standard ENTRY()/ENDPROC(). For the code itself: * In sha1_block_data_order, use of TEQ with sp is deprecated in ARMv7 and not supported in Thumb. For the branches back to .L_00_15 and .L_40_59, the TEQ is converted to a CMP, under the assumption that clobbering the C flag here will not cause incorrect behaviour. For the first branch back to .L_20_39_or_60_79 the C flag is important, so sp is moved temporarily into another register so that TEQ can be used for the comparison. * In the AES code, most forms of register-indexed addressing with shifts and rotates are not permitted for loads and stores in Thumb, so the address calculation is done using a separate instruction for the Thumb case. The resulting code is unlikely to be optimally scheduled, but it should not have a large impact given the overall size of the code. I haven't run any benchmarks. Change-Id: Ic80ff883d90ee1f83b775e0bb447672d81dff54b Signed-off-by: Dave Martin <dave.martin@linaro.org> Tested-by: David McCullough <ucdevel@gmail.com> (ARM only) Acked-by: David McCullough <ucdevel@gmail.com> Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* arm/crypto: Add optimized AES and SHA1 routinesDavid McCullough2016-03-117-0/+1945
| | | | | | | | | | | | | | Add assembler versions of AES and SHA1 for ARM platforms. This has provided up to a 50% improvement in IPsec/TCP throughout for tunnels using AES128/SHA1. Platform CPU SPeed Endian Before (bps) After (bps) Improvement IXP425 533 MHz big 11217042 15566294 ~38% KS8695 166 MHz little 3828549 5795373 ~51% Change-Id: I5b77e7aa89c8b1d54aef75065827325e90305638 Signed-off-by: David McCullough <ucdevel@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* net: ipv6: allow choosing optimistic addresses with use_optimisticErik Kline2016-03-111-1/+3
| | | | | | | | | | | | | | | | The use_optimistic sysctl makes optimistic IPv6 addresses equivalent to preferred addresses for source address selection (e.g., when calling connect()), but it does not allow an application to bind to optimistic addresses. This behaviour is inconsistent - for example, it doesn't make sense for bind() to an optimistic address fail with EADDRNOTAVAIL, but connect() to choose that address outgoing address on the same socket. Bug: 17769720 Bug: 18609055 Change-Id: I9de0d6c92ac45e29d28e318ac626c71806666f13 Signed-off-by: Erik Kline <ek@google.com> Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
* cgroup: remove synchronize_rcu() from cgroup_attach_{task|proc}()Devin Kim2016-03-111-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These 2 syncronize_rcu()s make attaching a task to a cgroup quite slow, and it can't be ignored in some situations. A real case from Colin Cross: Android uses cgroups heavily to manage thread priorities, putting threads in a background group with reduced cpu.shares when they are not visible to the user, and in a foreground group when they are. Some RPCs from foreground threads to background threads will temporarily move the background thread into the foreground group for the duration of the RPC. This results in many calls to cgroup_attach_task. In cgroup_attach_task() it's task->cgroups that is protected by RCU, and put_css_set() calls kfree_rcu() to free it. If we remove this synchronize_rcu(), there can be threads in RCU-read sections accessing their old cgroup via current->cgroups with concurrent rmdir operation, but this is safe. # time for ((i=0; i<50; i++)) { echo $$ > /mnt/sub/tasks; echo $$ > /mnt/tasks; } real 0m2.524s user 0m0.008s sys 0m0.004s With this patch: real 0m0.004s user 0m0.004s sys 0m0.000s tj: These synchronize_rcu()s are utterly confused. synchornize_rcu() necessarily has to come between two operations to guarantee that the changes made by the former operation are visible to all rcu readers before proceeding to the latter operation. Here, synchornize_rcu() are at the end of attach operations with nothing beyond it. Its only effect would be delaying completion of write(2) to sysfs tasks/procs files until all rcu readers see the change, which doesn't mean anything. cherry-picked from: https://android.googlesource.com/kernel/common/+/5d65bc0ca1bceb73204dab943922ba3c83276a8c Bug: 17709419 Change-Id: I98dacd6c13da27cb3496fe4a24a24084e46bdd9c Signed-off-by: Li Zefan <lizefan@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Colin Cross <ccross@google.com> Signed-off-by: Devin Kim <dojip.kim@lge.com>
* mm: reorder can_do_mlock to fix audit denialJeff Vander Stoep2016-03-111-2/+2
| | | | | | | | | | | | | | | | | | | | A userspace call to mmap(MAP_LOCKED) may result in the successful locking of memory while also producing a confusing audit log denial. can_do_mlock checks capable and rlimit. If either of these return positive can_do_mlock returns true. The capable check leads to an LSM hook used by apparmour and selinux which produce the audit denial. Reordering so rlimit is checked first eliminates the denial on success, only recording a denial when the lock is unsuccessful as a result of the denial. Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Acked-by: Nick Kralevich <nnk@google.com> Cc: Jeff Vander Stoep <jeffv@google.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Rik van Riel <riel@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Paul Cassella <cassella@cray.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* SELinux: Fix possible NULL pointer dereference in selinux_inode_permission()Steven Rostedt2015-10-252-3/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While running stress tests on adding and deleting ftrace instances I hit this bug: BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 IP: selinux_inode_permission+0x85/0x160 PGD 63681067 PUD 7ddbe067 PMD 0 Oops: 0000 [#1] PREEMPT CPU: 0 PID: 5634 Comm: ftrace-test-mki Not tainted 3.13.0-rc4-test-00033-gd2a6dde-dirty #20 Hardware name: /DG965MQ, BIOS MQ96510J.86A.0372.2006.0605.1717 06/05/2006 task: ffff880078375800 ti: ffff88007ddb0000 task.ti: ffff88007ddb0000 RIP: 0010:[<ffffffff812d8bc5>] [<ffffffff812d8bc5>] selinux_inode_permission+0x85/0x160 RSP: 0018:ffff88007ddb1c48 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000800000 RCX: ffff88006dd43840 RDX: 0000000000000001 RSI: 0000000000000081 RDI: ffff88006ee46000 RBP: ffff88007ddb1c88 R08: 0000000000000000 R09: ffff88007ddb1c54 R10: 6e6576652f6f6f66 R11: 0000000000000003 R12: 0000000000000000 R13: 0000000000000081 R14: ffff88006ee46000 R15: 0000000000000000 FS: 00007f217b5b6700(0000) GS:ffffffff81e21000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033^M CR2: 0000000000000020 CR3: 000000006a0fe000 CR4: 00000000000007f0 Call Trace: security_inode_permission+0x1c/0x30 __inode_permission+0x41/0xa0 inode_permission+0x18/0x50 link_path_walk+0x66/0x920 path_openat+0xa6/0x6c0 do_filp_open+0x43/0xa0 do_sys_open+0x146/0x240 SyS_open+0x1e/0x20 system_call_fastpath+0x16/0x1b Code: 84 a1 00 00 00 81 e3 00 20 00 00 89 d8 83 c8 02 40 f6 c6 04 0f 45 d8 40 f6 c6 08 74 71 80 cf 02 49 8b 46 38 4c 8d 4d cc 45 31 c0 <0f> b7 50 20 8b 70 1c 48 8b 41 70 89 d9 8b 78 04 e8 36 cf ff ff RIP selinux_inode_permission+0x85/0x160 CR2: 0000000000000020 Investigating, I found that the inode->i_security was NULL, and the dereference of it caused the oops. in selinux_inode_permission(): isec = inode->i_security; rc = avc_has_perm_noaudit(sid, isec->sid, isec->sclass, perms, 0, &avd); Note, the crash came from stressing the deletion and reading of debugfs files. I was not able to recreate this via normal files. But I'm not sure they are safe. It may just be that the race window is much harder to hit. What seems to have happened (and what I have traced), is the file is being opened at the same time the file or directory is being deleted. As the dentry and inode locks are not held during the path walk, nor is the inodes ref counts being incremented, there is nothing saving these structures from being discarded except for an rcu_read_lock(). The rcu_read_lock() protects against freeing of the inode, but it does not protect freeing of the inode_security_struct. Now if the freeing of the i_security happens with a call_rcu(), and the i_security field of the inode is not changed (it gets freed as the inode gets freed) then there will be no issue here. (Linus Torvalds suggested not setting the field to NULL such that we do not need to check if it is NULL in the permission check). Note, this is a hack, but it fixes the problem at hand. A real fix is to restructure the destroy_inode() to call all the destructor handlers from the RCU callback. But that is a major job to do, and requires a lot of work. For now, we just band-aid this bug with this fix (it works), and work on a more maintainable solution in the future. Link: http://lkml.kernel.org/r/20140109101932.0508dec7@gandalf.local.home Link: http://lkml.kernel.org/r/20140109182756.17abaaa8@gandalf.local.home Cc: stable@vger.kernel.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Add permission checking for binder IPC.Stephen Smalley2015-10-256-0/+159
| | | | | | | | | | | | | | | | | | | Do not apply permission checks to private files. Fix security_binder_transfer_binder hook. Drop the owning task argument to security_binder_transfer_binder since ref->node->proc can be NULL (dead owner?). Revise the SELinux checking to apply a single transfer check between the source and destination tasks. Owning task is no longer relevant. Drop the receive permission definition as it is no longer used. This makes the transfer permission similar to the call permission; it is only useful if you want to allow a binder IPC between two tasks (call permission) but deny passing of binder references between them (transfer permission). Change-Id: I51e7a9a6662e826073b35e4f70a57f9ec73e472e Signed-off-by: William Roberts <w.roberts@sta.samsung.com>
* android: binder: Change binder mutex to rtmutex.Riley Andrews2015-10-251-3/+4
| | | | | | | | | | | | | Surfaceflinger uses binder heavily to receive/send frames from applications while compositing the screen. Change the binder mutex to an rt mutex to minimize instances where high priority surfaceflinger binder work is blocked by lower priority binder ipc. Change-Id: If7429040641d6e463f20301ec14f02ecf6b0da36 Signed-off-by: Riley Andrews <riandrews@google.com> Conflicts: drivers/android/binder.c
* Staging: android: binder: Allow using highmem for binder buffersArve Hjønnevåg2015-10-251-1/+1
| | | | | | | | | | | | commit 585650dcec88e704a19bb226a34b6a7166111623 upstream. The default kernel mapping for the pages allocated for the binder buffers is never used. Set the __GFP_HIGHMEM flag when allocating these pages so we don't needlessly use low memory pages that may be required elsewhere. Signed-off-by: Arve Hjønnevåg <arve@android.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* Staging: android: binder: Fix memory leak on thread/process exitArve Hjønnevåg2015-10-251-1/+27
| | | | | | | | | | | commit 675d66b0ed5fd170d6a44cf8dbb3fa56a5347bdb upstream. If a thread or process exited while a reply, one-way transaction or death notification was pending, the struct holding the pending work was leaked. Signed-off-by: Arve Hjønnevåg <arve@android.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* Staging: android: binder: Add some tracepointsArve Hjønnevåg2015-10-253-20/+400
| | | | | | | | | | | | | | Add tracepoints: - ioctl entry and exit - Main binder lock: lock, locked and unlock - Command and return buffer opcodes - Transaction: create and receive - Transaction buffer: create and free - Object and file descriptor transfer - binder_update_page_range Change-Id: Ib09ae78b0b8b75062325318e2307afd71b7c4458 Signed-off-by: Arve Hjønnevåg <arve@android.com>
* Staging: android: binder: Add some missing binder_stat_br callsArve Hjønnevåg2015-10-251-0/+4
| | | | | | | | Cached thread return errors, death notifications and new looper requests were not included in the stats. Change-Id: Iabe14b351b662d3f63009ecb3900f92fc3d72cc4 Signed-off-by: Arve Hjønnevåg <arve@android.com>
* Staging: android: binder: Fix use-after-free bugArve Hjønnevåg2015-10-251-1/+4
| | | | | | | | | | | | | binder_update_page_range could read freed memory if the vma of the selected process was freed right before the check that the vma belongs to the mm struct it just locked. If the vm_mm pointer in that freed vma struct had also been rewritten with a value that matched the locked mm struct, then the code would proceed and possibly modify the freed vma. Signed-off-by: Arve Hjønnevåg <arve@android.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* Staging:android: Change type for binder_debug_no_lock switch to boolZhengwang Ruan2015-10-251-1/+1
| | | | | | | | | GCC warns that module_param_named() indirectly returns a bool type value which is different from 'int' type binder_debug_no_lock declared. Change it to bool because it is a internal switch for debugging. Signed-off-by: Zhengwang Ruan <ruan.zhengwang@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* Staging: android: binder: Fix crashes when sharing a binder file between ↵Arve Hjønnevåg2015-10-251-1/+11
| | | | | | | | | | | | | | | | | | processes Opening the binder driver and sharing the file returned with other processes (e.g. by calling fork) can crash the kernel. Prevent these crashes with the following changes: - Add a mutex to protect against two processes mmapping the same binder_proc. - After locking mmap_sem, check that the vma we want to access (still) points to the same mm_struct. - Use proc->tsk instead of current to get the files struct since this is where we get the rlimit from. Signed-off-by: Arve Hjønnevåg <arve@android.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* Staging: android: binder: Don't call dump_stack in binder_vma_openArve Hjønnevåg2015-10-251-1/+0
| | | | | | | | | | | If user-space partially unmaps the driver, binder_vma_open would dump the kernel stack. This is not a kernel bug however and will be treated as if the whole area was unmapped once binder_vma_close gets called. Signed-off-by: Arve Hjønnevåg <arve@android.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* Staging: android: fixed a space warning in binder.hMarco Navarra2015-10-251-1/+1
| | | | | | | This patch fixes a simple tab-space warning in binder.h found by checkpatch tool Signed-off-by: Marco Navarra <fromenglish@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* staging: binder: add vm_fault handlerVinayak Menon2015-10-251-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | An issue was observed when a userspace task exits. The page which hits error here is the zero page. In binder mmap, the whole of vma is not mapped. On a task crash, when debuggerd reads the binder regions, the unmapped areas fall to do_anonymous_page in handle_pte_fault, due to the absence of a vm_fault handler. This results in zero page being mapped. Later in zap_pte_range, vm_normal_page returns zero page in the case of VM_MIXEDMAP and it results in the error. BUG: Bad page map in process mediaserver pte:9dff379f pmd:9bfbd831 page:c0ed8e60 count:1 mapcount:-1 mapping: (null) index:0x0 page flags: 0x404(referenced|reserved) addr:40c3f000 vm_flags:10220051 anon_vma: (null) mapping:d9fe0764 index:fd vma->vm_ops->fault: (null) vma->vm_file->f_op->mmap: binder_mmap+0x0/0x274 CPU: 0 PID: 1463 Comm: mediaserver Tainted: G W 3.10.17+ #1 [<c001549c>] (unwind_backtrace+0x0/0x11c) from [<c001200c>] (show_stack+0x10/0x14) [<c001200c>] (show_stack+0x10/0x14) from [<c0103d78>] (print_bad_pte+0x158/0x190) [<c0103d78>] (print_bad_pte+0x158/0x190) from [<c01055f0>] (unmap_single_vma+0x2e4/0x598) [<c01055f0>] (unmap_single_vma+0x2e4/0x598) from [<c010618c>] (unmap_vmas+0x34/0x50) [<c010618c>] (unmap_vmas+0x34/0x50) from [<c010a9e4>] (exit_mmap+0xc8/0x1e8) [<c010a9e4>] (exit_mmap+0xc8/0x1e8) from [<c00520f0>] (mmput+0x54/0xd0) [<c00520f0>] (mmput+0x54/0xd0) from [<c005972c>] (do_exit+0x360/0x990) [<c005972c>] (do_exit+0x360/0x990) from [<c0059ef0>] (do_group_exit+0x84/0xc0) [<c0059ef0>] (do_group_exit+0x84/0xc0) from [<c0066de0>] (get_signal_to_deliver+0x4d4/0x548) [<c0066de0>] (get_signal_to_deliver+0x4d4/0x548) from [<c0011500>] (do_signal+0xa8/0x3b8) Add a vm_fault handler which returns VM_FAULT_SIGBUS, and prevents the wrong fallback to do_anonymous_page. Change-Id: I43c227e489f74f4907f199caf99f571b61883064 Signed-off-by: Vinayak Menon <vinayakm.list@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* staging: Remove the Android logger driverJohn Stultz2015-10-253-830/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the relase of Lollipop, Android no longer requires the logger driver. There are three patches which the android dev's still need before they drop logger on all their devices: [PATCH v4 1/5] pstores: use scnprintf [PATCH v2 2/5] pstore: remove superfluous memory size check [PATCH 3/5] pstore: handle zero-sized prz in series [PATCH v4 4/5] pstore: add pmsg [PATCH 5/5] pstore: selinux: add security in-core xattr support for pstore and debugfs But these seem to have been acked and are hopefully queued for upstream. So this patch removes the logger driver from staging. Cc: Rom Lemarchand <romlem@google.com>, Cc: Mark Salyzyn <salyzyn@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Android Kernel Team <kernel-team@android.com> Signed-off-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Bug: 13505761 Change-Id: I21b6897f01871851e05b6eb53c7c08a1cb597e3d Conflicts: drivers/staging/android/Kconfig drivers/staging/android/logger.c drivers/staging/android/logger.h Conflicts: arch/arm/configs/cyanogenmod_tuna_defconfig arch/arm/configs/tuna_defconfig drivers/staging/android/logger.c
* staging: android: lowmemorykiller: fix build breakage on kernel 3.0Ziyann2015-10-251-1/+1
|
* lowmemorykiller: make default lowmemorykiller debug message usefulColin Cross2015-10-251-8/+20
| | | | | | | | | | | | | | | | | | | | | lowmemorykiller debug messages are inscrutable and mostly useful for debugging the lowmemorykiller, not explaining why a process was killed. Make the messages more useful by prefixing them with "lowmemorykiller: " and explaining in more readable terms what was killed, who it was killed for, and why it was killed. The messages now look like: [ 76.997631] lowmemorykiller: Killing 'droid.gallery3d' (2172), adj 1000, [ 76.997635] to free 27436kB on behalf of 'kswapd0' (29) because [ 76.997638] cache 122624kB is below limit 122880kB for oom_score_adj 1000 [ 76.997641] Free memory is -53356kB above reserved A negative number for free memory above reserved means some of the reserved memory has been used and is being regenerated by kswapd, which is likely what called the shrinkers. Change-Id: I1fe983381e73e124b90aa5d91cb66e55eaca390f Signed-off-by: Colin Cross <ccross@android.com>
* staging: android: lowmemorykiller: Change default debug_level to 1Arve Hjønnevåg2015-10-251-1/+1
| | | | | | | | | | The select...to kill messages are not very useful when not debugging the lowmemorykiller itself. After the change to check TIF_MEMDIE instead of using a task notifer this message can also get very noisy. Change-Id: Ice171c25801d6faa454b885a23b24b002423b754 Signed-off-by: Arve Hjønnevåg <arve@android.com>
* staging: android: lowmemorykiller: Don't count reserved free memoryArve Hjønnevåg2015-10-251-1/+2
| | | | | | | | | The amount of reserved memory varies between devices. Subtract it here to reduce the amount of devices specific tuning needed for the minfree values. Change-Id: I466ae8b18f5972f6f6d8b5a7d8c4ae69660de53a Signed-off-by: Arve Hjønnevåg <arve@android.com>
* staging: android: lowmemorykiller: Add config option to support oom_adj valuesArve Hjønnevåg2015-10-252-0/+94
| | | | | | | | | | | | | The conversion to use oom_score_adj instead of the deprecated oom_adj values breaks existing user-space code. Add a config option to convert oom_adj values written to oom_score_adj values if they appear to be valid oom_adj values. Change-Id: I68308125059b802ee2991feefb07e9703bc48549 Signed-off-by: Arve Hjønnevåg <arve@android.com> Conflicts: drivers/staging/android/Kconfig
* android, lowmemorykiller: remove task handoff notifierDavid Rientjes2015-10-251-33/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The task handoff notifier leaks task_struct since it never gets freed after the callback returns NOTIFY_OK, which means it is responsible for doing so. It turns out the lowmemorykiller actually doesn't need this notifier at all. It's used to prevent unnecessary killing by waiting for a thread to exit as a result of lowmem_shrink(), however, it's possible to do this in the same way the kernel oom killer works by setting TIF_MEMDIE and avoid killing if we're still waiting for it to exit. The kernel oom killer will already automatically set TIF_MEMDIE for threads that are attempting to allocate memory that have a fatal signal. The thread selected by lowmem_shrink() will have such a signal after the lowmemorykiller sends it a SIGKILL, so this won't result in an unnecessary use of memory reserves for the thread to exit. This has the added benefit that we don't have to rely on CONFIG_PROFILING to prevent needlessly killing tasks. Reported-by: Werner Landgraf <w.landgraf@ru.ru> Cc: stable@vger.kernel.org Signed-off-by: David Rientjes <rientjes@google.com> Acked-by: Colin Cross <ccross@android.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Conflicts: drivers/staging/android/lowmemorykiller.c
* Staging: android: lowmemorykiller.cGreg Kroah-Hartman2015-10-251-1/+1
| | | | | | | Fix compiler warning about the type of the module parameter. Cc: San Mehat <san@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* staging: android, lowmemorykiller: convert to use oom_score_adjDavid Rientjes2015-10-251-21/+22
| | | | | | | | | | | | | /proc/pid/oom_adj is deprecated and will be removed in August 2012 according to Documentation/feature-removal-schedule.txt. Convert its usage in the lowmemorykiller to use the new interface, oom_score_adj, instead. Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Conflicts: drivers/staging/android/lowmemorykiller.c
* staging: android/lowmemorykiller: Do not kill kernel threadsAnton Vorontsov2015-10-251-0/+3
| | | | | | | | | | LMK should not try killing kernel threads. Suggested-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org> Acked-by: KOSAKI Motohiro <kosaki.motohiro@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* staging: android/lowmemorykiller: No need for task->signal checkAnton Vorontsov2015-10-251-7/+1
| | | | | | | | | | task->signal == NULL is not possible, so no need for these checks. Suggested-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org> Acked-by: KOSAKI Motohiro <kosaki.motohiro@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* staging: android/lowmemorykiller: Better mm handlingAnton Vorontsov2015-10-251-7/+9
| | | | | | | | | | | | | | LMK should not directly check for task->mm. The reason is that the process' threads may exit or detach its mm via use_mm(), but other threads may still have a valid mm. To catch this we use find_lock_task_mm(), which walks up all threads and returns an appropriate task (with lock held). Suggested-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org> Acked-by: KOSAKI Motohiro <kosaki.motohiro@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* staging: android/lowmemorykiller: Don't grab tasklist_lockAnton Vorontsov2015-10-251-3/+4
| | | | | | | | | | | | | | | | | | | | | | Grabbing tasklist_lock has its disadvantages, i.e. it blocks process creation and destruction. If there are lots of processes, blocking doesn't sound as a great idea. For LMK, it is sufficient to surround tasks list traverse with rcu_read_{,un}lock(). >From now on using force_sig() is not safe, as it can race with an already exiting task, so we use send_sig() now. As a downside, it won't kill PID namespace init processes, but that's not what we want anyway. Suggested-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Conflicts: drivers/staging/android/lowmemorykiller.c