| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The l2tp [get|set]sockopt() code has fallen back to the UDP functions
for socket option levels != SOL_PPPOL2TP since day one, but that has
never actually worked, since the l2tp socket isn't an inet socket.
As David Miller points out:
"If we wanted this to work, it'd have to look up the tunnel and then
use tunnel->sk, but I wonder how useful that would be"
Since this can never have worked so nobody could possibly have depended
on that functionality, just remove the broken code and return -EINVAL.
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Acked-by: James Chapman <jchapman@katalix.com>
Acked-by: David Miller <davem@davemloft.net>
Cc: Phil Turnbull <phil.turnbull@oracle.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Willy Tarreau <w@1wt.eu>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change-Id: I63a8c6572971c62370442c7e74c53e66b43e4b1b
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The tty atomic_write_lock does not provide an exclusion guarantee for
the tty driver if the termios settings are LECHO & !OPOST. And since
it is unexpected and not allowed to call TTY buffer helpers like
tty_insert_flip_string concurrently, this may lead to crashes when
concurrect writers call pty_write. In that case the following two
writers:
* the ECHOing from a workqueue and
* pty_write from the process
race and can overflow the corresponding TTY buffer like follows.
If we look into tty_insert_flip_string_fixed_flag, there is:
int space = __tty_buffer_request_room(port, goal, flags);
struct tty_buffer *tb = port->buf.tail;
...
memcpy(char_buf_ptr(tb, tb->used), chars, space);
...
tb->used += space;
so the race of the two can result in something like this:
A B
__tty_buffer_request_room
__tty_buffer_request_room
memcpy(buf(tb->used), ...)
tb->used += space;
memcpy(buf(tb->used), ...) ->BOOM
B's memcpy is past the tty_buffer due to the previous A's tb->used
increment.
Since the N_TTY line discipline input processing can output
concurrently with a tty write, obtain the N_TTY ldisc output_lock to
serialize echo output with normal tty writes. This ensures the tty
buffer helper tty_insert_flip_string is not called concurrently and
everything is fine.
Note that this is nicely reproducible by an ordinary user using
forkpty and some setup around that (raw termios + ECHO). And it is
present in kernels at least after commit
d945cb9cce20ac7143c2de8d88b187f62db99bdc (pty: Rework the pty layer to
use the normal buffering logic) in 2.6.31-rc3.
js: add more info to the commit log
js: switch to bool
js: lock unconditionally
js: lock only the tty->ops->write call
Change-Id: I63e622f7303d46a784116b26f5937ae65036b23b
References: CVE-2014-0196
Reported-and-tested-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cherry-picked from git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
branch: master
commit: 4291086b1f081b869c6d79e5b7441633dc3ace00
Needed some tweaks because there is no
bddc715 2012-10-18 jslaby@suse.cz TTY: move ldisc data from tty_struct: locks
Signed-off-by: JP Abgrall <jpa@google.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Plug a group_info refcount leak in ping_init.
group_info is only needed during initialization and
the code failed to release the reference on exit.
While here move grabbing the reference to a place
where it is actually needed.
Signed-off-by: Chuansheng Liu <chuansheng.liu@intel.com>
Signed-off-by: Zhang Dongxing <dongxing.zhang@intel.com>
Signed-off-by: xiaoming wang <xiaoming.wang@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
net/ipv4/ping.c
Change-Id: I51931e439cce7f19b4179f5828d7355bb3b1dcfa
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 13eb2ab2d33c57ebddc57437a7d341995fc9138c ]
When trying to delete a table >= 256 using iproute2 the local table
will be deleted.
The table id is specified as a netlink attribute when it needs more then
8 bits and iproute2 then sets the table field to RT_TABLE_UNSPEC (0).
Preconditions to matching the table id in the rule delete code
doesn't seem to take the "table id in netlink attribute" into condition
so the frh_get_table helper function never gets to do its job when
matching against current rule.
Use the helper function twice instead of peaking at the table value directly.
Originally reported at: http://bugs.debian.org/724783
Reported-by: Nicolas HICHER <nhicher@avencall.com>
Signed-off-by: Andreas Henriksson <andreas@fatal.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, IPv6 router discovery always puts routes into
RT6_TABLE_MAIN. This causes problems for connection managers
that want to support multiple simultaneous network connections
and want control over which one is used by default (e.g., wifi
and wired).
To work around this connection managers typically take the routes
they prefer and copy them to static routes with low metrics in
the main table. This puts the burden on the connection manager
to watch netlink to see if the routes have changed, delete the
routes when their lifetime expires, etc.
Instead, this patch adds a per-interface sysctl to have the
kernel put autoconf routes into different tables. This allows
each interface to have its own autoconf table, and choosing the
default interface (or using different interfaces at the same
time for different types of traffic) can be done using
appropriate ip rules.
The sysctl behaves as follows:
- = 0: default. Put routes into RT6_TABLE_MAIN as before.
- > 0: manual. Put routes into the specified table.
- < 0: automatic. Add the absolute value of the sysctl to the
device's ifindex, and use that table.
The automatic mode is most useful in conjunction with
net.ipv6.conf.default.accept_ra_rt_table. A connection manager
or distribution could set it to, say, -100 on boot, and
thereafter just use IP rules.
Change-Id: I093d39fb06ec413905dc0d0d5792c1bc5d5c73a9
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Conflicts:
net/ipv6/addrconf.c
net/ipv6/route.c
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a userspace visible knob to tell the VM to keep an extra amount
of memory free, by increasing the gap between each zone's min and
low watermarks.
This is useful for realtime applications that call system
calls and have a bound on the number of allocations that happen
in any short time period. In this application, extra_free_kbytes
would be left at an amount equal to or larger than than the
maximum number of allocations that happen in any burst.
It may also be useful to reduce the memory use of virtual
machines (temporarily?), in a way that does not cause memory
fragmentation like ballooning does.
[ccross]
Revived for use on old kernels where no other solution exists.
The tunable will be removed on kernels that do better at avoiding
direct reclaim.
Change-Id: I765a42be8e964bfd3e2886d1ca85a29d60c3bb3e
Signed-off-by: Rik van Riel<riel@redhat.com>
Signed-off-by: Colin Cross <ccross@android.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit c9f01245 ("oom: remove oom_disable_count") has removed the
oom_disable_count counter which has been used for early break out from
oom_badness so we could never select a task with oom_score_adj set to
OOM_SCORE_ADJ_MIN (oom disabled).
Now that the counter is gone we are always going through heuristics
calculation and we always return a non zero positive value. This means
that we can end up killing a task with OOM disabled because it is
indistinguishable from regular tasks with 1% resp. CAP_SYS_ADMIN tasks
with 3% usage of memory or tasks with oom_score_adj set but OOM enabled.
Let's break out early if the task should have OOM disabled.
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ying Han <yinghan@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This removes mm->oom_disable_count entirely since it's unnecessary and
currently buggy. The counter was intended to be per-process but it's
currently decremented in the exit path for each thread that exits, causing
it to underflow.
The count was originally intended to prevent oom killing threads that
share memory with threads that cannot be killed since it doesn't lead to
future memory freeing. The counter could be fixed to represent all
threads sharing the same mm, but it's better to remove the count since:
- it is possible that the OOM_DISABLE thread sharing memory with the
victim is waiting on that thread to exit and will actually cause
future memory freeing, and
- there is no guarantee that a thread is disabled from oom killing just
because another thread sharing its mm is oom disabled.
Signed-off-by: David Rientjes <rientjes@google.com>
Reported-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Cc: Ying Han <yinghan@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
After selecting a task to kill, the oom killer iterates all processes and
kills all other threads that share the same mm_struct in different thread
groups. It would not otherwise be helpful to kill a thread if its memory
would not be subsequently freed.
A kernel thread, however, may assume a user thread's mm by using
use_mm(). This is only temporary and should not result in sending a
SIGKILL to that kthread.
This patch ensures that only user threads and not kthreads are sent a
SIGKILL if they share the same mm_struct as the oom killed task.
Signed-off-by: David Rientjes <rientjes@google.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds ARM NEON assembly implementation of SHA-512 and SHA-384
algorithms.
tcrypt benchmark results on Cortex-A8, sha512-generic vs sha512-neon-asm:
block-size bytes/update old-vs-new
16 16 2.99x
64 16 2.67x
64 64 3.00x
256 16 2.64x
256 64 3.06x
256 256 3.33x
1024 16 2.53x
1024 256 3.39x
1024 1024 3.52x
2048 16 2.50x
2048 256 3.41x
2048 1024 3.54x
2048 2048 3.57x
4096 16 2.49x
4096 256 3.42x
4096 1024 3.56x
4096 4096 3.59x
8192 16 2.48x
8192 256 3.42x
8192 1024 3.56x
8192 4096 3.60x
8192 8192 3.60x
Change-Id: Ibc318f8c9136507f57e2bb8d8f51b4714d8ed70b
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Iliyan Malchev <malchev@google.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds ARM NEON assembly implementation of SHA-1 algorithm.
tcrypt benchmark results on Cortex-A8, sha1-arm-asm vs sha1-neon-asm:
block-size bytes/update old-vs-new
16 16 1.04x
64 16 1.02x
64 64 1.05x
256 16 1.03x
256 64 1.04x
256 256 1.30x
1024 16 1.03x
1024 256 1.36x
1024 1024 1.52x
2048 16 1.03x
2048 256 1.39x
2048 1024 1.55x
2048 2048 1.59x
4096 16 1.03x
4096 256 1.40x
4096 1024 1.57x
4096 4096 1.62x
8192 16 1.03x
8192 256 1.40x
8192 1024 1.58x
8192 4096 1.63x
8192 8192 1.63x
Change-Id: I6df3c0a9ba8d450d034cf78785b6ce80a72bef4a
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Iliyan Malchev <malchev@google.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Common SHA-1 structures are defined in <crypto/sha.h> for code sharing.
This patch changes SHA-1/ARM glue code to use these structures.
Change-Id: I5b82530706fa7c6f5ec08926992b86d26fa1c24d
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
|
|
|
|
|
|
|
|
|
| |
Fix the same alignment bug as in arm64 - we need to pass residue
unprocessed bytes as the last argument to blkcipher_walk_done.
Change-Id: Ia4d3cacb006269aa5b9c0c542256eff5822e84ac
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org # 3.13+
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Building a multi-arch kernel results in:
arch/arm/crypto/built-in.o: In function `aesbs_xts_decrypt':
sha1_glue.c:(.text+0x15c8): undefined reference to `bsaes_xts_decrypt'
arch/arm/crypto/built-in.o: In function `aesbs_xts_encrypt':
sha1_glue.c:(.text+0x1664): undefined reference to `bsaes_xts_encrypt'
arch/arm/crypto/built-in.o: In function `aesbs_ctr_encrypt':
sha1_glue.c:(.text+0x184c): undefined reference to `bsaes_ctr32_encrypt_blocks'
arch/arm/crypto/built-in.o: In function `aesbs_cbc_decrypt':
sha1_glue.c:(.text+0x19b4): undefined reference to `bsaes_cbc_encrypt'
This code is already runtime-conditional on NEON being supported, so
there's no point compiling it out depending on the minimum build
architecture.
Change-Id: Iff65acec7d30c508bf72132acad67332ea56bd3b
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
|
|
|
|
|
| |
This avoids this file being incorrectly added to git.
Change-Id: If8d1d669d8565b1f1cf3751b202bae052d26b53b
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bit sliced AES gives around 45% speedup on Cortex-A15 for encryption
and around 25% for decryption. This implementation of the AES algorithm
does not rely on any lookup tables so it is believed to be invulnerable
to cache timing attacks.
This algorithm processes up to 8 blocks in parallel in constant time. This
means that it is not usable by chaining modes that are strictly sequential
in nature, such as CBC encryption. CBC decryption, however, can benefit from
this implementation and runs about 25% faster. The other chaining modes
implemented in this module, XTS and CTR, can execute fully in parallel in
both directions.
The core code has been adopted from the OpenSSL project (in collaboration
with the original author, on cc). For ease of maintenance, this version is
identical to the upstream OpenSSL code, i.e., all modifications that were
required to make it suitable for inclusion into the kernel have been made
upstream. The original can be found here:
http://git.openssl.org/gitweb/?p=openssl.git;a=commit;h=6f6a6130
Note to integrators:
While this implementation is significantly faster than the existing table
based ones (generic or ARM asm), especially in CTR mode, the effects on
power efficiency are unclear as of yet. This code does fundamentally more
work, by calculating values that the table based code obtains by a simple
lookup; only by doing all of that work in a SIMD fashion, it manages to
perform better.
Change-Id: Ife4f79ce9e8994e248d6fc01fcb23b0534265418
Cc: Andy Polyakov <appro@openssl.org>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
|
|
|
|
|
|
|
|
|
| |
Put the struct definitions for AES keys and the asm function prototypes in a
separate header and export the asm functions from the module.
This allows other drivers to use them directly.
Change-Id: Ic79a7da83232d4e3658f3fc64de4761c88ae73f3
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 40190c85f427dcfdbab5dbef4ffd2510d649da1f upstream.
Patch 638591c enabled building the AES assembler code in Thumb2 mode.
However, this code used arithmetic involving PC rather than adr{l}
instructions to generate PC-relative references to the lookup tables,
and this needs to take into account the different PC offset when
running in Thumb mode.
Change-Id: I7358a145be3f37420c8ce5b8fc83a761b0d863ac
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Make the SHA1 asm code ABI conformant by making sure all stack
accesses occur above the stack pointer.
Origin:
http://git.openssl.org/gitweb/?p=openssl.git;a=commit;h=1a9d60d2
Change-Id: I89fe44b5021ee3d37ac924f04a82e9631e31843e
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Nicolas Pitre <nico@linaro.org>
Cc: stable@vger.kernel.org
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes aes-armv4.S and sha1-armv4-large.S to work
natively in Thumb. This allows ARM/Thumb interworking workarounds
to be removed.
I also take the opportunity to convert some explicit assembler
directives for exported functions to the standard
ENTRY()/ENDPROC().
For the code itself:
* In sha1_block_data_order, use of TEQ with sp is deprecated in
ARMv7 and not supported in Thumb. For the branches back to
.L_00_15 and .L_40_59, the TEQ is converted to a CMP, under the
assumption that clobbering the C flag here will not cause
incorrect behaviour.
For the first branch back to .L_20_39_or_60_79 the C flag is
important, so sp is moved temporarily into another register so
that TEQ can be used for the comparison.
* In the AES code, most forms of register-indexed addressing with
shifts and rotates are not permitted for loads and stores in
Thumb, so the address calculation is done using a separate
instruction for the Thumb case.
The resulting code is unlikely to be optimally scheduled, but it
should not have a large impact given the overall size of the code.
I haven't run any benchmarks.
Change-Id: Ic80ff883d90ee1f83b775e0bb447672d81dff54b
Signed-off-by: Dave Martin <dave.martin@linaro.org>
Tested-by: David McCullough <ucdevel@gmail.com> (ARM only)
Acked-by: David McCullough <ucdevel@gmail.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add assembler versions of AES and SHA1 for ARM platforms. This has provided
up to a 50% improvement in IPsec/TCP throughout for tunnels using AES128/SHA1.
Platform CPU SPeed Endian Before (bps) After (bps) Improvement
IXP425 533 MHz big 11217042 15566294 ~38%
KS8695 166 MHz little 3828549 5795373 ~51%
Change-Id: I5b77e7aa89c8b1d54aef75065827325e90305638
Signed-off-by: David McCullough <ucdevel@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The use_optimistic sysctl makes optimistic IPv6 addresses
equivalent to preferred addresses for source address selection
(e.g., when calling connect()), but it does not allow an
application to bind to optimistic addresses. This behaviour is
inconsistent - for example, it doesn't make sense for bind() to
an optimistic address fail with EADDRNOTAVAIL, but connect() to
choose that address outgoing address on the same socket.
Bug: 17769720
Bug: 18609055
Change-Id: I9de0d6c92ac45e29d28e318ac626c71806666f13
Signed-off-by: Erik Kline <ek@google.com>
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These 2 syncronize_rcu()s make attaching a task to a cgroup
quite slow, and it can't be ignored in some situations.
A real case from Colin Cross: Android uses cgroups heavily to
manage thread priorities, putting threads in a background group
with reduced cpu.shares when they are not visible to the user,
and in a foreground group when they are. Some RPCs from foreground
threads to background threads will temporarily move the background
thread into the foreground group for the duration of the RPC.
This results in many calls to cgroup_attach_task.
In cgroup_attach_task() it's task->cgroups that is protected by RCU,
and put_css_set() calls kfree_rcu() to free it.
If we remove this synchronize_rcu(), there can be threads in RCU-read
sections accessing their old cgroup via current->cgroups with
concurrent rmdir operation, but this is safe.
# time for ((i=0; i<50; i++)) { echo $$ > /mnt/sub/tasks; echo $$ > /mnt/tasks; }
real 0m2.524s
user 0m0.008s
sys 0m0.004s
With this patch:
real 0m0.004s
user 0m0.004s
sys 0m0.000s
tj: These synchronize_rcu()s are utterly confused. synchornize_rcu()
necessarily has to come between two operations to guarantee that
the changes made by the former operation are visible to all rcu
readers before proceeding to the latter operation. Here,
synchornize_rcu() are at the end of attach operations with nothing
beyond it. Its only effect would be delaying completion of
write(2) to sysfs tasks/procs files until all rcu readers see the
change, which doesn't mean anything.
cherry-picked from:
https://android.googlesource.com/kernel/common/+/5d65bc0ca1bceb73204dab943922ba3c83276a8c
Bug: 17709419
Change-Id: I98dacd6c13da27cb3496fe4a24a24084e46bdd9c
Signed-off-by: Li Zefan <lizefan@huawei.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Colin Cross <ccross@google.com>
Signed-off-by: Devin Kim <dojip.kim@lge.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A userspace call to mmap(MAP_LOCKED) may result in the successful locking
of memory while also producing a confusing audit log denial. can_do_mlock
checks capable and rlimit. If either of these return positive
can_do_mlock returns true. The capable check leads to an LSM hook used by
apparmour and selinux which produce the audit denial. Reordering so
rlimit is checked first eliminates the denial on success, only recording a
denial when the lock is unsuccessful as a result of the denial.
Signed-off-by: Jeff Vander Stoep <jeffv@google.com>
Acked-by: Nick Kralevich <nnk@google.com>
Cc: Jeff Vander Stoep <jeffv@google.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Paul Cassella <cassella@cray.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
While running stress tests on adding and deleting ftrace instances I hit
this bug:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
IP: selinux_inode_permission+0x85/0x160
PGD 63681067 PUD 7ddbe067 PMD 0
Oops: 0000 [#1] PREEMPT
CPU: 0 PID: 5634 Comm: ftrace-test-mki Not tainted 3.13.0-rc4-test-00033-gd2a6dde-dirty #20
Hardware name: /DG965MQ, BIOS MQ96510J.86A.0372.2006.0605.1717 06/05/2006
task: ffff880078375800 ti: ffff88007ddb0000 task.ti: ffff88007ddb0000
RIP: 0010:[<ffffffff812d8bc5>] [<ffffffff812d8bc5>] selinux_inode_permission+0x85/0x160
RSP: 0018:ffff88007ddb1c48 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000800000 RCX: ffff88006dd43840
RDX: 0000000000000001 RSI: 0000000000000081 RDI: ffff88006ee46000
RBP: ffff88007ddb1c88 R08: 0000000000000000 R09: ffff88007ddb1c54
R10: 6e6576652f6f6f66 R11: 0000000000000003 R12: 0000000000000000
R13: 0000000000000081 R14: ffff88006ee46000 R15: 0000000000000000
FS: 00007f217b5b6700(0000) GS:ffffffff81e21000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033^M
CR2: 0000000000000020 CR3: 000000006a0fe000 CR4: 00000000000007f0
Call Trace:
security_inode_permission+0x1c/0x30
__inode_permission+0x41/0xa0
inode_permission+0x18/0x50
link_path_walk+0x66/0x920
path_openat+0xa6/0x6c0
do_filp_open+0x43/0xa0
do_sys_open+0x146/0x240
SyS_open+0x1e/0x20
system_call_fastpath+0x16/0x1b
Code: 84 a1 00 00 00 81 e3 00 20 00 00 89 d8 83 c8 02 40 f6 c6 04 0f 45 d8 40 f6 c6 08 74 71 80 cf 02 49 8b 46 38 4c 8d 4d cc 45 31 c0 <0f> b7 50 20 8b 70 1c 48 8b 41 70 89 d9 8b 78 04 e8 36 cf ff ff
RIP selinux_inode_permission+0x85/0x160
CR2: 0000000000000020
Investigating, I found that the inode->i_security was NULL, and the
dereference of it caused the oops.
in selinux_inode_permission():
isec = inode->i_security;
rc = avc_has_perm_noaudit(sid, isec->sid, isec->sclass, perms, 0, &avd);
Note, the crash came from stressing the deletion and reading of debugfs
files. I was not able to recreate this via normal files. But I'm not
sure they are safe. It may just be that the race window is much harder
to hit.
What seems to have happened (and what I have traced), is the file is
being opened at the same time the file or directory is being deleted.
As the dentry and inode locks are not held during the path walk, nor is
the inodes ref counts being incremented, there is nothing saving these
structures from being discarded except for an rcu_read_lock().
The rcu_read_lock() protects against freeing of the inode, but it does
not protect freeing of the inode_security_struct. Now if the freeing of
the i_security happens with a call_rcu(), and the i_security field of
the inode is not changed (it gets freed as the inode gets freed) then
there will be no issue here. (Linus Torvalds suggested not setting the
field to NULL such that we do not need to check if it is NULL in the
permission check).
Note, this is a hack, but it fixes the problem at hand. A real fix is
to restructure the destroy_inode() to call all the destructor handlers
from the RCU callback. But that is a major job to do, and requires a
lot of work. For now, we just band-aid this bug with this fix (it
works), and work on a more maintainable solution in the future.
Link: http://lkml.kernel.org/r/20140109101932.0508dec7@gandalf.local.home
Link: http://lkml.kernel.org/r/20140109182756.17abaaa8@gandalf.local.home
Cc: stable@vger.kernel.org
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Do not apply permission checks to private files.
Fix security_binder_transfer_binder hook.
Drop the owning task argument to security_binder_transfer_binder
since ref->node->proc can be NULL (dead owner?).
Revise the SELinux checking to apply a single transfer check between
the source and destination tasks. Owning task is no longer relevant.
Drop the receive permission definition as it is no longer used.
This makes the transfer permission similar to the call permission; it is only
useful if you want to allow a binder IPC between two tasks (call permission)
but deny passing of binder references between them (transfer permission).
Change-Id: I51e7a9a6662e826073b35e4f70a57f9ec73e472e
Signed-off-by: William Roberts <w.roberts@sta.samsung.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Surfaceflinger uses binder heavily to receive/send frames from applications
while compositing the screen. Change the binder mutex to an rt mutex to minimize
instances where high priority surfaceflinger binder work is blocked by lower
priority binder ipc.
Change-Id: If7429040641d6e463f20301ec14f02ecf6b0da36
Signed-off-by: Riley Andrews <riandrews@google.com>
Conflicts:
drivers/android/binder.c
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 585650dcec88e704a19bb226a34b6a7166111623 upstream.
The default kernel mapping for the pages allocated for the binder
buffers is never used. Set the __GFP_HIGHMEM flag when allocating
these pages so we don't needlessly use low memory pages that may
be required elsewhere.
Signed-off-by: Arve Hjønnevåg <arve@android.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
| |
commit 675d66b0ed5fd170d6a44cf8dbb3fa56a5347bdb upstream.
If a thread or process exited while a reply, one-way transaction or
death notification was pending, the struct holding the pending work
was leaked.
Signed-off-by: Arve Hjønnevåg <arve@android.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add tracepoints:
- ioctl entry and exit
- Main binder lock: lock, locked and unlock
- Command and return buffer opcodes
- Transaction: create and receive
- Transaction buffer: create and free
- Object and file descriptor transfer
- binder_update_page_range
Change-Id: Ib09ae78b0b8b75062325318e2307afd71b7c4458
Signed-off-by: Arve Hjønnevåg <arve@android.com>
|
|
|
|
|
|
|
|
| |
Cached thread return errors, death notifications and new looper
requests were not included in the stats.
Change-Id: Iabe14b351b662d3f63009ecb3900f92fc3d72cc4
Signed-off-by: Arve Hjønnevåg <arve@android.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
binder_update_page_range could read freed memory if the vma of the
selected process was freed right before the check that the vma
belongs to the mm struct it just locked.
If the vm_mm pointer in that freed vma struct had also been rewritten
with a value that matched the locked mm struct, then the code would
proceed and possibly modify the freed vma.
Signed-off-by: Arve Hjønnevåg <arve@android.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
| |
GCC warns that module_param_named() indirectly returns a bool type value
which is different from 'int' type binder_debug_no_lock declared. Change
it to bool because it is a internal switch for debugging.
Signed-off-by: Zhengwang Ruan <ruan.zhengwang@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
processes
Opening the binder driver and sharing the file returned with
other processes (e.g. by calling fork) can crash the kernel.
Prevent these crashes with the following changes:
- Add a mutex to protect against two processes mmapping the
same binder_proc.
- After locking mmap_sem, check that the vma we want to access
(still) points to the same mm_struct.
- Use proc->tsk instead of current to get the files struct since
this is where we get the rlimit from.
Signed-off-by: Arve Hjønnevåg <arve@android.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
| |
If user-space partially unmaps the driver, binder_vma_open
would dump the kernel stack. This is not a kernel bug however
and will be treated as if the whole area was unmapped once
binder_vma_close gets called.
Signed-off-by: Arve Hjønnevåg <arve@android.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
|
|
|
|
|
| |
This patch fixes a simple tab-space warning in binder.h found by checkpatch tool
Signed-off-by: Marco Navarra <fromenglish@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
An issue was observed when a userspace task exits.
The page which hits error here is the zero page.
In binder mmap, the whole of vma is not mapped.
On a task crash, when debuggerd reads the binder regions,
the unmapped areas fall to do_anonymous_page in handle_pte_fault,
due to the absence of a vm_fault handler. This results in
zero page being mapped. Later in zap_pte_range, vm_normal_page
returns zero page in the case of VM_MIXEDMAP and it results in the
error.
BUG: Bad page map in process mediaserver pte:9dff379f pmd:9bfbd831
page:c0ed8e60 count:1 mapcount:-1 mapping: (null) index:0x0
page flags: 0x404(referenced|reserved)
addr:40c3f000 vm_flags:10220051 anon_vma: (null) mapping:d9fe0764 index:fd
vma->vm_ops->fault: (null)
vma->vm_file->f_op->mmap: binder_mmap+0x0/0x274
CPU: 0 PID: 1463 Comm: mediaserver Tainted: G W 3.10.17+ #1
[<c001549c>] (unwind_backtrace+0x0/0x11c) from [<c001200c>] (show_stack+0x10/0x14)
[<c001200c>] (show_stack+0x10/0x14) from [<c0103d78>] (print_bad_pte+0x158/0x190)
[<c0103d78>] (print_bad_pte+0x158/0x190) from [<c01055f0>] (unmap_single_vma+0x2e4/0x598)
[<c01055f0>] (unmap_single_vma+0x2e4/0x598) from [<c010618c>] (unmap_vmas+0x34/0x50)
[<c010618c>] (unmap_vmas+0x34/0x50) from [<c010a9e4>] (exit_mmap+0xc8/0x1e8)
[<c010a9e4>] (exit_mmap+0xc8/0x1e8) from [<c00520f0>] (mmput+0x54/0xd0)
[<c00520f0>] (mmput+0x54/0xd0) from [<c005972c>] (do_exit+0x360/0x990)
[<c005972c>] (do_exit+0x360/0x990) from [<c0059ef0>] (do_group_exit+0x84/0xc0)
[<c0059ef0>] (do_group_exit+0x84/0xc0) from [<c0066de0>] (get_signal_to_deliver+0x4d4/0x548)
[<c0066de0>] (get_signal_to_deliver+0x4d4/0x548) from [<c0011500>] (do_signal+0xa8/0x3b8)
Add a vm_fault handler which returns VM_FAULT_SIGBUS, and prevents the
wrong fallback to do_anonymous_page.
Change-Id: I43c227e489f74f4907f199caf99f571b61883064
Signed-off-by: Vinayak Menon <vinayakm.list@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With the relase of Lollipop, Android no longer
requires the logger driver.
There are three patches which the android dev's
still need before they drop logger on all their
devices:
[PATCH v4 1/5] pstores: use scnprintf
[PATCH v2 2/5] pstore: remove superfluous memory size check
[PATCH 3/5] pstore: handle zero-sized prz in series
[PATCH v4 4/5] pstore: add pmsg
[PATCH 5/5] pstore: selinux: add security in-core xattr support for pstore and debugfs
But these seem to have been acked and are hopefully
queued for upstream.
So this patch removes the logger driver from staging.
Cc: Rom Lemarchand <romlem@google.com>,
Cc: Mark Salyzyn <salyzyn@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Android Kernel Team <kernel-team@android.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bug: 13505761
Change-Id: I21b6897f01871851e05b6eb53c7c08a1cb597e3d
Conflicts:
drivers/staging/android/Kconfig
drivers/staging/android/logger.c
drivers/staging/android/logger.h
Conflicts:
arch/arm/configs/cyanogenmod_tuna_defconfig
arch/arm/configs/tuna_defconfig
drivers/staging/android/logger.c
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
lowmemorykiller debug messages are inscrutable and mostly useful
for debugging the lowmemorykiller, not explaining why a process
was killed. Make the messages more useful by prefixing them
with "lowmemorykiller: " and explaining in more readable terms
what was killed, who it was killed for, and why it was killed.
The messages now look like:
[ 76.997631] lowmemorykiller: Killing 'droid.gallery3d' (2172), adj 1000,
[ 76.997635] to free 27436kB on behalf of 'kswapd0' (29) because
[ 76.997638] cache 122624kB is below limit 122880kB for oom_score_adj 1000
[ 76.997641] Free memory is -53356kB above reserved
A negative number for free memory above reserved means some of the
reserved memory has been used and is being regenerated by kswapd,
which is likely what called the shrinkers.
Change-Id: I1fe983381e73e124b90aa5d91cb66e55eaca390f
Signed-off-by: Colin Cross <ccross@android.com>
|
|
|
|
|
|
|
|
|
|
| |
The select...to kill messages are not very useful when not debugging
the lowmemorykiller itself. After the change to check TIF_MEMDIE
instead of using a task notifer this message can also get very
noisy.
Change-Id: Ice171c25801d6faa454b885a23b24b002423b754
Signed-off-by: Arve Hjønnevåg <arve@android.com>
|
|
|
|
|
|
|
|
|
| |
The amount of reserved memory varies between devices. Subtract it
here to reduce the amount of devices specific tuning needed for the
minfree values.
Change-Id: I466ae8b18f5972f6f6d8b5a7d8c4ae69660de53a
Signed-off-by: Arve Hjønnevåg <arve@android.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The conversion to use oom_score_adj instead of the deprecated oom_adj
values breaks existing user-space code. Add a config option to convert
oom_adj values written to oom_score_adj values if they appear to be
valid oom_adj values.
Change-Id: I68308125059b802ee2991feefb07e9703bc48549
Signed-off-by: Arve Hjønnevåg <arve@android.com>
Conflicts:
drivers/staging/android/Kconfig
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The task handoff notifier leaks task_struct since it never gets freed
after the callback returns NOTIFY_OK, which means it is responsible for
doing so.
It turns out the lowmemorykiller actually doesn't need this notifier at
all. It's used to prevent unnecessary killing by waiting for a thread
to exit as a result of lowmem_shrink(), however, it's possible to do
this in the same way the kernel oom killer works by setting TIF_MEMDIE
and avoid killing if we're still waiting for it to exit.
The kernel oom killer will already automatically set TIF_MEMDIE for
threads that are attempting to allocate memory that have a fatal signal.
The thread selected by lowmem_shrink() will have such a signal after the
lowmemorykiller sends it a SIGKILL, so this won't result in an
unnecessary use of memory reserves for the thread to exit.
This has the added benefit that we don't have to rely on
CONFIG_PROFILING to prevent needlessly killing tasks.
Reported-by: Werner Landgraf <w.landgraf@ru.ru>
Cc: stable@vger.kernel.org
Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: Colin Cross <ccross@android.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Conflicts:
drivers/staging/android/lowmemorykiller.c
|
|
|
|
|
|
|
| |
Fix compiler warning about the type of the module parameter.
Cc: San Mehat <san@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
/proc/pid/oom_adj is deprecated and will be removed in August 2012
according to Documentation/feature-removal-schedule.txt. Convert its
usage in the lowmemorykiller to use the new interface, oom_score_adj,
instead.
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Conflicts:
drivers/staging/android/lowmemorykiller.c
|
|
|
|
|
|
|
|
|
|
| |
LMK should not try killing kernel threads.
Suggested-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
| |
task->signal == NULL is not possible, so no need for these checks.
Suggested-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
LMK should not directly check for task->mm. The reason is that the
process' threads may exit or detach its mm via use_mm(), but other
threads may still have a valid mm. To catch this we use
find_lock_task_mm(), which walks up all threads and returns an
appropriate task (with lock held).
Suggested-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Grabbing tasklist_lock has its disadvantages, i.e. it blocks
process creation and destruction. If there are lots of processes,
blocking doesn't sound as a great idea.
For LMK, it is sufficient to surround tasks list traverse with
rcu_read_{,un}lock().
>From now on using force_sig() is not safe, as it can race with an
already exiting task, so we use send_sig() now. As a downside, it
won't kill PID namespace init processes, but that's not what we
want anyway.
Suggested-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Conflicts:
drivers/staging/android/lowmemorykiller.c
|