kernel_samsung_espresso10.git - kernel/samsung/espresso10

	Commit message (Collapse)	Author	Age	Files	Lines
*	tcp: make challenge acks less predictable	Eric Dumazet	2017-06-07	1	-5/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 75ff39ccc1bd5d3c455b6822ab09e533c551f758 upstream. Yue Cao claims that current host rate limiting of challenge ACKS (RFC 5961) could leak enough information to allow a patient attacker to hijack TCP sessions. He will soon provide details in an academic paper. This patch increases the default limit from 100 to 1000, and adds some randomization so that the attacker can no longer hijack sessions without spending a considerable amount of probes. Based on initial analysis and patch from Linus. Note that we also have per socket rate limiting, so it is tempting to remove the host limit in the future. v2: randomize the count of challenge acks per second, not the period. Change-Id: I89b43dd092449c8b7cac12d6d0d38a9b91bada77 Fixes: 282f23c6ee34 ("tcp: implement RFC 5961 3.2") Reported-by: Yue Cao <ycao009@ucr.edu> Signed-off-by: Eric Dumazet <edumazet@google.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Yuchung Cheng <ycheng@google.com> Cc: Neal Cardwell <ncardwell@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> [bwh: Backported to 3.2: - Adjust context - Use ACCESS_ONCE() instead of {READ,WRITE}_ONCE() - Open-code prandom_u32_max()] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
*	tcp: Change possible SYN flooding messages	Eric Dumazet	2017-06-07	3	-49/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	"Possible SYN flooding on port xxxx " messages can fill logs on servers. Change logic to log the message only once per listener, and add two new SNMP counters to track : TCPReqQFullDoCookies : number of times a SYNCOOKIE was replied to client TCPReqQFullDrop : number of times a SYN request was dropped because syncookies were not enabled. Based on a prior patch from Tom Herbert, and suggestions from David. Change-Id: I18f2f1593b13d1273ba4c67c92367b0221cab405 Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	ping: implement proper locking	Eric Dumazet	2017-06-07	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 43a6684519ab0a6c52024b5e25322476cabad893 upstream. We got a report of yet another bug in ping http://www.openwall.com/lists/oss-security/2017/03/24/6 ->disconnect() is not called with socket lock held. Fix this by acquiring ping rwlock earlier. Thanks to Daniel, Alexander and Andrey for letting us know this problem. Change-Id: I7de7df3a5ab5b5f7a41635799522bbf9a5395ad0 Fixes: c319b4d76b9e ("net: ipv4: add IPPROTO_ICMP socket kind") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Daniel Jiang <danieljiang0415@gmail.com> Reported-by: Solar Designer <solar@openwall.com> Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	net: avoid signed overflows for SO_{SND\|RCV}BUFFORCE	Eric Dumazet	2017-03-17	1	-8/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	CAP_NET_ADMIN users should not be allowed to set negative sk_sndbuf or sk_rcvbuf values, as it can lead to various memory corruptions, crashes, OOM... Note that before commit 82981930125a ("net: cleanups in sock_setsockopt()"), the bug was even more serious, since SO_SNDBUF and SO_RCVBUF were vulnerable. This needs to be backported to all known linux kernels. Again, many thanks to syzkaller team for discovering this gem. Change-Id: I93f4b9b1e6d93747a096ab26b73d24c7911b21b4 Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Revert "udp: remove redundant variable"	David S. Miller	2017-03-17	2	-14/+16
\| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit 81d54ec8479a2c695760da81f05b5a9fb2dbe40a. If we take the "try_again" goto, due to a checksum error, the 'len' has already been truncated. So we won't compute the same values as the original code did. Change-Id: I0503e45682377965571c4544385811765ef2e2bb Reported-by: paul bilke <fsmail@conspiracy.net> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit 7c3b1de9c0ba32bd33ac15c62e8b8a0548641c6b)
*	net: add length argument to skb_copy_and_csum_datagram_iovec	Sabrina Dubroca	2017-03-17	6	-6/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Without this length argument, we can read past the end of the iovec in memcpy_toiovec because we have no way of knowing the total length of the iovec's buffers. This is needed for stable kernels where 89c22d8c3b27 ("net: Fix skb csum races when peeking") has been backported but that don't have the ioviter conversion, which is almost all the stable trees <= 3.18. This also fixes a kernel crash for NFS servers when the client uses -onfsvers=3,proto=udp to mount the export. Change-Id: I1865e3d7a1faee42a5008a9ad58c4d3323ea4bab Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org> (cherry picked from commit c91234366e4cfd4f70c73e7d79ede92a6e462a88)
*	mac80211: fix fragmentation code, particularly for encryption	Johannes Berg	2017-03-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The "new" fragmentation code (since my rewrite almost 5 years ago) erroneously sets skb->len rather than using skb_trim() to adjust the length of the first fragment after copying out all the others. This leaves the skb tail pointer pointing to after where the data originally ended, and thus causes the encryption MIC to be written at that point, rather than where it belongs: immediately after the data. The impact of this is that if software encryption is done, then a) encryption doesn't work for the first fragment, the connection becomes unusable as the first fragment will never be properly verified at the receiver, the MIC is practically guaranteed to be wrong b) we leak up to 8 bytes of plaintext (!) of the packet out into the air This is only mitigated by the fact that many devices are capable of doing encryption in hardware, in which case this can't happen as the tail pointer is irrelevant in that case. Additionally, fragmentation is not used very frequently and would normally have to be configured manually. Fix this by using skb_trim() properly. Change-Id: I8d800e31b926a9e8b1cb9a3b6d15ebe1417a6a99 Cc: stable@vger.kernel.org Fixes: 2de8e0d999b8 ("mac80211: rewrite fragmentation") Reported-by: Jouni Malinen <j@w1.fi> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
*	packet: fix race condition in packet_set_ring	Philip Pettersson	2017-03-17	1	-8/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When packet_set_ring creates a ring buffer it will initialize a struct timer_list if the packet version is TPACKET_V3. This value can then be raced by a different thread calling setsockopt to set the version to TPACKET_V1 before packet_set_ring has finished. This leads to a use-after-free on a function pointer in the struct timer_list when the socket is closed as the previously initialized timer will not be deleted. The bug is fixed by taking lock_sock(sk) in packet_setsockopt when changing the packet version while also taking the lock at the start of packet_set_ring. Change-Id: Iec8b20f499134e1edd0f9214aa4dde477d1674e1 Fixes: f6fb8f100b80 ("af-packet: TPACKET_V3 flexible buffer implementation.") Signed-off-by: Philip Pettersson <philip.pettersson@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	UPSTREAM: net: Fix use after free in the recvmmsg exit path	Arnaldo Carvalho de Melo	2016-10-13	1	-19/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(cherry picked from commit 34b88a68f26a75e4fded796f1a49c40f82234b7d) The syzkaller fuzzer hit the following use-after-free: Call Trace: [<ffffffff8175ea0e>] __asan_report_load8_noabort+0x3e/0x40 mm/kasan/report.c:295 [<ffffffff851cc31a>] __sys_recvmmsg+0x6fa/0x7f0 net/socket.c:2261 [< inline >] SYSC_recvmmsg net/socket.c:2281 [<ffffffff851cc57f>] SyS_recvmmsg+0x16f/0x180 net/socket.c:2270 [<ffffffff86332bb6>] entry_SYSCALL_64_fastpath+0x16/0x7a arch/x86/entry/entry_64.S:185 And, as Dmitry rightly assessed, that is because we can drop the reference and then touch it when the underlying recvmsg calls return some packets and then hit an error, which will make recvmmsg to set sock->sk->sk_err, oops, fix it. Reported-and-Tested-by: Dmitry Vyukov <dvyukov@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Kostya Serebryany <kcc@google.com> Cc: Sasha Levin <sasha.levin@oracle.com> Fixes: a2e2725541fa ("net: Introduce recvmmsg socket syscall") http://lkml.kernel.org/r/20160122211644.GC2470@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Change-Id: I2adb0faf595b7b634d9b739dfdd1a47109e20ecb Bug: 30515201
*	fix infoleak in rtnetlink	kangjie	2016-10-13	1	-8/+8
\| \| \| \| \| \| \| \| \| \| \|	the stack object “map” has a total size of 32 bytes. Its last 4 bytes are padding generated by compiler. These padding bytes are not initialized and sent out via “nla_put” Bug: 28620102 Change-Id: I13da380c6fe8abca49e3cf9f05293c02b44d2e5e Signed-off-by: kangjie <kangjielu@gmail.com>
*	Replace %p with %pK to prevent leaking kernel address	Mohamad Ayyash	2016-10-13	1	-1/+1
\| \| \| \| \| \|	BUG: 27532522 Change-Id: Ic0710a9a8cfc682acd88ecf3bbfeece2d798c4a4 Signed-off-by: Mohamad Ayyash <mkayyash@google.com>
*	UPSTREAM: netfilter: x_tables: validate e->target_offset early	Florian Westphal	2016-10-13	3	-27/+24
\| \| \| \| \| \| \| \| \| \| \| \| \|	(cherry pick from commit bdf533de6968e9686df777dc178486f600c6e617) We should check that e->target_offset is sane before mark_source_chains gets called since it will fetch the target entry for loop detection. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Change-Id: Ic2dbc31c9525d698e94d4d8875886acf3524abbd Bug: 29637687
*	netfilter: x_tables: make sure e->next_offset covers remaining blob size	Florian Westphal	2016-10-12	3	-6/+12
\| \| \| \| \| \| \| \|	Otherwise this function may read data beyond the ruleset blob. Change-Id: I22f514057d3e0403d1af61f4d9555403ab9f72ea Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	net: validate the range we feed to iov_iter_init() in sys_sendto/sys_recvfrom	Al Viro	2016-10-12	1	-0/+4
\| \| \| \| \| \| \|	Change-Id: Ida19e5102b7faca17c685a261c20fbbf5c9614f9 Cc: stable@vger.kernel.org # v3.19 Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
*	netfilter: x_tables: check for size overflow	Florian Westphal	2016-10-12	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \|	Ben Hawkes says: integer overflow in xt_alloc_table_info, which on 32-bit systems can lead to small structure allocation and a copy_from_user based heap corruption. Change-Id: I13c554c630651a37e3f6a195e9a5f40cddcb29a1 Reported-by: Ben Hawkes <hawkes@google.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	netfilter: x_tables: fix unconditional helper	Florian Westphal	2016-10-12	3	-33/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Ben Hawkes says: In the mark_source_chains function (net/ipv4/netfilter/ip_tables.c) it is possible for a user-supplied ipt_entry structure to have a large next_offset field. This field is not bounds checked prior to writing a counter value at the supplied offset. Problem is that mark_source_chains should not have been called -- the rule doesn't have a next entry, so its supposed to return an absolute verdict of either ACCEPT or DROP. However, the function conditional() doesn't work as the name implies. It only checks that the rule is using wildcard address matching. However, an unconditional rule must also not be using any matches (no -m args). The underflow validator only checked the addresses, therefore passing the 'unconditional absolute verdict' test, while mark_source_chains also tested for presence of matches, and thus proceeeded to the next (not-existent) rule. Unify this so that all the callers have same idea of 'unconditional rule'. Change-Id: Id2b4779f2e41b1a82b1d266bb9e11118c4428afc Reported-by: Ben Hawkes <hawkes@google.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
*	ipv4: Don't do expensive useless work during inetdev destroy.	David S. Miller	2016-10-12	3	-2/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When an inetdev is destroyed, every address assigned to the interface is removed. And in this scenerio we do two pointless things which can be very expensive if the number of assigned interfaces is large: 1) Address promotion. We are deleting all addresses, so there is no point in doing this. 2) A full nf conntrack table purge for every address. We only need to do this once, as is already caught by the existing masq_dev_notifier so masq_inet_event() can skip this. Change-Id: I4b2a3ed665543728451c21465fb90ec89f739135 Reported-by: Solar Designer <solar@openwall.com> Signed-off-by: David S. Miller <davem@davemloft.net> Tested-by: Cyrill Gorcunov <gorcunov@openvz.org>
*	bluetooth: Validate socket address length in sco_sock_bind().	David S. Miller	2016-10-12	1	-0/+3
\| \| \| \| \|	Change-Id: I890640975f1af64f71947b6a1820249e08f6375b Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: add validation for the socket syscall protocol argument	Hannes Frederic Sowa	2016-10-12	5	-0/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	郭永刚 reported that one could simply crash the kernel as root by using a simple program: int socket_fd; struct sockaddr_in addr; addr.sin_port = 0; addr.sin_addr.s_addr = INADDR_ANY; addr.sin_family = 10; socket_fd = socket(10,3,0x40000000); connect(socket_fd , &addr,16); AF_INET, AF_INET6 sockets actually only support 8-bit protocol identifiers. inet_sock's skc_protocol field thus is sized accordingly, thus larger protocol identifiers simply cut off the higher bits and store a zero in the protocol fields. This could lead to e.g. NULL function pointer because as a result of the cut off inet_num is zero and we call down to inet_autobind, which is NULL for raw sockets. kernel: Call Trace: kernel: [<ffffffff816db90e>] ? inet_autobind+0x2e/0x70 kernel: [<ffffffff816db9a4>] inet_dgram_connect+0x54/0x80 kernel: [<ffffffff81645069>] SYSC_connect+0xd9/0x110 kernel: [<ffffffff810ac51b>] ? ptrace_notify+0x5b/0x80 kernel: [<ffffffff810236d8>] ? syscall_trace_enter_phase2+0x108/0x200 kernel: [<ffffffff81645e0e>] SyS_connect+0xe/0x10 kernel: [<ffffffff81779515>] tracesys_phase2+0x84/0x89 I found no particular commit which introduced this problem. Change-Id: I653fad90da54908144cc8916c2dccb1fa6f14eed CVE: CVE-2015-8543 Cc: Cong Wang <cwang@twopensource.com> Reported-by: 郭永刚 <guoyonggang@360.cn> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: fix iterating over hashtable in tcp_nuke_addr()	Dmitry Torokhov	2016-10-12	1	-1/+1
\| \| \| \| \| \| \| \| \|	The actual size of the tcp hashinfo table is tcp_hashinfo.ehash_mask + 1 so we need to adjust the loop accordingly to get the sockets hashed into the last bucket. Change-Id: I796b3c7b4a1a7fa35fba9e5192a4a403eb6e17de Signed-off-by: Dmitry Torokhov <dtor@google.com>
*	ipv6: addrconf: validate new MTU before applying it	Marcelo Leitner	2016-10-12	1	-1/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently we don't check if the new MTU is valid or not and this allows one to configure a smaller than minimum allowed by RFCs or even bigger than interface own MTU, which is a problem as it may lead to packet drops. If you have a daemon like NetworkManager running, this may be exploited by remote attackers by forging RA packets with an invalid MTU, possibly leading to a DoS. (NetworkManager currently only validates for values too small, but not for too big ones.) The fix is just to make sure the new value is valid. That is, between IPV6_MIN_MTU and interface's MTU. Note that similar check is already performed at ndisc_router_discovery(), for when kernel itself parses the RA. Change-Id: I6b70d0c12a77c7932066982f8797d8024f130d7c Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
*	udp: fix behavior of wrong checksums	Eric Dumazet	2016-10-12	2	-8/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We have two problems in UDP stack related to bogus checksums : 1) We return -EAGAIN to application even if receive queue is not empty. This breaks applications using edge trigger epoll() 2) Under UDP flood, we can loop forever without yielding to other processes, potentially hanging the host, especially on non SMP. This patch is an attempt to make things better. We might in the future add extra support for rt applications wanting to better control time spent doing a recv() in a hostile environment. For example we could validate checksums before queuing packets in socket receive queue. Change-Id: I9355321ac7ee564d56c342fa7738b918052bf308 Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: llc: use correct size for sysctl timeout entries	Sasha Levin	2016-10-12	1	-4/+4
\| \| \| \| \| \| \| \| \| \|	The timeout entries are sizeof(int) rather than sizeof(long), which means that when they were getting read we'd also leak kernel memory to userspace along with the timeout values. Change-Id: I7d764fcc1b5aa022c48b0021fe77f58d21cb9f1b Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	ipv6: Don't reduce hop limit for an interface	D.S. Ljungmark	2016-10-12	1	-1/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A local route may have a lower hop_limit set than global routes do. RFC 3756, Section 4.2.7, "Parameter Spoofing" > 1. The attacker includes a Current Hop Limit of one or another small > number which the attacker knows will cause legitimate packets to > be dropped before they reach their destination. > As an example, one possible approach to mitigate this threat is to > ignore very small hop limits. The nodes could implement a > configurable minimum hop limit, and ignore attempts to set it below > said limit. Change-Id: I1090f30cf8b16e381d968376be6bd141a5f8787c Signed-off-by: D.S. Ljungmark <ljungmark@modio.se> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	netfilter: nf_conntrack_dccp: fix skb_header_pointer API usages	Daniel Borkmann	2016-10-12	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	CVE-2014-2523 (severity rating: 10.0) Some occurences in the netfilter tree use skb_header_pointer() in the following way ... struct dccp_hdr _dh, *dh; ... skb_header_pointer(skb, dataoff, sizeof(_dh), &dh); ... where dh itself is a pointer that is being passed as the copy buffer. Instead, we need to use &_dh as the forth argument so that we're copying the data into an actual buffer that sits on the stack. Currently, we probably could overwrite memory on the stack (e.g. with a possibly mal-formed DCCP packet), but unintentionally, as we only want the buffer to be placed into _dh variable. Fixes: 2bc780499aa3 ("[NETFILTER]: nf_conntrack: add DCCP protocol support") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Change-Id: I7f7b32440a4f791e29ed9ca53cef88b3c4e412d3
*	netfilter: conntrack: disable generic tracking for known protocols	Florian Westphal	2016-10-12	1	-1/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Given following iptables ruleset: -P FORWARD DROP -A FORWARD -m sctp --dport 9 -j ACCEPT -A FORWARD -p tcp --dport 80 -j ACCEPT -A FORWARD -p tcp -m conntrack -m state ESTABLISHED,RELATED -j ACCEPT One would assume that this allows SCTP on port 9 and TCP on port 80. Unfortunately, if the SCTP conntrack module is not loaded, this allows all SCTP communication, to pass though, i.e. -p sctp -j ACCEPT, which we think is a security issue. This is because on the first SCTP packet on port 9, we create a dummy "generic l4" conntrack entry without any port information (since conntrack doesn't know how to extract this information). All subsequent packets that are unknown will then be in established state since they will fallback to proto_generic and will match the 'generic' entry. Our originally proposed version [1] completely disabled generic protocol tracking, but Jozsef suggests to not track protocols for which a more suitable helper is available, hence we now mitigate the issue for in tree known ct protocol helpers only, so that at least NAT and direction information will still be preserved for others. [1] http://www.spinics.net/lists/netfilter-devel/msg33430.html Joint work with Daniel Borkmann. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Conflicts: net/netfilter/nf_conntrack_proto_generic.c Change-Id: I5f7ddfa2d29672cc4533ad907bd7605bdc6c4bf7
*	ipv4: try to cache dst_entries which would cause a redirect	Hannes Frederic Sowa	2016-10-12	2	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Not caching dst_entries which cause redirects could be exploited by hosts on the same subnet, causing a severe DoS attack. This effect aggravated since commit f88649721268999 ("ipv4: fix dst race in sk_dst_get()"). Lookups causing redirects will be allocated with DST_NOCACHE set which will force dst_release to free them via RCU. Unfortunately waiting for RCU grace period just takes too long, we can end up with >1M dst_entries waiting to be released and the system will run OOM. rcuos threads cannot catch up under high softirq load. Attaching the flag to emit a redirect later on to the specific skb allows us to cache those dst_entries thus reducing the pressure on allocation and deallocation. This issue was discovered by Marcelo Leitner. Cc: Julian Anastasov <ja@ssi.bg> Signed-off-by: Marcelo Leitner <mleitner@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net> Conflicts: include/net/ip.h net/ipv4/route.c Change-Id: I53e4b500a4db2f5fece937a42a3bd810b2640c44
*	Fix security issues reported by the android.security CTS	Stevan Marinkovic	2016-03-11	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	package Contains following upstream changes: a134f083e79fb4c3d0a925691e732c56911b4326 ipv4: Missing sk_nulls_node_init() in ping_unhash(). 8176cced706b5e5d15887584150764894e94e02f perf: Treat attr.config as u64 in perf_swevent_init() e9c243a5a6de0be8e584c604d353412584b592f8 futex-prevent-requeue-pi-on-same-futex.patch futex: Forbid... 6f7b0a2a5c0fb03be7c25bd1745baa50582348ef futex: Forbid uaddr == uaddr2 in futex_wait_requeue_pi() Fixes CTS tests: android.security.cts.NativeCodeTest#testFutex android.security.cts.NativeCodeTest#testPerfEvent android.security.cts.NativeCodeTest#testPingPongRoot Change-Id: Ib9d389c875935e9eb9611be4fc11911383f627fc
*	proc: Usable inode numbers for the namespace file descriptors.	Eric W. Biederman	2016-03-11	1	-0/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Assign a unique proc inode to each namespace, and use that inode number to ensure we only allocate at most one proc inode for every namespace in proc. A single proc inode per namespace allows userspace to test to see if two processes are in the same namespace. This has been a long requested feature and only blocked because a naive implementation would put the id in a global space and would ultimately require having a namespace for the names of namespaces, making migration and certain virtualization tricks impossible. We still don't have per superblock inode numbers for proc, which appears necessary for application unaware checkpoint/restart and migrations (if the application is using namespace file descriptors) but that is now allowd by the design if it becomes important. I have preallocated the ipc and uts initial proc inode numbers so their structures can be statically initialized. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> (cherry picked from commit 98f842e675f96ffac96e6c50315790912b2812be)
*	security: remove the security_netlink_recv hook as it is equivalent to capable()	Eric Paris	2016-03-11	7	-7/+7
\| \| \| \| \| \| \| \| \| \|	Once upon a time netlink was not sync and we had to get the effective capabilities from the skb that was being received. Today we instead get the capabilities from the current task. This has rendered the entire purpose of the hook moot as it is now functionally equivalent to the capable() call. Signed-off-by: Eric Paris <eparis@redhat.com>
*	cfg80211: allow registering to beacons	Johannes Berg	2016-03-11	2	-1/+71
\| \| \| \| \| \| \| \| \| \|	Add the ability to register to received beacon frames to allow implementing OLBC logic in userspace. The registration is per wiphy since there's no point in receiving the same frame multiple times. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
*	nl80211: add API to probe a client	Johannes Berg	2016-03-11	1	-0/+104
\| \| \| \| \| \| \| \| \|	When the AP SME in hostapd is used it wants to probe the clients when they have been idle for some time. Add explicit API to support this. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
*	nl80211: allow subscribing to unexpected class3 frames	Johannes Berg	2016-03-11	3	-0/+85
\| \| \| \| \| \| \| \| \| \| \|	To implement AP mode without monitor interfaces we need to be able to send a deauth to stations that send frames without being associated. Enable this by adding a new nl80211 event for such frames that an application can subscribe to. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
*	nl80211: advertise device AP SME	Johannes Berg	2016-03-11	2	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add the ability to advertise that the device contains the AP SME and what features it can support. There are currently no features in the bitmap -- probe response offload will be advertised by a few patches Arik is working on now (who took over from Guy Eilam) and a device with AP SME will typically implement and require response offload. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Conflicts: drivers/net/wireless/ath/ath6kl/init.c Change-Id: Ib1a65814860cf97cadd142c17be0e91f43743832
*	nl80211: advertise GTK rekey support, new triggers	Johannes Berg	2016-03-11	2	-0/+53
\| \| \| \| \| \| \| \| \| \| \|	Since we now have the necessary API in place to support GTK rekeying, applications will need to know whether it is supported by a device. Add a pseudo-trigger that is used only to advertise that capability. Also, add some new triggers that match what iwlagn devices can do. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
*	nl80211: support sending TDLS commands/frames	Arik Nemtsov	2016-03-11	1	-0/+85
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add support for sending high-level TDLS commands and TDLS frames via NL80211_CMD_TDLS_OPER and NL80211_CMD_TDLS_MGMT, respectively. Add appropriate cfg80211 callbacks for lower level drivers. Add wiphy capability flags for TDLS support and advertise them via nl80211. Signed-off-by: Arik Nemtsov <arik@wizery.com> Cc: Kalyan C Gaddam <chakkal@iit.edu> Signed-off-by: John W. Linville <linville@tuxdriver.com> Conflicts: include/linux/nl80211.h include/net/cfg80211.h net/wireless/nl80211.c Change-Id: I08e4de6e92680aed35b98838aa999d31963b6d50
*	cfg80211/nl80211: Add PMKSA caching candidate event	Jouni Malinen	2016-03-11	3	-0/+61
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When the driver (or most likely firmware) decides which AP to use for roaming based on internal scan result processing, user space needs to be notified of PMKSA caching candidates to allow RSN pre-authentication to be used. Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Conflicts: include/linux/nl80211.h Change-Id: I31aa113747b75f5f35658b857fdfe8d9a75e4534
*	cfg80211/nl80211: support GTK rekey offload	Johannes Berg	2016-03-11	3	-0/+127
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In certain circumstances, like WoWLAN scenarios, devices may implement (partial) GTK rekeying on the device to avoid waking up the host for it. In order to successfully go through GTK rekeying, the KEK, KCK and the replay counter are required. Add API to let the supplicant hand the parameters to the driver which may store it for future GTK rekey operations. Note that, of course, if GTK rekeying is done by the device, the EAP frame must not be passed up to userspace, instead a rekey event needs to be sent to let userspace update its replay counter. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Conflicts: include/linux/nl80211.h net/wireless/nl80211.c Change-Id: Icd3a157742b08c01a3be20d46d4112e5d4b93a58
*	netfilter: xt_qtaguid: report only uid tags to non-privileged processes	JP Abgrall	2016-03-11	1	-2/+3
\| \| \| \| \| \| \| \| \| \|	In the past, a process could only see its own stats (uid-based summary, and details). Now we allow any process to see other UIDs uid-based stats, but still hide the detailed stats. Change-Id: I7666961ed244ac1d9359c339b048799e5db9facc Signed-off-by: JP Abgrall <jpa@google.com>
*	net: ipv6: Add a sysctl to make optimistic addresses useful candidates	Erik Kline	2016-03-11	1	-2/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add a sysctl that causes an interface's optimistic addresses to be considered equivalent to other non-deprecated addresses for source address selection purposes. Preferred addresses will still take precedence over optimistic addresses, subject to other ranking in the source address selection algorithm. This is useful where different interfaces are connected to different networks from different ISPs (e.g., a cell network and a home wifi network). The current behaviour complies with RFC 3484/6724, and it makes sense if the host has only one interface, or has multiple interfaces on the same network (same or cooperating administrative domain(s), but not in the multiple distinct networks case. For example, if a mobile device has an IPv6 address on an LTE network and then connects to IPv6-enabled wifi, while the wifi IPv6 address is undergoing DAD, IPv6 connections will try use the wifi default route with the LTE IPv6 address, and will get stuck until they time out. Also, because optimistic nodes can receive frames, issue an RTM_NEWADDR as soon as DAD starts (with the IFA_F_OPTIMSTIC flag appropriately set). A second RTM_NEWADDR is sent if DAD completes (the address flags have changed), otherwise an RTM_DELADDR is sent. Also: add an entry in ip-sysctl.txt for optimistic_dad. [backport of net-next 7fd2561e4ebdd070ebba6d3326c4c5b13942323f] Signed-off-by: Erik Kline <ek@google.com> Acked-by: Lorenzo Colitti <lorenzo@google.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Bug: 17769720 Change-Id: I440a9b8c788db6767d191bbebfd2dff481aa9e0d
*	ipv4: Check if dev_out is NULL in ip_route_output_slow()	Dmitry Shmidt	2016-03-11	1	-1/+6
\| \| \| \| \|	Change-Id: If04a8e99942dbe7e099e736dc87c2a49e1e778f9 Signed-off-by: Dmitry Shmidt <dimitrysh@google.com>
*	net/ping: handle protocol mismatching scenario	Jane Zhou	2016-03-11	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ping_lookup() may return a wrong sock if sk_buff's and sock's protocols dont' match. For example, sk_buff's protocol is ETH_P_IPV6, but sock's sk_family is AF_INET, in that case, if sk->sk_bound_dev_if is zero, a wrong sock will be returned. the fix is to "continue" the searching, if no matching, return NULL. [cherry-pick of net 91a0b603469069cdcce4d572b7525ffc9fd352a6] Bug: 18512516 Change-Id: I520223ce53c0d4e155c37d6b65a03489cc7fd494 Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: James Morris <jmorris@namei.org> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Patrick McHardy <kaber@trash.net> Cc: netdev@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Jane Zhou <a17711@motorola.com> Signed-off-by: Yiwei Zhao <gbjc64@motorola.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	ipv4: Skip calling fib_detect_death() if fib_dev is NULL	Dmitry Shmidt	2016-03-11	1	-1/+1
\| \| \| \| \|	Change-Id: I1b8c6c7e79cb8a05b4b715ddb3299d74edef0e14 Signed-off-by: Dmitry Shmidt <dimitrysh@google.com>
*	net: ipv6: ping: Use socket mark in routing lookup	Lorenzo Colitti	2016-03-11	1	-0/+1
\| \| \| \| \| \| \| \|	[net-next commit bf439b3154ce49d81a79b14f9fab18af99018ae2] Change-Id: I8356e9132088c75d4510021c6e4c2641d772087a Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tcp: add a sysctl to config the tcp_default_init_rwnd	JP Abgrall	2016-03-11	3	-10/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The default initial rwnd was hardcoded to 10. Now we allow it to be controlled via /proc/sys/net/ipv4/tcp_default_init_rwnd which limits the values from 3 to 100 This is somewhat needed because ipv6 routes are autoconfigured by the kernel. See "An Argument for Increasing TCP's Initial Congestion Window" in https://developers.google.com/speed/articles/tcp_initcwnd_paper.pdf Change-Id: I7eac8a0a5133371aea9ecb9aec0b608bd7f2cc57 Signed-off-by: JP Abgrall <jpa@google.com>
*	nf: xt_qtaguid: fix handling for cases where tunnels are used.	JP Abgrall	2016-03-11	1	-81/+69
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* fix skb->dev vs par->in/out When there is some forwarding going on, it introduces extra state around devs associated with xt_action_param->in/out and sk_buff->dev. E.g. par->in and par->out are both set, or skb->dev and par->out are both set (and different) This would lead qtaguid to make the wrong assumption about the direction and update the wrong device stats. Now we rely more on par->in/out. * Fix handling when qtaguid is used as "owner" When qtaguid is used as an owner module, and sk_socket->file is not there (happens when tunnels are involved), it would incorrectly do a tag stats update. * Correct debug messages. Bug: 11687690 Change-Id: I2b1ff8bd7131969ce9e25f8291d83a6280b3ba7f Signed-off-by: JP Abgrall <jpa@google.com> (cherry picked from commit 2b71479d6f5fe8f33b335f713380f72037244395)
*	netfilter: xt_qtaguid: extend iface stat to report protocols	JP Abgrall	2016-03-11	3	-43/+82
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the past the iface_stat_fmt would only show global bytes/packets for the skb-based numbers. For stall detection in userspace, distinguishing tcp vs other protocols makes it easier. Now we report ifname total_skb_rx_bytes total_skb_rx_packets total_skb_tx_bytes total_skb_tx_packets {rx,tx}_{tcp,udp,ohter}_{bytes,packets} Bug: 6818637 Signed-off-by: JP Abgrall <jpa@google.com> Change-Id: I179c5ebf2fe822acec0bce4973b4bbb5e7d5076d
*	netfilter: xt_qtaguid: remove AID_* dependency for access control	JP Abgrall	2016-03-11	1	-25/+26
\| \| \| \| \| \| \| \| \| \|	qtaguid limits what can be done with /ctrl and /stats based on group membership. This changes removes AID_NET_BW_STATS and AID_NET_BW_ACCT, and picks up the groups from the gid of the matching proc entry files. Signed-off-by: JP Abgrall <jpa@google.com> Change-Id: I42e477adde78a12ed5eb58fbc0b277cdaadb6f94
*	netfilter: qtaguid: Don't BUG_ON if create_if_tag_stat fails	Pontus Fuchs	2016-03-11	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \|	If create_if_tag_stat fails to allocate memory (GFP_ATOMIC) the following will happen: qtaguid: iface_stat: tag stat alloc failed ... kernel BUG at xt_qtaguid.c:1482! Signed-off-by: Pontus Fuchs <pontus.fuchs@gmail.com>
*	netfilter: xt_qtaguid: fix error exit that would keep a spinlock.	JP Abgrall	2016-03-11	1	-2/+2
\| \| \| \| \| \| \| \|	qtudev_open() could return with a uid_tag_data_tree_lock held when an kzalloc(..., GFP_ATOMIC) would fail. Very unlikely to get triggered AND survive the mayhem of running out of mem. Signed-off-by: JP Abgrall <jpa@google.com>