aboutsummaryrefslogtreecommitdiffstats
path: root/net/xfrm/xfrm_state.c
Commit message (Collapse)AuthorAgeFilesLines
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2013-09-051-5/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c net/bridge/br_multicast.c net/ipv6/sit.c The conflicts were minor: 1) sit.c changes overlap with change to ip_tunnel_xmit() signature. 2) br_multicast.c had an overlap between computing max_delay using msecs_to_jiffies and turning MLDV2_MRC() into an inline function with a name using lowercase instead of uppercase letters. 3) stmmac had two overlapping changes, one which conditionally allocated and hooked up a dma_cfg based upon the presence of the pbl OF property, and another one handling store-and-forward DMA made. The latter of which should not go into the new of_find_property() basic block. Signed-off-by: David S. Miller <davem@davemloft.net>
| * xfrm: make local error reporting more robustHannes Frederic Sowa2013-08-141-5/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In xfrm4 and xfrm6 we need to take care about sockets of the other address family. This could happen because a 6in4 or 4in6 tunnel could get protected by ipsec. Because we don't want to have a run-time dependency on ipv6 when only using ipv4 xfrm we have to embed a pointer to the correct local_error function in xfrm_state_afinet and look it up when returning an error depending on the socket address family. Thanks to vi0ss for the great bug report: <https://bugzilla.kernel.org/show_bug.cgi?id=58691> v2: a) fix two more unsafe interpretations of skb->sk as ipv6 socket (xfrm6_local_dontfrag and __xfrm6_output) v3: a) add an EXPORT_SYMBOL_GPL(xfrm_local_error) to fix a link error when building ipv6 as a module (thanks to Steffen Klassert) Reported-by: <vi0oss@gmail.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* | xfrm: Make xfrm_state timer monotonicFan Du2013-08-161-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | xfrm_state timer should be independent of system clock change, so switch to CLOCK_BOOTTIME base which is not only monotonic but also counting suspend time. Thus issue reported in commit: 9e0d57fd6dad37d72a3ca6db00ca8c76f2215454 ("xfrm: SAD entries do not expire correctly after suspend-resume") could ALSO be avoided. v2: Use CLOCK_BOOTTIME to count suspend time, but still monotonic. Signed-off-by: Fan Du <fan.du@windriver.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* | xfrm: constify mark argument of xfrm_find_acq()Mathias Krause2013-08-051-5/+7
|/ | | | | | | | | | The mark argument is read only, so constify it. Also make dummy_mark in af_key const -- only used as dummy argument for this very function. Signed-off-by: Mathias Krause <minipli@googlemail.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* xfrm: allow to avoid copying DSCP during encapsulationNicolas Dichtel2013-03-061-0/+1
| | | | | | | | | | | | | | | | | | | By default, DSCP is copying during encapsulation. Copying the DSCP in IPsec tunneling may be a bit dangerous because packets with different DSCP may get reordered relative to each other in the network and then dropped by the remote IPsec GW if the reordering becomes too big compared to the replay window. It is possible to avoid this copy with netfilter rules, but it's very convenient to be able to configure it for each SA directly. This patch adds a toogle for this purpose. By default, it's not set to maintain backward compatibility. Field flags in struct xfrm_usersa_info is full, hence I add a new attribute. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* hlist: drop the node parameter from iteratorsSasha Levin2013-02-271-26/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I'm not sure why, but the hlist for each entry iterators were conceived list_for_each_entry(pos, head, member) The hlist ones were greedy and wanted an extra parameter: hlist_for_each_entry(tpos, pos, head, member) Why did they need an extra pos parameter? I'm not quite sure. Not only they don't really need it, it also prevents the iterator from looking exactly like the list iterator, which is unfortunate. Besides the semantic patch, there was some manual work required: - Fix up the actual hlist iterators in linux/list.h - Fix up the declaration of other iterators based on the hlist ones. - A very small amount of places were using the 'node' parameter, this was modified to use 'obj->member' instead. - Coccinelle didn't handle the hlist_for_each_entry_safe iterator properly, so those had to be fixed up manually. The semantic patch which is mostly the work of Peter Senna Tschudin is here: @@ iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host; type T; expression a,c,d,e; identifier b; statement S; @@ -T b; <+... when != b ( hlist_for_each_entry(a, - b, c, d) S | hlist_for_each_entry_continue(a, - b, c) S | hlist_for_each_entry_from(a, - b, c) S | hlist_for_each_entry_rcu(a, - b, c, d) S | hlist_for_each_entry_rcu_bh(a, - b, c, d) S | hlist_for_each_entry_continue_rcu_bh(a, - b, c) S | for_each_busy_worker(a, c, - b, d) S | ax25_uid_for_each(a, - b, c) S | ax25_for_each(a, - b, c) S | inet_bind_bucket_for_each(a, - b, c) S | sctp_for_each_hentry(a, - b, c) S | sk_for_each(a, - b, c) S | sk_for_each_rcu(a, - b, c) S | sk_for_each_from -(a, b) +(a) S + sk_for_each_from(a) S | sk_for_each_safe(a, - b, c, d) S | sk_for_each_bound(a, - b, c) S | hlist_for_each_entry_safe(a, - b, c, d, e) S | hlist_for_each_entry_continue_rcu(a, - b, c) S | nr_neigh_for_each(a, - b, c) S | nr_neigh_for_each_safe(a, - b, c, d) S | nr_node_for_each(a, - b, c) S | nr_node_for_each_safe(a, - b, c, d) S | - for_each_gfn_sp(a, c, d, b) S + for_each_gfn_sp(a, c, d) S | - for_each_gfn_indirect_valid_sp(a, c, d, b) S + for_each_gfn_indirect_valid_sp(a, c, d) S | for_each_host(a, - b, c) S | for_each_host_safe(a, - b, c, d) S | for_each_mesh_entry(a, - b, c, d) S ) ...+> [akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c] [akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c] [akpm@linux-foundation.org: checkpatch fixes] [akpm@linux-foundation.org: fix warnings] [akpm@linux-foudnation.org: redo intrusive kvm changes] Tested-by: Peter Senna Tschudin <peter.senna@gmail.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Gleb Natapov <gleb@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* xfrm: Convert xfrm_addr_cmp() to boolean xfrm_addr_equal().YOSHIFUJI Hideaki / 吉藤英明2013-01-291-17/+17
| | | | | | | | | All users of xfrm_addr_cmp() use its result as boolean. Introduce xfrm_addr_equal() (which is equal to !xfrm_addr_cmp()) and convert all users. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* xfrm: use separated locks to protect pointers of struct xfrm_state_afinfoCong Wang2013-01-171-25/+18
| | | | | | | | | | | | | | afinfo->type_map and afinfo->mode_map deserve separated locks, they are different things. We should just take RCU read lock to protect afinfo itself, but not for the inner pointers. Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* xfrm: replace rwlock on xfrm_km_list with rcuCong Wang2013-01-161-28/+30
| | | | | | | | Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* xfrm: replace rwlock on xfrm_state_afinfo with rcuCong Wang2013-01-161-17/+16
| | | | | | | | | | | | Similar to commit 418a99ac6ad487dc9c42e6b0e85f941af56330f2 (Replace rwlock on xfrm_policy_afinfo with rcu), the rwlock on xfrm_state_afinfo can be replaced by RCU too. Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* xfrm: removes a superfluous check and add a statisticLi RongQing2013-01-071-3/+0
| | | | | | | | | | | | Remove the check if x->km.state equal to XFRM_STATE_VALID in xfrm_state_check_expire(), which will be done before call xfrm_state_check_expire(). add a LINUX_MIB_XFRMOUTSTATEINVALID statistic to record the outbound error due to invalid xfrm state. Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds2012-10-021-6/+6
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking changes from David Miller: 1) GRE now works over ipv6, from Dmitry Kozlov. 2) Make SCTP more network namespace aware, from Eric Biederman. 3) TEAM driver now works with non-ethernet devices, from Jiri Pirko. 4) Make openvswitch network namespace aware, from Pravin B Shelar. 5) IPV6 NAT implementation, from Patrick McHardy. 6) Server side support for TCP Fast Open, from Jerry Chu and others. 7) Packet BPF filter supports MOD and XOR, from Eric Dumazet and Daniel Borkmann. 8) Increate the loopback default MTU to 64K, from Eric Dumazet. 9) Use a per-task rather than per-socket page fragment allocator for outgoing networking traffic. This benefits processes that have very many mostly idle sockets, which is quite common. From Eric Dumazet. 10) Use up to 32K for page fragment allocations, with fallbacks to smaller sizes when higher order page allocations fail. Benefits are a) less segments for driver to process b) less calls to page allocator c) less waste of space. From Eric Dumazet. 11) Allow GRO to be used on GRE tunnels, from Eric Dumazet. 12) VXLAN device driver, one way to handle VLAN issues such as the limitation of 4096 VLAN IDs yet still have some level of isolation. From Stephen Hemminger. 13) As usual there is a large boatload of driver changes, with the scale perhaps tilted towards the wireless side this time around. Fix up various fairly trivial conflicts, mostly caused by the user namespace changes. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1012 commits) hyperv: Add buffer for extended info after the RNDIS response message. hyperv: Report actual status in receive completion packet hyperv: Remove extra allocated space for recv_pkt_list elements hyperv: Fix page buffer handling in rndis_filter_send_request() hyperv: Fix the missing return value in rndis_filter_set_packet_filter() hyperv: Fix the max_xfer_size in RNDIS initialization vxlan: put UDP socket in correct namespace vxlan: Depend on CONFIG_INET sfc: Fix the reported priorities of different filter types sfc: Remove EFX_FILTER_FLAG_RX_OVERRIDE_IP sfc: Fix loopback self-test with separate_tx_channels=1 sfc: Fix MCDI structure field lookup sfc: Add parentheses around use of bitfield macro arguments sfc: Fix null function pointer in efx_sriov_channel_type vxlan: virtual extensible lan igmp: export symbol ip_mc_leave_group netlink: add attributes to fdb interface tg3: unconditionally select HWMON support when tg3 is enabled. Revert "net: ti cpsw ethernet: allow reading phy interface mode from DT" gre: fix sparse warning ...
| * Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2012-09-151-1/+3
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: net/netfilter/nfnetlink_log.c net/netfilter/xt_LOG.c Rather easy conflict resolution, the 'net' tree had bug fixes to make sure we checked if a socket is a time-wait one or not and elide the logging code if so. Whereas on the 'net-next' side we are calculating the UID and GID from the creds using different interfaces due to the user namespace changes from Eric Biederman. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | netlink: Rename pid to portid to avoid confusionEric W. Biederman2012-09-101-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is a frequent mistake to confuse the netlink port identifier with a process identifier. Try to reduce this confusion by renaming fields that hold port identifiers portid instead of pid. I have carefully avoided changing the structures exported to userspace to avoid changing the userspace API. I have successfully built an allyesconfig kernel with this change. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | xfrm: remove redundant parameter "int dir" in struct xfrm_mgr.acquireFan Du2012-08-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Sematically speaking, xfrm_mgr.acquire is called when kernel intends to ask user space IKE daemon to negotiate SAs with peers. IOW the direction will *always* be XFRM_POLICY_OUT, so remove int dir for clarity. Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | Merge branch 'for-linus' of ↵Linus Torvalds2012-10-021-3/+3
|\ \ \ | |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull user namespace changes from Eric Biederman: "This is a mostly modest set of changes to enable basic user namespace support. This allows the code to code to compile with user namespaces enabled and removes the assumption there is only the initial user namespace. Everything is converted except for the most complex of the filesystems: autofs4, 9p, afs, ceph, cifs, coda, fuse, gfs2, ncpfs, nfs, ocfs2 and xfs as those patches need a bit more review. The strategy is to push kuid_t and kgid_t values are far down into subsystems and filesystems as reasonable. Leaving the make_kuid and from_kuid operations to happen at the edge of userspace, as the values come off the disk, and as the values come in from the network. Letting compile type incompatible compile errors (present when user namespaces are enabled) guide me to find the issues. The most tricky areas have been the places where we had an implicit union of uid and gid values and were storing them in an unsigned int. Those places were converted into explicit unions. I made certain to handle those places with simple trivial patches. Out of that work I discovered we have generic interfaces for storing quota by projid. I had never heard of the project identifiers before. Adding full user namespace support for project identifiers accounts for most of the code size growth in my git tree. Ultimately there will be work to relax privlige checks from "capable(FOO)" to "ns_capable(user_ns, FOO)" where it is safe allowing root in a user names to do those things that today we only forbid to non-root users because it will confuse suid root applications. While I was pushing kuid_t and kgid_t changes deep into the audit code I made a few other cleanups. I capitalized on the fact we process netlink messages in the context of the message sender. I removed usage of NETLINK_CRED, and started directly using current->tty. Some of these patches have also made it into maintainer trees, with no problems from identical code from different trees showing up in linux-next. After reading through all of this code I feel like I might be able to win a game of kernel trivial pursuit." Fix up some fairly trivial conflicts in netfilter uid/git logging code. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (107 commits) userns: Convert the ufs filesystem to use kuid/kgid where appropriate userns: Convert the udf filesystem to use kuid/kgid where appropriate userns: Convert ubifs to use kuid/kgid userns: Convert squashfs to use kuid/kgid where appropriate userns: Convert reiserfs to use kuid and kgid where appropriate userns: Convert jfs to use kuid/kgid where appropriate userns: Convert jffs2 to use kuid and kgid where appropriate userns: Convert hpfs to use kuid and kgid where appropriate userns: Convert btrfs to use kuid/kgid where appropriate userns: Convert bfs to use kuid/kgid where appropriate userns: Convert affs to use kuid/kgid wherwe appropriate userns: On alpha modify linux_to_osf_stat to use convert from kuids and kgids userns: On ia64 deal with current_uid and current_gid being kuid and kgid userns: On ppc convert current_uid from a kuid before printing. userns: Convert s390 getting uid and gid system calls to use kuid and kgid userns: Convert s390 hypfs to use kuid and kgid where appropriate userns: Convert binder ipc to use kuids userns: Teach security_path_chown to take kuids and kgids userns: Add user namespace support to IMA userns: Convert EVM to deal with kuids and kgids in it's hmac computation ...
| * | userns: Convert the audit loginuid to be a kuidEric W. Biederman2012-09-171-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Always store audit loginuids in type kuid_t. Print loginuids by converting them into uids in the appropriate user namespace, and then printing the resulting uid. Modify audit_get_loginuid to return a kuid_t. Modify audit_set_loginuid to take a kuid_t. Modify /proc/<pid>/loginuid on read to convert the loginuid into the user namespace of the opener of the file. Modify /proc/<pid>/loginud on write to convert the loginuid rom the user namespace of the opener of the file. Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Eric Paris <eparis@redhat.com> Cc: Paul Moore <paul@paul-moore.com> ? Cc: David Miller <davem@davemloft.net> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
* | | net/xfrm/xfrm_state.c: fix error return codeJulia Lawall2012-08-311-1/+3
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Initialize return variable before exiting on an error path. A simplified version of the semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> ( if@p1 (\(ret < 0\|ret != 0\)) { ... return ret; } | ret@p1 = 0 ) ... when != ret = e1 when != &ret *if(...) { ... when != ret = e2 when forall return ret; } // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Fix unexpected SA hard expiration after changing dateFan Du2012-08-021-4/+17
|/ | | | | | | | | | | After SA is setup, one timer is armed to detect soft/hard expiration, however the timer handler uses xtime to do the math. This makes hard expiration occurs first before soft expiration after setting new date with big interval. As a result new child SA is deleted before rekeying the new one. Signed-off-by: Fan Du <fdu@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: remove ipv6_addr_copy()Alexey Dobriyan2011-11-221-8/+4
| | | | | | | C assignment can handle struct in6_addr copying. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* XFRM: Fix memory leak in xfrm_state_updateTushar Gohad2011-07-081-0/+2
| | | | | | | | | | | | | | Upon "ip xfrm state update ..", xfrm_add_sa() takes an extra reference on the user-supplied SA and forgets to drop the reference when xfrm_state_update() returns 0. This leads to a memory leak as the parameter SA is never freed. This change attempts to fix the leak by calling __xfrm_state_put() when xfrm_state_update() updates a valid SA (err = 0). The parameter SA is added to the gc list when the final reference is dropped by xfrm_add_sa() upon completion. Signed-off-by: Tushar Gohad <tgohad@mvista.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* inet: constify ip headers and in6_addrEric Dumazet2011-04-221-6/+6
| | | | | | | | Add const qualifiers to structs iphdr, ipv6hdr and in6_addr pointers where possible, to make code intention more obvious. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* xfrm: Assign esn pointers when cloning a stateSteffen Klassert2011-03-281-0/+6
| | | | | | | | | | | | | When we clone a xfrm state we have to assign the replay_esn and the preplay_esn pointers to the state if we use the new replay detection method. To this end, we add a xfrm_replay_clone() function that allocates memory for the replay detection and takes over the necessary values from the original state. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* xfrm: Fix initialize repl field of struct xfrm_stateWei Yongjun2011-03-211-1/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 'xfrm: Move IPsec replay detection functions to a separate file' (9fdc4883d92d20842c5acea77a4a21bb1574b495) introduce repl field to struct xfrm_state, and only initialize it under SA's netlink create path, the other path, such as pf_key, ipcomp/ipcomp6 etc, the repl field remaining uninitialize. So if the SA is created by pf_key, any input packet with SA's encryption algorithm will cause panic. int xfrm_input() { ... x->repl->advance(x, seq); ... } This patch fixed it by introduce new function __xfrm_init_state(). Pid: 0, comm: swapper Not tainted 2.6.38-next+ #14 Bochs Bochs EIP: 0060:[<c078e5d5>] EFLAGS: 00010206 CPU: 0 EIP is at xfrm_input+0x31c/0x4cc EAX: dd839c00 EBX: 00000084 ECX: 00000000 EDX: 01000000 ESI: dd839c00 EDI: de3a0780 EBP: dec1de88 ESP: dec1de64 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 Process swapper (pid: 0, ti=dec1c000 task=c09c0f20 task.ti=c0992000) Stack: 00000000 00000000 00000002 c0ba27c0 00100000 01000000 de3a0798 c0ba27c0 00000033 dec1de98 c0786848 00000000 de3a0780 dec1dea4 c0786868 00000000 dec1debc c074ee56 e1da6b8c de3a0780 c074ed44 de3a07a8 dec1decc c074ef32 Call Trace: [<c0786848>] xfrm4_rcv_encap+0x22/0x27 [<c0786868>] xfrm4_rcv+0x1b/0x1d [<c074ee56>] ip_local_deliver_finish+0x112/0x1b1 [<c074ed44>] ? ip_local_deliver_finish+0x0/0x1b1 [<c074ef32>] NF_HOOK.clone.1+0x3d/0x44 [<c074ef77>] ip_local_deliver+0x3e/0x44 [<c074ed44>] ? ip_local_deliver_finish+0x0/0x1b1 [<c074ec03>] ip_rcv_finish+0x30a/0x332 [<c074e8f9>] ? ip_rcv_finish+0x0/0x332 [<c074ef32>] NF_HOOK.clone.1+0x3d/0x44 [<c074f188>] ip_rcv+0x20b/0x247 [<c074e8f9>] ? ip_rcv_finish+0x0/0x332 [<c072797d>] __netif_receive_skb+0x373/0x399 [<c0727bc1>] netif_receive_skb+0x4b/0x51 [<e0817e2a>] cp_rx_poll+0x210/0x2c4 [8139cp] [<c072818f>] net_rx_action+0x9a/0x17d [<c0445b5c>] __do_softirq+0xa1/0x149 [<c0445abb>] ? __do_softirq+0x0/0x149 Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* xfrm: Add user interface for esn and big anti-replay windowsSteffen Klassert2011-03-131-0/+2
| | | | | | | | | | | | | | This patch adds a netlink based user interface to configure esn and big anti-replay windows. The new netlink attribute XFRMA_REPLAY_ESN_VAL is used to configure the new implementation. If the XFRM_STATE_ESN flag is set, we use esn and support for big anti-replay windows for the configured state. If this flag is not set we use the new implementation with 32 bit sequence numbers. A big anti-replay window can be configured in this case anyway. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* xfrm: Move IPsec replay detection functions to a separate fileSteffen Klassert2011-03-131-108/+3
| | | | | | | | | | To support multiple versions of replay detection, we move the replay detection functions to a separate file and make them accessible via function pointers contained in the struct xfrm_replay. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Put flowi_* prefix on AF independent members of struct flowiDavid S. Miller2011-03-121-1/+1
| | | | | | | | | | I intend to turn struct flowi into a union of AF specific flowi structs. There will be a common structure that each variant includes first, much like struct sock_common. This is the first step to move in that direction. Signed-off-by: David S. Miller <davem@davemloft.net>
* xfrm: Pass const xfrm_address_t objects to xfrm_state_lookup* and xfrm_find_acq.David S. Miller2011-02-271-4/+8
| | | | Signed-off-by: David S. Miller <davem@davemloft.net>
* xfrm: Const'ify xfrm_address_t args to xfrm_state_find.David S. Miller2011-02-231-2/+2
| | | | | | This required a const'ification in xfrm_init_tempstate() too. Signed-off-by: David S. Miller <davem@davemloft.net>
* xfrm: Remove unused 'saddr' and 'daddr' args to xfrm_state_look_at.David S. Miller2011-02-231-3/+2
| | | | Signed-off-by: David S. Miller <davem@davemloft.net>
* xfrm: Const'ify xfrm_address_t args to __xfrm_state_lookup{,_byaddr}.David S. Miller2011-02-231-2/+8
| | | | Signed-off-by: David S. Miller <davem@davemloft.net>
* xfrm: Const'ify xfrm_tmpl arg to xfrm_init_tempstate.David S. Miller2011-02-231-1/+1
| | | | Signed-off-by: David S. Miller <davem@davemloft.net>
* xfrm: Const'ify xfrm_address_t args to xfrm_*_hash.David S. Miller2011-02-231-5/+6
| | | | Signed-off-by: David S. Miller <davem@davemloft.net>
* xfrm: Const'ify pointer args to km_migrate() and implementations.David S. Miller2011-02-231-3/+3
| | | | Signed-off-by: David S. Miller <davem@davemloft.net>
* xfrm: Pass km_event pointers around as const when possible.David S. Miller2011-02-231-2/+2
| | | | Signed-off-by: David S. Miller <davem@davemloft.net>
* xfrm: Mark flowi arg to xfrm_state_find() const.David S. Miller2011-02-221-1/+1
| | | | Signed-off-by: David S. Miller <davem@davemloft.net>
* xfrm: Mark flowi arg to xfrm_init_tempstate() const.David S. Miller2011-02-221-1/+1
| | | | Signed-off-by: David S. Miller <davem@davemloft.net>
* xfrm: Mark flowi arg to xfrm_state_look_at() const.David S. Miller2011-02-221-1/+1
| | | | Signed-off-by: David S. Miller <davem@davemloft.net>
* xfrm: Fix xfrm_state_migrate leakThomas Egerer2010-12-091-1/+1
| | | | | | | | | | | xfrm_state_migrate calls kfree instead of xfrm_state_put to free a failed state. According to git commit 553f9118 this can cause memory leaks. Signed-off-by: Thomas Egerer <thomas.egerer@secunet.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* xfrm: Allow different selector family in temporary stateThomas Egerer2010-09-201-18/+27
| | | | | | | | | | | | | | | | | | The family parameter xfrm_state_find is used to find a state matching a certain policy. This value is set to the template's family (encap_family) right before xfrm_state_find is called. The family parameter is however also used to construct a temporary state in xfrm_state_find itself which is wrong for inter-family scenarios because it produces a selector for the wrong family. Since this selector is included in the xfrm_user_acquire structure, user space programs misinterpret IPv6 addresses as IPv4 and vice versa. This patch splits up the original init_tempsel function into a part that initializes the selector respectively the props and id of the temporary state, to allow for differing ip address families whithin the state. Signed-off-by: Thomas Egerer <thomas.egerer@secunet.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'master' of ↵David S. Miller2010-04-111-0/+1
|\ | | | | | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/stmmac/stmmac_main.c drivers/net/wireless/wl12xx/wl1271_cmd.c drivers/net/wireless/wl12xx/wl1271_main.c drivers/net/wireless/wl12xx/wl1271_spi.c net/core/ethtool.c net/mac80211/scan.c
| * include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵Tejun Heo2010-03-301-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
* | xfrm: Remove xfrm_state_genidHerbert Xu2010-04-011-4/+1
|/ | | | | | | | | | | The xfrm state genid only needs to be matched against the copy saved in xfrm_dst. So we don't need a global genid at all. In fact, we don't even need to initialise it. Based on observation by Timo Teräs. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* xfrm: SA lookups with markJamal Hadi Salim2010-02-221-0/+12
| | | | | | | Allow mark to be added to the SA lookup Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
* xfrm: SA lookups signature with markJamal Hadi Salim2010-02-221-24/+34
| | | | | | | | pass mark to all SA lookups to prepare them for when we add code to have them search. Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
* xfrm: Flushing empty SAD generates false eventsJamal Hadi Salim2010-02-191-2/+6
| | | | | | | | | | | | | To see the effect make sure you have an empty SAD. On window1 "ip xfrm mon" and on window2 issue "ip xfrm state flush" You get prompt back in window2 and you see the flush event on window1. With this fix, you still get prompt on window1 but no event on window2. Thanks to Alexey Dobriyan for finding a bug in earlier version when using pfkey to do the flushing. Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
* xfrm: Revert false event eliding commits.David S. Miller2010-02-171-6/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | As reported by Alexey Dobriyan: -------------------- setkey now takes several seconds to run this simple script and it spits "recv: Resource temporarily unavailable" messages. #!/usr/sbin/setkey -f flush; spdflush; add A B ipcomp 44 -m tunnel -C deflate; add B A ipcomp 45 -m tunnel -C deflate; spdadd A B any -P in ipsec ipcomp/tunnel/192.168.1.2-192.168.1.3/use; spdadd B A any -P out ipsec ipcomp/tunnel/192.168.1.3-192.168.1.2/use; -------------------- Obviously applications want the events even when the table is empty. So we cannot make this behavioral change. Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'master' of ↵David S. Miller2010-02-161-9/+3
|\ | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
| * xfrm: Fix xfrm_state_clone leakHerbert Xu2010-02-161-9/+3
| | | | | | | | | | | | | | | | | | | | | | | | xfrm_state_clone calls kfree instead of xfrm_state_put to free a failed state. Depending on the state of the failed state, it can cause leaks to things like module references. All states should be freed by xfrm_state_put past the point of xfrm_init_state. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* | xfrm: avoid spinlock in get_acqseq() used by xfrm userjamal2010-02-161-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Eric's version fixed it for pfkey. This one is for xfrm user. I thought about amortizing those two get_acqseq()s but it seems reasonable to have two of these sequence spaces for the two different interfaces. cheers, jamal commit d5168d5addbc999c94aacda8f28a4a173756a72b Author: Jamal Hadi Salim <hadi@cyberus.ca> Date: Tue Feb 16 06:51:22 2010 -0500 xfrm: avoid spinlock in get_acqseq() used by xfrm user This is in the same spirit as commit 28aecb9d7728dc26bf03ce7925fe622023a83a2a by Eric Dumazet. Use atomic_inc_return() in get_acqseq() to avoid taking a spinlock Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>