aboutsummaryrefslogtreecommitdiffstats
path: root/net
Commit message (Collapse)AuthorAgeFilesLines
...
| * | | netlabel: Remove unneeded in-kernel API functionsPaul Moore2008-10-101-61/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After some discussions with the Smack folks, well just Casey, I now have a better idea of what Smack wants out of NetLabel in the future so I think it is now safe to do some API "pruning". If another LSM comes along that needs this functionality we can always add it back in, but I don't see any LSMs on the horizon which might make use of these functions. Thanks to Rami Rosen who suggested removing netlbl_cfg_cipsov4_del() back in February 2008. Signed-off-by: Paul Moore <paul.moore@hp.com> Reviewed-by: James Morris <jmorris@namei.org>
| * | | netlabel: Fix some sparse warningsPaul Moore2008-10-103-14/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix a few sparse warnings. One dealt with a RCU lock being held on error, another dealt with an improper type caused by a signed/unsigned mixup while the rest appeared to be caused by using rcu_dereference() in a list_for_each_entry_rcu() call. The latter probably isn't a big deal, but I derive a certain pleasure from knowing that the net/netlabel is nice and clean. Thanks to James Morris for pointing out the issues and demonstrating how to run sparse. Signed-off-by: Paul Moore <paul.moore@hp.com>
* | | | gre: Initialise rtnl_link tunnel parameters properlyHerbert Xu2008-10-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Brown paper bag error of calling memset with sizeof(p) instead of sizeof(*p). Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | ipvs: Add proper dependencies on IP_VS, and fix description header line.David S. Miller2008-10-111-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Linus noted a build failure case: net/netfilter/ipvs/ip_vs_xmit.c: In function 'ip_vs_tunnel_xmit': net/netfilter/ipvs/ip_vs_xmit.c:616: error: implicit declaration of function 'ip_select_ident' The proper include file (net/ip.h) is being included in ip_vs_xmit.c to get that declaration. So the only possible case where this can happen is if CONFIG_INET is not enabled. This seems to be purely a missing dependency in the ipvs/Kconfig file IP_VS entry. Also, while we're here, remove the out of date "EXPERIMENTAL" string in the IP_VS config help header line. IP_VS no longer depends upon CONFIG_EXPERIMENTAL Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | af_key: fix SADB_X_SPDDELETE responseTobias Brunner2008-10-101-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When deleting an SPD entry using SADB_X_SPDDELETE, c.data.byid is not initialized to zero in pfkey_spddelete(). Thus, key_notify_policy() responds with a PF_KEY message of type SADB_X_SPDDELETE2 instead of SADB_X_SPDDELETE. Signed-off-by: Tobias Brunner <tobias.brunner@strongswan.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | gre: minor cleanups in netlink interfacePatrick McHardy2008-10-101-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - use typeful helpers for IFLA_GRE_LOCAL/IFLA_GRE_REMOTE - replace magic value by FIELD_SIZEOF - use MODULE_ALIAS_RTNL_LINK macro Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | gre: fix copy and paste errorPatrick McHardy2008-10-101-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The flags are dumped twice, the keys not at all. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | tcpv6: fix error with CONFIG_TCP_MD5SIG disabledGuo-Fu Tseng2008-10-091-5/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fix error with CONFIG_TCP_MD5SIG disabled. Signed-off-by: Guo-Fu Tseng <cooldavid@cooldavid.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | udp: complete port availability checkingEric Dumazet2008-10-091-24/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While looking at UDP port randomization, I noticed it was litle bit pessimistic, not looking at type of sockets (IPV6/IPV4) and not looking at bound addresses if any. We should perform same tests than when binding to a specific port. This permits a cleanup of udp_lib_get_port() Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | tcpv6: combine tcp_v6_send_(reset|ack)Ilpo Järvinen2008-10-091-99/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | $ codiff tcp_ipv6.o.old tcp_ipv6.o.new net/ipv6/tcp_ipv6.c: tcp_v6_md5_hash_hdr | -144 tcp_v6_send_ack | -585 tcp_v6_send_reset | -540 3 functions changed, 1269 bytes removed, diff: -1269 net/ipv6/tcp_ipv6.c: tcp_v6_send_response | +791 1 function changed, 791 bytes added, diff: +791 tcp_ipv6.o.new: 4 functions changed, 791 bytes added, 1269 bytes removed, diff: -478 I choose to leave the reset related netns comment in place (not the one that is killed) as I cannot understand its English so it's a bit hard for me to evaluate its usefulness :-). Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | tcpv6: convert opt[] -> topt in tcp_v6_send_resetIlpo Järvinen2008-10-091-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | after this I get: $ diff-funcs tcp_v6_send_reset tcp_ipv6.c tcp_ipv6.c tcp_v6_send_ack --- tcp_ipv6.c:tcp_v6_send_reset() +++ tcp_ipv6.c:tcp_v6_send_ack() @@ -1,4 +1,5 @@ -static void tcp_v6_send_reset(struct sock *sk, struct sk_buff *skb) +static void tcp_v6_send_ack(struct sk_buff *skb, u32 seq, u32 ack, u32 win, u32 ts, + struct tcp_md5sig_key *key) { struct tcphdr *th = tcp_hdr(skb), *t1; struct sk_buff *buff; @@ -7,31 +8,14 @@ struct sock *ctl_sk = net->ipv6.tcp_sk; unsigned int tot_len = sizeof(struct tcphdr); __be32 *topt; -#ifdef CONFIG_TCP_MD5SIG - struct tcp_md5sig_key *key; -#endif - - if (th->rst) - return; - - if (!ipv6_unicast_destination(skb)) - return; + if (ts) + tot_len += TCPOLEN_TSTAMP_ALIGNED; #ifdef CONFIG_TCP_MD5SIG - if (sk) - key = tcp_v6_md5_do_lookup(sk, &ipv6_hdr(skb)->daddr); - else - key = NULL; - if (key) tot_len += TCPOLEN_MD5SIG_ALIGNED; #endif - /* - * We need to grab some memory, and put together an RST, - * and then put it into the queue to be sent. - */ - buff = alloc_skb(MAX_HEADER + sizeof(struct ipv6hdr) + tot_len, GFP_ATOMIC); if (buff == NULL) @@ -46,18 +30,20 @@ t1->dest = th->source; t1->source = th->dest; t1->doff = tot_len / 4; - t1->rst = 1; - - if(th->ack) { - t1->seq = th->ack_seq; - } else { - t1->ack = 1; - t1->ack_seq = htonl(ntohl(th->seq) + th->syn + th->fin - + skb->len - (th->doff<<2)); - } + t1->seq = htonl(seq); + t1->ack_seq = htonl(ack); + t1->ack = 1; + t1->window = htons(win); topt = (__be32 *)(t1 + 1); + if (ts) { + *topt++ = htonl((TCPOPT_NOP << 24) | (TCPOPT_NOP << 16) | + (TCPOPT_TIMESTAMP << 8) | TCPOLEN_TIMESTAMP); + *topt++ = htonl(tcp_time_stamp); + *topt++ = htonl(ts); + } + #ifdef CONFIG_TCP_MD5SIG if (key) { *topt++ = htonl((TCPOPT_NOP << 24) | (TCPOPT_NOP << 16) | @@ -84,15 +70,10 @@ fl.fl_ip_sport = t1->source; security_skb_classify_flow(skb, &fl); - /* Pass a socket to ip6_dst_lookup either it is for RST - * Underlying function will use this to retrieve the network - * namespace - */ if (!ip6_dst_lookup(ctl_sk, &buff->dst, &fl)) { if (xfrm_lookup(&buff->dst, &fl, NULL, 0) >= 0) { ip6_xmit(ctl_sk, buff, &fl, NULL, 0); TCP_INC_STATS_BH(net, TCP_MIB_OUTSEGS); - TCP_INC_STATS_BH(net, TCP_MIB_OUTRSTS); return; } } ...which starts to be trivial to combine. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | tcpv6: trivial formatting changes to send_(ack|reset)Ilpo Järvinen2008-10-091-4/+3
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | tcpv[46]: fix md5 pseudoheader address field orderingIlpo Järvinen2008-10-092-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Maybe it's just me but I guess those md5 people made a mess out of it by having *_md5_hash_* to use daddr, saddr order instead of the one that is natural (and equal to what csum functions use). For the segment were sending, the original addresses are reversed so buff's saddr == skb's daddr and vice-versa. Maybe I can finally proceed with unification of some code after fixing it first... :-) Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | sctp: update SNMP statiscts when T5 timer expired.Vlad Yasevich2008-10-091-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The T5 timer is the timer for the over-all shutdown procedure. If this timer expires, then shutdown procedure has not completed and we ABORT the association. We should update SCTP_MIB_ABORTED and SCTP_MIB_CURRESTAB when aborting. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | sctp: Fix SNMP number of SCTP_MIB_ABORTED during violation handling.Vlad Yasevich2008-10-091-6/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If ABORT chunks require authentication and a protocol violation is triggered, we do not tear down the association. Subsequently, we should not increment SCTP_MIB_ABORTED. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | sctp: Fix the SNMP number of SCTP_MIB_CURRESTABWei Yongjun2008-10-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | RFC3873 defined SCTP_MIB_CURRESTAB: sctpCurrEstab OBJECT-TYPE SYNTAX Gauge32 MAX-ACCESS read-only STATUS current DESCRIPTION "The number of associations for which the current state is either ESTABLISHED, SHUTDOWN-RECEIVED or SHUTDOWN-PENDING." REFERENCE "Section 4 in RFC2960 covers the SCTP Association state diagram." If the T4 RTO timer expires many times(timeout), the association will enter CLOSED state, so we should dec the number of SCTP_MIB_CURRESTAB, not inc the number of SCTP_MIB_CURRESTAB. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | inet: Make tunnel RX/TX byte counters more consistentHerbert Xu2008-10-091-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes the RX/TX byte counters for IPIP, GRE and SIT more consistent. Previously we included the external IP headers on the way out but not when the packet is inbound. The new scheme is to count payload only in both directions. For IPIP and SIT this simply means the exclusion of the external IP header. For GRE this means that we exclude the GRE header as well. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | gre: Add Transparent Ethernet BridgingHerbert Xu2008-10-091-32/+174
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support for Ethernet over GRE encapsulation. This is exposed to user-space with a new link type of "gretap" instead of "gre". It will create an ARPHRD_ETHER device in lieu of the usual ARPHRD_IPGRE. Note that to preserver backwards compatibility all Transparent Ethernet Bridging packets are passed to an ARPHRD_IPGRE tunnel if its key matches and there is no ARPHRD_ETHER device whose key matches more closely. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | gre: Add netlink interfaceHerbert Xu2008-10-091-4/+243
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds a netlink interface that will eventually displace the existing ioctl interface. It utilises the elegant rtnl_link_ops mechanism. This also means that user-space no longer needs to rely on the tunnel interface being of type GRE to identify GRE tunnels. The identification can now occur using rtnl_link_ops. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | gre: Move MTU setting out of ipgre_tunnel_bind_devHerbert Xu2008-10-091-5/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch moves the dev->mtu setting out of ipgre_tunnel_bind_dev. This is in prepartion of using rtnl_link where we'll need to make the MTU setting conditional on whether the user has supplied an MTU. This also requires the move of the ipgre_tunnel_bind_dev call out of the dev->init function so that we can access the user parameters later. This patch also adds a check to prevent setting the MTU below the minimum of 68. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | gre: Use needed_headroomHerbert Xu2008-10-091-7/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that we have dev->needed_headroom, we can use it instead of having a bogus dev->hard_header_len. This also allows us to include dev->hard_header_len in the MTU computation so that when we do have a meaningful hard_harder_len in future it is included automatically in figuring out the MTU. Incidentally, this fixes a bug where we ignored the needed_headroom field of the underlying device in calculating our own hard_header_len. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | dsa: Need to select PHYLIB.David S. Miller2008-10-081-0/+1
| | | | | | | | | | | | | | | | Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | dsa: add support for the Marvell 88E6060 switch chipLennert Buytenhek2008-10-083-0/+295
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for the Marvell 88E6060 switch chip. This chip only supports the Header and Trailer tagging formats, and we use it in Trailer mode since that mode is slightly easier to handle than Header mode. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Tested-by: Byron Bradley <byron.bbradley@gmail.com> Tested-by: Tim Ellis <tim.ellis@mac.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | dsa: add support for Trailer tagging formatLennert Buytenhek2008-10-087-0/+152
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds support for the Trailer switch tagging format. This is another tagging that doesn't explicitly mark tagged packets with a distinct ethertype, so that we need to add a similar hack in the receive path as for the Original DSA tagging format. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Tested-by: Byron Bradley <byron.bbradley@gmail.com> Tested-by: Tim Ellis <tim.ellis@mac.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | dsa: add support for the Marvell 88E6131 switch chipLennert Buytenhek2008-10-085-0/+555
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for the Marvell 88E6131 switch chip. This chip only supports the original (ethertype-less) DSA tagging format. On the 88E6131, there is a PHY Polling Unit (PPU) which has exclusive access to each of the PHYs's MII management registers. If we want to talk to the PHYs from software, we have to disable the PPU and wait for it to complete its current transaction before we can do so, and we need to re-enable the PPU afterwards to make sure that the switch will notice changes in link state and speed on the individual ports as they occur. Since disabling the PPU is rather slow, and since MII management accesses are typically done in bursts, this patch keeps the PPU disabled for 10ms after a software access completes. This makes handling the PPU slightly more complex, but speeds up something like running ethtool on one of the switch slave interfaces from ~300ms to ~30ms on typical hardware. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Tested-by: Nicolas Pitre <nico@marvell.com> Tested-by: Peter van Valderen <linux@ddcrew.com> Tested-by: Dirk Teurlings <dirk@upexia.nl> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | dsa: add support for original DSA tagging formatLennert Buytenhek2008-10-088-7/+244
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Most of the DSA switches currently in the field do not support the Ethertype DSA tagging format that one of the previous patches added support for, but only the original DSA tagging format. The original DSA tagging format carries the same information as the Ethertype DSA tagging format, but with the difference that it does not have an ethertype field. In other words, when receiving a packet that is tagged with an original DSA tag, there is no way of telling in eth_type_trans() that this packet is in fact a DSA-tagged packet. This patch adds a hook into eth_type_trans() which is only compiled in if support for a switch chip that doesn't support Ethertype DSA is selected, and which checks whether there is a DSA switch driver instance attached to this network device which uses the old tag format. If so, it sets the protocol field to ETH_P_DSA without looking at the packet, so that the packet ends up in the right place. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Tested-by: Nicolas Pitre <nico@marvell.com> Tested-by: Peter van Valderen <linux@ddcrew.com> Tested-by: Dirk Teurlings <dirk@upexia.nl> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | net: Distributed Switch Architecture protocol supportLennert Buytenhek2008-10-0811-0/+1893
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Distributed Switch Architecture is a protocol for managing hardware switch chips. It consists of a set of MII management registers and commands to configure the switch, and an ethernet header format to signal which of the ports of the switch a packet was received from or is intended to be sent to. The switches that this driver supports are typically embedded in access points and routers, and a typical setup with a DSA switch looks something like this: +-----------+ +-----------+ | | RGMII | | | +-------+ +------ 1000baseT MDI ("WAN") | | | 6-port +------ 1000baseT MDI ("LAN1") | CPU | | ethernet +------ 1000baseT MDI ("LAN2") | |MIImgmt| switch +------ 1000baseT MDI ("LAN3") | +-------+ w/5 PHYs +------ 1000baseT MDI ("LAN4") | | | | +-----------+ +-----------+ The switch driver presents each port on the switch as a separate network interface to Linux, polls the switch to maintain software link state of those ports, forwards MII management interface accesses to those network interfaces (e.g. as done by ethtool) to the switch, and exposes the switch's hardware statistics counters via the appropriate Linux kernel interfaces. This initial patch supports the MII management interface register layout of the Marvell 88E6123, 88E6161 and 88E6165 switch chips, and supports the "Ethertype DSA" packet tagging format. (There is no officially registered ethertype for the Ethertype DSA packet format, so we just grab a random one. The ethertype to use is programmed into the switch, and the switch driver uses the value of ETH_P_EDSA for this, so this define can be changed at any time in the future if the one we chose is allocated to another protocol or if Ethertype DSA gets its own officially registered ethertype, and everything will continue to work.) Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Tested-by: Nicolas Pitre <nico@marvell.com> Tested-by: Byron Bradley <byron.bbradley@gmail.com> Tested-by: Tim Ellis <tim.ellis@mac.com> Tested-by: Peter van Valderen <linux@ddcrew.com> Tested-by: Dirk Teurlings <dirk@upexia.nl> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | Merge branch 'master' of ↵David S. Miller2008-10-087-34/+33
|\ \ \ \ | |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/e1000e/ich8lan.c drivers/net/e1000e/netdev.c
| * | | tcp: Fix tcp_hybla zero congestion window growth with small rho and large cwnd.Daniele Lacamera2008-10-071-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Because of rounding, in certain conditions, i.e. when in congestion avoidance state rho is smaller than 1/128 of the current cwnd, TCP Hybla congestion control starves and the cwnd is kept constant forever. This patch forces an increment by one segment after #send_cwnd calls without increments(newreno behavior). Signed-off-by: Daniele Lacamera <root@danielinux.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | net: Fix netdev_run_todo dead-lockHerbert Xu2008-10-072-22/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Benjamin Thery tracked down a bug that explains many instances of the error unregister_netdevice: waiting for %s to become free. Usage count = %d It turns out that netdev_run_todo can dead-lock with itself if a second instance of it is run in a thread that will then free a reference to the device waited on by the first instance. The problem is really quite silly. We were trying to create parallelism where none was required. As netdev_run_todo always follows a RTNL section, and that todo tasks can only be added with the RTNL held, by definition you should only need to wait for the very ones that you've added and be done with it. There is no need for a second mutex or spinlock. This is exactly what the following patch does. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | tcp: Fix possible double-ack w/ user dmaAli Saidi2008-10-071-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | From: Ali Saidi <saidi@engin.umich.edu> When TCP receive copy offload is enabled it's possible that tcp_rcv_established() will cause two acks to be sent for a single packet. In the case that a tcp_dma_early_copy() is successful, copied_early is set to true which causes tcp_cleanup_rbuf() to be called early which can send an ack. Further along in tcp_rcv_established(), __tcp_ack_snd_check() is called and will schedule a delayed ACK. If no packets are processed before the delayed ack timer expires the packet will be acked twice. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | net: only invoke dev->change_rx_flags when device is UPPatrick McHardy2008-10-071-6/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jesper Dangaard Brouer <hawk@comx.dk> reported a bug when setting a VLAN device down that is in promiscous mode: When the VLAN device is set down, the promiscous count on the real device is decremented by one by vlan_dev_stop(). When removing the promiscous flag from the VLAN device afterwards, the promiscous count on the real device is decremented a second time by the vlan_change_rx_flags() callback. The root cause for this is that the ->change_rx_flags() callback is invoked while the device is down. The synchronization is meant to mirror the behaviour of the ->set_rx_mode callbacks, meaning the ->open function is responsible for doing a full sync on open, the ->close() function is responsible for doing full cleanup on ->stop() and ->change_rx_flags() is meant to do incremental changes while the device is UP. Only invoke ->change_rx_flags() while the device is UP to provide the intended behaviour. Tested-by: Jesper Dangaard Brouer <jdb@comx.dk> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | netrom: Fix sock_orphan() use in nr_releaseJarek Poplawski2008-10-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While debugging another bug it was found that NetRom socks are sometimes seen unorphaned in sk_free(). This patch moves sock_orphan() in nr_release() to the beginning (like in ax25, or rose). Reported-and-tested-by: Bernard Pidoux f6bvp <f6bvp@free.fr> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | ax25: Quick fix for making sure unaccepted sockets get destroyed.David S. Miller2008-10-061-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since we reverted 30902dc3cb0ea1cfc7ac2b17bcf478ff98420d74 ("ax25: Fix std timer socket destroy handling.") we have to put some kind of fix in to cure the issue whereby unaccepted connections do not get destroyed. The approach used here is from Tihomir Heidelberg - 9a4gl Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | Revert "ax25: Fix std timer socket destroy handling."David S. Miller2008-10-061-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 30902dc3cb0ea1cfc7ac2b17bcf478ff98420d74. It causes all kinds of problems, based upon a report by Bernard (f6bvp) and analysis by Jarek Poplawski. Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | ipvs: Remove stray file left over from ipvs moveSven Wegener2008-10-081-261/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit cb7f6a7b716e801097b564dec3ccb58d330aef56 ("IPVS: Move IPVS to net/netfilter/ipvs") has left a stray file in the old location of ipvs. Signed-off-by: Sven Wegener <sven.wegener@stealer.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | tcpv6: fix option space offsets with md5Ilpo Järvinen2008-10-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | More breakage :-), part of timestamps just were previously overwritten. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | Merge branch 'lvs-next-2.6' of ↵David S. Miller2008-10-0829-5/+272
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/horms/lvs-2.6 Conflicts: net/netfilter/Kconfig
| * \ \ \ Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6 into ↵Simon Horman2008-10-0788-1180/+4637
| |\ \ \ \ | | | | | | | | | | | | | | | | | | lvs-next-2.6
| * | | | | IPVS: Move IPVS to net/netfilter/ipvsJulius Volz2008-10-0729-3/+266
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since IPVS now has partial IPv6 support, this patch moves IPVS from net/ipv4/ipvs to net/netfilter/ipvs. It's a result of: $ git mv net/ipv4/ipvs net/netfilter and adapting the relevant Kconfigs/Makefiles to the new path. Signed-off-by: Julius Volz <juliusv@google.com> Signed-off-by: Simon Horman <horms@verge.net.au>
| * | | | | ipvs: Fix unused label warningSven Wegener2008-09-221-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Sven Wegener <sven.wegener@stealer.net> Signed-off-by: Simon Horman <horms@verge.net.au>
| * | | | | ipvs: Restrict sync message to 255 connectionsSven Wegener2008-09-221-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The nr_conns variable in the sync message header is only eight bits wide and will overflow on interfaces with a large MTU. As a result the backup won't parse all connections contained in the sync buffer. On regular ethernet with an MTU of 1500 this isn't a problem, because we can't overflow the value, but consider jumbo frames being used on a cross-over connection between both directors. We now restrict the size of the sync buffer, so that we never put more than 255 connections into a single sync buffer. Signed-off-by: Sven Wegener <sven.wegener@stealer.net> Signed-off-by: Simon Horman <horms@verge.net.au>
* | | | | | sctp: shrink sctp_tsnmap some more by removing gabs arrayVlad Yasevich2008-10-082-9/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The gabs array in the sctp_tsnmap structure is only used in one place, sctp_make_sack(). As such, carrying the array around in the sctp_tsnmap and thus directly in the sctp_association is rather pointless since most of the time it's just taking up space. Now, let sctp_make_sack create and populate it and then throw it away when it's done. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | sctp: Rework the tsn map to use generic bitmap.Vlad Yasevich2008-10-085-192/+166
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The tsn map currently use is 4K large and is stuck inside the sctp_association structure making memory references REALLY expensive. What we really need is at most 4K worth of bits so the biggest map we would have is 512 bytes. Also, the map is only really usefull when we have gaps to store and report. As such, starting with minimal map of say 32 TSNs (bits) should be enough for normal low-loss operations. We can grow the map by some multiple of 32 along with some extra room any time we receive the TSN which would put us outside of the map boundry. As we close gaps, we can shift the map to rebase it on the latest TSN we've seen. This saves 4088 bytes per association just in the map alone along savings from the now unnecessary structure members. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | inet: cleanup of local_port_rangeEric Dumazet2008-10-082-20/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I noticed sysctl_local_port_range[] and its associated seqlock sysctl_local_port_range_lock were on separate cache lines. Moreover, sysctl_local_port_range[] was close to unrelated variables, highly modified, leading to cache misses. Moving these two variables in a structure can help data locality and moving this structure to read_mostly section helps sharing of this data among cpus. Cleanup of extern declarations (moved in include file where they belong), and use of inet_get_local_port_range() accessor instead of direct access to ports values. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | udp: Improve port randomizationEric Dumazet2008-10-081-44/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current UDP port allocation is suboptimal. We select the shortest chain to chose a port (out of 512) that will hash in this shortest chain. First, it can lead to give not so ramdom ports and ease give attackers more opportunities to break the system. Second, it can consume a lot of CPU to scan all table in order to find the shortest chain. Third, in some pathological cases we can fail to find a free port even if they are plenty of them. This patch zap the search for a short chain and only use one random seed. Problem of getting long chains should be addressed in another way, since we can obtain long chains with non random ports. Based on a report and patch from Vitaly Mayatskikh Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | pkt_sched: Update qdisc requeue stats in dev_requeue_skb()Jarek Poplawski2008-10-081-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After the last change of requeuing there is no info about such incidents in tc stats. This patch updates the counter, but we should consider this should differ from previous stats because of additional checks preventing to repeat this. On the other hand, previous stats didn't include requeuing of gso_segmented skbs. Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | tcp: fix length used for checksum in a resetIlpo Järvinen2008-10-082-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While looking for some common code I came across difference in checksum calculation between tcp_v6_send_(reset|ack) I couldn't explain. I checked both v4 and v6 and found out that both seem to have the same "feature". I couldn't find anything in rfc nor anywhere else which would state that md5 option should be ignored like it was in case of reset so I came to a conclusion that this is probably a genuine bug. I suspect that addition of md5 just was fooled by the excessive copy-paste code in those functions and the reset part was never tested well enough to find out the problem. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | ipv6: remove unused not init_ipv6_mibs/cleanup_ipv6_mibsDenis V. Lunev2008-10-081-17/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | ipv6: making ip and icmp statistics per/namespaceDenis V. Lunev2008-10-082-23/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>