aboutsummaryrefslogtreecommitdiffstats
path: root/net/ipv4
Commit message (Collapse)AuthorAgeFilesLines
...
* | | | | ipv4: arp announce, arp_proxy and windows ip conflict verificationDenys Fedoryshchenko2009-03-131-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Windows (XP at least) hosts on boot, with configured static ip, performing address conflict detection, which is defined in RFC3927. Here is quote of important information: " An ARP announcement is identical to the ARP Probe described above, except that now the sender and target IP addresses are both set to the host's newly selected IPv4 address. " But it same time this goes wrong with RFC5227. " The 'sender IP address' field MUST be set to all zeroes; this is to avoid polluting ARP caches in other hosts on the same link in the case where the address turns out to be already in use by another host. " When ARP proxy configured, it must not answer to both cases, because it is address conflict verification in any case. For Windows it is just causing to detect false "ip conflict". Already there is code for RFC5227, so just trivially we just check also if source ip == target ip. Signed-off-by: Denys Fedoryshchenko <denys@visp.net.lb> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | Network Drop Monitor: Adding kfree_skb_clean for non-drops and modifying ↵Neil Horman2009-03-132-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | end-of-line points for skbs Signed-off-by: Neil Horman <nhorman@tuxdriver.com> include/linux/skbuff.h | 4 +++- net/core/datagram.c | 2 +- net/core/skbuff.c | 22 ++++++++++++++++++++++ net/ipv4/arp.c | 2 +- net/ipv4/udp.c | 2 +- net/packet/af_packet.c | 2 +- 6 files changed, 29 insertions(+), 5 deletions(-) Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | tcp: allow timestamps even if SYN packet has tsval=0Eric Dumazet2009-03-111-9/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some systems send SYN packets with apparently wrong RFC1323 timestamp option values [timestamp tsval=0 tsecr=0]. It might be for security reasons (http://www.secuobs.com/plugs/25220.shtml ) Linux TCP stack ignores this option and sends back a SYN+ACK packet without timestamp option, thus many TCP flows cannot use timestamps and lose some benefit of RFC1323. Other operating systems seem to not care about initial tsval value, and let tcp flows to negotiate timestamp option. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | net: convert usage of packet_type to read_mostlyStephen Hemminger2009-03-102-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Protocols that use packet_type can be __read_mostly section for better locality. Elminate any unnecessary initializations of NULL. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | tcp: tcp_init_wl / tcp_update_wl argument cleanupHantzis Fotis2009-03-023-7/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The above functions from include/net/tcp.h have been defined with an argument that they never use. The argument is 'u32 ack' which is never used inside the function body, and thus it can be removed. The rest of the patch involves the necessary changes to the function callers of the above two functions. Signed-off-by: Hantzis Fotis <xantzis@ceid.upatras.gr> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | tcp: get rid of two unnecessary u16s in TCP skb flags copyingIlpo Järvinen2009-03-021-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I guess these fields were one day 16-bit in the struct but nowadays they're just using 8 bits anyway. This is just a precaution, didn't result any change in my case but who knows what all those varying gcc versions & options do. I've been told that 16-bit is not so nice with some cpus. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | tcp: in sendmsg/pages open code the real goto targetIlpo Järvinen2009-03-021-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | copied was assigned zero right before the goto, so if (copied) cannot ever be true. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | tcp: kill eff_sacks "cache", the sole user can calculate itselfIlpo Järvinen2009-03-023-21/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Also fixes insignificant bug that would cause sending of stale SACK block (would occur in some corner cases). Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | tcp: add helper for AI algorithmIlpo Järvinen2009-03-026-49/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It seems that implementation in yeah was inconsistent to what other did as it would increase cwnd one ack earlier than the others do. Size benefits: bictcp_cong_avoid | -36 tcp_cong_avoid_ai | +52 bictcp_cong_avoid | -34 tcp_scalable_cong_avoid | -36 tcp_veno_cong_avoid | -12 tcp_yeah_cong_avoid | -38 = -104 bytes total Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | htcp: merge icsk_ca_state compareIlpo Järvinen2009-03-021-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Similar to what is done elsewhere in TCP code when double state checks are being done. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | tcp: drop unnecessary local var in collapseIlpo Järvinen2009-03-021-5/+2
| | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | tcp: cleanup ca_state mess in tcp_timerIlpo Järvinen2009-03-021-13/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Redundant checks made indentation impossible to follow. However, it might be useful to make this ca_state+is_sack indexed array. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | tcp: separate timeout marking loop to it's own functionIlpo Järvinen2009-03-021-24/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some comment about its current state added. So far I have seen very few cases where the thing is actually useful, usually just marginally (though admittedly I don't usually see top of window losses where it seems possible that there could be some gain), instead, more often the cases suffer from L-marking spike which is certainly not desirable (I'll bury improving it to my todo list, but on a low prio position). Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | tcp: remove redundant code from tcp_mark_lost_retransIlpo Järvinen2009-03-021-4/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Arnd Hannemann <hannemann@nets.rwth-aachen.de> noticed and was puzzled by the fact that !tcp_is_fack(tp) leads to early return near the beginning and the later on tcp_is_fack(tp) was still used in an if condition. The later check was a left-over from RFC3517 SACK stuff (== !tcp_is_fack(tp) behavior nowadays) as there wasn't clear way how to handle this particular check cheaply in the spirit of RFC3517 (using only SACK blocks, not holes + SACK blocks as with FACK). I sort of left it there as a reminder but since it's confusing other people just remove it and comment the missing-feature stuff instead. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Cc: Arnd Hannemann <hannemann@nets.rwth-aachen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | tcp: fix corner case issue in segmentation during rexmittingIlpo Järvinen2009-03-021-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If cur_mss grew very recently so that the previously G/TSOed skb now fits well into a single segment it would get send up in parts unless we calculate # of segments again. This corner-case could happen eg. after mtu probe completes or less than previously sack blocks are required for the opposite direction. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | tcp: Don't clear hints when tcp_fragmentingIlpo Järvinen2009-03-021-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1) We didn't remove any skbs, so no need to handle stale refs. 2) scoreboard_skb_hint is trivial, no timestamps were changed so no need to clear that one 3) lost_skb_hint needs tweaking similar to that of tcp_sacktag_one(). Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | tcp: deferring in middle of queue makes very little senseIlpo Järvinen2009-03-021-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If skb can be sent right away, we certainly should do that if it's in the middle of the queue because it won't get more data into it. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | tcp: fix lost_cnt_hint miscountsIlpo Järvinen2009-03-021-8/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is possible that lost_cnt_hint gets underflow in tcp_clean_rtx_queue because the cumulative ACK can cover the segment where lost_skb_hint points to only partially, which means that the hint is not cleared, opposite to what my (earlier) comment claimed. Also I don't agree what I ended up writing about non-trivial case there to be what I intented to say. It was not supposed to happen that the hint won't get cleared and we underflow in any scenario. In general, this is quite hard to trigger in practice. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | tcp: don't backtrack to sacked skbsIlpo Järvinen2009-03-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backtracking to sacked skbs is a horrible performance killer since the hint cannot be advanced successfully past them... ...And it's totally unnecessary too. In theory this is 2.6.27..28 regression but I doubt anybody can make .28 to have worse performance because of other TCP improvements. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | Merge branch 'master' of ↵David S. Miller2009-03-011-4/+5
|\ \ \ \ \ | |/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/wireless/iwlwifi/iwl-tx.c net/8021q/vlan_core.c net/core/dev.c
| * | | | tcp: fix retrans_out leaksIlpo Järvinen2009-03-011-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's conflicting assumptions in shifting, the caller assumes that dupsack results in S'ed skbs (or a part of it) for sure but never gave a hint to tcp_sacktag_one when dsack is actually in use. Thus DSACK retrans_out -= pcount was not taken and the counter became out of sync. Remove obstacle from that information flow to get DSACKs accounted in tcp_sacktag_one as expected. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Tested-by: Denys Fedoryshchenko <denys@visp.net.lb> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | inet fragments: fix sparse warning: context imbalanceHannes Eder2009-02-261-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: Attribute function with __releases(...) Fix this sparse warning: net/ipv4/inet_fragment.c:276:35: warning: context imbalance in 'inet_frag_find' - unexpected unlock Signed-off-by: Hannes Eder <hannes@hanneseder.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | Merge branch 'master' of ↵David S. Miller2009-02-251-1/+1
|\ \ \ \ \ | |/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/wireless/orinoco/orinoco.c
| * | | | tcp_scalable: Update malformed & dead urlJoe Perches2009-02-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | ipip: used time_before for comparing jiffiesWei Yongjun2009-02-241-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The functions time_before is more robust for comparing jiffies against other values. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | gre: used time_before for comparing jiffiesWei Yongjun2009-02-241-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The functions time_before is more robust for comparing jiffies against other values. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | netlink: change nlmsg_notify() return value logicPablo Neira Ayuso2009-02-242-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch changes the return value of nlmsg_notify() as follows: If NETLINK_BROADCAST_ERROR is set by any of the listeners and an error in the delivery happened, return the broadcast error; else if there are no listeners apart from the socket that requested a change with the echo flag, return the result of the unicast notification. Thus, with this patch, the unicast notification is handled in the same way of a broadcast listener that has set the NETLINK_BROADCAST_ERROR socket flag. This patch is useful in case that the caller of nlmsg_notify() wants to know the result of the delivery of a netlink notification (including the broadcast delivery) and take any action in case that the delivery failed. For example, ctnetlink can drop packets if the event delivery failed to provide reliable logging and state-synchronization at the cost of dropping packets. This patch also modifies the rtnetlink code to ignore the return value of rtnl_notify() in all callers. The function rtnl_notify() (before this patch) returned the error of the unicast notification which makes rtnl_set_sk_err() reports errors to all listeners. This is not of any help since the origin of the change (the socket that requested the echoing) notices the ENOBUFS error if the notification fails and should resync itself. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | Merge branch 'master' of /home/davem/src/GIT/linux-2.6/David S. Miller2009-02-242-3/+7
|\ \ \ \ \ | |/ / / /
| * | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds2009-02-231-1/+0
| |\ \ \ \ | | |_|/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: netns: fix double free at netns creation veth : add the set_mac_address capability sunlance: Beyond ARRAY_SIZE of ib->btx_ring sungem: another error printed one too early ISDN: fix sc/shmem printk format warning SMSC: timeout reaches -1 smsc9420: handle magic field of ethtool_eeprom sundance: missing parentheses? smsc9420: fix another postfixed timeout wimax/i2400m: driver loads firmware v1.4 instead of v1.3 vlan: Update skb->mac_header in __vlan_put_tag(). cxgb3: Add support for PCI ID 0x35. tcp: remove obsoleted comment about different passes TG3: &&/|| confusion ATM: misplaced parentheses? net/mv643xx: don't disable the mib timer too early and lock properly net/mv643xx: use GFP_ATOMIC while atomic atl1c: Atheros L1C Gigabit Ethernet driver net: Kill skb_truesize_check(), it only catches false-positives. net: forcedeth: Fix wake-on-lan regression
| | * | | tcp: remove obsoleted comment about different passesIlpo Järvinen2009-02-181-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is obsolete since the passes got combined. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | cipso: Fix documentation commentPaul Moore2009-02-231-2/+7
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The CIPSO protocol engine incorrectly stated that the FIPS-188 specification could be found in the kernel's Documentation directory. This patch corrects that by removing the comment and directing users to the FIPS-188 documented hosted online. For the sake of completeness I've also included a link to the CIPSO draft specification on the NetLabel website. Thanks to Randy Dunlap for spotting the error and letting me know. Signed-off-by: Paul Moore <paul.moore@hp.com> Signed-off-by: James Morris <jmorris@namei.org>
* | | | Doc: Refer to ip-sysctl.txt for strict vs. loose rp_filter modeJesper Dangaard Brouer2009-02-241-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The IP_ADVANCED_ROUTER Kconfig describes the rp_filter proc option. Recent changes added a loose mode. Instead of documenting this change too places, refer to the document describing it: Documentation/networking/ip-sysctl.txt I'm considering moving the rp_filter description away from the Kconfig file into ip-sysctl.txt. Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | tcp: Like icmp use register_pernet_subsysEric W. Biederman2009-02-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To remove the possibility of packets flying around when network devices are being cleaned up use reisger_pernet_subsys instead of register_pernet_device. Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Acked-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | netns: Fix icmp shutdown.Eric W. Biederman2009-02-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Recently I had a kernel panic in icmp_send during a network namespace cleanup. There were packets in the arp queue that failed to be sent and we attempted to generate an ICMP host unreachable message, but failed because icmp_sk_exit had already been called. The network devices are removed from a network namespace and their arp queues are flushed before we do attempt to shutdown subsystems so this error should have been impossible. It turns out icmp_init is using register_pernet_device instead of register_pernet_subsys. Which resulted in icmp being shut down while we still had the possibility of packets in flight, making a nasty NULL pointer deference in interrupt context possible. Changing this to register_pernet_subsys fixes the problem in my testing. Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Acked-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | ipv4: Clean whitespaces in net/ipv4/Kconfig.Jesper Dangaard Brouer2009-02-221-21/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | While going through net/ipv4/Kconfig cleanup whitespaces. Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | ipv4: Fix rp_filter description in net/ipv4/Kconfig.Jesper Dangaard Brouer2009-02-221-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The reverse path filter (rp_filter) will NOT get enabled when enabling forwarding. Read the code and tested in in practice. Most distributions do enable it in startup scripts. Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | ip: ipip compile warningStephen Hemminger2009-02-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Get rid of compile warning about non-const format Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | ip: add loose reverse path filteringStephen Hemminger2009-02-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Extend existing reverse path filter option to allow strict or loose filtering. (See http://en.wikipedia.org/wiki/Reverse_path_filtering). For compatibility with existing usage, the value 1 is chosen for strict mode and 2 for loose mode. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | tcp: Always set urgent pointer if it's beyond snd_nxtHerbert Xu2009-02-211-4/+8
| |/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Our TCP stack does not set the urgent flag if the urgent pointer does not fit in 16 bits, i.e., if it is more than 64K from the sequence number of a packet. This behaviour is different from the BSDs, and clearly contradicts the purpose of urgent mode, which is to send the notification (though not necessarily the associated data) as soon as possible. Our current behaviour may in fact delay the urgent notification indefinitely if the receiver window does not open up. Simply matching BSD however may break legacy applications which incorrectly rely on the out-of-band delivery of urgent data, and conversely the in-band delivery of non-urgent data. Alexey Kuznetsov suggested a safe solution of following BSD only if the urgent pointer itself has not yet been transmitted. This way we guarantee that when the remote end sees the packet with non-urgent data marked as urgent due to wrap-around we would have advanced the urgent pointer beyond, either to the actual urgent data or to an as-yet untransmitted packet. The only potential downside is that applications on the remote end may see multiple SIGURG notifications. However, this would occur anyway with other TCP stacks. More importantly, the outcome of such a duplicate notification is likely to be harmless since the signal itself does not carry any information other than the fact that we're in urgent mode. Thanks to Ilpo Järvinen for fixing a critical bug in this and Jeff Chua for reporting that bug. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | net: replace commatas with semicolonsThomas Gleixner2009-02-161-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: syntax fix Interestingly enough this compiles w/o any complaints: orphans = percpu_counter_sum_positive(&tcp_orphan_count), sockets = percpu_counter_sum_positive(&tcp_sockets_allocated), Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | ip: support for TX timestamps on UDP and RAW socketsPatrick Ohly2009-02-154-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | Instructions for time stamping outgoing packets are take from the socket layer and later copied into the new skb. Signed-off-by: Patrick Ohly <patrick.ohly@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | gro: Optimise TCP packet receptionHerbert Xu2009-02-081-7/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | gro: Optimise TCP packet reception As this function can be called more than half a million times for 10GbE, it's important to optimise it as much as we can. This patch uses bit ops to logical ops, as well as open coding memcmp to exploit alignment properties. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | gro: Optimise IPv4 packet receptionHerbert Xu2009-02-081-6/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As this function can be called more than half a million times for 10GbE, it's important to optimise it as much as we can. This patch does some obvious changes to use 2-byte and 4-byte operations instead of byte-oriented ones where possible. Bit ops are also used to replace logical ops to reduce branching. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | Merge branch 'master' of ↵David S. Miller2009-02-072-11/+10
|\ \ \ | |/ / | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/wireless/iwlwifi/iwl-agn.c drivers/net/wireless/iwlwifi/iwl3945-base.c
| * | udp: Fix potential wrong ip_hdr(skb) pointersJesper Dangaard Brouer2009-02-061-2/+4
| |/ | | | | | | | | | | | | | | | | | | | | | | | | Like the UDP header fix, pskb_may_pull() can potentially alter the SKB buffer. Thus the saddr and daddr, pointers may point to the old skb->data buffer. I haven't seen corruptions, as its only seen if the old skb->data buffer were reallocated by another user and written into very quickly (or poison'd by SLAB debugging). Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
| * Revert "tcp: Always set urgent pointer if it's beyond snd_nxt"David S. Miller2009-02-051-8/+4
| | | | | | | | | | | | | | | | This reverts commit 64ff3b938ec6782e6585a83d5459b98b0c3f6eb8. Jeff Chua reports that it breaks rlogin for him. Signed-off-by: David S. Miller <davem@davemloft.net>
| * udp: Fix UDP short packet false positiveJesper Dangaard Brouer2009-02-051-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | The UDP header pointer assignment must happen after calling pskb_may_pull(). As pskb_may_pull() can potentially alter the SKB buffer. This was exposted by running multicast traffic through the NIU driver, as it won't prepull the protocol headers into the linear area on receive. Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
* | ipmr: use goto to common label instead of opencodingIlpo Järvinen2009-02-061-2/+1
| | | | | | | | | | Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch 'master' of ↵David S. Miller2009-02-031-1/+3
|\ \ | |/ | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/Kconfig
| * udp: increments sk_drops in __udp_queue_rcv_skb()Eric Dumazet2009-02-021-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | Commit 93821778def10ec1e69aa3ac10adee975dad4ff3 (udp: Fix rcv socket locking) accidentally removed sk_drops increments for UDP IPV4 sockets. This field can be used to detect incorrect sizing of socket receive buffers. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>