aboutsummaryrefslogtreecommitdiffstats
path: root/net/core
Commit message (Collapse)AuthorAgeFilesLines
...
| * | | | net: use batched device unregister in veth and macvlanEric Dumazet2011-05-091-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | veth devices dont use the batched device unregisters yet. Since veth are a pair of devices, it makes sense to use a batch of two unregisters, this roughly divides dismantle time by two. Fix this by changing dellink() callers to always provide a non NULL head. (Idea from Michał Mirosław) This patch also handles macvlan case : We now dismantle all macvlans on top of a lower dev at once. Reported-by: Alex Bligh <alex@alex.org.uk> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Michał Mirosław <mirqus@gmail.com> Cc: Jesse Gross <jesse@nicira.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ben Greear <greearb@candelatech.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | net: Allow ethtool to set interface in loopback mode.Mahesh Bandewar2011-05-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch enables ethtool to set the loopback mode on a given interface. By configuring the interface in loopback mode in conjunction with a policy route / rule, a userland application can stress the egress / ingress path exposing the flows of the change in progress and potentially help developer(s) understand the impact of those changes without even sending a packet out on the network. Following set of commands illustrates one such example - a) ip -4 addr add 192.168.1.1/24 dev eth1 b) ip -4 rule add from all iif eth1 lookup 250 c) ip -4 route add local 0/0 dev lo proto kernel scope host table 250 d) arp -Ds 192.168.1.100 eth1 e) arp -Ds 192.168.1.200 eth1 f) sysctl -w net.ipv4.ip_nonlocal_bind=1 g) sysctl -w net.ipv4.conf.all.accept_local=1 # Assuming that the machine has 8 cores h) taskset 000f netserver -L 192.168.1.200 i) taskset 00f0 netperf -t TCP_CRR -L 192.168.1.100 -H 192.168.1.200 -l 30 Signed-off-by: Mahesh Bandewar <maheshb@google.com> Acked-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | pktgen: use %pI6c for printing IPv6 addressesAlexey Dobriyan2011-05-081-96/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I don't know why %pI6 doesn't compress, but the format specifier is kernel-standard, so use it. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | ethtool: remove phys_id from ethtool_opsStephen Hemminger2011-05-081-5/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After that all the upstream kernel drivers now use phys_id, and the old ethtool_ops interface (phys_id) can be removed. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | Merge branch 'master' of ↵David S. Miller2011-05-051-3/+3
| |\ \ \ \ | | | |_|/ | | |/| | | | | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/tg3.c
| * | | | net: call dev_alloc_name from register_netdeviceJiri Pirko2011-05-052-26/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Force dev_alloc_name() to be called from register_netdevice() by dev_get_valid_name(). That allows to remove multiple explicit dev_alloc_name() calls. The possibility to call dev_alloc_name in advance remains. This also fixes veth creation regresion caused by 84c49d8c3e4abefb0a41a77b25aa37ebe8d6b743 Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | net: dont hold rtnl mutex during netlink dump callbacksEric Dumazet2011-05-022-8/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Four years ago, Patrick made a change to hold rtnl mutex during netlink dump callbacks. I believe it was a wrong move. This slows down concurrent dumps, making good old /proc/net/ files faster than rtnetlink in some situations. This occurred to me because one "ip link show dev ..." was _very_ slow on a workload adding/removing network devices in background. All dump callbacks are able to use RCU locking now, so this patch does roughly a revert of commits : 1c2d670f366 : [RTNETLINK]: Hold rtnl_mutex during netlink dump callbacks 6313c1e0992 : [RTNETLINK]: Remove unnecessary locking in dump callbacks This let writers fight for rtnl mutex and readers going full speed. It also takes care of phonet : phonet_route_get() is now called from rcu read section. I renamed it to phonet_route_get_rcu() Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Patrick McHardy <kaber@trash.net> Cc: Remi Denis-Courmont <remi.denis-courmont@nokia.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | ethtool: Call ethtool's get/set_settings callbacks with cleaned dataDavid Decotigny2011-04-292-14/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This makes sure that when a driver calls the ethtool's get/set_settings() callback of another driver, the data passed to it is clean. This guarantees that speed_hi will be zeroed correctly if the called callback doesn't explicitely set it: we are sure we don't get a corrupted speed from the underlying driver. We also take care of setting the cmd field appropriately (ETHTOOL_GSET/SSET). This applies to dev_ethtool_get_settings(), which now makes sure it sets up that ethtool command parameter correctly before passing it to drivers. This also means that whoever calls dev_ethtool_get_settings() does not have to clean the ethtool command parameter. This function also becomes an exported symbol instead of an inline. All drivers visible to make allyesconfig under x86_64 have been updated. Signed-off-by: David Decotigny <decot@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | pktgen: create num frags requestedamit salecha2011-04-291-26/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pktgen doesn't generate number of frags requested. Divide packet size by number of frags and fill that in every frags. Example: With packet size 1470, it generate only 11 frags. Initial frags get lenght 706, 353, 177....so on. Last frag get divided by 2. Now with this fix, each frags will get 78 bytes. Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | net: Use non-zero allocations in dst_alloc().David S. Miller2011-04-281-3/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make dst_alloc() and it's users explicitly initialize the entire entry. The zero'ing done by kmem_cache_zalloc() was almost entirely redundant. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | net: Make dst_alloc() take more explicit initializations.David S. Miller2011-04-281-5/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now the dst->dev, dev->obsolete, and dst->flags values can be specified as well. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | net: fix netdev_increment_features()Michał Mirosław2011-04-281-24/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Simplify and fix netdev_increment_features() to conform to what is stated in netdevice.h comments about NETIF_F_ONE_FOR_ALL. Include FCoE segmentation and VLAN-challedged flags in computation. Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | net: filter: Just In Time compiler for x86-64Eric Dumazet2011-04-272-60/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to speedup packet filtering, here is an implementation of a JIT compiler for x86_64 It is disabled by default, and must be enabled by the admin. echo 1 >/proc/sys/net/core/bpf_jit_enable It uses module_alloc() and module_free() to get memory in the 2GB text kernel range since we call helpers functions from the generated code. EAX : BPF A accumulator EBX : BPF X accumulator RDI : pointer to skb (first argument given to JIT function) RBP : frame pointer (even if CONFIG_FRAME_POINTER=n) r9d : skb->len - skb->data_len (headlen) r8 : skb->data To get a trace of generated code, use : echo 2 >/proc/sys/net/core/bpf_jit_enable Example of generated code : # tcpdump -p -n -s 0 -i eth1 host 192.168.20.0/24 flen=18 proglen=147 pass=3 image=ffffffffa00b5000 JIT code: ffffffffa00b5000: 55 48 89 e5 48 83 ec 60 48 89 5d f8 44 8b 4f 60 JIT code: ffffffffa00b5010: 44 2b 4f 64 4c 8b 87 b8 00 00 00 be 0c 00 00 00 JIT code: ffffffffa00b5020: e8 24 7b f7 e0 3d 00 08 00 00 75 28 be 1a 00 00 JIT code: ffffffffa00b5030: 00 e8 fe 7a f7 e0 24 00 3d 00 14 a8 c0 74 49 be JIT code: ffffffffa00b5040: 1e 00 00 00 e8 eb 7a f7 e0 24 00 3d 00 14 a8 c0 JIT code: ffffffffa00b5050: 74 36 eb 3b 3d 06 08 00 00 74 07 3d 35 80 00 00 JIT code: ffffffffa00b5060: 75 2d be 1c 00 00 00 e8 c8 7a f7 e0 24 00 3d 00 JIT code: ffffffffa00b5070: 14 a8 c0 74 13 be 26 00 00 00 e8 b5 7a f7 e0 24 JIT code: ffffffffa00b5080: 00 3d 00 14 a8 c0 75 07 b8 ff ff 00 00 eb 02 31 JIT code: ffffffffa00b5090: c0 c9 c3 BPF program is 144 bytes long, so native program is almost same size ;) (000) ldh [12] (001) jeq #0x800 jt 2 jf 8 (002) ld [26] (003) and #0xffffff00 (004) jeq #0xc0a81400 jt 16 jf 5 (005) ld [30] (006) and #0xffffff00 (007) jeq #0xc0a81400 jt 16 jf 17 (008) jeq #0x806 jt 10 jf 9 (009) jeq #0x8035 jt 10 jf 17 (010) ld [28] (011) and #0xffffff00 (012) jeq #0xc0a81400 jt 16 jf 13 (013) ld [38] (014) and #0xffffff00 (015) jeq #0xc0a81400 jt 16 jf 17 (016) ret #65535 (017) ret #0 Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Ben Hutchings <bhutchings@solarflare.com> Cc: Hagen Paul Pfeifer <hagen@jauu.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | bonding: move processing of recv handlers into handle_frame()Jiri Pirko2011-04-251-21/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since now when bonding uses rx_handler, all traffic going into bond device goes thru bond_handle_frame. So there's no need to go back into bonding code later via ptype handlers. This patch converts original ptype handlers into "bonding receive probes". These functions are called from bond_handle_frame and they are registered per-mode. Note that vlan packets are also handled because they are always untagged thanks to vlan_untag() Note that this also allows arpmon for eth-bond-bridge-vlan topology. Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | net: make WARN_ON in dev_disable_lro() usefulMichał Mirosław2011-04-251-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | inet: constify ip headers and in6_addrEric Dumazet2011-04-222-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add const qualifiers to structs iphdr, ipv6hdr and in6_addr pointers where possible, to make code intention more obvious. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | net: fix hw_features ethtool_ops->set_flags compatibilityMichał Mirosław2011-04-211-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | __ethtool_set_flags() was not taking into account features set but not user-toggleable. Since GFLAGS returns masked dev->features, EINVAL is returned when passed flags differ to it, and not to wanted_features. Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | Merge branch 'master' of ↵David S. Miller2011-04-191-3/+7
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/bnx2x/bnx2x_ethtool.c
| * | | | | pktgen: Fix set-but-unused variable.David S. Miller2011-04-171-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "iph" in pktgen_output_ipsec() is set but never actually used. Kill it off. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | net: minor cleanup to net_namespace.c.Rob Landley2011-04-151-8/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Inline a small static function that's only ever called from one place. Signed-off-by: Rob Landley <rlandley@parallels.com> Reviewed-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | ethtool: allow custom interval for physical identificationAllan, Bruce W2011-04-141-15/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When physical identification of an adapter is done by toggling the mechanism on and off through software utilizing the set_phys_id operation, it is done with a fixed duration for both on and off states. Some drivers may want to set a custom duration for the on/off intervals. This patch changes the API so the return code from the driver's entry point when it is called with ETHTOOL_ID_ACTIVE can specify the frequency at which to cycle the on/off states, and updates the drivers that have already been converted to use the new set_phys_id and use the synchronous method for identifying an adapter. The physical identification frequency set in the updated drivers is based on how it was done prior to the introduction of set_phys_id. Compile tested only. Also fixes a compiler warning in sfc. v2: drivers do not return -EINVAL for ETHOOL_ID_ACTIVE v3: fold patchset into single patch and cleanup per Ben's feedback Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Cc: Ben Hutchings <bhutchings@solarflare.com> Cc: Sathya Perla <sathya.perla@emulex.com> Cc: Subbu Seetharaman <subbu.seetharaman@emulex.com> Cc: Ajit Khaparde <ajit.khaparde@emulex.com> Cc: Michael Chan <mchan@broadcom.com> Cc: Eilon Greenstein <eilong@broadcom.com> Cc: Divy Le Ray <divy@chelsio.com> Cc: Don Fry <pcnet32@frontier.com> Cc: Jon Mason <jdmason@kudzu.us> Cc: Solarflare linux maintainers <linux-net-drivers@solarflare.com> Cc: Steve Hodgson <shodgson@solarflare.com> Cc: Stephen Hemminger <shemminger@linux-foundation.org> Cc: Matt Carlson <mcarlson@broadcom.com> Acked-by: Jon Mason <jdmason@kudzu.us> Acked-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | net: ethtool support to configure number of channelsamit salecha2011-04-131-0/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ethtool support to configure RX, TX and other channels. combined field in struct ethtool_channels to reflect set of channel (RX, TX or other). Other channel can be link interrupts, SR-IOV coordination etc. ETHTOOL_GCHANNELS will report max and current number of RX channels, max and current number of TX channels, max and current number of other channel or max and current number of combined channel. Number of channel can be modify upto max number of channel through ETHTOOL_SCHANNELS command. Ben Hutchings: o define 'combined' and 'other' types. Most multiqueue drivers pair up RX and TX queues so that most channels combine RX and TX work. o Please could you use a kernel-doc comment to describe the structure. Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com> Reviewed-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | net: add RTNL_ASSERT in __netdev_update_features()Michał Mirosław2011-04-121-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | net: vlan: make non-hw-accel rx path similar to hw-accelJiri Pirko2011-04-121-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now there are 2 paths for rx vlan frames. When rx-vlan-hw-accel is enabled, skb is untagged by NIC, vlan_tci is set and the skb gets into vlan code in __netif_receive_skb - vlan_hwaccel_do_receive. For non-rx-vlan-hw-accel however, tagged skb goes thru whole __netif_receive_skb, it's untagged in ptype_base hander and reinjected This incosistency is fixed by this patch. Vlan untagging happens early in __netif_receive_skb so the rest of code (ptype_all handlers, rx_handlers) see the skb like it was untagged by hw. Signed-off-by: Jiri Pirko <jpirko@redhat.com> v1->v2: remove "inline" from vlan_core.c functions Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | ethtool: time to blink provided in seconds not jiffiesAllan, Bruce W2011-04-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When blinking for a duration set by the user, the value specified is in seconds but it is used as the number of jiffies in the timeout after which the Physical ID indicator is deactivated. Fix by converting the timeout to seconds. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Acked-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | Merge branch 'master' of ↵David S. Miller2011-04-116-16/+16
| |\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/smsc911x.c
| * | | | | | ethtool: prevent null pointer dereference with NTUPLE set but no set_rx_ntupleAlexander Duyck2011-04-111-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change is meant to prevent a possible null pointer dereference if NETIF_F_NTUPLE is defined but the set_rx_ntuple function pointer is not. The main motivation behind this patch is to eventually replace the ntuple interfaces entirely with the network flow classifier interfaces. This allows the device drivers to maintain the ntuple check internally while using the network flow classifier interface for setting up and displaying rules. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | Merge branch 'for-davem' of ↵David S. Miller2011-04-061-2/+53
| |\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc-next-2.6
| | * | | | | | ethtool: Change ETHTOOL_PHYS_ID implementation to allow dropping RTNLBen Hutchings2011-04-051-2/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ethtool ETHTOOL_PHYS_ID command runs for an arbitrarily long period of time, holding the RTNL lock. This blocks routing updates, device enumeration, and various important operations that one might want to keep running while hunting for the flashing LED. We need to drop the RTNL lock during this operation, but currently the core implementation is a thin wrapper around a driver operation and drivers may well depend upon holding the lock. Define a new driver operation 'set_phys_id' with an argument that sets the ID indicator on/off/inactive/active (the last optional, for any driver or firmware that prefers to handle blinking asynchronously). When this is defined, the ethtool core drops the lock while waiting and only acquires it around calls to this operation. Deprecate the 'phys_id' operation in favour of this. It can be removed once all in-tree drivers are converted. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
| * | | | | | | net: Allow no-cache copy from user on transmitTom Herbert2011-04-042-1/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch uses __copy_from_user_nocache on transmit to bypass data cache for a performance improvement. skb_add_data_nocache and skb_copy_to_page_nocache can be called by sendmsg functions to use this feature, initial support is in tcp_sendmsg. This functionality is configurable per device using ethtool. Presumably, this feature would only be useful when the driver does not touch the data. The feature is turned on by default if a device indicates that it does some form of checksum offload; it is off by default for devices that do no checksum offload or indicate no checksum is necessary. For the former case copy-checksum is probably done anyway, in the latter case the device is likely loopback in which case the no cache copy is probably not beneficial. This patch was tested using 200 instances of netperf TCP_RR with 1400 byte request and one byte reply. Platform is 16 core AMD x86. No-cache copy disabled: 672703 tps, 97.13% utilization 50/90/99% latency:244.31 484.205 1028.41 No-cache copy enabled: 702113 tps, 96.16% utilization, 50/90/99% latency 238.56 467.56 956.955 Using 14000 byte request and response sizes demonstrate the effects more dramatically: No-cache copy disabled: 79571 tps, 34.34 %utlization 50/90/95% latency 1584.46 2319.59 5001.76 No-cache copy enabled: 83856 tps, 34.81% utilization 50/90/95% latency 2508.42 2622.62 2735.88 Note especially the effect on latency tail (95th percentile). This seems to provide a nice performance improvement and is consistent in the tests I ran. Presumably, this would provide the greatest benfits in the presence of an application workload stressing the cache and a lot of transmit data happening. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | net: Call netdev_features_change() from netdev_update_features()Michał Mirosław2011-04-022-9/+20
| |/ / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Issue FEAT_CHANGE notification when features are changed by netdev_update_features(). This will allow changes made by extra constraints on e.g. MTU change to be properly propagated like changes via ethtool. Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | | sanitize <linux/prefetch.h> usageLinus Torvalds2011-05-203-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit e66eed651fd1 ("list: remove prefetching from regular list iterators") removed the include of prefetch.h from list.h, which uncovered several cases that had apparently relied on that rather obscure header file dependency. So this fixes things up a bit, using grep -L linux/prefetch.h $(git grep -l '[^a-z_]prefetchw*(' -- '*.[ch]') grep -L 'prefetchw*(' $(git grep -l 'linux/prefetch.h' -- '*.[ch]') to guide us in finding files that either need <linux/prefetch.h> inclusion, or have it despite not needing it. There are more of them around (mostly network drivers), but this gets many core ones. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | | Merge branch 'core-rcu-for-linus' of ↵Linus Torvalds2011-05-195-65/+12
|\ \ \ \ \ \ \ | |_|_|_|_|_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (78 commits) Revert "rcu: Decrease memory-barrier usage based on semi-formal proof" net,rcu: convert call_rcu(prl_entry_destroy_rcu) to kfree batman,rcu: convert call_rcu(softif_neigh_free_rcu) to kfree_rcu batman,rcu: convert call_rcu(neigh_node_free_rcu) to kfree() batman,rcu: convert call_rcu(gw_node_free_rcu) to kfree_rcu net,rcu: convert call_rcu(kfree_tid_tx) to kfree_rcu() net,rcu: convert call_rcu(xt_osf_finger_free_rcu) to kfree_rcu() net/mac80211,rcu: convert call_rcu(work_free_rcu) to kfree_rcu() net,rcu: convert call_rcu(wq_free_rcu) to kfree_rcu() net,rcu: convert call_rcu(phonet_device_rcu_free) to kfree_rcu() perf,rcu: convert call_rcu(swevent_hlist_release_rcu) to kfree_rcu() perf,rcu: convert call_rcu(free_ctx) to kfree_rcu() net,rcu: convert call_rcu(__nf_ct_ext_free_rcu) to kfree_rcu() net,rcu: convert call_rcu(net_generic_release) to kfree_rcu() net,rcu: convert call_rcu(netlbl_unlhsh_free_addr6) to kfree_rcu() net,rcu: convert call_rcu(netlbl_unlhsh_free_addr4) to kfree_rcu() security,rcu: convert call_rcu(sel_netif_free) to kfree_rcu() net,rcu: convert call_rcu(xps_dev_maps_release) to kfree_rcu() net,rcu: convert call_rcu(xps_map_release) to kfree_rcu() net,rcu: convert call_rcu(rps_map_release) to kfree_rcu() ...
| * | | | | | net,rcu: convert call_rcu(net_generic_release) to kfree_rcu()Lai Jiangshan2011-05-071-9/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The rcu callback net_generic_release() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(net_generic_release). Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
| * | | | | | net,rcu: convert call_rcu(xps_dev_maps_release) to kfree_rcu()Lai Jiangshan2011-05-071-10/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The rcu callback xps_dev_maps_release() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(xps_dev_maps_release). Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
| * | | | | | net,rcu: convert call_rcu(xps_map_release) to kfree_rcu()Lai Jiangshan2011-05-071-9/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The rcu callback xps_map_release() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(xps_map_release). Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
| * | | | | | net,rcu: convert call_rcu(rps_map_release) to kfree_rcu()Lai Jiangshan2011-05-071-9/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The rcu callback rps_map_release() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(rps_map_release). Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
| * | | | | | net,rcu: convert call_rcu(free_dm_hw_stat) to kfree_rcu()Lai Jiangshan2011-05-071-10/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The rcu callback free_dm_hw_stat() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(free_dm_hw_stat). Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
| * | | | | | net,rcu: convert call_rcu(__gen_kill_estimator) to kfree_rcu()Lai Jiangshan2011-05-071-8/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The rcu callback __gen_kill_estimator() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(__gen_kill_estimator). Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
| * | | | | | net,rcu: convert call_rcu(ha_rcu_free) to kfree_rcu()Lai Jiangshan2011-05-071-10/+2
| | |_|_|/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The rcu callback ha_rcu_free() just calls a kfree(), so we use kfree_rcu() instead of the call_rcu(ha_rcu_free). Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
* | | | | | net: Change netdev_fix_features messages loglevelMichał Mirosław2011-05-161-14/+8
| |_|_|_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Those reduced to DEBUG can possibly be triggered by unprivileged processes and are nothing exceptional. Illegal checksum combinations can only be caused by driver bug, so promote those messages to WARN. Since GSO without SG will now only cause DEBUG message from netdev_fix_features(), remove the workaround from register_netdevice(). Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | net: dev_close() should check IFF_UPEric Dumazet2011-05-101-4/+6
|/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 443457242beb (factorize sync-rcu call in unregister_netdevice_many) mistakenly removed one test from dev_close() Following actions trigger a BUG : modprobe bonding modprobe dummy ifconfig bond0 up ifenslave bond0 dummy0 rmmod dummy dev_close() must not close a non IFF_UP device. With help from Frank Blaschka and Einar EL Lueck Reported-by: Frank Blaschka <blaschka@linux.vnet.ibm.com> Reported-by: Einar EL Lueck <ELELUECK@de.ibm.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | networking: inappropriate ioctl operation should return ENOTTYLifeng Sun2011-05-021-3/+3
| |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ioctl() calls against a socket with an inappropriate ioctl operation are incorrectly returning EINVAL rather than ENOTTY: [ENOTTY] Inappropriate I/O control operation. BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=33992 Signed-off-by: Lifeng Sun <lifongsun@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | net: Disable NETIF_F_TSO_ECN when TSO is disabledBen Hutchings2011-04-121-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | NETIF_F_TSO_ECN has no effect when TSO is disabled; this just means that feature state will be accurately reported to user-space. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | net: Disable all TSO features when SG is disabledBen Hutchings2011-04-121-3/+3
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The feature flags NETIF_F_TSO and NETIF_F_TSO6 independently enable TSO for IPv4 and IPv6 respectively. However, the test in netdev_fix_features() and its predecessor functions was never updated to check for NETIF_F_TSO6, possibly because it was originally proposed that TSO for IPv6 would be dependent on both feature flags. Now that these feature flags can be changed independently from user-space and we depend on netdev_fix_features() to fix invalid feature combinations, it's important to disable them both if scatter-gather is disabled. Also disable NETIF_F_TSO_ECN so user-space sees all TSO features as disabled. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch 'for-linus2' of git://git.profusion.mobi/users/lucas/linux-2.6Linus Torvalds2011-04-076-16/+16
|\ \ | |/ |/| | | | | * 'for-linus2' of git://git.profusion.mobi/users/lucas/linux-2.6: Fix common misspellings
| * Fix common misspellingsLucas De Marchi2011-03-316-16/+16
| | | | | | | | | | | | Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
* | netdev: fix mtu check when TSO is enabledDaniel Lezcano2011-03-301-2/+22
|/ | | | | | | | | | | | | | | | In case the device where is coming from the packet has TSO enabled, we should not check the mtu size value as this one could be bigger than the expected value. This is the case for the macvlan driver when the lower device has TSO enabled. The macvlan inherit this feature and forward the packets without fragmenting them. Then the packets go through dev_forward_skb and are dropped. This patch fix this by checking TSO is not enabled when we want to check the mtu size. Signed-off-by: Daniel Lezcano <daniel.lezcano@free.fr> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds2011-03-292-55/+17
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (30 commits) xfrm: Restrict extended sequence numbers to esp xfrm: Check for esn buffer len in xfrm_new_ae xfrm: Assign esn pointers when cloning a state xfrm: Move the test on replay window size into the replay check functions netdev: bfin_mac: document TE setting in RMII modes drivers net: Fix declaration ordering in inline functions. cxgb3: Apply interrupt coalescing settings to all queues net: Always allocate at least 16 skb frags regardless of page size ipv4: Don't ip_rt_put() an error pointer in RAW sockets. net: fix ethtool->set_flags not intended -EINVAL return value mlx4_en: Fix loss of promiscuity tg3: Fix inline keyword usage tg3: use <linux/io.h> and <linux/uaccess.h> instead <asm/io.h> and <asm/uaccess.h> net: use CHECKSUM_NONE instead of magic number Net / jme: Do not use legacy PCI power management myri10ge: small rx_done refactoring bridge: notify applications if address of bridge device changes ipv4: Fix IP timestamp option (IPOPT_TS_PRESPEC) handling in ip_options_echo() can: c_can: Fix tx_bytes accounting can: c_can_platform: fix irq check in probe ...
| * net: fix ethtool->set_flags not intended -EINVAL return valueStanislaw Gruszka2011-03-271-1/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After commit d5dbda23804156ae6f35025ade5307a49d1db6d7 "ethtool: Add support for vlan accleration.", drivers that have NETIF_F_HW_VLAN_TX, and/or NETIF_F_HW_VLAN_RX feature, but do not allow enable/disable vlan acceleration via ethtool set_flags, always return -EINVAL from that function. Fix by returning -EINVAL only if requested features do not match current settings and can not be changed by driver. Change any driver that define ethtool->set_flags to use ethtool_invalid_flags() to avoid similar problems in the future (also on drivers that do not have the problem). Tested with modified (to reproduce this bug) myri10ge driver. Cc: stable@kernel.org # 2.6.37+ Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>