aboutsummaryrefslogtreecommitdiffstats
path: root/net/sched
Commit message (Collapse)AuthorAgeFilesLines
...
* | act_nat: use stack variableChangli Gao2010-06-301-21/+10
| | | | | | | | | | | | | | | | | | | | | | | | act_nat: use stack variable structure tc_nat isn't too big for stack, so we can put it in stack. Signed-off-by: Changli Gao <xiaosuo@gmail.com> ---- net/sched/act_nat.c | 31 ++++++++++--------------------- 1 file changed, 10 insertions(+), 21 deletions(-) Signed-off-by: David S. Miller <davem@davemloft.net>
* | act_mirred: combine duplicate codeChangli Gao2010-06-301-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | act_mirred: combine duplicate code tcf_bstats is updated in any way, so we can do it earlier to reduce the size of the code. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> ---- net/sched/act_mirred.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) Signed-off-by: David S. Miller <davem@davemloft.net>
* | act_mirred: don't clone skb when skb isn't sharedChangli Gao2010-06-281-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | don't clone skb when skb isn't shared When the tcf_action is TC_ACT_STOLEN, and the skb isn't shared, we don't need to clone a new skb. As the skb will be freed after this function returns, we can use it freely once we get a reference to it. Signed-off-by: Changli Gao <xiaosuo@gmail.com> ---- include/net/sch_generic.h | 11 +++++++++-- net/sched/act_mirred.c | 6 +++--- 2 files changed, 12 insertions(+), 5 deletions(-) Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch 'master' of ↵David S. Miller2010-06-231-0/+1
|\ \ | |/ | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: net/ipv4/ip_output.c
| * Clear IFF_XMIT_DST_RELEASE for teql interfacesTom Hughes2010-06-161-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | https://bugzilla.kernel.org/show_bug.cgi?id=16183 The sch_teql module, which can be used to load balance over a set of underlying interfaces, stopped working after 2.6.30 and has been broken in all kernels since then for any underlying interface which requires the addition of link level headers. The problem is that the transmit routine relies on being able to access the destination address in the skb in order to do address resolution once it has decided which underlying interface it is going to transmit through. In 2.6.31 the IFF_XMIT_DST_RELEASE flag was introduced, and set by default for all interfaces, which causes the destination address to be released before the transmit routine for the interface is called. The solution is to clear that flag for teql interfaces. Signed-off-by: Tom Hughes <tom@compton.nu> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | pkt_sched: gen_kill_estimator() rcu fixesEric Dumazet2010-06-112-2/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | gen_kill_estimator() API is incomplete or not well documented, since caller should make sure an RCU grace period is respected before freeing stats_lock. This was partially addressed in commit 5d944c640b4 (gen_estimator: deadlock fix), but same problem exist for all gen_kill_estimator() users, if lock they use is not already RCU protected. A code review shows xt_RATEEST.c, act_api.c, act_police.c have this problem. Other are ok because they use qdisc lock, already RCU protected. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net sched: make pedit check for clones insteadjamal2010-06-071-2/+1
| | | | | | | | | | | | | | | | Now that the core path doesnt set OK to munge we detect writable skbs by looking to see if they are cloned. Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
* | htb: remove two unnecessary assignmentsChangli Gao2010-06-071-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | remove two unnecessary assignments we don't need to assign NULL when initialize structure objects. Signed-off-by: Changli Gao <xiaosuo@gmail.com> ---- net/sched/sch_htb.c | 2 -- 1 file changed, 2 deletions(-) Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch 'master' of ↵David S. Miller2010-06-063-23/+54
|\ \ | |/ | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/sfc/net_driver.h drivers/net/sfc/siena.c
| * act_pedit: access skb->data safelyChangli Gao2010-06-031-10/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | access skb->data safely we should use skb_header_pointer() and skb_store_bits() to access skb->data to handle small or non-linear skbs. Signed-off-by: Changli Gao <xiaosuo@gmail.com> ---- net/sched/act_pedit.c | 24 ++++++++++++++---------- 1 file changed, 14 insertions(+), 10 deletions(-) Signed-off-by: David S. Miller <davem@davemloft.net>
| * cls_u32: use skb_header_pointer() to dereference data safelyChangli Gao2010-06-021-13/+36
| | | | | | | | | | | | | | | | | | | | | | | | use skb_header_pointer() to dereference data safely the original skb->data dereference isn't safe, as there isn't any skb->len or skb_is_nonlinear() check. skb_header_pointer() is used instead in this patch. And when the skb isn't long enough, we terminate the function u32_classify() immediately with -1. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * act_nat: fix the wrong checksum when addr isn't in old_addr/maskChangli Gao2010-06-021-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | fix the wrong checksum when addr isn't in old_addr/mask For TCP and UDP packets, when addr isn't in old_addr/mask we don't do SNAT or DNAT, and we should not update layer 4 checksum. Signed-off-by: Changli Gao <xiaosuo@gmail.com> ---- net/sched/act_nat.c | 4 ++++ 1 file changed, 4 insertions(+) Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: add additional lock to qdisc to increase throughputEric Dumazet2010-06-021-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When many cpus compete for sending frames on a given qdisc, the qdisc spinlock suffers from very high contention. The cpu owning __QDISC_STATE_RUNNING bit has same priority to acquire the lock, and cannot dequeue packets fast enough, since it must wait for this lock for each dequeued packet. One solution to this problem is to force all cpus spinning on a second lock before trying to get the main lock, when/if they see __QDISC_STATE_RUNNING already set. The owning cpu then compete with at most one other cpu for the main lock, allowing for higher dequeueing rate. Based on a previous patch from Alexander Duyck. I added the heuristic to avoid the atomic in fast path, and put the new lock far away from the cache line used by the dequeue worker. Also try to release the busylock lock as late as possible. Tests with following script gave a boost from ~50.000 pps to ~600.000 pps on a dual quad core machine (E5450 @3.00GHz), tg3 driver. (A single netperf flow can reach ~800.000 pps on this platform) for j in `seq 0 3`; do for i in `seq 0 7`; do netperf -H 192.168.0.1 -t UDP_STREAM -l 60 -N -T $i -- -m 6 & done done Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: Define accessors to manipulate QDISC_STATE_RUNNINGEric Dumazet2010-06-021-2/+2
| | | | | | | | | | | | | | | | Define three helpers to manipulate QDISC_STATE_RUNNIG flag, that a second patch will move on another location. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | arp_notify: allow drivers to explicitly request a notification event.Ian Campbell2010-05-311-0/+18
|/ | | | | | | | | | | | | | | | | | | Currently such notifications are only generated when the device comes up or the address changes. However one use case for these notifications is to enable faster network recovery after a virtual machine migration (by causing switches to relearn their MAC tables). A migration appears to the network stack as a temporary loss of carrier and therefore does not trigger either of the current conditions. Rather than adding carrier up as a trigger (which can cause issues when interfaces a flapping) simply add an interface which the driver can use to explicitly trigger the notification. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Stephen Hemminger <shemminger@linux-foundation.org> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Cc: stable@kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
* cls_cgroup: Store classid in struct sockHerbert Xu2010-05-241-16/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Up until now cls_cgroup has relied on fetching the classid out of the current executing thread. This runs into trouble when a packet processing is delayed in which case it may execute out of another thread's context. Furthermore, even when a packet is not delayed we may fail to classify it if soft IRQs have been disabled, because this scenario is indistinguishable from one where a packet unrelated to the current thread is processed by a real soft IRQ. In fact, the current semantics is inherently broken, as a single skb may be constructed out of the writes of two different tasks. A different manifestation of this problem is when the TCP stack transmits in response of an incoming ACK. This is currently unclassified. As we already have a concept of packet ownership for accounting purposes in the skb->sk pointer, this is a natural place to store the classid in a persistent manner. This patch adds the cls_cgroup classid in struct sock, filling up an existing hole on 64-bit :) The value is set at socket creation time. So all sockets created via socket(2) automatically gains the ID of the thread creating it. Whenever another process touches the socket by either reading or writing to it, we will change the socket classid to that of the process if it has a valid (non-zero) classid. For sockets created on inbound connections through accept(2), we inherit the classid of the original listening socket through sk_clone, possibly preceding the actual accept(2) call. In order to minimise risks, I have not made this the authoritative classid. For now it is only used as a backup when we execute with soft IRQs disabled. Once we're completely happy with its semantics we can use it as the sole classid. Footnote: I have rearranged the error path on cls_group module creation. If we didn't do this, then there is a window where someone could create a tc rule using cls_group before the cgroup subsystem has been registered. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* net_sched: Fix qdisc_notify()Eric Dumazet2010-05-231-7/+7
| | | | | | | | | | | | | | | | | Ben Pfaff reported a kernel oops and provided a test program to reproduce it. https://kerneltrap.org/mailarchive/linux-netdev/2010/5/21/6277805 tc_fill_qdisc() should not be called for builtin qdisc, or it dereference a NULL pointer to get device ifindex. Fix is to always use tc_qdisc_dump_ignore() before calling tc_fill_qdisc(). Reported-by: Ben Pfaff <blp@nicira.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Remove unnecessary returns from void function()sJoe Perches2010-05-177-7/+0
| | | | | | | | | | | | | | | | This patch removes from net/ (but not any netfilter files) all the unnecessary return; statements that precede the last closing brace of void functions. It does not remove the returns that are immediately preceded by a label as gcc doesn't like that. Done via: $ grep -rP --include=*.[ch] -l "return;\n}" net/ | \ xargs perl -i -e 'local $/ ; while (<>) { s/\n[ \t\n]+return;\n}/\n}/g; print; }' Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net sched: cleanup and rate limit warningstephen hemminger2010-05-171-3/+6
| | | | | | | | | | | | | | If the user has a bad classification configuration, and gets a packet that goes through too many steps. Chances are more packets will arrive, and the message spew will overrun syslog because it is not rate limited. And because it is not tagged with appropriate priority it can't not be screened. Added the qdisc to the message to try and give some more context when the message does arrive. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
* net sched: printk message severitystephen hemminger2010-05-178-28/+33
| | | | | | | | | | | The previous patch encourage me to go look at all the messages in the network scheduler and fix them. Many messages were missing any severity level. Some serious ones that should never happen were turned into WARN(), and the random noise messages that were handled changed to pr_debug(). Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net_sched: sch_hfsc: fix classification loopsPatrick McHardy2010-05-171-1/+5
| | | | | | | | | | | | | | When attaching filters to a class pointing to a class higher up in the hierarchy, classification may enter an endless loop. Currently this is prevented for filters that are already resolved, but not for filters resolved at runtime. Only allow filters to point downwards in the hierarchy, similar to what CBQ does. Reported-by: Pawel Staszewski <pstaszewski@itcare.pl> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* tbf: stop wanton destruction of children (v2)stephen hemminger2010-05-171-1/+5
| | | | | | | | | Several netem users use TBF for rate control. But every time the parameters of TBF are changed it destroys the child qdisc, requiring reconfigation. Better to just keep child qdisc and just notify it of changed limit. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: add a noref bit on skb dstEric Dumazet2010-05-171-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use low order bit of skb->_skb_dst to tell dst is not refcounted. Change _skb_dst to _skb_refdst to make sure all uses are catched. skb_dst() returns the dst, regardless of noref bit set or not, but with a lockdep check to make sure a noref dst is not given if current user is not rcu protected. New skb_dst_set_noref() helper to set an notrefcounted dst on a skb. (with lockdep check) skb_dst_drop() drops a reference only if skb dst was refcounted. skb_dst_force() helper is used to force a refcount on dst, when skb is queued and not anymore RCU protected. Use skb_dst_force() in __sk_add_backlog(), __dev_xmit_skb() if !IFF_XMIT_DST_RELEASE or skb enqueued on qdisc queue, in sock_queue_rcv_skb(), in __nf_queue(). Use skb_dst_force() in dev_requeue_skb(). Note: dst_use_noref() still dirties dst, we might transform it later to do one dirtying per jiffies. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'master' of git://dev.medozas.de/linuxPatrick McHardy2010-05-111-1/+1
|\
| * netfilter: xtables: substitute temporary defines by final nameJan Engelhardt2010-05-111-1/+1
| | | | | | | | Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
* | Merge branch 'master' of /repos/git/net-next-2.6Patrick McHardy2010-05-102-6/+6
|\ \ | |/ |/| | | | | | | | | | | Conflicts: net/bridge/br_device.c net/bridge/br_forward.c Signed-off-by: Patrick McHardy <kaber@trash.net>
| * net: fix softnet_statChangli Gao2010-05-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Per cpu variable softnet_data.total was shared between IRQ and SoftIRQ context without any protection. And enqueue_to_backlog should update the netdev_rx_stat of the target CPU. This patch renames softnet_data.total to softnet_data.processed: the number of packets processed in uppper levels(IP stacks). softnet_stat data is moved into softnet_data. Signed-off-by: Changli Gao <xiaosuo@gmail.com> ---- include/linux/netdevice.h | 17 +++++++---------- net/core/dev.c | 26 ++++++++++++-------------- net/sched/sch_generic.c | 2 +- 3 files changed, 20 insertions(+), 25 deletions(-) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: Fix various endianness glitchesEric Dumazet2010-04-201-5/+5
| | | | | | | | | | | | | | | | Sparse can help us find endianness bugs, but we need to make some cleanups to be able to more easily spot real bugs. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch 'master' of /repos/git/net-next-2.6Patrick McHardy2010-04-2034-95/+185
|\ \ | |/ | | | | | | | | | | | | | | Conflicts: Documentation/feature-removal-schedule.txt net/ipv6/netfilter/ip6t_REJECT.c net/netfilter/xt_limit.c Signed-off-by: Patrick McHardy <kaber@trash.net>
| * Merge branch 'master' of ↵David S. Miller2010-04-1133-0/+33
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/stmmac/stmmac_main.c drivers/net/wireless/wl12xx/wl1271_cmd.c drivers/net/wireless/wl12xx/wl1271_main.c drivers/net/wireless/wl12xx/wl1271_spi.c net/core/ethtool.c net/mac80211/scan.c
| | * include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵Tejun Heo2010-03-3033-0/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
| * | Merge branch 'master' of ↵David S. Miller2010-04-062-10/+31
| |\ \ | | |/ | | | | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/bonding/bond_main.c drivers/net/via-velocity.c drivers/net/wireless/iwlwifi/iwl-agn.c
| | * cgroups: net_cls as moduleBen Blum2010-03-232-10/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allows the net_cls cgroup subsystem to be compiled as a module This patch modifies net/sched/cls_cgroup.c to allow the net_cls subsystem to be optionally compiled as a module instead of builtin. The cgroup_subsys struct is moved around a bit to allow the subsys_id to be either declared as a compile-time constant by the cgroup_subsys.h include in cgroup.h, or, if it's a module, initialized within the struct by cgroup_load_subsys. Signed-off-by: Ben Blum <bblum@andrew.cmu.edu> Acked-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Paul Menage <menage@google.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | gen_estimator: deadlock fixEric Dumazet2010-04-011-2/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | One of my test machine got a deadlock during "tc" sessions, adding/deleting classes & filters, using traffic estimators. After some analysis, I believe we have a potential use after free case in est_timer() : spin_lock(e->stats_lock); << HERE >> read_lock(&est_lock); if (e->bstats == NULL) << TEST >> goto skip; Test is done a bit late, because after estimator is killed, and before rcu grace period elapsed, we might already have freed/reuse memory where e->stats_locks points to (some qdisc->q.lock) A possible fix is to respect a rcu grace period at Qdisc dismantle time. On 64bit, sizeof(struct Qdisc) is exactly 192 bytes. Adding 16 bytes to it (for struct rcu_head) is a problem because it might change performance, given QDISC_ALIGNTO is 32 bytes. This is why I also change QDISC_ALIGNTO to 64 bytes, to satisfy most current alignment requirements. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net_sched: minor netns related cleanupTom Goff2010-03-301-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These changes were suggested by Alexey Dobriyan <adobriyan@gmail.com>: - psched_show() does not use any private data so just pass NULL to psched_open() - remove unnecessary return statement Signed-off-by: Tom Goff <thomas.goff@boeing.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net: remove trailing space in messagesFrans Pop2010-03-241-2/+2
| | | | | | | | | | | | | | | Signed-off-by: Frans Pop <elendil@planet.nl> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net_sched: make traffic control network namespace awareTom Goff2010-03-223-80/+107
| |/ | | | | | | | | | | | | | | | | | | Mostly minor changes to add a net argument to various functions and remove initial network namespace checks. Make /proc/net/psched per network namespace. Signed-off-by: Tom Goff <thomas.goff@boeing.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | netfilter: xtables: make use of xt_request_find_targetJan Engelhardt2010-03-251-2/+2
|/ | | | Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
* Merge branch 'master' of ↵David S. Miller2010-02-091-8/+8
|\ | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
| * net/sched: Fix module name in KconfigJan Luebbe2010-02-081-8/+8
| | | | | | | | | | | | | | | | | | The action modules have been prefixed with 'act_', but the Kconfig description was not changed. Signed-off-by: Jan Luebbe <jluebbe@debian.org> Acked-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
* | sched: add head drop fifo queueHagen Paul Pfeifer2010-01-282-0/+35
|/ | | | | | | | | | | | | This adds an additional queuing strategy, called pfifo_head_drop, to remove the oldest skb in the case of an overflow within the queue - the head element - instead of the last skb (tail). To remove the oldest skb in congested situations is useful for sensor network environments where newer packets reflect the superior information. Reviewed-by: Florian Westphal <fw@strlen.de> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'for-linus' of ↵Linus Torvalds2009-12-091-1/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (42 commits) tree-wide: fix misspelling of "definition" in comments reiserfs: fix misspelling of "journaled" doc: Fix a typo in slub.txt. inotify: remove superfluous return code check hdlc: spelling fix in find_pvc() comment doc: fix regulator docs cut-and-pasteism mtd: Fix comment in Kconfig doc: Fix IRQ chip docs tree-wide: fix assorted typos all over the place drivers/ata/libata-sff.c: comment spelling fixes fix typos/grammos in Documentation/edac.txt sysctl: add missing comments fs/debugfs/inode.c: fix comment typos sgivwfb: Make use of ARRAY_SIZE. sky2: fix sky2_link_down copy/paste comment error tree-wide: fix typos "couter" -> "counter" tree-wide: fix typos "offest" -> "offset" fix kerneldoc for set_irq_msi() spidev: fix double "of of" in comment comment typo fix: sybsystem -> subsystem ...
| * tree-wide: fix a very frequent spelling mistakeDirk Hohndel2009-11-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | something-bility is spelled as something-blity so a grep for 'blit' would find these lines this is so trivial that I didn't split it by subsystem / copy additional maintainers - all changes are to comments The only purpose is to get fewer false positives when grepping around the kernel sources. Signed-off-by: Dirk Hohndel <hohndel@infradead.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
* | net: Move && and || to end of previous lineJoe Perches2009-11-294-25/+30
| | | | | | | | | | | | | | | | | | | | | | Not including net/atm/ Compiled tested x86 allyesconfig only Added a > 80 column line or two, which I ignored. Existing checkpatch plaints willfully, cheerfully ignored. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: use net_eq to compare netsOctavian Purdila2009-11-253-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Generated with the following semantic patch @@ struct net *n1; struct net *n2; @@ - n1 == n2 + net_eq(n1, n2) @@ struct net *n1; struct net *n2; @@ - n1 != n2 + !net_eq(n1, n2) applied over {include,net,drivers/net}. Signed-off-by: Octavian Purdila <opurdila@ixiacom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: rename skb->iif to skb->skb_iifEric Dumazet2009-11-202-2/+2
| | | | | | | | | | | | | | To help grep games, rename iif to skb_iif Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | netsched: Allow var_sk_bound_if meta to work on all namespacesEric Dumazet2009-11-181-1/+2
| | | | | | | | | | | | | | | | This fix can probably wait 2.6.33, or should use another patch if needed in 2.6.32 (no get_dev_by_index_rcu() before 2.6.33) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | act_mirred: optimization.Changli Gao2009-11-171-31/+29
| | | | | | | | | | | | | | | | move checking if eaction is valid in tcf_mirred_init() Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
* | act_mirred: cleanupChangli Gao2009-11-171-26/+33
| | | | | | | | | | | | | | | | | | | | 1. don't let go back using goto. 2. don't call skb_act_clone() until it is necessary. 3. one exit of the critical context. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: Optimize hard_start_xmit() return checkingJarek Poplawski2009-11-151-18/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Recent changes in the TX error propagation require additional checking and masking of values returned from hard_start_xmit(), mainly to separate cases where skb was consumed. This aim can be simplified by changing the order of NETDEV_TX and NET_XMIT codes, because the latter are treated similarly to negative (ERRNO) values. After this change much simpler dev_xmit_complete() is also used in sch_direct_xmit(), so it is moved to netdevice.h. Additionally NET_RX definitions in netdevice.h are moved up from between TX codes to avoid confusion while reading the TX comment. Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>