aboutsummaryrefslogtreecommitdiffstats
path: root/net/sched
Commit message (Collapse)AuthorAgeFilesLines
* pkt_sched: Update qdisc requeue stats in dev_requeue_skb()Jarek Poplawski2008-10-081-0/+1
| | | | | | | | | | | After the last change of requeuing there is no info about such incidents in tc stats. This patch updates the counter, but we should consider this should differ from previous stats because of additional checks preventing to repeat this. On the other hand, previous stats didn't include requeuing of gso_segmented skbs. Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* netfilter: xtables: provide invoked family value to extensionsJan Engelhardt2008-10-081-2/+2
| | | | | | | | | By passing in the family through which extensions were invoked, a bit of data space can be reclaimed. The "family" member will be added to the parameter structures and the check functions be adjusted. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
* netfilter: xtables: move extension arguments into compound structure (6/6)Jan Engelhardt2008-10-081-3/+7
| | | | | | | This patch does this for target extensions' destroy functions. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
* netfilter: xtables: move extension arguments into compound structure (5/6)Jan Engelhardt2008-10-081-3/+9
| | | | | | | This patch does this for target extensions' checkentry functions. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
* netfilter: xtables: move extension arguments into compound structure (4/6)Jan Engelhardt2008-10-081-4/+8
| | | | | | | This patch does this for target extensions' target functions. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
* netfilter: xtables: do centralized checkentry call (1/2)Jan Engelhardt2008-10-081-11/+3
| | | | | | | | It used to be that {ip,ip6,etc}_tables called extension->checkentry themselves, but this can be moved into the xtables core. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
* pkt_sched: Simplify dev_requeue_skb and dequeue_skbJarek Poplawski2008-10-061-16/+5
| | | | | | | | | | | qdisc->requeue was planned to universally replace all requeuing code, but at the top level we never requeue more than one skb, so qdisc-> gso_skb is enough for this. qdisc->requeue would be used on the lower levels only for one level deep requeuing (like in sch_hfsc) after finishing all the changes. Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* pkt_sched: Fix handling of gso skbs on requeuingJarek Poplawski2008-10-061-5/+17
| | | | | | | | | | | Jay Cliburn noticed and diagnosed a bug triggered in dev_gso_skb_destructor() after last change from qdisc->gso_skb to qdisc->requeue list. Since gso_segmented skbs can't be queued to another list this patch brings back qdisc->gso_skb for them. Reported-by: Jay Cliburn <jcliburn@gmail.com> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* pkt_sched: Check the state of tx_queue in dequeue_skb()Jarek Poplawski2008-09-221-4/+14
| | | | | | | | | Check in dequeue_skb() the state of tx_queue for requeued skb to save on locking and re-requeuing, and possibly remove the current check in qdisc_run(). Based on the idea of Alexander Duyck. Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* pkt_sched: Always use q->requeue in dev_requeue_skb().David S. Miller2008-09-221-4/+1
| | | | | | | | | | There is no reason to call into the complicated qdiscs just to remember the last SKB where we found the device blocked. The SKB is outside of the qdiscs realm at this point. Signed-off-by: David S. Miller <davem@davemloft.net>
* pkt_sched: Make qdisc->gso_skb a list.David S. Miller2008-09-221-5/+7
| | | | | | | The idea is that we can use this to get rid of ->requeue(). Signed-off-by: David S. Miller <davem@davemloft.net>
* net: em_cmp.c use unaligned access helpersHarvey Harrison2008-09-221-6/+3
| | | | | | Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Use hton[sl]() instead of __constant_hton[sl]() where applicableArnaldo Carvalho de Melo2008-09-203-20/+20
| | | | | Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* multiq: requeue should rewind the current_bandAlexander Duyck2008-09-201-0/+5
| | | | | | | | | Currently dequeueing a packet and requeueing the same packet will cause a different packet to be pulled on the next dequeue. This change forces requeue to rewind the current_band. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* multiq: Further multiqueue cleanupAlexander Duyck2008-09-121-4/+9
| | | | | | | | | This patch resolves a few issues found with multiq including wording suggestions and a problem seen in the allocation of queues. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* pkt_action: add new action skbeditAlexander Duyck2008-09-123-0/+215
| | | | | | | | | This new action will have the ability to change the priority and/or queue_mapping fields on an sk_buff. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* pkt_sched: Add multiqueue scheduler supportAlexander Duyck2008-09-123-0/+477
| | | | | | | | | | | | | | | This patch is intended to add a qdisc to support the new tx multiqueue architecture by providing a band for each hardware queue. By doing this it is possible to support a different qdisc per physical hardware queue. This qdisc uses the skb->queue_mapping to select which band to place the traffic onto. It then uses a round robin w/ a check to see if the subqueue is stopped to determine which band to dequeue the packet from. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* warn: Turn the netdev timeout WARN_ON() into a WARN()Arjan van de Ven2008-09-081-2/+1
| | | | | | | | | | | this patch turns the netdev timeout WARN_ON_ONCE() into a WARN_ONCE(), so that the device and driver names are inside the warning message. This helps automated tools like kerneloops.org to collect the data and do statistics, as well as making it more likely that humans cut-n-paste the important message as part of a bugreport. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* netlink: Remove compat API for nested attributesThomas Graf2008-09-022-7/+17
| | | | | | | | | | | Removes all _nested_compat() functions from the API. The prio qdisc no longer requires them and netem has its own format anyway. Their existance is only confusing. Resend: Also remove the wrapper macro. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
* pkt_sched: Fix locking of qdisc_root with qdisc_root_sleeping_lock()Jarek Poplawski2008-08-297-11/+11
| | | | | | | | | | | Use qdisc_root_sleeping_lock() instead of qdisc_root_lock() where appropriate. The only difference is while dev is deactivated, when currently we can use a sleeping qdisc with the lock of noop_qdisc. This shouldn't be dangerous since after deactivation root lock could be used only by gen_estimator code, but looks wrong anyway. Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* pkt_sched: Fix gen_estimator locksJarek Poplawski2008-08-274-9/+17
| | | | | | | | | | | | While passing a qdisc root lock to gen_new_estimator() and gen_replace_estimator() dev could be deactivated or even before grafting proper root qdisc as qdisc_sleeping (e.g. qdisc_create), so using qdisc_root_lock() is not enough. This patch adds qdisc_root_sleeping_lock() for this, plus additional checks, where necessary. Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* pkt_sched: Use rcu_assign_pointer() to change dev_queue->qdiscJarek Poplawski2008-08-272-3/+3
| | | | | | | These pointers are RCU protected, so proper primitives should be used. Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* pkt_sched: Fix dev_graft_qdisc() lockingJarek Poplawski2008-08-271-1/+1
| | | | | | | | | During dev_graft_qdisc() dev is deactivated, so qdisc_root_lock() returns wrong lock of noop_qdisc instead of qdisc_sleeping. Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* pkt_sched: Fix qdisc list lockingJarek Poplawski2008-08-222-8/+41
| | | | | | | | | | | | | | | Since some qdiscs call qdisc_tree_decrease_qlen() (so qdisc_lookup()) without rtnl_lock(), adding and deleting from a qdisc list needs additional locking. This patch adds global spinlock qdisc_list_lock and wrapper functions for modifying the list. It is considered as a temporary solution until hfsc_dequeue(), netem_dequeue() and tbf_dequeue() (or qdisc_tree_decrease_qlen()) are redone. With feedback from Herbert Xu and David S. Miller. Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* pkt_sched: Fix qdisc_watchdog() vs. dev_deactivate() raceJarek Poplawski2008-08-212-0/+8
| | | | | | | | | | | | | dev_deactivate() can skip rescheduling of a qdisc by qdisc_watchdog() or other timer calling netif_schedule() after dev_queue_deactivate(). We prevent this checking aliveness before scheduling the timer. Since during deactivation the root qdisc is available only as qdisc_sleeping additional accessor qdisc_root_sleeping() is created. With feedback from Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Revert "pkt_sched: Add BH protection for qdisc_stab_lock."David S. Miller2008-08-181-7/+7
| | | | | | | | | This reverts commit 1cfa26661a85549063e369e2b40275eeaa7b923c. qdisc_destroy() runs fully under RTNL again and not from softint any longer, so this change is no longer needed. Signed-off-by: David S. Miller <davem@davemloft.net>
* pkt_sched: remove bogus block (cleanup)Ilpo Järvinen2008-08-181-7/+6
| | | | | | | ...Last block local var got just deleted. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* pkt_sched: Don't hold qdisc lock over qdisc_destroy().David S. Miller2008-08-182-17/+2
| | | | | | | | | | | | | | | | | | | | Based upon reports by Denys Fedoryshchenko, and feedback and help from Jarek Poplawski and Herbert Xu. We always either: 1) Never made an external reference to this qdisc. or 2) Did a dev_deactivate() which purged all asynchronous references. So do not lock the qdisc when we call qdisc_destroy(), it's illegal anyways as when we drop the lock this is free'd memory. Signed-off-by: David S. Miller <davem@davemloft.net>
* pkt_sched: Add lockdep annotation for qdisc locksJarek Poplawski2008-08-181-0/+7
| | | | | | | | | | | | Qdisc locks are initialized in the same function, qdisc_alloc(), so lockdep can't distinguish tx qdisc lock from rx and reports "possible recursive locking detected" when both these locks are taken eg. while using act_mirred with ifb. This looks like a false positive. Anyway, after this patch these locks will be reported more exactly. Reported-by: Denys Fedoryshchenko <denys@visp.net.lb> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* pkt_sched: Never schedule non-root qdiscs.David S. Miller2008-08-182-2/+2
| | | | | | | | | | | Based upon initial discovery and patch by Jarek Poplawski. The qdisc watchdogs can be attached to any qdisc, not just the root, so make sure we schedule the correct one. CBQ has a similar bug. Signed-off-by: David S. Miller <davem@davemloft.net>
* pkt_sched: Fix return value corruption in HTB and TBF.David S. Miller2008-08-182-11/+4
| | | | | | | | | | | | Based upon a bug report by Josip Rodin. Packet schedulers should only return NET_XMIT_DROP iff the packet really was dropped. If the packet does reach the device after we return NET_XMIT_DROP then TCP can crash because it depends upon the enqueue path return values being accurate. Signed-off-by: David S. Miller <davem@davemloft.net>
* sch_prio: Use NET_XMIT_SUCCESS instead of "0" constant.David S. Miller2008-08-171-1/+1
| | | | Signed-off-by: David S. Miller <davem@davemloft.net>
* sch_prio: Use return value from inner qdisc requeueJussi Kivilinna2008-08-171-1/+1
| | | | | | | | Use return value from inner qdisc requeue when value returned isn't NET_XMIT_SUCCESS, instead of always returning NET_XMIT_DROP. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* pkt_sched: No longer destroy qdiscs from RCU.David S. Miller2008-08-171-18/+9
| | | | | | | | | | We can now kill them synchronously with all of the previous dev_deactivate() cures. This makes netdev destruction and shutdown saner as the qdiscs hold references to the device. Signed-off-by: David S. Miller <davem@davemloft.net>
* pkt_sched: Grab correct lock in notify_and_destroy().Jarek Poplawski2008-08-171-2/+2
| | | | | | | | | From: Jarek Poplawski <jarkao2@gmail.com> When we are destroying non-root qdiscs, we need to lock the root of the qdisc tree not the the qdisc itself. Signed-off-by: David S. Miller <davem@davemloft.net>
* pkt_sched: Simplify dev_deactivate() polling loop.David S. Miller2008-08-171-26/+5
| | | | | | | | | | | | The condition under which the previous qdisc has no more references after we've attached &noop_qdisc is that both RUNNING and SCHED are both seen clear while holding the root lock. So just make specifically that check in the polling loop, instead of this overly complex "check without then check with lock held" sequence. Signed-off-by: David S. Miller <davem@davemloft.net>
* pkt_sched: Add 'deactivated' state.David S. Miller2008-08-171-0/+6
| | | | | | | | | | | | | | | | This new state lets dev_deactivate() mark a qdisc as having been deactivated. dev_queue_xmit() and ing_filter() check for this bit and do not try to process the qdisc if the bit is set. dev_deactivate() polls the qdisc after setting the bit, waiting for both __QDISC_STATE_RUNNING and __QDISC_STATE_SCHED to clear. This isn't perfect yet, but subsequent changesets will make it so. This part is just one piece of the puzzle. Signed-off-by: David S. Miller <davem@davemloft.net>
* pkt_sched: Fix unlocking in tc_ctl_tfilter()Jarek Poplawski2008-08-141-1/+1
| | | | | | | | | Fix a bug with spin_lock_bh() inserted instead of spin_unlock_bh() by some recent patch. Reported-by: Denys Fedoryshchenko <denys@visp.net.lb> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* pkt_sched: Fix queue quiescence testing in dev_deactivate().David S. Miller2008-08-131-5/+6
| | | | | | | | | | | | | | | | | | | Based upon discussions with Jarek P. and Herbert Xu. First, we're testing the wrong qdisc. We just reset the device queue qdiscs to &noop_qdisc and checking it's state is completely pointless here. We want to wait until the previous qdisc that was sitting at the ->qdisc pointer is not busy any more. And that would be ->qdisc_sleeping. Because of how we propagate the samples qdisc pointer down into qdisc_run and friends via per-cpu ->output_queue and netif_schedule, we have to wait also for the __QDISC_STATE_SCHED bit to clear as well. Signed-off-by: David S. Miller <davem@davemloft.net>
* pkt_sched: Fix oops in htb_delete.Jarek Poplawski2008-08-131-1/+2
| | | | | | | | | Recent changes introduced a bug in htb_delete(): cl->parent->children counter update misses checking cl->parent for NULL, which is used for root classes, so deleting them causes an oops. Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net-sched: fix Action flushing return codeJamal Hadi Salim2008-08-131-2/+2
| | | | | | | Flushing must consistently return ENOMEM on failure of any allocation Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
* net-sched: Fix actions flushingJamal Hadi Salim2008-08-131-2/+7
| | | | | | | | | Flushing of actions has been broken since we changed the semantics of netlink parsed tb[X] to mean X is an attribute type. This makes the flushing work. Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
* pkt_sched: Add BH protection for qdisc_stab_lock.Jarek Poplawski2008-08-111-7/+7
| | | | | | | | | Since qdisc_stab_lock is used in qdisc_put_stab(), which is called in BH context from __qdisc_destroy() RCU callback, softirq safe locking is needed. Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* pkt_sched: Fix ingress deletion and filter attachment.David S. Miller2008-08-081-13/+23
| | | | | | | | | | Based upon bug reports by Stephen Hemminger. We still had some cases using ->qdisc instead of ->qdisc_sleeping. Also, qdisc_lookup() should return ingress qdiscs. Signed-off-by: David S. Miller <davem@davemloft.net>
* pkt_sched: Fix actions referencingJamal Hadi Salim2008-08-071-3/+2
| | | | | | | | | When an action is added several times with the same exact index it gets deleted on every even-numbered attempt. This fixes that issue. Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
* pkt_sched: Fix qdisc config when link is down.David S. Miller2008-08-071-4/+4
| | | | | | | | Bug reported by Stephen Hemminger. We need to fetch the root from ->qdisc_sleeping not ->qdisc. Signed-off-by: David S. Miller <davem@davemloft.net>
* pkt_sched: Fix "parent is root" test in qdisc_create().David S. Miller2008-08-061-1/+1
| | | | | | | As noticed by Stephen Hemminger, the root qdisc is denoted by TC_H_ROOT, not zero. Signed-off-by: David S. Miller <davem@davemloft.net>
* net_sched: Add qdisc __NET_XMIT_BYPASS flagJarek Poplawski2008-08-048-16/+16
| | | | | | | | | | | | Patrick McHardy <kaber@trash.net> noticed that it would be nice to handle NET_XMIT_BYPASS by NET_XMIT_SUCCESS with an internal qdisc flag __NET_XMIT_BYPASS and to remove the mapping from dev_queue_xmit(). David Miller <davem@davemloft.net> spotted a serious bug in the first version of this patch. Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net_sched: Add qdisc __NET_XMIT_STOLEN flagJarek Poplawski2008-08-0410-33/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Patrick McHardy <kaber@trash.net> noticed: "The other problem that affects all qdiscs supporting actions is TC_ACT_QUEUED/TC_ACT_STOLEN getting mapped to NET_XMIT_SUCCESS even though the packet is not queued, corrupting upper qdiscs' qlen counters." and later explained: "The reason why it translates it at all seems to be to not increase the drops counter. Within a single qdisc this could be avoided by other means easily, upper qdiscs would still increase the counter when we return anything besides NET_XMIT_SUCCESS though. This means we need a new NET_XMIT return value to indicate this to the upper qdiscs. So I'd suggest to introduce NET_XMIT_STOLEN, return that to upper qdiscs and translate it to NET_XMIT_SUCCESS in dev_queue_xmit, similar to NET_XMIT_BYPASS." David Miller <davem@davemloft.net> noticed: "Maybe these NET_XMIT_* values being passed around should be a set of bits. They could be composed of base meanings, combined with specific attributes. So you could say "NET_XMIT_DROP | __NET_XMIT_NO_DROP_COUNT" The attributes get masked out by the top-level ->enqueue() caller, such that the base meanings are the only thing that make their way up into the stack. If it's only about communication within the qdisc tree, let's simply code it that way." This patch is trying to realize these ideas. Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* pkt_sched: Use qdisc_lock() on already sampled root qdisc.David S. Miller2008-08-021-6/+6
| | | | | | | | | | | | | Based upon a bug report by Jeff Kirsher. Don't use qdisc_root_lock() in these cases as the root qdisc could have been changed, and we'd thus lock the wrong object. Tested by Emil S Tantilov who confirms that this seems to fix the problem. Signed-off-by: David S. Miller <davem@davemloft.net>