aboutsummaryrefslogtreecommitdiffstats
path: root/net/core
Commit message (Collapse)AuthorAgeFilesLines
* wext: let get_wireless_stats() sleepJohannes Berg2009-10-051-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | A number of drivers (recently including cfg80211-based ones) assume that all wireless handlers, including statistics, can sleep and they often also implicitly assume that the rtnl is held around their invocation. This is almost always true now except when reading from sysfs: BUG: sleeping function called from invalid context at kernel/mutex.c:280 in_atomic(): 1, irqs_disabled(): 0, pid: 10450, name: head 2 locks held by head/10450: #0: (&buffer->mutex){+.+.+.}, at: [<c10ceb99>] sysfs_read_file+0x24/0xf4 #1: (dev_base_lock){++.?..}, at: [<c12844ee>] wireless_show+0x1a/0x4c Pid: 10450, comm: head Not tainted 2.6.32-rc3 #1 Call Trace: [<c102301c>] __might_sleep+0xf0/0xf7 [<c1324355>] mutex_lock_nested+0x1a/0x33 [<f8cea53b>] wdev_lock+0xd/0xf [cfg80211] [<f8cea58f>] cfg80211_wireless_stats+0x45/0x12d [cfg80211] [<c13118d6>] get_wireless_stats+0x16/0x1c [<c12844fe>] wireless_show+0x2a/0x4c Fix this by using the rtnl instead of dev_base_lock. Reported-by: Miles Lane <miles.lane@gmail.com> Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* pktgen: restore nanosec delaysEric Dumazet2009-10-041-1/+1
| | | | | | | | Commit fd29cf72 (pktgen: convert to use ktime_t) inadvertantly converted "delay" parameter from nanosec to microsec. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* pktgen: Fix multiqueue handlingEric Dumazet2009-10-041-1/+1
| | | | | | | | | | | | | | | | | | | It is not currently possible to instruct pktgen to use one selected tx queue. When Robert added multiqueue support in commit 45b270f8, he added an interval (queue_map_min, queue_map_max), and his code doesnt take into account the case of min = max, to select one tx queue exactly. I suspect a high performance setup on a eight txqueue device wants to use exactly eight cpus, and assign one tx queue to each sender. This patchs makes pktgen select the right tx queue, not the first one. Also updates Documentation to reflect Robert changes. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Robert Olsson <robert.olsson@its.uu.se> Signed-off-by: David S. Miller <davem@davemloft.net>
* pktgen: Fix delay handlingEric Dumazet2009-10-011-2/+4
| | | | | | | | | | | | After last pktgen changes, delay handling is wrong. pktgen actually sends packets at full line speed. Fix is to update pkt_dev->next_tx even if spin() returns early, so that next spin() calls have a chance to see a positive delay. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: restore tx timestamping for accelerated vlansEric Dumazet2009-09-301-3/+3
| | | | | | | | | | | | | | | | | | | | | | Since commit 9b22ea560957de1484e6b3e8538f7eef202e3596 ( net: fix packet socket delivery in rx irq handler ) We lost rx timestamping of packets received on accelerated vlans. Effect is that tcpdump on real dev can show strange timings, since it gets rx timestamps too late (ie at skb dequeueing time, not at skb queueing time) 14:47:26.986871 IP 192.168.20.110 > 192.168.20.141: icmp 64: echo request seq 1 14:47:26.986786 IP 192.168.20.141 > 192.168.20.110: icmp 64: echo reply seq 1 14:47:27.986888 IP 192.168.20.110 > 192.168.20.141: icmp 64: echo request seq 2 14:47:27.986781 IP 192.168.20.141 > 192.168.20.110: icmp 64: echo reply seq 2 14:47:28.986896 IP 192.168.20.110 > 192.168.20.141: icmp 64: echo request seq 3 14:47:28.986780 IP 192.168.20.141 > 192.168.20.110: icmp 64: echo reply seq 3 Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Fix sock_wfree() raceEric Dumazet2009-09-301-7/+12
| | | | | | | | | | | | | | Commit 2b85a34e911bf483c27cfdd124aeb1605145dc80 (net: No more expensive sock_hold()/sock_put() on each tx) opens a window in sock_wfree() where another cpu might free the socket we are working on. A fix is to call sk->sk_write_space(sk) while still holding a reference on sk. Reported-by: Jike Song <albcamus@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Make setsockopt() optlen be unsigned.David S. Miller2009-09-301-4/+4
| | | | | | | | | | | | This provides safety against negative optlen at the type level instead of depending upon (sometimes non-trivial) checks against this sprinkled all over the the place, in each and every implementation. Based upon work done by Arjan van de Ven and feedback from Linus Torvalds. Signed-off-by: David S. Miller <davem@davemloft.net>
* wext: add back wireless/ dir in sysfs for cfg80211 interfacesJohannes Berg2009-09-281-7/+5
| | | | | | | | | | | | | | | | | | The move away from having drivers assign wireless handlers, in favour of making cfg80211 assign them, broke the sysfs registration (the wireless/ dir went missing) because the handlers are now assigned only after registration, which is too late. Fix this by special-casing cfg80211-based devices, all of which are required to have an ieee80211_ptr, in the sysfs code, and also using get_wireless_stats() to have the same values reported as in procfs. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Reported-by: Hugh Dickins <hugh.dickins@tiscali.co.uk> Tested-by: Hugh Dickins <hugh.dickins@tiscali.co.uk> Signed-off-by: John W. Linville <linville@tuxdriver.com>
* pktgen: better scheduler friendlinessStephen Hemminger2009-09-241-79/+72
| | | | | | | | | | | | | | Previous update did not resched in inner loop causing watchdogs. Rewrite inner loop to: * account for delays better with less clock calls * more accurate timing of delay: - only delay if packet was successfully sent - if delay is 100ns and it takes 10ns to build packet then account for that * use wait_event_interruptible_timeout rather than open coding it. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* pktgen: T_TERMINATE flag is unusedStephen Hemminger2009-09-241-5/+4
| | | | | | | Get rid of unused flag bit. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* mm: replace various uses of num_physpages by totalram_pagesJan Beulich2009-09-221-2/+2
| | | | | | | | | | | | | | | | | | | | | | | Sizing of memory allocations shouldn't depend on the number of physical pages found in a system, as that generally includes (perhaps a huge amount of) non-RAM pages. The amount of what actually is usable as storage should instead be used as a basis here. Some of the calculations (i.e. those not intending to use high memory) should likely even use (totalram_pages - totalhigh_pages). Signed-off-by: Jan Beulich <jbeulich@novell.com> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Dave Airlie <airlied@linux.ie> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: "David S. Miller" <davem@davemloft.net> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds2009-09-171-2/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (66 commits) be2net: fix some cmds to use mccq instead of mbox atl1e: fix 2.6.31-git4 -- ATL1E 0000:03:00.0: DMA-API: device driver frees DMA pkt_sched: Fix qstats.qlen updating in dump_stats ipv6: Log the affected address when DAD failure occurs wl12xx: Fix print_mac() conversion. af_iucv: fix race when queueing skbs on the backlog queue af_iucv: do not call iucv_sock_kill() twice af_iucv: handle non-accepted sockets after resuming from suspend af_iucv: fix race in __iucv_sock_wait() iucv: use correct output register in iucv_query_maxconn() iucv: fix iucv_buffer_cpumask check when calling IUCV functions iucv: suspend/resume error msg for left over pathes wl12xx: switch to %pM to print the mac address b44: the poll handler b44_poll must not enable IRQ unconditionally ipv6: Ignore route option with ROUTER_PREF_INVALID bonding: make ab_arp select active slaves as other modes cfg80211: fix SME connect rc80211_minstrel: fix contention window calculation ssb/sdio: fix printk format warnings p54usb: add Zcomax XG-705A usbid ...
| * bonding: remap muticast addresses without using dev_close() and dev_open()Moni Shoua2009-09-151-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes commit e36b9d16c6a6d0f59803b3ef04ff3c22c3844c10. The approach there is to call dev_close()/dev_open() whenever the device type is changed in order to remap the device IP multicast addresses to HW multicast addresses. This approach suffers from 2 drawbacks: *. It assumes tha the device is UP when calling dev_close(), or otherwise dev_close() has no affect. It is worth to mention that initscripts (Redhat) and sysconfig (Suse) doesn't act the same in this matter. *. dev_close() has other side affects, like deleting entries from the routing table, which might be unnecessary. The fix here is to directly remap the IP multicast addresses to HW multicast addresses for a bonding device that changes its type, and nothing else. Reported-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Moni Shoua <monis@voltaire.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | driver model: constify attribute groupsDavid Brownell2009-09-151-1/+1
|/ | | | | | | | | | Let attribute group vectors be declared "const". We'd like to let most attribute metadata live in read-only sections... this is a start. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6Linus Torvalds2009-09-1412-762/+837
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1623 commits) netxen: update copyright netxen: fix tx timeout recovery netxen: fix file firmware leak netxen: improve pci memory access netxen: change firmware write size tg3: Fix return ring size breakage netxen: build fix for INET=n cdc-phonet: autoconfigure Phonet address Phonet: back-end for autoconfigured addresses Phonet: fix netlink address dump error handling ipv6: Add IFA_F_DADFAILED flag net: Add DEVTYPE support for Ethernet based devices mv643xx_eth.c: remove unused txq_set_wrr() ucc_geth: Fix hangs after switching from full to half duplex ucc_geth: Rearrange some code to avoid forward declarations phy/marvell: Make non-aneg speed/duplex forcing work for 88E1111 PHYs drivers/net/phy: introduce missing kfree drivers/net/wan: introduce missing kfree net: force bridge module(s) to be GPL Subject: [PATCH] appletalk: Fix skb leak when ipddp interface is not loaded ... Fixed up trivial conflicts: - arch/x86/include/asm/socket.h converted to <asm-generic/socket.h> in the x86 tree. The generic header has the same new #define's, so that works out fine. - drivers/net/tun.c fix conflict between 89f56d1e9 ("tun: reuse struct sock fields") that switched over to using 'tun->socket.sk' instead of the redundantly available (and thus removed) 'tun->sk', and 2b980dbd ("lsm: Add hooks to the TUN driver") which added a new 'tun->sk' use. Noted in 'next' by Stephen Rothwell.
| * net: force bridge module(s) to be GPLStephen Hemminger2009-09-111-2/+2
| | | | | | | | | | | | | | | | | | | | The only valid usage for the bridge frame hooks are by a GPL components (such as the bridge module). The kernel should not leave a crack in the door for proprietary networking stacks to slip in. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net_sched: reintroduce dev->qdisc for use by sch_apiPatrick McHardy2009-09-061-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the multiqueue integration with the qdisc API suffers from a few problems: - with multiple queues, all root qdiscs use the same handle. This means they can't be exposed to userspace in a backwards compatible fashion. - all API operations always refer to queue number 0. Newly created qdiscs are automatically shared between all queues, its not possible to address individual queues or restore multiqueue behaviour once a shared qdisc has been attached. - Dumps only contain the root qdisc of queue 0, in case of non-shared qdiscs this means the statistics are incomplete. This patch reintroduces dev->qdisc, which points to the (single) root qdisc from userspace's point of view. Currently it either points to the first (non-shared) default qdisc, or a qdisc shared between all queues. The following patches will introduce a classful dummy qdisc, which will be used as root qdisc and contain the per-queue qdiscs as children. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: Remove debugging codeEric Dumazet2009-09-031-2/+0
| | | | | | | | | | | | | | Remove a debugging aid I accidently left in previous 'cleanup' patch Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: net/core/dev.c cleanupsEric Dumazet2009-09-031-297/+292
| | | | | | | | | | | | | | Pure style cleanup patch before surgery :) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/ethtool: Add support for the ethtool feature to flash firmware image ↵Ajit Khaparde2009-09-021-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | from a specified file. This patch adds support to flash a firmware image to a device using ethtool. The driver gets the filename of the firmware image and flashes the image using the request firmware path. The region "on the chip" to be flashed can be specified by an option. It is upto the device driver to enumerate the region number passed by ethtool, to the region to be flashed. The default behavior is to flash all the regions on the chip. Signed-off-by: Ajit Khaparde <ajitk@serverengines.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * vlan: multiqueue vlan deviceEric Dumazet2009-09-021-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | vlan devices are currently not multi-queue capable. We can do that with a new rtnl_link_ops method, get_tx_queues(), called from rtnl_create_link() This new method gets num_tx_queues/real_num_tx_queues from real device. register_vlan_device() is also handled. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: drop_monitor: make last_rx timestamp privateNeil Horman2009-09-021-3/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | It was recently pointed out to me that the last_rx field of the net_device structure wasn't updated regularly. In fact only the bonding driver really uses it currently. Since the drop_monitor code relies on the last_rx field to detect drops on recevie in hardware, We need to find a more reliable way to rate limit our drop checks (so that we don't check for drops on every frame recevied, which would be inefficient. This patch makes a last_rx timestamp that is private to the drop monitor code and is updated for every device that we track. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * Merge branch 'master' of ↵David S. Miller2009-09-024-10/+20
| |\ | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/yellowfin.c
| * | drop_monitor: fix trace_napi_poll_hit()Xiao Guangrong2009-09-011-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The net_dev of backlog napi is NULL, like below: __get_cpu_var(softnet_data).backlog.dev == NULL So, we should check it in napi tracepoint's probe function Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | netdev: convert pseudo-devices to netdev_tx_tStephen Hemminger2009-09-011-1/+1
| | | | | | | | | | | | | | | Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net: convert remaining non-symbolic return values in dev_queue_xmitKrishna Kumar2009-08-301-1/+1
| | | | | | | | | | | | | | | | | | | | | Patch compiled and 32 simultaneous netperf testing ran fine. Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | pktgen: use proc_create_data()Alexey Dobriyan2009-08-281-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It looks like after rename device proc entry is unusable, because of no ->read_proc or ->proc_fops. And create_proc_entry() is deprecated. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | pktgen: increase versionStephen Hemminger2009-08-281-6/+10
| | | | | | | | | | | | | | | | | | | | | Increase module version, and cleanup module info. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | pktgen: cleanup checkpatch warningsStephen Hemminger2009-08-281-139/+159
| | | | | | | | | | | | | | | Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | pktgen: use common idle routineStephen Hemminger2009-08-281-10/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | Simpler to have one place that spins and accounts for delays, this will also make the last packet be detected faster for more repeatable timing. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | pktgen: spin using hrtimerStephen Hemminger2009-08-281-21/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | This changes how the pktgen thread spins/waits between packets if delay is configured. It uses a high res timer to wait for time to arrive. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | pktgen: convert to use ktime_tStephen Hemminger2009-08-281-100/+84
| | | | | | | | | | | | | | | | | | | | | | | | The kernel ktime_t is a nice generic infrastructure for mananging high resolution times, as is done in pktgen. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | pktgen: avoid calling gettimeofdayStephen Hemminger2009-08-281-7/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | If not using delay then no need to update next_tx after each packet sent. This allows pktgen to send faster especially on systems with slower clock sources. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | pktgen: reorganize transmit loopStephen Hemminger2009-08-281-12/+14
| | | | | | | | | | | | | | | | | | | | | Handle standard (and non-standard) return values in a switch. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | pktgen: use netdev_alloc_skbStephen Hemminger2009-08-281-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | netdev_alloc_skb is NUMA node aware. Also, don't exhaust atomic emergency pool. Don't want pktgen to cause OOM behaviour. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | pktgen: cleanup clone count testStephen Hemminger2009-08-281-17/+15
| | | | | | | | | | | | | | | | | | | | | | | | The if statement to test for "should a new packet be used" can be simplified. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | pktgen: xmit logic reorganizationStephen Hemminger2009-08-281-31/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Do some reorganization of transmit logic path: * move transmit queue full idle to separate routine * add a cpu_relax() * eliminate some of the uneeded goto's * if queue is still stopped, go back to main thread loop. * don't give up transmitting if quantum is exhausted (be greedy) Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | pktgen: stop_device cleanupStephen Hemminger2009-08-281-9/+3
| | | | | | | | | | | | | | | | | | | | | | | | All the callers were freeing skb after stopping device. Remove unneeded forward decl. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | pktgen: mark read-only/mostly variablesStephen Hemminger2009-08-281-5/+5
| | | | | | | | | | | | | | | Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | pktgen: change inliningStephen Hemminger2009-08-281-8/+7
| | | | | | | | | | | | | | | | | | | | | | | | Don't force inlining where not needed. Gcc does better job of deciding to inline local functions. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | pktgen: minor cleanupStephen Hemminger2009-08-281-16/+9
| | | | | | | | | | | | | | | | | | | | | A couple of minor functions can be written more compactly. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | Drop ARPHRD_IEEE802154_PHYDmitry Eremin-Solenikov2009-08-191-2/+2
| | | | | | | | | | | | | | | | | | | | | There are not maste devices in mac802154 anymore, so drop ARPHRD_IEEE802154_PHY definition. Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
| * | net: skb ftracer - add tracepoint to skb_copy_datagram_iovec (v3)Neil Horman2009-08-131-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | skb allocation / cosumption tracer - Add consumption tracepoint This patch adds a tracepoint to skb_copy_datagram_iovec, which is called each time a userspace process copies a frame from a socket receive queue to a user space buffer. It allows us to hook in and examine each sk_buff that the system receives on a per-socket bases, and can be use to compile a list of which skb's were received by which processes. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> include/trace/events/skb.h | 20 ++++++++++++++++++++ net/core/datagram.c | 3 +++ 2 files changed, 23 insertions(+) Signed-off-by: David S. Miller <davem@davemloft.net>
| * | Merge branch 'master' of ↵David S. Miller2009-08-122-10/+15
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: arch/microblaze/include/asm/socket.h
| * | | net: Avoid enqueuing skb for default qdiscsKrishna Kumar2009-08-061-13/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | dev_queue_xmit enqueue's a skb and calls qdisc_run which dequeue's the skb and xmits it. In most cases, the skb that is enqueue'd is the same one that is dequeue'd (unless the queue gets stopped or multiple cpu's write to the same queue and ends in a race with qdisc_run). For default qdiscs, we can remove the redundant enqueue/dequeue and simply xmit the skb since the default qdisc is work-conserving. The patch uses a new flag - TCQ_F_CAN_BYPASS to identify the default fast queue. The controversial part of the patch is incrementing qlen when a skb is requeued - this is to avoid checks like the second line below: + } else if ((q->flags & TCQ_F_CAN_BYPASS) && !qdisc_qlen(q) && >> !q->gso_skb && + !test_and_set_bit(__QDISC_STATE_RUNNING, &q->state)) { Results of a 2 hour testing for multiple netperf sessions (1, 2, 4, 8, 12 sessions on a 4 cpu system-X). The BW numbers are aggregate Mb/s across iterations tested with this version on System-X boxes with Chelsio 10gbps cards: ---------------------------------- Size | ORG BW NEW BW | ---------------------------------- 128K | 156964 159381 | 256K | 158650 162042 | ---------------------------------- Changes from ver1: 1. Move sch_direct_xmit declaration from sch_generic.h to pkt_sched.h 2. Update qdisc basic statistics for direct xmit path. 3. Set qlen to zero in qdisc_reset. 4. Changed some function names to more meaningful ones. Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | net: implement a SO_DOMAIN getsockoptionJan Engelhardt2009-08-051-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This sockopt goes in line with SO_TYPE and SO_PROTOCOL. It makes it possible for userspace programs to pass around file descriptors — I am referring to arguments-to-functions, but it may even work for the fd passing over UNIX sockets — without needing to also pass the auxiliary information (PF_INET6/IPPROTO_TCP). Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | net: implement a SO_PROTOCOL getsockoptionJan Engelhardt2009-08-051-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Similar to SO_TYPE returning the socket type, SO_PROTOCOL allows to retrieve the protocol used with a given socket. I am not quite sure why we have that-many copies of socket.h, and why the values are not the same on all arches either, but for where hex numbers dominate, I use 0x1029 for SO_PROTOCOL as that seems to be the next free unused number across a bunch of operating systems, or so Google results make me want to believe. SO_PROTOCOL for others just uses the next free Linux number, 38. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | net: mark read-only arrays as constJan Engelhardt2009-08-053-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | String literals are constant, and usually, we can also tag the array of pointers const too, moving it to the .rodata section. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | neigh: Convert garbage collection from softirq to workqueueEric Dumazet2009-08-021-46/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current neigh_periodic_timer() function is fired by timer IRQ, and scans one hash bucket each round (very litle work in fact) As we are supposed to scan whole hash table in 15 seconds, this means neigh_periodic_timer() can be fired very often. (depending on the number of concurrent hash entries we stored in this table) Converting this to a workqueue permits scanning whole table, minimizing icache pollution, and firing this work every 15 seconds, independantly of hash table size. This 15 seconds delay is not a hard number, as work is a deferrable one. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | Merge branch 'master' of ↵David S. Miller2009-07-301-0/+1
| |\ \ \ | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6