diff options
author | Eric Dumazet <eric.dumazet@gmail.com> | 2010-05-05 01:07:37 -0700 |
---|---|---|
committer | David S. Miller <davem@davemloft.net> | 2010-05-05 01:07:37 -0700 |
commit | ec7d2f2cf3a1b76202986519ec4f8ec75b2de232 (patch) | |
tree | 177c324eb0cf7e687d1bbd10a6add3a7d5979002 /net/sunrpc | |
parent | 8753d29fd5daf890004a38c80835e1eb3acda394 (diff) | |
download | kernel_samsung_espresso10-ec7d2f2cf3a1b76202986519ec4f8ec75b2de232.zip kernel_samsung_espresso10-ec7d2f2cf3a1b76202986519ec4f8ec75b2de232.tar.gz kernel_samsung_espresso10-ec7d2f2cf3a1b76202986519ec4f8ec75b2de232.tar.bz2 |
net: __alloc_skb() speedup
With following patch I can reach maximum rate of my pktgen+udpsink
simulator :
- 'old' machine : dual quad core E5450 @3.00GHz
- 64 UDP rx flows (only differ by destination port)
- RPS enabled, NIC interrupts serviced on cpu0
- rps dispatched on 7 other cores. (~130.000 IPI per second)
- SLAB allocator (faster than SLUB in this workload)
- tg3 NIC
- 1.080.000 pps without a single drop at NIC level.
Idea is to add two prefetchw() calls in __alloc_skb(), one to prefetch
first sk_buff cache line, the second to prefetch the shinfo part.
Also using one memset() to initialize all skb_shared_info fields instead
of one by one to reduce number of instructions, using long word moves.
All skb_shared_info fields before 'dataref' are cleared in
__alloc_skb().
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Diffstat (limited to 'net/sunrpc')
0 files changed, 0 insertions, 0 deletions