aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* ipv6: remove ipv6_statisticsEric Dumazet2010-06-251-2/+0
| | | | | | | | | | | | | | commit 9261e5370112 (ipv6: making ip and icmp statistics per/namespace) forgot to remove ipv6_statistics variable. commit bc417d99bf27 (ipv6: remove stale MIB definitions) took care of icmpv6_statistics & icmpv6msg_statistics Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Denis V. Lunev <den@openvz.org> CC: Alexey Dobriyan <adobriyan@gmail.com> CC: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* snmp: add align parameter to snmp_mib_init()Eric Dumazet2010-06-257-22/+40
| | | | | | | | | | | | | | | | In preparation for 64bit snmp counters for some mibs, add an 'align' parameter to snmp_mib_init(), instead of assuming mibs only contain 'unsigned long' fields. Callers can use __alignof__(type) to provide correct alignment. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Herbert Xu <herbert@gondor.apana.org.au> CC: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> CC: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> CC: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* loopback: use u64_stats_sync infrastructureEric Dumazet2010-06-251-46/+16
| | | | | | | | | | | | | | | Commit 6b10de38f0ef (loopback: Implement 64bit stats on 32bit arches) introduced 64bit stats in loopback driver, using a private seqcount and private helpers. David suggested to introduce a generic infrastructure, added in (net: Introduce u64_stats_sync infrastructure) This patch reimplements loopback 64bit stats using the u64_stats_sync infrastructure. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* arp: RCU change in arp_solicit()Eric Dumazet2010-06-251-5/+7
| | | | | | | Avoid two atomic ops in arp_solicit() Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* dccp: make implementation of Syn-RTT symmetricGerrit Renker2010-06-252-3/+16
| | | | | | | | | | | | | | | | | | This patch is thanks to Andre Noll who reported the issue and helped testing. The Syn-RTT sampled during the initial handshake currently only works for the client sending the DCCP-Request. TFRC penalizes the absence of an RTT sample with a very slow initial speed (1 packet per second), which delays slow-start significantly, resulting in sluggish performance. This patch mirrors the "Syn RTT" principle by adding a timestamp also onto the DCCP-Response, producing an RTT sample when the (Data)Ack completing the handshake arrives. Also changed the documentation to 'TFRC' since Syn RTTs are also used by CCID-4. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
* dccp: remove unused function argumentGerrit Renker2010-06-254-19/+13
| | | | | | | This removes an unused 'sk' argument from several option-inserting functions. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/fec: clean suspend/resumeEric Benard2010-06-251-18/+12
| | | | | | | | | | | | | | | | Commit 59d4289b83b11379d867e2f7146904b19cc96404 converted fec to dev_pm_ops but didn't update the suspend/resume functions thus leading to the following warning : "initialization from incompatible pointer type" when CONFIG_PM is set. This patch also fixe a few indentation and style around CONFIG_PM area. Signed-off-by: Eric Bénard <eric@eukrea.com> Cc: netdev@vger.kernel.org Cc: davem@davemloft.net Cc: amit.kucheria@canonical.com Cc: s.hauer@pengutronix.de Cc: linux-arm-kernel@lists.infradead.org Signed-off-by: David S. Miller <davem@davemloft.net>
* cxgb3: request 7.10 firmwareDivy Le Ray2010-06-251-2/+2
| | | | | | | | The driver requests FW 7.10 Bump up driver version. Signed-off-by: Divy Le Ray <divy@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* cxgb3: update FW to 7.10Divy Le Ray2010-06-253-1918/+1936
| | | | | | | Update FW to 7.10 Signed-off-by: Divy Le Ray <divy@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/core/pktgen.c: Use pr_<level>Joe Perches2010-06-251-78/+64
| | | | | | | | | | | | Add pr_fmt(fmt) KBUILD_MODNAME ": " fmt Remove "pktgen: " from formats Convert printks to pr_<level> Added func_enter() for debugging Moved version to end of string at module_init Coalesced long formats Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: optimize Berkeley Packet Filter (BPF) processingHagen Paul Pfeifer2010-06-252-51/+209
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Gcc is currenlty not in the ability to optimize the switch statement in sk_run_filter() because of dense case labels. This patch replace the OR'd labels with ordered sequenced case labels. The sk_chk_filter() function is modified to patch/replace the original OPCODES in a ordered but equivalent form. gcc is now in the ability to transform the switch statement in sk_run_filter into a jump table of complexity O(1). Until this patch gcc generates a sequence of conditional branches (O(n) of 567 byte .text segment size (arch x86_64): 7ff: 8b 06 mov (%rsi),%eax 801: 66 83 f8 35 cmp $0x35,%ax 805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d> 80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a> 811: 66 83 f8 15 cmp $0x15,%ax 815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322> 81b: 77 73 ja 890 <sk_run_filter+0xd2> 81d: 66 83 f8 04 cmp $0x4,%ax 821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280> 827: 77 29 ja 852 <sk_run_filter+0x94> 829: 66 83 f8 01 cmp $0x1,%ax [...] With the modification the compiler translate the switch statement into the following jump table fragment: 7ff: 66 83 3e 2c cmpw $0x2c,(%rsi) 803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a> 809: 0f b7 06 movzwl (%rsi),%eax 80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8) 813: 44 89 e3 mov %r12d,%ebx 816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0> 81b: 41 89 dc mov %ebx,%r12d 81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0> Furthermore, I reordered the instructions to reduce cache line misses by order the most common instruction to the start. Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* sfc: Log clearer error messages for hardware monitorBen Hutchings2010-06-251-3/+8
| | | | | Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sfc: Use Toeplitz IPv4 hash for RSS and hash insertionBen Hutchings2010-06-252-3/+22
| | | | | | | Insertion of the Falcon hash is unreliable. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sfc: Move siena_nic_data::ipv6_rss_key to efx_nic::rx_hash_keyBen Hutchings2010-06-254-9/+8
| | | | | | | We will use this hash key for Toeplitz IPv4 hashing too. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sfc: Fix reading of inserted hashBen Hutchings2010-06-252-8/+6
| | | | | | | | | | The hash appears immediately before the packet data, not at the beginning of the buffer. This means we can easily use negative offsets from the start of packet data, so adjust the data and length at the top of __efx_rx_packet() instead of wherever we consume the hash. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* enic: Clean upsVasanthy Kolluri2010-06-2525-53/+41
| | | | | | | | | | | | | | | | 1) Update copyright 2) Fix hardware queue descriptor field size CQ_ENET_RQ_DESC_FCOE_SOF_BITS 3) Include rtnetlink.h instead of if_link.h 4) Selectively flush writes to interrupt mask register 5) Use pci_enable_device_mem 6) Remove unused variables and header files 7) Fix size mismatch between memory alloc and free operations of a variable 8) Check for non null arguments to vic_provinfo_alloc Signed-off-by: Scott Feldman <scofeldm@cisco.com> Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com> Signed-off-by: Roopa Prabhu <roprabhu@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* enic: Bug Fix: Handle surprise hardware removalsVasanthy Kolluri2010-06-253-4/+20
| | | | | | | | | | Handle surprise hardware removals gracefully during devcmd issue and init, cleanup of queues. Signed-off-by: Scott Feldman <scofeldm@cisco.com> Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com> Signed-off-by: Roopa Prabhu <roprabhu@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* enic: Feature Add: Add loopback capability to enic devicesVasanthy Kolluri2010-06-255-27/+45
| | | | | | | | | | | Hardware has the loopback capability to queue the packets transmitted from a device to the receive queue of the same device. enic now supports the loopback capability. Signed-off-by: Scott Feldman <scofeldm@cisco.com> Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com> Signed-off-by: Roopa Prabhu <roprabhu@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* enic: Use receive queue buffer blocks of 32/64 entriesVasanthy Kolluri2010-06-254-26/+37
| | | | | | | | | | Change the receive queue buffer allocations into blocks of 32 entries when ring size is less than 64, otherwise use 64 entries per block. Signed-off-by: Scott Feldman <scofeldm@cisco.com> Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com> Signed-off-by: Roopa Prabhu <roprabhu@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* enic: Add new firmware devcmdsVasanthy Kolluri2010-06-253-9/+146
| | | | | | | | | | Add new firmware devcmds - CMD_PROXY_BY_BDF, CMD_PACKET_FILTER_ALL, CMD_ENABLE_WAIT. Signed-off-by: Scott Feldman <scofeldm@cisco.com> Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com> Signed-off-by: Roopa Prabhu <roprabhu@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* enic: Use (netdev|dev|pr)_<level> macro helpers for loggingVasanthy Kolluri2010-06-259-109/+98
| | | | | | | | | | | Replace all printk routines with the (netdev|dev|pr)_<level> macros that provide verbose logs. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Scott Feldman <scofeldm@cisco.com> Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com> Signed-off-by: Roopa Prabhu <roprabhu@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* enic: Clean up: Add wrapper routines for firmware devcmd callsVasanthy Kolluri2010-06-255-84/+201
| | | | | | | | | | Add wrapper routines that issue devcmds to firmware and ensure that a devcmd lock is held for each devcmd call. Signed-off-by: Scott Feldman <scofeldm@cisco.com> Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com> Signed-off-by: Roopa Prabhu <roprabhu@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* enic: Use a lighter reset operation for enic devicesVasanthy Kolluri2010-06-254-8/+55
| | | | | | | | | | | | | | The port profile information for a dynamic enic device is set by the upper layers, that are oblivious to the device reset operation. We do not want a reset operation erase the network state of a dynamic enic device as there is no way to set up the port profile information again. Hence a lighter reset operation called hang reset is used. Hang reset, unlike soft reset does not reset the network state and resets the host side state only. Signed-off-by: Scott Feldman <scofeldm@cisco.com> Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com> Signed-off-by: Roopa Prabhu <roprabhu@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* enic: Bug Fix: Change hardware ingress vlan rewrite modeVasanthy Kolluri2010-06-255-7/+68
| | | | | | | | | | | | | The current ingress vlan rewrite mode setting lets the hardware strip off the tag control information of a packet received on native vlan. As a result, the priority bits are also lost. The fix is to change the ingress vlan rewrite mode setting such that the complete tag control information is retained for packets that belong to native vlan. Signed-off-by: Scott Feldman <scofeldm@cisco.com> Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com> Signed-off-by: Roopa Prabhu <roprabhu@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* enic: Feature Add: Replace LRO with GROVasanthy Kolluri2010-06-253-80/+9
| | | | | | | | | | enic now uses the GRO mechanism instead of LRO to pass skbs to upper layers. Signed-off-by: Scott Feldman <scofeldm@cisco.com> Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com> Signed-off-by: Roopa Prabhu <roprabhu@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* cnic: Update version to 2.1.3.Michael Chan2010-06-251-2/+2
| | | | Signed-off-by: David S. Miller <davem@davemloft.net>
* cnic: Further unify kcq handling code.Michael Chan2010-06-251-37/+34
| | | | | | | | This eliminates some of the duplicate code for the various devices that require the same basic kcq handling. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* cnic: Restructure kcq processing.Michael Chan2010-06-251-54/+22
| | | | | | | | By doing more work in the common function cnic_get_kcqes(), and making full use of the kcq_info structure. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* cnic: Unify kcq allocation for all devices.Michael Chan2010-06-252-64/+98
| | | | | | | | | By creating a common data stucture kcq_info for all devices, the kcq (kernel completion queue) for all devices can be allocated by common code. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* cnic: Unify IRQ code for all hardware types.Michael Chan2010-06-251-13/+20
| | | | | | | By creating a common cnic_doirq(). Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* cnic: Fine-tune CID memory space calculation.Michael Chan2010-06-252-15/+21
| | | | | | | | | | | | The current code makes assumptions about the CID (context ID) memory space and starting CID that may not be always correct when firmware changes. In particular, BNX2_ISCSI_START_CID may not always be fixed. We now calculate cp->max_cid_space and cp->iscsi_start_cid dynamically instead of using fixed constants. The unused cp->max_iscsi_conn is also eliminated. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sfc: Record hardware RX hash on each skb where possibleBen Hutchings2010-06-247-3/+49
| | | | | Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sfc: Disable setting feature flags that are not implementedBen Hutchings2010-06-241-1/+0
| | | | | Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sfc: Replace EFX_DRIVER_NAME with KBUILD_MODNAMEBen Hutchings2010-06-243-5/+3
| | | | | Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sfc: Implement message level controlBen Hutchings2010-06-2420-529/+727
| | | | | | | | | | | | | | | Replace EFX_ERR() with netif_err(), EFX_INFO() with netif_info(), EFX_LOG() with netif_dbg() and EFX_TRACE() and EFX_REGDUMP() with netif_vdbg(). Replace EFX_ERR_RL(), EFX_INFO_RL() and EFX_LOG_RL() using explicit calls to net_ratelimit(). Implement the ethtool operations to get and set message level flags, and add a 'debug' module parameter for the initial value. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sfc: Log MTD errors using partition name, not just net device nameBen Hutchings2010-06-241-9/+14
| | | | | Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sfc: Implement ethtool register dump operationBen Hutchings2010-06-244-0/+292
| | | | | | Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Acked-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tcp: do not send reset to already closed socketsKonstantin Khorenko2010-06-241-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | i've found that tcp_close() can be called for an already closed socket, but still sends reset in this case (tcp_send_active_reset()) which seems to be incorrect. Moreover, a packet with reset is sent with different source port as original port number has been already cleared on socket. Besides that incrementing stat counter for LINUX_MIB_TCPABORTONCLOSE also does not look correct in this case. Initially this issue was found on 2.6.18-x RHEL5 kernel, but the same seems to be true for the current mainstream kernel (checked on 2.6.35-rc3). Please, correct me if i missed something. How that happens: 1) the server receives a packet for socket in TCP_CLOSE_WAIT state that triggers a tcp_reset(): Call Trace: <IRQ> [<ffffffff8025b9b9>] tcp_reset+0x12f/0x1e8 [<ffffffff80046125>] tcp_rcv_state_process+0x1c0/0xa08 [<ffffffff8003eb22>] tcp_v4_do_rcv+0x310/0x37a [<ffffffff80028bea>] tcp_v4_rcv+0x74d/0xb43 [<ffffffff8024ef4c>] ip_local_deliver_finish+0x0/0x259 [<ffffffff80037131>] ip_local_deliver+0x200/0x2f4 [<ffffffff8003843c>] ip_rcv+0x64c/0x69f [<ffffffff80021d89>] netif_receive_skb+0x4c4/0x4fa [<ffffffff80032eca>] process_backlog+0x90/0xec [<ffffffff8000cc50>] net_rx_action+0xbb/0x1f1 [<ffffffff80012d3a>] __do_softirq+0xf5/0x1ce [<ffffffff8001147a>] handle_IRQ_event+0x56/0xb0 [<ffffffff8006334c>] call_softirq+0x1c/0x28 [<ffffffff80070476>] do_softirq+0x2c/0x85 [<ffffffff80070441>] do_IRQ+0x149/0x152 [<ffffffff80062665>] ret_from_intr+0x0/0xa <EOI> [<ffffffff80008a2e>] __handle_mm_fault+0x6cd/0x1303 [<ffffffff80008903>] __handle_mm_fault+0x5a2/0x1303 [<ffffffff80033a9d>] cache_free_debugcheck+0x21f/0x22e [<ffffffff8006a263>] do_page_fault+0x49a/0x7dc [<ffffffff80066487>] thread_return+0x89/0x174 [<ffffffff800c5aee>] audit_syscall_exit+0x341/0x35c [<ffffffff80062e39>] error_exit+0x0/0x84 tcp_rcv_state_process() ... // (sk_state == TCP_CLOSE_WAIT here) ... /* step 2: check RST bit */ if(th->rst) { tcp_reset(sk); goto discard; } ... --------------------------------- tcp_rcv_state_process tcp_reset tcp_done tcp_set_state(sk, TCP_CLOSE); inet_put_port __inet_put_port inet_sk(sk)->num = 0; sk->sk_shutdown = SHUTDOWN_MASK; 2) After that the process (socket owner) tries to write something to that socket and "inet_autobind" sets a _new_ (which differs from the original!) port number for the socket: Call Trace: [<ffffffff80255a12>] inet_bind_hash+0x33/0x5f [<ffffffff80257180>] inet_csk_get_port+0x216/0x268 [<ffffffff8026bcc9>] inet_autobind+0x22/0x8f [<ffffffff80049140>] inet_sendmsg+0x27/0x57 [<ffffffff8003a9d9>] do_sock_write+0xae/0xea [<ffffffff80226ac7>] sock_writev+0xdc/0xf6 [<ffffffff800680c7>] _spin_lock_irqsave+0x9/0xe [<ffffffff8001fb49>] __pollwait+0x0/0xdd [<ffffffff8008d533>] default_wake_function+0x0/0xe [<ffffffff800a4f10>] autoremove_wake_function+0x0/0x2e [<ffffffff800f0b49>] do_readv_writev+0x163/0x274 [<ffffffff80066538>] thread_return+0x13a/0x174 [<ffffffff800145d8>] tcp_poll+0x0/0x1c9 [<ffffffff800c56d3>] audit_syscall_entry+0x180/0x1b3 [<ffffffff800f0dd0>] sys_writev+0x49/0xe4 [<ffffffff800622dd>] tracesys+0xd5/0xe0 3) sendmsg fails at last with -EPIPE (=> 'write' returns -EPIPE in userspace): F: tcp_sendmsg1 -EPIPE: sk=ffff81000bda00d0, sport=49847, old_state=7, new_state=7, sk_err=0, sk_shutdown=3 Call Trace: [<ffffffff80027557>] tcp_sendmsg+0xcb/0xe87 [<ffffffff80033300>] release_sock+0x10/0xae [<ffffffff8016f20f>] vgacon_cursor+0x0/0x1a7 [<ffffffff8026bd32>] inet_autobind+0x8b/0x8f [<ffffffff8003a9d9>] do_sock_write+0xae/0xea [<ffffffff80226ac7>] sock_writev+0xdc/0xf6 [<ffffffff800680c7>] _spin_lock_irqsave+0x9/0xe [<ffffffff8001fb49>] __pollwait+0x0/0xdd [<ffffffff8008d533>] default_wake_function+0x0/0xe [<ffffffff800a4f10>] autoremove_wake_function+0x0/0x2e [<ffffffff800f0b49>] do_readv_writev+0x163/0x274 [<ffffffff80066538>] thread_return+0x13a/0x174 [<ffffffff800145d8>] tcp_poll+0x0/0x1c9 [<ffffffff800c56d3>] audit_syscall_entry+0x180/0x1b3 [<ffffffff800f0dd0>] sys_writev+0x49/0xe4 [<ffffffff800622dd>] tracesys+0xd5/0xe0 tcp_sendmsg() ... /* Wait for a connection to finish. */ if ((1 << sk->sk_state) & ~(TCPF_ESTABLISHED | TCPF_CLOSE_WAIT)) { int old_state = sk->sk_state; if ((err = sk_stream_wait_connect(sk, &timeo)) != 0) { if (f_d && (err == -EPIPE)) { printk("F: tcp_sendmsg1 -EPIPE: sk=%p, sport=%u, old_state=%d, new_state=%d, " "sk_err=%d, sk_shutdown=%d\n", sk, ntohs(inet_sk(sk)->sport), old_state, sk->sk_state, sk->sk_err, sk->sk_shutdown); dump_stack(); } goto out_err; } } ... 4) Then the process (socket owner) understands that it's time to close that socket and does that (and thus triggers sending reset packet): Call Trace: ... [<ffffffff80032077>] dev_queue_xmit+0x343/0x3d6 [<ffffffff80034698>] ip_output+0x351/0x384 [<ffffffff80251ae9>] dst_output+0x0/0xe [<ffffffff80036ec6>] ip_queue_xmit+0x567/0x5d2 [<ffffffff80095700>] vprintk+0x21/0x33 [<ffffffff800070f0>] check_poison_obj+0x2e/0x206 [<ffffffff80013587>] poison_obj+0x36/0x45 [<ffffffff8025dea6>] tcp_send_active_reset+0x15/0x14d [<ffffffff80023481>] dbg_redzone1+0x1c/0x25 [<ffffffff8025dea6>] tcp_send_active_reset+0x15/0x14d [<ffffffff8000ca94>] cache_alloc_debugcheck_after+0x189/0x1c8 [<ffffffff80023405>] tcp_transmit_skb+0x764/0x786 [<ffffffff8025df8a>] tcp_send_active_reset+0xf9/0x14d [<ffffffff80258ff1>] tcp_close+0x39a/0x960 [<ffffffff8026be12>] inet_release+0x69/0x80 [<ffffffff80059b31>] sock_release+0x4f/0xcf [<ffffffff80059d4c>] sock_close+0x2c/0x30 [<ffffffff800133c9>] __fput+0xac/0x197 [<ffffffff800252bc>] filp_close+0x59/0x61 [<ffffffff8001eff6>] sys_close+0x85/0xc7 [<ffffffff800622dd>] tracesys+0xd5/0xe0 So, in brief: * a received packet for socket in TCP_CLOSE_WAIT state triggers tcp_reset() which clears inet_sk(sk)->num and put socket into TCP_CLOSE state * an attempt to write to that socket forces inet_autobind() to get a new port (but the write itself fails with -EPIPE) * tcp_close() called for socket in TCP_CLOSE state sends an active reset via socket with newly allocated port This adds an additional check in tcp_close() for already closed sockets. We do not want to send anything to closed sockets. Signed-off-by: Konstantin Khorenko <khorenko@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* broadcom: Add 5241 supportDmitry Baryshkov2010-06-242-0/+23
| | | | | | | This patch adds the 5241 PHY ID to the broadcom module. Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* broadcom: move all PHY_ID's to headerDmitry Baryshkov2010-06-242-12/+18
| | | | | | | Move all PHY IDs to brcmphy.h header for completeness and unification of code. Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: fix "netpoll: Allow netpoll_setup/cleanup recursion"Andrew Morton2010-06-241-1/+0
| | | | | | | | | Remove rtnl_unlock() which had no corresponding rtnl_lock(). Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'master' of ↵David S. Miller2010-06-2325-66/+116
|\ | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: net/ipv4/ip_output.c
| * sky2: enable rx/tx in sky2_phy_reinit()Brandon Philips2010-06-231-5/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | sky2_phy_reinit is called by the ethtool helpers sky2_set_settings, sky2_nway_reset and sky2_set_pauseparam when netif_running. However, at the end of sky2_phy_init GM_GP_CTRL has GM_GPCR_RX_ENA and GM_GPCR_TX_ENA cleared. So, doing these commands causes the device to stop working: $ ethtool -r eth0 $ ethtool -A eth0 autoneg off Fix this issue by enabling Rx/Tx after running sky2_phy_init in sky2_phy_reinit. Signed-off-by: Brandon Philips <bphilips@suse.de> Tested-by: Brandon Philips <bphilips@suse.de> Cc: stable@kernel.org Tested-by: Mike McCormack <mikem@ring3k.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * cnic: Disable statistics initialization for eth clients that do not support ↵Dmitry Kravkov2010-06-231-21/+34
| | | | | | | | | | | | | | | | | | | | | | statistics Disable statistics initialization for eth clients that do not support statistics. This prevents memory corruption on bnx2x hw. Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com> Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
| * net: add dependency on fw class module to qlcnic and netxen_nicAnirban Chakraborty2010-06-231-0/+2
| | | | | | | | | | | | | | | | netxen_nic and qlcnic driver depends on firmware_class module. Signed-off-by: Anirban Chakraborty <anirban.chakraborty@qlogic.com> Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * snmp: fix SNMP_ADD_STATS()Eric Dumazet2010-06-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | commit aa2ea0586d9d (tcp: fix outsegs stat for TSO segments) incorrectly assumed SNMP_ADD_STATS() was used from BH context. Fix this using mib[!in_softirq()] instead of mib[0] Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * Merge branch 'master' of ↵David S. Miller2010-06-221-0/+1
| |\ | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6
| | * ath5k: initialize ah->ah_current_channelBob Copeland2010-06-181-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ath5k assumes ah_current_channel is always a valid pointer in several places, but a newly created interface may not have a channel. To avoid null pointer dereferences, set it up to point to the first available channel until later reconfigured. This fixes the following oops: $ rmmod ath5k $ insmod ath5k $ iw phy0 set distance 11000 BUG: unable to handle kernel NULL pointer dereference at 00000006 IP: [<d0a1ff24>] ath5k_hw_set_coverage_class+0x74/0x1b0 [ath5k] *pde = 00000000 Oops: 0000 [#1] last sysfs file: /sys/devices/pci0000:00/0000:00:0e.0/ieee80211/phy0/index Modules linked in: usbhid option usb_storage usbserial usblp evdev lm90 scx200_acb i2c_algo_bit i2c_dev i2c_core via_rhine ohci_hcd ne2k_pci 8390 leds_alix2 xt_IMQ imq nf_nat_tftp nf_conntrack_tftp nf_nat_irc nf_cc Pid: 1597, comm: iw Not tainted (2.6.32.14 #8) EIP: 0060:[<d0a1ff24>] EFLAGS: 00010296 CPU: 0 EIP is at ath5k_hw_set_coverage_class+0x74/0x1b0 [ath5k] EAX: 000000c2 EBX: 00000000 ECX: ffffffff EDX: c12d2080 ESI: 00000019 EDI: cf8c0000 EBP: d0a30edc ESP: cfa09bf4 DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 Process iw (pid: 1597, ti=cfa09000 task=cf88a000 task.ti=cfa09000) Stack: d0a34f35 d0a353f8 d0a30edc 000000fe cf8c0000 00000000 1900063d cfa8c9e0 <0> cfa8c9e8 cfa8c0c0 cfa8c000 d0a27f0c 199d84b4 cfa8c200 00000010 d09bfdc7 <0> 00000000 00000000 ffffffff d08e0d28 cf9263c0 00000001 cfa09cc4 00000000 Call Trace: [<d0a27f0c>] ? ath5k_hw_attach+0xc8c/0x3c10 [ath5k] [<d09bfdc7>] ? __ieee80211_request_smps+0x1347/0x1580 [mac80211] [<d08e0d28>] ? nl80211_send_scan_start+0x7b8/0x4520 [cfg80211] [<c10f5db9>] ? nla_parse+0x59/0xc0 [<c11ca8d9>] ? genl_rcv_msg+0x169/0x1a0 [<c11ca770>] ? genl_rcv_msg+0x0/0x1a0 [<c11c7e68>] ? netlink_rcv_skb+0x38/0x90 [<c11c9649>] ? genl_rcv+0x19/0x30 [<c11c7c03>] ? netlink_unicast+0x1b3/0x220 [<c11c893e>] ? netlink_sendmsg+0x26e/0x290 [<c11a409e>] ? sock_sendmsg+0xbe/0xf0 [<c1032780>] ? autoremove_wake_function+0x0/0x50 [<c104d846>] ? __alloc_pages_nodemask+0x106/0x530 [<c1074933>] ? do_lookup+0x53/0x1b0 [<c10766f9>] ? __link_path_walk+0x9b9/0x9e0 [<c11acab0>] ? verify_iovec+0x50/0x90 [<c11a42b1>] ? sys_sendmsg+0x1e1/0x270 [<c1048e50>] ? find_get_page+0x10/0x50 [<c104a96f>] ? filemap_fault+0x5f/0x370 [<c1059159>] ? __do_fault+0x319/0x370 [<c11a55b4>] ? sys_socketcall+0x244/0x290 [<c101962c>] ? do_page_fault+0x1ec/0x270 [<c1019440>] ? do_page_fault+0x0/0x270 [<c1002ae5>] ? syscall_call+0x7/0xb Code: 00 b8 fe 00 00 00 b9 f8 53 a3 d0 89 5c 24 14 89 7c 24 10 89 44 24 0c 89 6c 24 08 89 4c 24 04 c7 04 24 35 4f a3 d0 e8 7c 30 60 f0 <0f> b7 43 06 ba 06 00 00 00 a8 10 75 0e 83 e0 20 83 f8 01 19 d2 EIP: [<d0a1ff24>] ath5k_hw_set_coverage_class+0x74/0x1b0 [ath5k] SS:ESP 0068:cfa09bf4 CR2: 0000000000000006 ---[ end trace 54f73d6b10ceb87b ]--- Cc: stable@kernel.org Reported-by: Steve Brown <sbrown@cortland.com> Signed-off-by: Bob Copeland <me@bobcopeland.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
| * | hso: remove setting of low_latency flagFilip Aben2010-06-221-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch removes the setting of the low_latency flag. tty_flip_buffer_push() is occasionally being called in irq context, which causes a hang if the low_latency flag is set. Removing the low_latency flag only seems to impact the flush to ldisc, which will now be put on a workqueue. Signed-off-by: Filip Aben <f.aben@option.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | udp: Fix bogus UFO packet generationHerbert Xu2010-06-211-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It has been reported that the new UFO software fallback path fails under certain conditions with NFS. I tracked the problem down to the generation of UFO packets that are smaller than the MTU. The software fallback path simply discards these packets. This patch fixes the problem by not generating such packets on the UFO path. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>