aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* [XFRM]: CONFIG_XFRM_MIGRATE optionShinta Sugimoto2007-02-081-0/+11
| | | | | | | | | | Add CONFIG_XFRM_MIGRATE option which makes it possible for for user application to send or receive MIGRATE message to/from netlink socket. Signed-off-by: Shinta Sugimoto <shinta.sugimoto@ericsson.com> Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [XFRM]: User interface for handling XFRM_MSG_MIGRATEShinta Sugimoto2007-02-081-0/+173
| | | | | | | | | | | Add user interface for handling XFRM_MSG_MIGRATE. The message is issued by user application. When kernel receives the message, procedure of updating XFRM databases will take place. Signed-off-by: Shinta Sugimoto <shinta.sugimoto@ericsson.com> Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [XFRM]: Extension for dynamic update of endpoint address(es)Shinta Sugimoto2007-02-084-0/+467
| | | | | | | | | | | | | | Extend the XFRM framework so that endpoint address(es) in the XFRM databases could be dynamically updated according to a request (MIGRATE message) from user application. Target XFRM policy is first identified by the selector in the MIGRATE message. Next, the endpoint addresses of the matching templates and XFRM states are updated according to the MIGRATE message. Signed-off-by: Shinta Sugimoto <shinta.sugimoto@ericsson.com> Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETFILTER]: ip6_tables: remove redundant structure definitionsPatrick McHardy2007-02-084-57/+19
| | | | | | | | Move ip6t_standard/ip6t_error_target/ip6t_error definitions to ip6_tables.h instead of defining them in each table individually. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETFILTER]: ip_tables: remove declaration of non-existant ipt_find_target ↵Patrick McHardy2007-02-081-3/+0
| | | | | | | function Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETFILTER]: ip6_tables: support MH matchMasahide NAKAMURA2007-02-084-0/+132
| | | | | | | | | | | This introduces match for Mobility Header (MH) described by Mobile IPv6 specification (RFC3775). User can specify the MH type or its range to be matched. Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: Yasuyuki Kozakai <kozakai@linux-ipv6.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETFILTER]: {ip,ip6}_tables: use struct xt_table instead of redefined ↵Jan Engelhardt2007-02-0811-27/+24
| | | | | | | | structure names Signed-off-by: Jan Engelhardt <jengelh@gmx.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETFILTER]: {ip,ip6}_tables: remove x_tables wrapper functionsJan Engelhardt2007-02-0836-171/+202
| | | | | | | | | Use the x_tables functions directly to make it better visible which parts are shared between ip_tables and ip6_tables. Signed-off-by: Jan Engelhardt <jengelh@gmx.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETFILTER]: x_tables: fix return values for LOG/ULOGJan Engelhardt2007-02-083-7/+14
| | | | | | Signed-off-by: Jan Engelhardt <jengelh@gmx.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETFILTER]: NAT: optional source port randomization supportEric Leblond2007-02-0810-4/+48
| | | | | | | | This patch adds support to NAT to randomize source ports. Signed-off-by: Eric Leblond <eric@inl.fr> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETFILTER]: add IPv6-capable TCPMSS targetPatrick McHardy2007-02-089-238/+337
| | | | | Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET]: Add UDPLITE support in a few missing spotsPatrick McHardy2007-02-085-0/+6
| | | | | Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETFILTER]: bridge-netfilter: use nf_register_hooks/nf_unregister_hooksPatrick McHardy2007-02-081-22/+7
| | | | | | | Additionally mark the init function __init. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETFILTER]: nf_nat: remove broken HOOKNAME macroPatrick McHardy2007-02-081-6/+0
| | | | | Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETFILTER]: Remove useless comparisons before assignmentsJan Engelhardt2007-02-086-21/+9
| | | | | | | | Remove unnecessary if() constructs before assignment. Signed-off-by: Jan Engelhardt <jengelh@gmx.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETFILTER]: tcp conntrack: do liberal tracking for picked up connectionsPatrick McHardy2007-02-083-51/+33
| | | | | | | | | | | | Do liberal tracking (only RSTs need to be in-window) for connections picked up without seeing a SYN to deal with window scaling. Also change logging of invalid packets not to log packets accepted by liberal tracking to avoid spamming the logs. Based on suggestion from James Ralston <ralston@pobox.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETFILTER]: Add SANE connection tracking helperMichal Schmidt2007-02-085-0/+279
| | | | | | | | | | | | This is nf_conntrack_sane, a netfilter connection tracking helper module for the SANE protocol used by the 'saned' daemon to make scanners available via network. The SANE protocol uses separate control & data connections, similar to passive FTP. The helper module is needed to recognize the data connection as RELATED to the control one. Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [IRLAN]: handle out of memory errorsAkinobu Mita2007-02-081-3/+20
| | | | | | | | | | | | This patch checks return values: - irlmp_register_client() - irlmp_register_service() - irlan_open() Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Samuel Ortiz <samuel@sortiz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [IRDA]: handle out of memory errorsAkinobu Mita2007-02-081-0/+40
| | | | | | | | | This patch checks return value of memory allocation functions for irda subsystem and fixes memory leaks in error cases. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Samuel Ortiz <samuel@sortiz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET]: unregister_netdevice as voidStephen Hemminger2007-02-086-13/+14
| | | | | | | | | There was no real useful information from the unregister_netdevice() return code, the only error occurred in a situation that was a driver bug. So change it to a void function. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [IPV6] RAW: Add checksum default defines for MH.Masahide NAKAMURA2007-02-082-28/+11
| | | | | | | | | | | | | | Add checksum default defines for mobility header(MH) which goes through raw socket. As the result kernel's behavior is to handle MH checksum as default. This patch also removes verifying inbound MH checksum at mip6_mh_filter() since it did not consider user specified checksum offset and was redundant check with raw socket code. Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [IPV4/IPV6] multicast: Check add_grhead() return valueAlexey Dobriyan2007-02-082-0/+4
| | | | | | | | add_grhead() allocates memory with GFP_ATOMIC and in at least two places skb from it passed to skb_put() without checking. Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [XFRM]: Fix missed error setting in xfrm4_policy.cDavid S. Miller2007-02-081-0/+1
| | | | | | | When we can't find the afinfo we should return EAFNOSUPPORT. GCC warned about the uninitialized 'err' for this path as well. Signed-off-by: David S. Miller <davem@davemloft.net>
* [IPSEC]: IPv4 over IPv6 IPsec tunnelMiika Komu2007-02-082-27/+65
| | | | | | | | | This is the patch to support IPv4 over IPv6 IPsec. Signed-off-by: Miika Komu <miika@iki.fi> Signed-off-by: Diego Beltrami <Diego.Beltrami@hiit.fi> Signed-off-by: Kazunori Miyazawa <miyazawa@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [IPSEC]: IPv6 over IPv4 IPsec tunnelMiika Komu2007-02-082-26/+77
| | | | | | | | | This is the patch to support IPv6 over IPv4 IPsec Signed-off-by: Miika Komu <miika@iki.fi> Signed-off-by: Diego Beltrami <Diego.Beltrami@hiit.fi> Signed-off-by: Kazunori Miyazawa <miyazawa@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [IPSEC]: exporting xfrm_state_afinfoMiika Komu2007-02-084-5/+10
| | | | | | | | | This patch exports xfrm_state_afinfo. Signed-off-by: Miika Komu <miika@iki.fi> Signed-off-by: Diego Beltrami <Diego.Beltrami@hiit.fi> Signed-off-by: Kazunori Miyazawa <miyazawa@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [BONDING]: Replace kmalloc() + memset() pairs with the appropriate kzalloc() ↵Joe Jin2007-02-082-6/+2
| | | | | | | | | | calls Replace kmalloc() + memset() pairs with the appropriate kzalloc() calls in the bonding driver. Signed-off-by: Joe Jin <lkmaillist@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* [TG3]: Avoid an expensive divide.Eric Dumazet2007-02-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During an oprofile session of linux-2.6.20 on a dual opteron system, I noticed an expensive divide was done in tg3_poll(). I am using gcc-4.1.1, so the following comment from drivers/net/tg3.c seems over-optimistic : /* Do not place this n-ring entries value into the tp struct itself, * we really want to expose these constants to GCC so that modulo et * al. operations are done with shifts and masks instead of with * hw multiply/modulo instructions. Another solution would be to * replace things like '% foo' with '& (foo - 1)'. */ #define TG3_RX_RCB_RING_SIZE(tp) \ ((tp->tg3_flags2 & TG3_FLG2_5705_PLUS) ? 512 : 1024) Assembly code before patch : (oprofile results included) 6434 0.0088 :ffffffff803684b9: mov 0x6f0(%r15),%eax 587 8.0e-04 :ffffffff803684c0: and $0x40000,%eax 2170 0.0030 :ffffffff803684c5: cmp $0x1,%eax :ffffffff803684c8: lea 0x1(%r13),%eax :ffffffff803684cc: sbb %ecx,%ecx 2051 0.0028 :ffffffff803684ce: xor %edx,%edx :ffffffff803684d0: and $0x200,%ecx 20 2.7e-05 :ffffffff803684d6: add $0x200,%ecx 1986 0.0027 :ffffffff803684dc: div %ecx 103427 0.1410 :ffffffff803684de: cmp %edx,0xffffffffffffff7c(%rbp) Assembly code after the suggested patch : ffffffff803684b9: mov 0x6f0(%r15),%eax ffffffff803684c0: and $0x40000,%eax ffffffff803684c5: cmp $0x1,%eax ffffffff803684c8: sbb %eax,%eax ffffffff803684ca: inc %r13d ffffffff803684cd: and $0x200,%eax ffffffff803684d2: add $0x1ff,%eax ffffffff803684d7: and %eax,%r13d ffffffff803684da: cmp %r13d,0xffffffffffffff7c(%rbp) Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Acked-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [DCCP]: Warning fixes.Andrew Morton2007-02-081-2/+3
| | | | | | | | | | | net/dccp/ccids/ccid3.c: In function `ccid3_hc_rx_packet_recv': net/dccp/ccids/ccid3.c:1007: warning: long int format, different type arg (arg 3) net/dccp/ccids/ccid3.c:1007: warning: long int format, different type arg (arg 4) opaque types must be suitably cast for printing. Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET] slip: Replace kmalloc() + memset() pairs with the appropriate ↵Joe Jin2007-02-081-4/+1
| | | | | | | | | | | kzalloc() calls This patch replace kmalloc() + memset() pairs with the appropriate kzalloc(). Signed-off-by: Joe Jin <joe.jin@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET] net/wanrouter/wanmain.c: cleanupsAdrian Bunk2007-02-082-17/+8
| | | | | | | | | | | | | | This patch contains the following cleanups: - make the following needlessly global functions static: - lock_adapter_irq() - unlock_adapter_irq() - #if 0 the following unused global functions: - wanrouter_encapsulate() - wanrouter_type_trans() Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [ATM]: Fix for crash in adummy_init()Daniel Walker2007-02-081-1/+2
| | | | | | | | | | | | | | | | | | | This was reported by Ingo Molnar here, http://lkml.org/lkml/2006/12/18/119 The problem is that adummy_init() depends on atm_init() , but adummy_init() is called first. So I put atm_init() into subsys_initcall which seems appropriate, and it will still get module_init() if it becomes a module. Interesting to note that you could crash your system here if you just load the modules in the wrong order. Signed-off-by: Daniel Walker <dwalker@mvista.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET]: user of the jiffies rounding code: NetworkingArjan van de Ven2007-02-083-3/+13
| | | | | | | | | | | | | | | | | | | | This patch introduces users of the round_jiffies() function in the networking code. These timers all were of the "about once a second" or "about once every X seconds" variety and several showed up in the "what wakes the cpu up" profiles that the tickless patches provide. Some timers are highly dynamic based on network load; but even on low activity systems they still show up so the rounding is done only in cases of low activity, allowing higher frequency timers in the high activity case. The various hardware watchdogs are an obvious case; they run every 2 seconds but aren't otherwise specific of exactly when they need to run. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [TCP]: Don't apply FIN exception to full TSO segments.John Heffner2007-02-081-1/+2
| | | | | Signed-off-by: John Heffner <jheffner@psc.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
* [TCP]: Check num sacks in SACK fast pathBaruch Even2007-02-081-0/+5
| | | | | | | | | | We clear the unused parts of the SACK cache, This prevents us from mistakenly taking the cache data if the old data in the SACK cache is the same as the data in the SACK block. This assumes that we never receive an empty SACK block with start and end both at zero. Signed-off-by: Baruch Even <baruch@ev-en.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [TCP]: Seperate DSACK from SACK fast pathBaruch Even2007-02-082-36/+32
| | | | | | | | | | | | | | Move DSACK code outside the SACK fast-path checking code. If the DSACK determined that the information was too old we stayed with a partial cache copied. Most likely this matters very little since the next packet will not be DSACK and we will find it in the cache. but it's still not good form and there is little reason to couple the two checks. Since the SACK receive cache doesn't need the data to be in host order we also remove the ntohl in the checking loop. Signed-off-by: Baruch Even <baruch@ev-en.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [TCP]: Advance fast path pointer for first block onlyBaruch Even2007-02-081-10/+24
| | | | | | | | | Only advance the SACK fast-path pointer for the first block, the fast-path assumes that only the first block advances next time so we should not move the cached skb for the next sack blocks. Signed-off-by: Baruch Even <baruch@ev-en.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [PACKET]: Fix skb->cb clobbering between aux and sockaddrHerbert Xu2007-02-081-16/+30
| | | | | | | | | | | | | Both aux data and sockaddr tries to use the same buffer which obviously doesn't work. We just happen to have 4 bytes free in the skb->cb if you take away the maximum length of sockaddr_ll. That's just enough to store the one piece of info from aux data that we can't generate at recvmsg(2) time. This is what the following patch does. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* [PACKET]: Add optional checksum computation for recvmsgHerbert Xu2007-02-082-9/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch is needed to make ISC's DHCP server (and probably other DHCP servers/clients using AF_PACKET) to be able to serve another client on the same Xen host. The problem is that packets between different domains on the same Xen host only have partial checksums. Unfortunately this piece of information is not passed along in AF_PACKET unless you're using the mmap interface. Since dhcpd doesn't support packet-mmap, UDP packets from the same host come out with apparently bogus checksums. This patch adds a mechanism for AF_PACKET recvmsg(2) to return the status along with the packet. It does so by adding a new cmsg that contains this information along with some other relevant data such as the original packet length. I didn't include the time stamp information since there is already a cmsg for that. This patch also changes the mmap code to set the CSUMNOTREADY flag on all packets instead of just outoing packets on cooked sockets. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* [IPV4/IPV6]: Always wait for IPSEC SA resolution in socket contexts.David S. Miller2007-02-0812-13/+14
| | | | | | | | | | Do this even for non-blocking sockets. This avoids the silly -EAGAIN that applications can see now, even for non-blocking sockets in some cases (f.e. connect()). With help from Venkat Tekkirala. Signed-off-by: David S. Miller <davem@davemloft.net>
* [TCP]: remove tcp header from tcp_v4_check (take #2)Frederik Deweerdt2007-02-085-12/+11
| | | | | | | | | | | The tcphdr struct passed to tcp_v4_check is not used, the following patch removes it from the parameter list. This adds the netfilter modifications missing in the patch I sent for rc3-mm1. Signed-off-by: Frederik Deweerdt <frederik.deweerdt@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [IPV6] ROUTE: Do not route packets to link-local address on other device.YOSHIFUJI Hideaki2007-02-081-5/+14
| | | | | | | With help from Wei Dong <weid@np.css.fujitsu.com>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETLINK]: Don't BUG on undersized allocationsPatrick McHardy2007-02-0812-85/+149
| | | | | | | | | | | | | | | Currently netlink users BUG when the allocated skb for an event notification is undersized. While this is certainly a kernel bug, its not critical and crashing the kernel is too drastic, especially when considering that these errors have appeared multiple times in the past and it BUGs even if no listeners are present. This patch replaces BUG by WARN_ON and changes the notification functions to inform potential listeners of undersized allocations using a unique error code (EMSGSIZE). Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET_SCHED] sch_prio: class statistics printing enabledJarek Poplawski2007-02-081-0/+15
| | | | | | | | | | This patch adds a dump_stats callback to enable printing of basic statistics of prio classes. (With help of Patrick McHardy). Signed-off-by: Jarek Poplawski <jarkao2@o2.pl> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'upstream-linus' of ↵Linus Torvalds2007-02-0817-468/+1211
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/mfasheh/ocfs2 * 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/mfasheh/ocfs2: (22 commits) configfs: Zero terminate data in configfs attribute writes. [PATCH] ocfs2 heartbeat: clean up bio submission code ocfs2: introduce sc->sc_send_lock to protect outbound outbound messages [PATCH] ocfs2: drop INET from Kconfig, not needed ocfs2_dlm: Add timeout to dlm join domain ocfs2_dlm: Silence some messages during join domain ocfs2_dlm: disallow a domain join if node maps mismatch ocfs2_dlm: Ensure correct ordering of set/clear refmap bit on lockres ocfs2: Binds listener to the configured ip address ocfs2_dlm: Calling post handler function in assert master handler ocfs2: Added post handler callable function in o2net message handler ocfs2_dlm: Cookies in locks not being printed correctly in error messages ocfs2_dlm: Silence a failed convert ocfs2_dlm: wake up sleepers on the lockres waitqueue ocfs2_dlm: Dlm dispatch was stopping too early ocfs2_dlm: Drop inflight refmap even if no locks found on the lockres ocfs2_dlm: Flush dlm workqueue before starting to migrate ocfs2_dlm: Fix migrate lockres handler queue scanning ocfs2_dlm: Make dlmunlock() wait for migration to complete ocfs2_dlm: Fixes race between migrate and dirty ...
| * configfs: Zero terminate data in configfs attribute writes.Joel Becker2007-02-071-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Attributes in configfs are text files. As such, most handlers expect to be able to call functions like simple_strtoul() without checking the bounds of the buffer. Change the call to zero terminate the buffer before calling the client's ->store() method. This does reduce the attribute size from PAGE_SIZE to PAGE_SIZE-1. Also, change get_zeroed_page() to alloc_page(), as we are handling the termination. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
| * [PATCH] ocfs2 heartbeat: clean up bio submission codePhilipp Reisner2007-02-071-127/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As was already pointed out Mathieu Avila on Thu, 07 Sep 2006 03:15:25 -0700 that OCFS2 is expecting bio_add_page() to add pages to BIOs in an easily predictable manner. That is not true, especially for devices with own merge_bvec_fn(). Therefore OCFS2's heartbeat code is very likely to fail on such devices. Move the bio_put() call into the bio's bi_end_io() function. This makes the whole idea of trying to predict the behaviour of bio_add_page() unnecessary. Removed compute_max_sectors() and o2hb_compute_request_limits(). Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
| * ocfs2: introduce sc->sc_send_lock to protect outbound outbound messagesZhen Wei2007-02-072-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When there is a lot of multithreaded I/O usage, two threads can collide while sending out a message to the other nodes. This is due to the lack of locking between threads while sending out the messages. When a connected TCP send(), sendto(), or sendmsg() arrives in the Linux kernel, it eventually comes through tcp_sendmsg(). tcp_sendmsg() protects itself by acquiring a lock at invocation by calling lock_sock(). tcp_sendmsg() then loops over the buffers in the iovec, allocating associated sk_buff's and cache pages for use in the actual send. As it does so, it pushes the data out to tcp for actual transmission. However, if one of those allocation fails (because a large number of large sends is being processed, for example), it must wait for memory to become available. It does so by jumping to wait_for_sndbuf or wait_for_memory, both of which eventually cause a call to sk_stream_wait_memory(). sk_stream_wait_memory() contains a code path that calls sk_wait_event(). Finally, sk_wait_event() contains the call to release_sock(). The following patch adds a lock to the socket container in order to properly serialize outbound requests. From: Zhen Wei <zwei@novell.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
| * [PATCH] ocfs2: drop INET from Kconfig, not neededRandy Dunlap2007-02-071-1/+0
| | | | | | | | | | | | | | OCFS2: drop 'depends on INET' since local mounts are now allowed. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
| * ocfs2_dlm: Add timeout to dlm join domainSunil Mushran2007-02-071-1/+13
| | | | | | | | | | | | | | | | | | | | Currently the ocfs2 dlm has no timeout during dlm join domain. While this is not a problem in normal operation, this does become an issue if, say, the other node is refusing to let the node join the domain because of a stuck recovery. This patch adds a 90 sec timeout. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>