aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* nfsd41: encode replay sequence from the slot valuesAndy Adamson2009-07-282-36/+37
| | | | | | | | | | | | The sequence operation is not cached; always encode the sequence operation on a replay from the slot table and session values. This simplifies the sessions replay logic in nfsd4_proc_compound. If this is a replay of a compound that was specified not to be cached, return NFS4ERR_RETRY_UNCACHED_REP. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd41: rename nfsd4_enc_uncached_replayAndy Adamson2009-07-281-2/+2
| | | | | | | This function is only used for SEQUENCE replay. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd41: Use separate DRC for setclientidAndy Adamson2009-07-285-42/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of trying to share the generic 4.1 reply cache code for the CREATE_SESSION reply cache, it's simpler to handle CREATE_SESSION separately. The nfs41 single slot clientid DRC holds the results of create session processing. CREATE_SESSION can be preceeded by a SEQUENCE operation (an embedded CREATE_SESSION) and the create session single slot cache must be maintained. nfsd4_replay_cache_entry() and nfsd4_store_cache_entry() do not implement the replay of an embedded CREATE_SESSION. The clientid DRC slot does not need the inuse, cachethis or other fields that the multiple slot session cache uses. Replace the clientid DRC cache struct nfs4_slot cache with a new nfsd4_clid_slot cache. Save the xdr struct nfsd4_create_session into the cache at the end of processing, and on a replay, replace the struct for the replay request with the cached version all while under the state lock. nfsd4_proc_compound will handle both the solo and embedded CREATE_SESSION case via the normal use of encode_operation. Errors that do not change the create session cache: A create session NFS4ERR_STALE_CLIENTID error means that a client record (and associated create session slot) could not be found and therefore can't be changed. NFSERR_SEQ_MISORDERED errors do not change the slot cache. All other errors get cached. Remove the clientid DRC specific check in nfs4svc_encode_compoundres to put the session only if cstate.session is set which will now always be true. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd41: change check_slot_seqid parametersAndy Adamson2009-07-281-11/+13
| | | | | | | For separation of session slot and clientid slot processing. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd41: remove redundant forechannel max requests checkAndy Adamson2009-07-281-4/+0
| | | | | | | This check is done in set_forechannel_maxreqs. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd41: hange from page to memory based drc limitsAndy Adamson2009-07-284-24/+22
| | | | | | | | | | | NFSD_SLOT_CACHE_SIZE is the size of all encoded operation responses (excluding the sequence operation) that we want to cache. For now, keep NFSD_SLOT_CACHE_SIZE at PAGE_SIZE. It will be reduced when the DRC is changed from page based to memory based. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd41: reserve less memory for DRCAndy Adamson2009-07-281-2/+1
| | | | | | | Also remove a slightly misleading comment. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd41: minor set_forechannel_maxreqs cleanupAndy Adamson2009-07-281-8/+7
| | | | | Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd41: reclaim DRC memory on session freeAndy Adamson2009-07-281-0/+3
| | | | | | | This fixes a leak which would eventually lock out new clients. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd: minor write_pool_threads exit cleanupJ. Bruce Fields2009-07-281-5/+1
| | | | Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* Fix memory leak in write_pool_threadsEric Sesterhenn2009-07-281-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | kmemleak produces the following warning unreferenced object 0xc9ec02a0 (size 8): comm "cat", pid 19048, jiffies 730243 backtrace: [<c01bf970>] create_object+0x100/0x240 [<c01bfadb>] kmemleak_alloc+0x2b/0x60 [<c01bcd4b>] __kmalloc+0x14b/0x270 [<c02fd027>] write_pool_threads+0x87/0x1d0 [<c02fcc08>] nfsctl_transaction_write+0x58/0x70 [<c02fcc6f>] nfsctl_transaction_read+0x4f/0x60 [<c01c2574>] vfs_read+0x94/0x150 [<c01c297d>] sys_read+0x3d/0x70 [<c0102d6b>] sysenter_do_call+0x12/0x32 [<ffffffff>] 0xffffffff write_pool_threads() only frees nthreads on error paths, in the success case we leak it. Signed-off-by: Eric Sesterhenn <eric.sesterhenn@lsexperts.de> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd41: use globals for DRC limitsAndy Adamson2009-07-144-11/+23
| | | | | | | | | | | | | The version 4.1 DRC memory limit and tracking variables are server wide and session specific. Replace struct svc_serv fields with globals. Stop using the svc_serv sv_lock. Add a spinlock to serialize access to the DRC limit management variables which change on session creation and deletion (usage counter) or (future) administrative action to adjust the total DRC memory limit. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com>
* SUNRPC: handle IPv6 PKTINFO when extracting destination addressChuck Lever2009-07-141-28/+56
| | | | | | | | | | | | | | | | PKTINFO is needed to scrape the caller's IP address off the socket so RPC datagram replies are routed correctly. Fill in missing pieces in the kernel RPC server's UDP receive path to request IPv6 PKTINFO and correctly parse the IPv6 cmsg header. Without this patch, kernel RPC services drop all incoming requests on UDP on IPv6. Related commit: 7a37f5787e76bf1765c1add3a9a7163f841a28bb Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: Neil Brown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* NFSv4: ACL in operations 'open' and 'create' should be usedYu Zhiguo2009-07-141-5/+42
| | | | | | | | | | ACL in operations 'open' and 'create' is decoded but never be used. It should be set as the initial ACL for the object according to RFC3530. If error occurs when setting the ACL, just clear the ACL bit in the returned attr bitmap. Signed-off-by: Yu Zhiguo <yuzg@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* nfsd41: gather and report statistics also for v4.1 opsBenny Halevy2009-07-101-1/+1
| | | | | Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* svcrdma: fix error handling of rdma_alloc_frmr()Wei Yongjun2009-07-031-2/+2
| | | | | | | | ib_alloc_fast_reg_mr() and ib_alloc_fast_reg_page_list() returns ERR_PTR() and not NULL. Compile tested only. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* Linux 2.6.31-rc1Linus Torvalds2009-06-241-2/+2
|
* Revert "PCI: use ACPI _CRS data by default"Linus Torvalds2009-06-245-6/+6
| | | | | | | | | | | | | | This reverts commit 9e9f46c44e487af0a82eb61b624553e2f7118f5b. Quoting from the commit message: "At this point, it seems to solve more problems than it causes, so let's try using it by default. It's an easy revert if it ends up causing trouble." And guess what? The _CRS code causes trouble. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge git://git.infradead.org/battery-2.6Linus Torvalds2009-06-248-17/+420
|\ | | | | | | | | | | | | | | | | * git://git.infradead.org/battery-2.6: da9030_battery: Fix race between event handler and monitor Add MAX17040 Fuel Gauge driver w1: ds2760_battery: add support for sleep mode feature w1: ds2760: add support for EEPROM read and write ds2760_battery: cleanups in ds2760_battery_probe()
| * da9030_battery: Fix race between event handler and monitorMike Rapoport2009-06-091-7/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are cases when charging monitor and the event handler try to change the charger state simultaneously. For instance, a charger is connected to the system, there's the detection event and the event handler tries to enable charging. It is possible that the periodic charging monitor runs at the same time and it still thinks there's no external charger. So it tries to disable the charging. As the result, even if the conditions necessary to charge the battery hold, there will be no actual charging. The patch changes the event handler so that instead of enabling/ disabling the charger immediately it would rather make the monitor run. The monitor code then decides what should be the charger state. Signed-off-by: Mike Rapoport <mike@compulab.co.il> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Anton Vorontsov <cbouatmailru@gmail.com>
| * Add MAX17040 Fuel Gauge driverMinkyu Kang2009-06-094-1/+338
| | | | | | | | | | | | | | | | The MAX17040 is a I2C interfaced Fuel Gauge systems for lithium-ion batteries This patch adds support the MAX17040 Fuel Gauge Signed-off-by: Minkyu Kang <mk7.kang@samsung.com> Signed-off-by: Anton Vorontsov <cbouatmailru@gmail.com>
| * w1: ds2760_battery: add support for sleep mode featureDaniel Mack2009-06-082-0/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds support for ds2760's sleep mode feature. With this feature enabled, the chip enters a deep sleep mode and disconnects from the battery when the w1 line is held down for more than 2 seconds. This new behaviour can be switched on and off using a new module parameter. Signed-off-by: Daniel Mack <daniel@caiaq.de> Cc: Szabolcs Gyurko <szabolcs.gyurko@tlt.hu> Acked-by: Matt Reimer <mreimer@vpop.net> Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Signed-off-by: Anton Vorontsov <cbouatmailru@gmail.com>
| * w1: ds2760: add support for EEPROM read and writeDaniel Mack2009-06-082-0/+32
| | | | | | | | | | | | | | | | | | | | | | In order to modify the DS2762's status registers and to add support for sleep mode, there is need for functions to write the internal EEPROM. Signed-off-by: Daniel Mack <daniel@caiaq.de> Acked-by: Matt Reimer <mreimer@vpop.net> Acked-by: Szabolcs Gyurko <szabolcs.gyurko@tlt.hu> Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Signed-off-by: Anton Vorontsov <cbouatmailru@gmail.com>
| * ds2760_battery: cleanups in ds2760_battery_probe()Daniel Mack2009-06-081-9/+7
| | | | | | | | | | | | | | | | | | | | Removed struct ds2760_platform_data which wasn't defined anywhere. Indentation cleanups. Signed-off-by: Daniel Mack <daniel@caiaq.de> Cc: Szabolcs Gyurko <szabolcs.gyurko@tlt.hu> Acked-by: Matt Reimer <mreimer@vpop.net> Signed-off-by: Anton Vorontsov <cbouatmailru@gmail.com>
| |
| \
*-. \ Merge branches 'for-linus' of ↵Linus Torvalds2009-06-246-18/+23
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/{vfs-2.6,audit-current} * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: another race fix in jfs_check_acl() Get "no acls for this inode" right, fix shmem breakage inline functions left without protection of ifdef (acl) * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current: audit: inode watches depend on CONFIG_AUDIT not CONFIG_AUDIT_SYSCALL
| | * | audit: inode watches depend on CONFIG_AUDIT not CONFIG_AUDIT_SYSCALLEric Paris2009-06-241-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Even though one cannot make use of the audit watch code without CONFIG_AUDIT_SYSCALL the spaghetti nature of the audit code means that the audit rule filtering requires that it at least be compiled. Thus build the audit_watch code when we build auditfilter like it was before cfcad62c74abfef83762dc05a556d21bdf3980a2 Clearly this is a point of potential future cleanup.. Reported-by: Frans Pop <elendil@planet.nl> Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | | another race fix in jfs_check_acl()Al Viro2009-06-241-6/+7
| | | | | | | | | | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | | Get "no acls for this inode" right, fix shmem breakageAl Viro2009-06-244-10/+13
| | | | | | | | | | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | | inline functions left without protection of ifdef (acl)Markus Trippelsdorf2009-06-241-1/+2
| |/ / | | | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | | Merge branch 'futexes-for-linus' of ↵Linus Torvalds2009-06-241-21/+24
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'futexes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: futex: Fix the write access fault problem for real
| * | | futex: Fix the write access fault problem for realThomas Gleixner2009-06-241-21/+24
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 64d1304a64 (futex: setup writeable mapping for futex ops which modify user space data) did address only half of the problem of write access faults. The patch was made on two wrong assumptions: 1) access_ok(VERIFY_WRITE,...) would actually check write access. On x86 it does _NOT_. It's a pure address range check. 2) a RW mapped region can not go away under us. That's wrong as well. Nobody can prevent another thread to call mprotect(PROT_READ) on that region where the futex resides. If that call hits between the get_user_pages_fast() verification and the actual write access in the atomic region we are toast again. The solution is to not rely on access_ok and get_user() for any write access related fault on private and shared futexes. Instead we need to fault it in with verification of write access. There is no generic non destructive write mechanism which would fault the user page in trough a #PF, but as we already know that we will fault we can as well call get_user_pages() directly and avoid the #PF overhead. If get_user_pages() returns -EFAULT we know that we can not fix it anymore and need to bail out to user space. Remove a bunch of confusing comments on this issue as well. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@kernel.org
* | | SLUB: Don't pass __GFP_FAIL for the initial allocationPekka Enberg2009-06-241-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | SLUB uses higher order allocations by default but falls back to small orders under memory pressure. Make sure the GFP mask used in the initial allocation doesn't include __GFP_NOFAIL. Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | Don't warn about order-1 allocations with __GFP_NOFAILLinus Torvalds2009-06-241-2/+2
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | Traditionally, we never failed small orders (even regardless of any __GFP_NOFAIL flags), and slab will allocate order-1 allocations even for small allocations that could fit in a single page (in order to avoid excessive fragmentation). Maybe we should remove this warning entirely, but before making that judgement, at least limit it to bigger allocations. Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linusLinus Torvalds2009-06-2463-492/+996
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus: Staging: octeon-ethernet: Fix race freeing transmit buffers. Staging: octeon-ethernet: Convert to use net_device_ops. MIPS: Cavium: Add CPU hotplugging code. MIPS: SMP: Allow suspend and hibernation if CPU hotplug is available MIPS: Add arch generic CPU hotplug DMA: txx9dmac: use dma_unmap_single if DMA_COMPL_{SRC,DEST}_UNMAP_SINGLE set MIPS: Sibyte: Fix build error if CONFIG_SERIAL_SB1250_DUART is undefined. MIPS: MIPSsim: Fix build error if MSC01E_INT_BASE is undefined. MIPS: Hibernation: Remove SMP TLB and cacheflushing code. MIPS: Build fix - include <linux/smp.h> into all smp_processor_id() users. MIPS: bug.h Build fix - include <linux/compiler.h>.
| * | Staging: octeon-ethernet: Fix race freeing transmit buffers.David Daney2009-06-244-66/+106
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The existing code had the following race: Thread-1 Thread-2 inc/read in_use inc/read in_use inc tx_free_list[qos].len inc tx_free_list[qos].len The actual in_use value was incremented twice, but thread-1 is going to free memory based on its stale value, and will free one too many times. The result is that memory is freed back to the kernel while its packet is still in the transmit buffer. If the memory is overwritten before it is transmitted, the hardware will put a valid checksum on it and send it out (just like it does with good packets). If by chance the TCP flags are clobbered but not the addresses or ports, the result can be a broken TCP stream. The fix is to track the number of freed packets in a single location (a Fetch-and-Add Unit register). That way it can never get out of sync with itself. We try to free up to MAX_SKB_TO_FREE (currently 10) buffers at a time. If fewer are available we adjust the free count with the difference. The action of claiming buffers to free is atomic so two threads cannot claim the same buffers. Signed-off-by: David Daney <ddaney@caviumnetworks.com> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| * | Staging: octeon-ethernet: Convert to use net_device_ops.David Daney2009-06-2410-396/+390
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert the driver to use net_device_ops as it is now mandatory. Also compensate for the removal of struct sk_buff's dst field. The changes are mostly mechanical, the content of ethernet-common.c was moved to ethernet.c and ethernet-common.{c,h} are removed. Signed-off-by: David Daney <ddaney@caviumnetworks.com> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| * | MIPS: Cavium: Add CPU hotplugging code.Ralf Baechle2009-06-245-2/+365
| | | | | | | | | | | | | | | | | | Thanks to Cavium Inc. for the code contribution and help. Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| * | MIPS: SMP: Allow suspend and hibernation if CPU hotplug is availableRalf Baechle2009-06-241-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | The SMP implementation of suspend and hibernate depends on CPU hotplugging. In the past we didn't have CPU hotplug so suspend and hibernation were not possible on SMP systems. Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| * | MIPS: Add arch generic CPU hotplugRalf Baechle2009-06-247-7/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Each platform has to add support for CPU hotplugging itself by providing suitable definitions for the cpu_disable and cpu_die of the smp_ops methods and setting SYS_SUPPORTS_HOTPLUG_CPU. A platform should only set SYS_SUPPORTS_HOTPLUG_CPU once all it's smp_ops definitions have the necessary changes. This patch contains the changes to the dummy smp_ops definition for uni-processor systems. Parts of the code contributed by Cavium Inc. Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| * | DMA: txx9dmac: use dma_unmap_single if DMA_COMPL_{SRC,DEST}_UNMAP_SINGLE setAtsushi Nemoto2009-06-241-8/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch does not change actual behaviour since dma_unmap_page is just an alias of dma_unmap_single on MIPS. Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp> Cc: Ralf Baechle <ralf@linux-mips.org> Acked-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| * | MIPS: Sibyte: Fix build error if CONFIG_SERIAL_SB1250_DUART is undefined.Ralf Baechle2009-06-241-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes kernel.org bugzilla 13596, see http://bugzilla.kernel.org/show_bug.cgi?id=13596 Reported-by: dvice_null@yahoo.com Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| * | MIPS: MIPSsim: Fix build error if MSC01E_INT_BASE is undefined.Ralf Baechle2009-06-241-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes kernel.org bugzilla 13595, see http://bugzilla.kernel.org/show_bug.cgi?id=13595 Reported-by: dvice_null@yahoo.com Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| * | MIPS: Hibernation: Remove SMP TLB and cacheflushing code.Ralf Baechle2009-06-241-9/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can't perform any flushes on SMP from swsusp_arch_resume because interrupts are disabled. A cross-CPU flush is unnecessary anyway because all but the local CPU have already been disabled. A local flush is not needed either because we didn't change any mappings. So just delete the code. Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| * | MIPS: Build fix - include <linux/smp.h> into all smp_processor_id() users.Ralf Baechle2009-06-2439-1/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | Some of the were relying into smp.h being dragged in by another header which of course is fragile. <asm/cpu-info.h> uses smp_processor_id() only in macros and including smp.h there leads to an include loop, so don't change cpu-info.h. Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| * | MIPS: bug.h Build fix - include <linux/compiler.h>.Ralf Baechle2009-06-241-0/+1
| | | | | | | | | | | | | | | | | | In the past this file somehow used to be dragged in. Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
* | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dmLinus Torvalds2009-06-2435-435/+3993
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm: (48 commits) dm mpath: change to be request based dm: disable interrupt when taking map_lock dm: do not set QUEUE_ORDERED_DRAIN if request based dm: enable request based option dm: prepare for request based option dm raid1: add userspace log dm: calculate queue limits during resume not load dm log: fix create_log_context to use logical_block_size of log device dm target:s introduce iterate devices fn dm table: establish queue limits by copying table limits dm table: replace struct io_restrictions with struct queue_limits dm table: validate device logical_block_size dm table: ensure targets are aligned to logical_block_size dm ioctl: support cookies for udev dm: sysfs add suspended attribute dm table: improve warning message when devices not freed before destruction dm mpath: add service time load balancer dm mpath: add queue length load balancer dm mpath: add start_io and nr_bytes to path selectors dm snapshot: use barrier when writing exception store ...
| * | | dm mpath: change to be request basedKiyoshi Ueda2009-06-221-65/+128
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch converts dm-multipath target to request-based from bio-based. Basically, the patch just converts the I/O unit from struct bio to struct request. In the course of the conversion, it also changes the I/O queueing mechanism. The change in the I/O queueing is described in details as follows. I/O queueing mechanism change ----------------------------- In I/O submission, map_io(), there is no mechanism change from bio-based, since the clone request is ready for retry as it is. However, in I/O complition, do_end_io(), there is a mechanism change from bio-based, since the clone request is not ready for retry. In do_end_io() of bio-based, the clone bio has all needed memory for resubmission. So the target driver can queue it and resubmit it later without memory allocations. The mechanism has almost no overhead. On the other hand, in do_end_io() of request-based, the clone request doesn't have clone bios, so the target driver can't resubmit it as it is. To resubmit the clone request, memory allocation for clone bios is needed, and it takes some overheads. To avoid the overheads just for queueing, the target driver doesn't queue the clone request inside itself. Instead, the target driver asks dm core for queueing and remapping the original request of the clone request, since the overhead for queueing is just a freeing memory for the clone request. As a result, the target driver doesn't need to record/restore the information of the original request for resubmitting the clone request. So dm_bio_details in dm_mpath_io is removed. multipath_busy() --------------------- The target driver returns "busy", only when the following case: o The target driver will map I/Os, if map() function is called and o The mapped I/Os will wait on underlying device's queue due to their congestions, if map() function is called now. In other cases, the target driver doesn't return "busy". Otherwise, dm core will keep the I/Os and the target driver can't do what it wants. (e.g. the target driver can't map I/Os now, so wants to kill I/Os.) Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Acked-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
| * | | dm: disable interrupt when taking map_lockKiyoshi Ueda2009-06-221-6/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch disables interrupt when taking map_lock to avoid lockdep warnings in request-based dm. request-based dm takes map_lock after taking queue_lock with disabling interrupt: spin_lock_irqsave(queue_lock) q->request_fn() == dm_request_fn() => dm_get_table() => read_lock(map_lock) while queue_lock could be (but isn't) taken in interrupt context. Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Acked-by: Christof Schmitt <christof.schmitt@de.ibm.com> Acked-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
| * | | dm: do not set QUEUE_ORDERED_DRAIN if request basedKiyoshi Ueda2009-06-223-1/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Request-based dm doesn't have barrier support yet. So we need to set QUEUE_ORDERED_DRAIN only for bio-based dm. Since the device type is decided at the first table loading time, the flag set is deferred until then. Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Acked-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
| * | | dm: enable request based optionKiyoshi Ueda2009-06-224-26/+285
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch enables request-based dm. o Request-based dm and bio-based dm coexist, since there are some target drivers which are more fitting to bio-based dm. Also, there are other bio-based devices in the kernel (e.g. md, loop). Since bio-based device can't receive struct request, there are some limitations on device stacking between bio-based and request-based. type of underlying device bio-based request-based ---------------------------------------------- bio-based OK OK request-based -- OK The device type is recognized by the queue flag in the kernel, so dm follows that. o The type of a dm device is decided at the first table binding time. Once the type of a dm device is decided, the type can't be changed. o Mempool allocations are deferred to at the table loading time, since mempools for request-based dm are different from those for bio-based dm and needed mempool type is fixed by the type of table. o Currently, request-based dm supports only tables that have a single target. To support multiple targets, we need to support request splitting or prevent bio/request from spanning multiple targets. The former needs lots of changes in the block layer, and the latter needs that all target drivers support merge() function. Both will take a time. Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>