aboutsummaryrefslogtreecommitdiffstats
path: root/fs/ocfs2
Commit message (Collapse)AuthorAgeFilesLines
* [PATCH] r/o bind mounts: clean up OCFS2 nlink handlingMark Fasheh2006-10-011-20/+19
| | | | | | | | | | | | | | | | | | | | | | OCFS2 does some operations on i_nlink, then reverts them if some of its operations fail to complete. This does not fit in well with the drop_nlink() logic where we expect i_nlink to stay at zero once it gets there. So, delay all of the nlink operations until we're sure that the operations have completed. Also, introduce a small helper to check whether an inode has proper "unlinkable" i_nlink counts no matter whether it is a directory or regular inode. This patch is broken out from the others because it does contain some logical changes. Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] r/o bind mount prepwork: inc_nlink() helperDave Hansen2006-10-012-7/+7
| | | | | | | | | | | This is mostly included for parity with dec_nlink(), where we will have some more hooks. This one should stay pretty darn straightforward for now. Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Acked-by: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] r/o bind mounts: unlink: monitor i_nlinkDave Hansen2006-10-011-2/+2
| | | | | | | | | | | | | | | | | When a filesystem decrements i_nlink to zero, it means that a write must be performed in order to drop the inode from the filesystem. We're shortly going to have keep filesystems from being remounted r/o between the time that this i_nlink decrement and that write occurs. So, add a little helper function to do the decrements. We'll tie into it in a bit to note when i_nlink hits zero. Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Acked-by: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] Vectorize aio_read/aio_write fileop methodsBadari Pulavarty2006-10-011-15/+13
| | | | | | | | | | | | This patch vectorizes aio_read() and aio_write() methods to prepare for collapsing all aio & vectored operations into one interface - which is aio_read()/aio_write(). Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Michael Holzheu <HOLZHEU@de.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] inode-diet: Eliminate i_blksize from the inode structureTheodore Ts'o2006-09-272-6/+0
| | | | | | | | | | | | | | | | This eliminates the i_blksize field from struct inode. Filesystems that want to provide a per-inode st_blksize can do so by providing their own getattr routine instead of using the generic_fillattr() function. Note that some filesystems were providing pretty much random (and incorrect) values for i_blksize. [bunk@stusta.de: cleanup] [akpm@osdl.org: generic_fillattr() fix] Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] inode_diet: Replace inode.u.generic_ip with inode.i_privateTheodore Ts'o2006-09-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | The following patches reduce the size of the VFS inode structure by 28 bytes on a UP x86. (It would be more on an x86_64 system). This is a 10% reduction in the inode size on a UP kernel that is configured in a production mode (i.e., with no spinlock or other debugging functions enabled; if you want to save memory taken up by in-core inodes, the first thing you should do is disable the debugging options; they are responsible for a huge amount of bloat in the VFS inode structure). This patch: The filesystem or device-specific pointer in the inode is inside a union, which is pretty pointless given that all 30+ users of this field have been using the void pointer. Get rid of the union and rename it to i_private, with a comment to explain who is allowed to use the void pointer. This is just a cleanup, but it allows us to reuse the union 'u' for something something where the union will actually be used. [judith@osdl.org: powerpc build fix] Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Judith Lebzelter <judith@osdl.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] Really ignore kmem_cache_destroy return valueAlexey Dobriyan2006-09-271-3/+1
| | | | | | | | | | | | | | | | * Rougly half of callers already do it by not checking return value * Code in drivers/acpi/osl.c does the following to be sure: (void)kmem_cache_destroy(cache); * Those who check it printk something, however, slab_error already printed the name of failed cache. * XFS BUGs on failed kmem_cache_destroy which is not the decision low-level filesystem driver should make. Converted to ignore. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* ocfs2: Teach ocfs2_drop_lock() to use ->set_lvb() callbackMark Fasheh2006-09-241-31/+28
| | | | | | | | With this, we don't need to pass an additional struct with function pointer. Now that the callbacks are fully used, comment the remaining API. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Remove ->unblock lockres operationMark Fasheh2006-09-241-146/+6
| | | | | | | | | | | Have ocfs2_process_blocked_lock() call ocfs2_generic_unblock_lock(), which gets to be ocfs2_unblock_lock() now that it's the only possible unblock function. Remove the ->unblock() callback from the structure, and all lock type specific unblock functions. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: move downconvert worker to lockres opsMark Fasheh2006-09-241-18/+32
| | | | | | | This way lock types don't have to manually pass it to ocfs2_generic_unblock_lock(). Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Remove unused dlmglue functionsMark Fasheh2006-09-241-103/+0
| | | | | | | The meta data unblocking code no longer needs ocfs2_do_unblock_meta() or ocfs2_can_downconvert_meta_lock(), so remove them. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Have the metadata lock use generic dlmglue functionsMark Fasheh2006-09-241-1/+31
| | | | | | | | Fill in the ->check_downconvert and ->set_lvb callbacks with meta data specific operations and switch ocfs2_unblock_meta() to call ocfs2_generic_unblock_lock() Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Add ->set_lvb callback in dlmglueMark Fasheh2006-09-241-2/+29
| | | | | | This allows a lock type to set the value block before downconvert. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Add ->check_downconvert callback in dlmglueMark Fasheh2006-09-241-1/+18
| | | | | | This will allow lock types to force a requeue of a lock downconvert. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Check for refreshing locks in generic unblock functionMark Fasheh2006-09-241-12/+19
| | | | | | Tidy up the exit path a bit too. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: don't unconditionally pass LVB flagsMark Fasheh2006-09-241-3/+15
| | | | | | | | | Allow a lock type to specifiy whether it makes use of the LVB. The only type which does this right now is the meta data lock. This should save us some space on network messages since they won't have to needlessly transmit value blocks. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: combine inode and generic blocking AST functionsMark Fasheh2006-09-241-112/+11
| | | | | | | There is extremely little difference between the two now. We can remove the callback from ocfs2_lock_res_ops as well. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Add ->get_osb() dlmglue locking operationMark Fasheh2006-09-241-0/+33
| | | | | | Will be used to find the ocfs2_super structure from a given lockres. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: remove ->unlock_ast() callback from ocfs2_lock_res_opsMark Fasheh2006-09-241-13/+3
| | | | | | | This was always defined to the same function in all locks, so clean things up by removing and passing ocfs2_unlock_ast() directly to the DLM. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: combine inode and generic AST functionsMark Fasheh2006-09-241-110/+10
| | | | | | | There is extremely little difference between the two now. We can remove the callback from ocfs2_lock_res_ops as well. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Clean up lock resource refresh flagsMark Fasheh2006-09-241-14/+35
| | | | | | | Use of the refresh mechanism is lock-type wide, so move knowledge of that to the ocfs2_lock_res_ops structure. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Remove i_generation from inode lock namesMark Fasheh2006-09-2410-53/+170
| | | | | | | | | | | | | | | | | | | | | | | OCFS2 puts inode meta data in the "lock value block" provided by the DLM. Typically, i_generation is encoded in the lock name so that a deleted inode on and a new one in the same block don't share the same lvb. Unfortunately, that scheme means that the read in ocfs2_read_locked_inode() is potentially thrown away as soon as the meta data lock is taken - we cannot encode the lock name without first knowing i_generation, which requires a disk read. This patch encodes i_generation in the inode meta data lvb, and removes the value from the inode meta data lock name. This way, the read can be covered by a lock, and at the same time we can distinguish between an up to date and a stale LVB. This will help cold-cache stat(2) performance in particular. Since this patch changes the protocol version, we take the opportunity to do a minor re-organization of two of the LVB fields. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Encode i_generation in the meta data lvbMark Fasheh2006-09-242-7/+12
| | | | | | | | When i_generation is removed from the lockname, this will help us determine whether a meta data lvb has information that is in sync with the local struct inode. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Free up some space in the lvbMark Fasheh2006-09-242-4/+6
| | | | | | | | lvb_version doesn't need to be a whole 32 bits. Make it an 8 bit field to free up some space. This should be backwards compatible until we use one of the fields, in which case we'd bump the lvb version anyway. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Remove special casing for inode creation in ocfs2_dentry_attach_lock()Mark Fasheh2006-09-243-37/+14
| | | | | | | | We can't use LKM_LOCAL for new dentry locks because an unlink and subsequent re-create of a name/inode pair may result in the lock still being mastered somewhere in the cluster. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: manually d_move() during ocfs2_rename()Mark Fasheh2006-09-242-2/+5
| | | | | | | | | | Make use of FS_RENAME_DOES_D_MOVE to avoid a race condition that can occur during ->rename() if we d_move() outside of the parent directory cluster locks, and another node discovers the new name (created during the rename) and unlinks it. d_move() will unconditionally rehash a dentry - which will leave stale data in the system. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Remove the dentry voteMark Fasheh2006-09-242-183/+2
| | | | | | This is unused now. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Hook rest of the file system into dentry locking APIMark Fasheh2006-09-244-41/+94
| | | | | | | Actually replace the vote calls with the new dentry operations. Make any necessary adjustments to get the scheme to work. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Add dentry tracking APIMark Fasheh2006-09-243-32/+369
| | | | | | | | | | | | | | | | | Replace the dentry vote mechanism with a cluster lock which covers a set of dentries. This allows us to force d_delete() only on nodes which actually care about an unlink. Every node that does a ->lookup() gets a read only lock on the dentry, until an unlink during which the unlinking node, will request an exclusive lock, forcing the other nodes who care about that dentry to d_delete() it. The effect is that we retain a very lightweight ->d_revalidate(), and at the same time get to make large improvements to the average case performance of the ocfs2 unlink and rename operations. This patch adds the higher level API and the dentry manipulation code. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Add new cluster lock typeMark Fasheh2006-09-245-104/+436
| | | | | | | | | | | | | | | | | | | Replace the dentry vote mechanism with a cluster lock which covers a set of dentries. This allows us to force d_delete() only on nodes which actually care about an unlink. Every node that does a ->lookup() gets a read only lock on the dentry, until an unlink during which the unlinking node, will request an exclusive lock, forcing the other nodes who care about that dentry to d_delete() it. The effect is that we retain a very lightweight ->d_revalidate(), and at the same time get to make large improvements to the average case performance of the ocfs2 unlink and rename operations. This patch adds the cluster lock type which OCFS2 can attach to dentries. A small number of fs/ocfs2/dcache.c functions are stubbed out so that this change can compile. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Update dlmglue for new dlmlock() APIMark Fasheh2006-09-241-0/+3
| | | | | | | File system lock names are very regular right now, so we really only need to pass an extra parameter to dlmlock(). Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Update dlmfs for new dlmlock() APIMark Fasheh2006-09-242-51/+31
| | | | | | | | We just need to add a namelen field to the user_lock_res structure, and update a few debug prints. Instead of updating all debug prints, I took the opportunity to remove a few that are likely unnecessary these days. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Allow binary names in the DLMMark Fasheh2006-09-245-8/+11
| | | | | | | | The OCFS2 DLM uses strlen() to determine lock name length, which excludes the possibility of putting binary values in the name string. Fix this by requiring that string length be passed in as a parameter. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Silence dlm error printMark Fasheh2006-09-241-3/+3
| | | | | | | An AST can be delivered via the network after a lock has been removed, so no need to print an error when we see that. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Remove overzealous BUG_ON()Mark Fasheh2006-09-201-8/+20
| | | | | | | | | The truncate code was never supposed to BUG() on an allocator it doesn't know about, but rather to ignore it. Right now, this does nothing, but when we change our allocation paths to use all suballocator files, this will allow current versions of the fs module to work fine. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Don't print on unknown remote blocking callMark Fasheh2006-09-201-6/+4
| | | | Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: implement directory read-aheadMark Fasheh2006-09-207-47/+115
| | | | | | | | | | Uptodate.c now knows about read-ahead buffers. Use some more aggressive logic in ocfs2_readdir(). The two functions which currently use directory read-ahead are ocfs2_find_entry() and ocfs2_readdir(). Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: properly update i_mtime on buffered writeMark Fasheh2006-09-201-49/+34
| | | | | | | | We weren't always updating i_mtime on writes, so fix ocfs2_commit_write() to handle this. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Acked-by: Zach Brown <zach.brown@oracle.com>
* ocfs2: Fix directory link count checks in ocfs2_link()Tiger Yang2006-09-201-5/+5
| | | | | | | | Remove the redundant "i_nlink >= OCFS2_LINK_MAX" check and adds an unlinked directory check in ocfs2_link(). Signed-off-by: Tiger Yang <tiger.yang@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: move nlink check in ocfs2_mknod()Mark Fasheh2006-09-201-7/+5
| | | | | | | | The dir nlink check in ocfs2_mknod() was being done outside of the cluster lock, which means we could have been checking against a stale version of the inode. Fix this by doing the check after the cluster lock instead. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Fix heartbeat sector calculationMathieu Avila2006-09-201-2/+6
| | | | | | This fixes things for devices which set max_sectors to 8. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* [PATCH] fs/ocfs2/ioctl.c should #include "ioctl.h"Adrian Bunk2006-09-201-0/+2
| | | | | | | | Every file should #include the headers containing the prototypes for its global functions. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: add ext2 attributesHerbert Poetzl2006-09-209-6/+217
| | | | | | | | | Support immutable, and other attributes. Some renaming and other minor fixes done by myself. Signed-off-by: Herbert Poetzl <herbert@13thfloor.at> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: allocation hintsMark Fasheh2006-08-074-13/+145
| | | | | | | | | Record the most recently used allocation group on the allocation context, so that subsequent allocations can attempt to optimize for contiguousness. Local alloc especially should benefit from this as the current chain search tends to let it spew across the disk. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: better group descriptor consistency checksMark Fasheh2006-08-071-20/+94
| | | | | | | | Try to catch corrupted group descriptors with some stronger checks placed in a couple of strategic locations. Detect a failed resizefs and refuse to allocate past what bitmap i_clusters allows. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: limit cluster bitmap information saved at mountMark Fasheh2006-08-072-3/+6
| | | | | | | | | We were storing cluster count on the ocfs2_super structure, but never actually using it so remove that. Also, we don't want to populate the uptodate cache with the unlocked block read - it is technically safe as is, but we should change it for correctness. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* [PATCH] fs/ocfs2/dlm/dlmmaster.c: unexport dlm_migrate_lockresAdrian Bunk2006-08-071-1/+0
| | | | | | | This patch removes the unused EXPORT_SYMBOL_GPL(dlm_migrate_lockres). Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: fix check for locally granted state during dlmunlock()Kurt Hackel2006-08-071-1/+1
| | | | | | | | If a process requests a lock cancel but the lock has been remotely granted already then there is no need to send the cancel message. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: do not modify lksb->status in the unlock astKurt Hackel2006-08-071-25/+12
| | | | | | | | This can race with other ast notification, which can cause bad status values to propagate into the unlock ast. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
* ocfs2: Fix lvb corruptionKurt Hackel2006-08-071-0/+6
| | | | | | | | Properly ignore LVB flags during a PR downconvert. This avoids an illegal lvb update. Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>