aboutsummaryrefslogtreecommitdiffstats
path: root/fs
Commit message (Collapse)AuthorAgeFilesLines
...
| * | | ocfs2: Use the right group in nfs sync check.Tao Ma2010-09-081-8/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have added discontig block group now, and now an inode can be allocated in an discontig block group. So get it in ocfs2_get_suballoc_slot_bit. The old ocfs2_test_suballoc_bit gets group block no from the allocation inode which is wrong. Fix it by passing the right group. Acked-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Tao Ma <tao.ma@oracle.com>
| * | | ocfs2: Flush drive's caches on fdatasyncJan Kara2010-09-081-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When 'barrier' mount option is specified, we have to issue a cache flush during fdatasync(2). We have to do this even if inode doesn't have I_DIRTY_DATASYNC set because we still have to get written *data* to disk so that they are not lost in case of crash. Acked-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Jan Kara <jack@suse.cz> Singed-off-by: Tao Ma <tao.ma@oracle.com>
| * | | ocfs2: make __ocfs2_page_mkwrite handle file end properly.Tao Ma2010-09-081-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | __ocfs2_page_mkwrite now is broken in handling file end. 1. the last page should be the page contains i_size - 1. 2. the len in the last page is also calculated wrong. So change them accordingly. Acked-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Tao Ma <tao.ma@oracle.com>
| * | | ocfs2: Fix incorrect checksum validation errorSunil Mushran2010-09-081-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For local mounts, ocfs2_read_locked_inode() calls ocfs2_read_blocks_sync() to read the inode off the disk. The latter first checks to see if that block is cached in the journal, and, if so, returns that block. That is ok. But ocfs2_read_locked_inode() goes wrong when it tries to validate the checksum of such blocks. Blocks that are cached in the journal may not have had their checksum computed as yet. We should not validate the checksums of such blocks. Fixes ossbz#1282 http://oss.oracle.com/bugzilla/show_bug.cgi?id=1282 Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Cc: stable@kernel.org Singed-off-by: Tao Ma <tao.ma@oracle.com>
| * | | ocfs2: Fix metaecc error messagesSunil Mushran2010-09-081-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Like tools, the checksum validate function now prints the values in hex. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Singed-off-by: Tao Ma <tao.ma@oracle.com>
* | | | Merge branch 'for-linus' of ↵Linus Torvalds2010-09-082-20/+30
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: fix lock annotations fuse: flush background queue on connection close
| * | | | fuse: fix lock annotationsMiklos Szeredi2010-09-072-16/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sparse doesn't understand lock annotations of the form __releases(&foo->lock). Change them to __releases(foo->lock). Same for __acquires(). Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
| * | | | fuse: flush background queue on connection closeMiklos Szeredi2010-09-071-4/+12
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | David Bartly reported that fuse can hang in fuse_get_req_nofail() when the connection to the filesystem server is no longer active. If bg_queue is not empty then flush_bg_queue() called from request_end() can put more requests on to the pending queue. If this happens while ending requests on the processing queue then those background requests will be queued to the pending list and never ended. Another problem is that fuse_dev_release() didn't wake up processes sleeping on blocked_waitq. Solve this by: a) flushing the background queue before calling end_requests() on the pending and processing queues b) setting blocked = 0 and waking up processes waiting on blocked_waitq() Thanks to David for an excellent bug report. Reported-by: David Bartley <andareed@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> CC: stable@kernel.org
* | | | Merge branch 'for-2.6.36' of git://linux-nfs.org/~bfields/linuxLinus Torvalds2010-09-071-1/+1
|\ \ \ \ | |_|_|/ |/| | | | | | | | | | | * 'for-2.6.36' of git://linux-nfs.org/~bfields/linux: nfsd4: mask out non-access bits in nfs4_access_to_omode
| * | | nfsd4: mask out non-access bits in nfs4_access_to_omodeJ. Bruce Fields2010-09-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | This fixes an unnecessary BUG(). Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* | | | Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfsLinus Torvalds2010-09-077-14/+35
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-linus' of git://oss.sgi.com/xfs/xfs: xfs: Make fiemap work with sparse files xfs: prevent 32bit overflow in space reservation xfs: Disallow 32bit project quota id xfs: improve buffer cache hash scalability
| * \ \ \ Merge branch '2.6.36-xfs-misc' of ↵Alex Elder2010-09-033-11/+11
| |\ \ \ \ | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/dgc/xfsdev
| | * | | | xfs: prevent 32bit overflow in space reservationDave Chinner2010-09-031-3/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we attempt to preallocate more than 2^32 blocks of space in a single syscall, the transaction block reservation will overflow leading to a hangs in the superblock block accounting code. This is trivially reproduced with xfs_io. Fix the problem by capping the allocation reservation to the maximum number of blocks a single xfs_bmapi() call can allocate (2^21 blocks). Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
| | * | | | xfs: improve buffer cache hash scalabilityDave Chinner2010-09-022-8/+1
| | | |/ / | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When doing large parallel file creates on a 16p machines, large amounts of time is being spent in _xfs_buf_find(). A system wide profile with perf top shows this: 1134740.00 19.3% _xfs_buf_find 733142.00 12.5% __ticket_spin_lock The problem is that the hash contains 45,000 buffers, and the hash table width is only 256 buffers. That means we've got around 200 buffers per chain, and searching it is quite expensive. The hash table size needs to increase. Secondly, every time we do a lookup, we promote the buffer we find to the head of the hash chain. This is causing cachelines to be dirtied and causes invalidation of cachelines across all CPUs that may have walked the hash chain recently. hence every walk of the hash chain is effectively a cold cache walk. Remove the promotion to avoid this invalidation. The results are: 1045043.00 21.2% __ticket_spin_lock 326184.00 6.6% _xfs_buf_find A 70% drop in the CPU usage when looking up buffers. Unfortunately that does not result in an increase in performance underthis workload as contention on the inode_lock soaks up most of the reduction in CPU usage. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
| * | | | xfs: Make fiemap work with sparse filesTao Ma2010-09-033-3/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In xfs_vn_fiemap, we set bvm_count to fi_extent_max + 1 and want to return fi_extent_max extents, but actually it won't work for a sparse file. The reason is that in xfs_getbmap we will calculate holes and set it in 'out', while out is malloced by bmv_count(fi_extent_max+1) which didn't consider holes. So in the worst case, if 'out' vector looks like [hole, extent, hole, extent, hole, ... hole, extent, hole], we will only return half of fi_extent_max extents. This patch add a new parameter BMV_IF_NO_HOLES for bvm_iflags. So with this flags, we don't use our 'out' in xfs_getbmap for a hole. The solution is a bit ugly by just don't increasing index of 'out' vector. I felt that it is not easy to skip it at the very beginning since we have the complicated check and some function like xfs_getbmapx_fix_eof_hole to adjust 'out'. Cc: Dave Chinner <david@fromorbit.com> Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Alex Elder <aelder@sgi.com>
| * | | | xfs: Disallow 32bit project quota idArkadiusz Mi?kiewicz2010-09-021-0/+7
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently on-disk structure is able to keep only 16bit project quota id, so disallow 32bit ones. This fixes a problem where parts of kernel structures holding project quota id are 32bit while parts (on-disk) are 16bit variables which causes project quota member files to be inaccessible for some operations (like mv/rm). Signed-off-by: Arkadiusz Mi?kiewicz <arekm@maven.pl> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
* | | | Merge branch 'for-linus' of ↵Linus Torvalds2010-09-071-1/+2
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: 9p: potential ERR_PTR() dereference
| * | | | 9p: potential ERR_PTR() dereferenceDan Carpenter2010-08-301-1/+2
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | p9_client_walk() can return error values if we run out of space or there is a problem with the network. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
* | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6Linus Torvalds2010-09-071-1/+1
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: sysfs: checking for NULL instead of ERR_PTR
| * | | | sysfs: checking for NULL instead of ERR_PTRDan Carpenter2010-09-031-1/+1
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | d_path() returns an ERR_PTR and it doesn't return NULL. Signed-off-by: Dan Carpenter <error27@gmail.com> Cc: stable <stable@kernel.org> Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* | | | Merge branch 'for-linus' of ↵Linus Torvalds2010-09-071-0/+1
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2: nilfs2: fix leak of shadow dat inode in error path of load_nilfs
| * | | | nilfs2: fix leak of shadow dat inode in error path of load_nilfsRyusuke Konishi2010-08-301-0/+1
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If load_nilfs() gets an error while doing recovery, it will fail to free the shadow inode of dat (nilfs->ns_gc_dat). This fixes the leak issue. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
* | | | VFS: Sanity check mount flags passed to change_mnt_propagation()Valerie Aurora2010-09-071-1/+22
|/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sanity check the flags passed to change_mnt_propagation(). Exactly one flag should be set. Return EINVAL otherwise. Userspace can pass in arbitrary combinations of MS_* flags to mount(). do_change_type() is called if any of MS_SHARED, MS_PRIVATE, MS_SLAVE, or MS_UNBINDABLE is set. do_change_type() clears MS_REC and then calls change_mnt_propagation() with the rest of the user-supplied flags. change_mnt_propagation() clearly assumes only one flag is set but do_change_type() does not check that this is true. For example, mount() with flags MS_SHARED | MS_RDONLY does not actually make the mount shared or read-only but does clear MNT_UNBINDABLE. Signed-off-by: Valerie Aurora <vaurora@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | Merge branch 'for-linus' of git://git.infradead.org/users/eparis/notifyLinus Torvalds2010-08-283-36/+64
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-linus' of git://git.infradead.org/users/eparis/notify: fsnotify: drop two useless bools in the fnsotify main loop fsnotify: fix list walk order fanotify: Return EPERM when a process is not privileged fanotify: resize pid and reorder structure fanotify: drop duplicate pr_debug statement fanotify: flush outstanding perm requests on group destroy fsnotify: fix ignored mask handling between inode and vfsmount marks fanotify: add MAINTAINERS entry fsnotify: reset used_inode and used_vfsmount on each pass fanotify: do not dereference inode_mark when it is unset
| * | | fsnotify: drop two useless bools in the fnsotify main loopEric Paris2010-08-271-8/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The fsnotify main loop has 2 bools which indicated if we processed the inode or vfsmount mark in that particular pass through the loop. These bool can we replaced with the inode_group and vfsmount_group variables and actually make the code a little easier to understand. Signed-off-by: Eric Paris <eparis@redhat.com>
| * | | fsnotify: fix list walk orderEric Paris2010-08-271-6/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Marks were stored on the inode and vfsmonut mark list in order from highest memory address to lowest memory address. The code to walk those lists thought they were in order from lowest to highest with unpredictable results when trying to match up marks from each. It was possible that extra events would be sent to userspace when inode marks ignoring events wouldn't get matched with the vfsmount marks. This problem only affected fanotify when using both vfsmount and inode marks simultaneously. Signed-off-by: Eric Paris <eparis@redhat.com>
| * | | fanotify: Return EPERM when a process is not privilegedAndreas Gruenbacher2010-08-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The appropriate error code when privileged operations are denied is EPERM, not EACCES. Signed-off-by: Andreas Gruenbacher <agruen@suse.de> Signed-off-by: Eric Paris <paris@paris.rdu.redhat.com>
| * | | fanotify: drop duplicate pr_debug statementTvrtko Ursulin2010-08-221-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reminded me... you have two pr_debugs in fanotify_should_send_event which output redundant information. Maybe you intended it like that so it is selectable how much log spam you want, or if not you may want to apply this patch. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@sophos.com> Signed-off-by: Eric Paris <eparis@redhat.com>
| * | | fanotify: flush outstanding perm requests on group destroyEric Paris2010-08-221-0/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When an fanotify listener is closing it may cause a deadlock between the listener and the original task doing an fs operation. If the original task is waiting for a permissions response it will be holding the srcu lock. The listener cannot clean up and exit until after that srcu lock is syncronized. Thus deadlock. The fix introduced here is to stop accepting new permissions events when a listener is shutting down and to grant permission for all outstanding events. Thus the original task will eventually release the srcu lock and the listener can complete shutdown. Reported-by: Andreas Gruenbacher <agruen@suse.de> Cc: Andreas Gruenbacher <agruen@suse.de> Signed-off-by: Eric Paris <eparis@redhat.com>
| * | | fsnotify: fix ignored mask handling between inode and vfsmount marksEric Paris2010-08-221-18/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The interesting 2 list lockstep walking didn't quite work out if the inode marks only had ignores and the vfsmount list requested events. The code to shortcut list traversal would not run the inode list since it didn't have real event requests. This code forces inode list traversal when a vfsmount mark matches the event type. Maybe we could add an i_fsnotify_ignored_mask field to struct inode to get the shortcut back, but it doesn't seem worth it to grow struct inode again. I bet with the recent changes to lock the way we do now it would actually not be a major perf hit to just drop i_fsnotify_mark_mask altogether. But that is for another day. Signed-off-by: Eric Paris <eparis@redhat.com>
| * | | fsnotify: reset used_inode and used_vfsmount on each passEric Paris2010-08-221-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The fsnotify main loop has 2 booleans which tell if a particular mark was sent to the listeners or if it should be processed in the next pass. The problem is that the booleans were not reset on each traversal of the loop. So marks could get skipped even when they were not sent to the notifiers. Reported-by: Tvrtko Ursulin <tvrtko.ursulin@sophos.com> Signed-off-by: Eric Paris <eparis@redhat.com>
| * | | fanotify: do not dereference inode_mark when it is unsetEric Paris2010-08-221-4/+11
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | The fanotify code is supposed to get the group from the mark. It accidentally only used the inode_mark. If the vfsmount_mark was set but not the inode_mark it would deref the NULL inode_mark. Get the group from the correct place. Reported-by: Tvrtko Ursulin <tvrtko.ursulin@sophos.com> Signed-off-by: Eric Paris <eparis@redhat.com>
* | | Merge branch 'for-linus' of ↵Linus Torvalds2010-08-286-12/+30
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/ecryptfs/ecryptfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ecryptfs/ecryptfs-2.6: eCryptfs: Fix encrypted file name lookup regression ecryptfs: properly mark init functions fs/ecryptfs: Return -ENOMEM on memory allocation failure
| * | | eCryptfs: Fix encrypted file name lookup regressionTyler Hicks2010-08-272-8/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes a regression caused by 21edad32205e97dc7ccb81a85234c77e760364c8 When file name encryption was enabled, ecryptfs_lookup() failed to use the encrypted and encoded version of the upper, plaintext, file name when performing a lookup in the lower file system. This made it impossible to lookup existing encrypted file names and any newly created files would have plaintext file names in the lower file system. https://bugs.launchpad.net/ecryptfs/+bug/623087 Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
| * | | ecryptfs: properly mark init functionsJerome Marchand2010-08-274-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some ecryptfs init functions are not prefixed by __init and thus not freed after initialization. This patch saved about 1kB in ecryptfs module. Signed-off-by: Jerome Marchand <jmarchan@redhat.com> Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
| * | | fs/ecryptfs: Return -ENOMEM on memory allocation failureJulia Lawall2010-08-271-0/+2
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In this code, 0 is returned on memory allocation failure, even though other failures return -ENOMEM or other similar values. A simplified version of the semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression ret; expression x,e1,e2,e3; @@ ret = 0 ... when != ret = e1 *x = \(kmalloc\|kcalloc\|kzalloc\)(...) ... when != ret = e2 if (x == NULL) { ... when != ret = e3 return ret; } // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
* | | Merge branch 'for-linus' of ↵Linus Torvalds2010-08-2813-107/+184
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: ceph: fix get_ticket_handler() error handling ceph: don't BUG on ENOMEM during mds reconnect ceph: ceph_mdsc_build_path() returns an ERR_PTR ceph: Fix warnings ceph: ceph_get_inode() returns an ERR_PTR ceph: initialize fields on new dentry_infos ceph: maintain i_head_snapc when any caps are dirty, not just for data ceph: fix osd request lru adjustment when sending request ceph: don't improperly set dir complete when holding EXCL cap mm: exporting account_page_dirty ceph: direct requests in snapped namespace based on nonsnap parent ceph: queue cap snap writeback for realm children on snap update ceph: include dirty xattrs state in snapped caps ceph: fix xattr cap writeback ceph: fix multiple mds session shutdown
| * | | ceph: fix get_ticket_handler() error handlingDan Carpenter2010-08-261-6/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | get_ticket_handler() returns a valid pointer or it returns ERR_PTR(-ENOMEM) if kzalloc() fails. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Sage Weil <sage@newdream.net>
| * | | ceph: don't BUG on ENOMEM during mds reconnectSage Weil2010-08-261-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | We are in a position to return an error; do that instead. Signed-off-by: Sage Weil <sage@newdream.net>
| * | | ceph: ceph_mdsc_build_path() returns an ERR_PTRDan Carpenter2010-08-261-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ceph_mdsc_build_path() returns an ERR_PTR but this code is set up to handle NULL returns. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Sage Weil <sage@newdream.net>
| * | | ceph: Fix warningsAlan Cox2010-08-251-5/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Just scrubbing some warnings so I can see real problem ones in the build noise. For 32bit we need to coax gcc politely into believing we really honestly intend to the casts. Using (u64)(unsigned long) means we cast from a pointer to a type of the right size and then extend it. This stops the warning spew. Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Sage Weil <sage@newdream.net>
| * | | ceph: ceph_get_inode() returns an ERR_PTRDan Carpenter2010-08-251-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | ceph_get_inode() returns an ERR_PTR and it doesn't return a NULL. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Sage Weil <sage@newdream.net>
| * | | ceph: initialize fields on new dentry_infosSage Weil2010-08-241-1/+1
| | | | | | | | | | | | | | | | Signed-off-by: Sage Weil <sage@newdream.net>
| * | | ceph: maintain i_head_snapc when any caps are dirty, not just for dataSage Weil2010-08-244-7/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We used to use i_head_snapc to keep track of which snapc the current epoch of dirty data was dirtied under. It is used by queue_cap_snap to set up the cap_snap. However, since we queue cap snaps for any dirty caps, not just for dirty file data, we need to keep a valid i_head_snapc anytime we have dirty|flushing caps. This fixes a NULL pointer deref in queue_cap_snap when writing back dirty caps without data (e.g., snaptest-authwb.sh). Signed-off-by: Sage Weil <sage@newdream.net>
| * | | ceph: fix osd request lru adjustment when sending requestHenry C Chang2010-08-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix argument order. We want to move the item to the end of the list, not change the position of the head. Signed-off-by: Henry C Chang <henry_c_chang@tcloudcomputing.com> Signed-off-by: Sage Weil <sage@newdream.net>
| * | | ceph: don't improperly set dir complete when holding EXCL capSage Weil2010-08-221-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we hold the EXCL cap, we cannot trust the dir stats from the MDS (num files, subdirs) and must not incorrectly conclude that the directory is empty. If we do, we get can bad results from lookup (bad ENOENT) and bad readdir results. Signed-off-by: Sage Weil <sage@newdream.net>
| * | | mm: exporting account_page_dirtyMichael Rubin2010-08-221-7/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows code outside of the mm core to safely manipulate page state and not worry about the other accounting. Not using these routines means that some code will lose track of the accounting and we get bugs. This has happened once already. Signed-off-by: Michael Rubin <mrubin@google.com> Signed-off-by: Sage Weil <sage@newdream.net>
| * | | ceph: direct requests in snapped namespace based on nonsnap parentSage Weil2010-08-221-2/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When making a request in the virtual snapdir or a snapped portion of the namespace, we should choose the MDS based on the first nonsnap parent (and its caps). If that is not the best place, we will get forward hints to find the right MDS in the cluster. This fixes ESTALE errors when using the .snap directory and namespace with multiple MDSs. Signed-off-by: Sage Weil <sage@newdream.net>
| * | | ceph: queue cap snap writeback for realm children on snap updateSage Weil2010-08-221-23/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a realm is updated, we need to queue writeback on inodes in that realm _and_ its children. Otherwise, if the inode gets cowed on the server, we can get a hang later due to out-of-sync cap/snap state. Signed-off-by: Sage Weil <sage@newdream.net>
| * | | ceph: include dirty xattrs state in snapped capsSage Weil2010-08-224-11/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we snapshot dirty metadata that needs to be written back to the MDS, include dirty xattr metadata. Make the capsnap reference the encoded xattr blob so that it will be written back in the FLUSHSNAP op. Also fix the capsnap creation guard to include dirty auth or file bits, not just tests specific to dirty file data or file writes in progress (this fixes auth metadata writeback). Signed-off-by: Sage Weil <sage@newdream.net>