| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Return -ENOMEM if kmalloc() fails.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
The original code dereferenced range on the next line.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
We can use this simple method to make source more readable.
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
We need to check return value of btrfs_search_slot() in
btrfs_read_chunk_tree() and do corresponding error handing.
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
We only need to call finish_wait() after wait loop.
By the way, this patch makes code of waiting loop similar to
example in wait.h(no functional change)
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
btrfs_find_create_tree_block() may return NULL, so we must check the returned
value, or we will access a NULL pointer.
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
fs/btrfs/ioctl.c: ctree.h is included more than once.
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|\ \ \ \ \ \ \
| | |/ / / / /
| |/| | | | | |
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
In initial design, walk_page_range() was designed just for walking page
table and it didn't require mmap_sem. Now, find_vma() etc.. are used
in walk_page_range() and we need mmap_sem around it.
This patch adds mmap_sem around walk_page_range().
Because /proc/<pid>/pagemap's callback routine use put_user(), we have
to get rid of it to do sane fix.
Changelog: 2010/Apr/2
- fixed start_vaddr and end overflow
Changelog: 2010/Apr/1
- fixed start_vaddr calculation
- removed unnecessary cast.
- removed unnecessary change in smaps.
- use GFP_TEMPORARY instead of GFP_KERNEL
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Matt Mackall <mpm@selenic.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: San Mehat <san@google.com>
Cc: Brian Swetland <swetland@google.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
[ Fixed kmalloc failure return code as per Matt ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| |\ \ \ \ \ \
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing
* 'reiserfs/kill-bkl' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing:
reiserfs: Fix locking BUG during mount failure
|
| | | |_|_|_|/
| | |/| | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Commit 8ebc423238341b52912c7295b045a32477b33f09 (reiserfs: kill-the-BKL)
introduced a bug in the mount failure case.
The error label releases the lock before calling journal_release_error,
but it requires that the lock be held. do_journal_release unlocks and
retakes it. When it releases it without it held, we trigger a BUG().
The error_alloc label skips the unlock since the lock isn't held yet
but none of the other conditions that are clean up exist yet either.
This patch returns immediately after the kzalloc failure and moves
the reiserfs_write_unlock after the journal_release_error call.
This was reported in https://bugzilla.novell.com/show_bug.cgi?id=591807
Reported-by: Thomas Siedentopf <thomas.siedentopf@novell.com>
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Cc: Thomas Siedentopf <thomas.siedentopf@novell.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: 2.6.33.x <stable@kernel.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
proc_oom_score(task) has a reference to task_struct, but that is all.
If this task was already released before we take tasklist_lock
- we can't use task->group_leader, it points to nowhere
- it is not safe to call badness() even if this task is
->group_leader, has_intersects_mems_allowed() assumes
it is safe to iterate over ->thread_group list.
- even worse, badness() can hit ->signal == NULL
Add the pid_alive() check to ensure __unhash_process() was not called.
Also, use "task" instead of task->group_leader. badness() should return
the same result for any sub-thread. Currently this is not true, but
this should be changed anyway.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| |/ / / / /
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
When using the string representation of a random counter as part of the base
name, ensure that it is no longer than 4 bytes.
Since we are repeatedly decrementing the counter in a loop until we have found a
unique base name, the counter may wrap around zero; therefore, it is not enough
to mask its higher bits before entering the loop, this must be done inside the
loop.
[hirofumi@mail.parknet.co.jp: use snprintf()]
Signed-off-by: Nikolaus Schulz <microschulz@web.de>
Cc: stable@kernel.org
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| |\ \ \ \ \
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
* git://git.kernel.org/pub/scm/linux/kernel/git/joern/logfs:
[LogFS] Erase new journal segments
[LogFS] Move reserved segments with journal
[LogFS] Clear PagePrivate when moving journal
Simplify and fix pad_wbuf
Prevent data corruption in logfs_rewrite_block()
Use deactivate_locked_super
Fix logfs_get_sb_final error path
Write out both superblocks on mismatch
Prevent schedule while atomic in __logfs_readdir
Plug memory leak in writeseg_end_io
Limit max_pages for insane devices
Open segment file before using it
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
If the device contains on old logfs image and the journal is moved to
segment that have never been used by the current logfs and not all
journal segments are erased before the next mount, the old content can
confuse mount code. To prevent this, always erase the new journal
segments.
Signed-off-by: Joern Engel <joern@logfs.org>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Fixes a GC livelock.
Signed-off-by: Joern Engel <joern@logfs.org>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
do_logfs_journal_wl_pass() must call freeseg(), thereby clear
PagePrivate on all pages of the current journal segment.
Signed-off-by: Joern Engel <joern@logfs.org>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
A comment in the old code read:
/* The math in this function can surely use some love */
And indeed it did. In the case that area->a_used_bytes is exactly
4096 bytes below segment size it fell apart. pad_wbuf is now split
into two helpers that are significantly less complicated.
Signed-off-by: Joern Engel <joern@logfs.org>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
The comment was correct, so make the code match the comment. As the
new comment indicates, we might be able to do a little less work. But
for the current -rc series let's keep it simple and just fix the bug.
Signed-off-by: Joern Engel <joern@logfs.org>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Found by Al Viro.
Signed-off-by: Joern Engel <joern@logfs.org>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
rootdir was already allocated, so we must iput it again.
Found by Al Viro.
Signed-off-by: Joern Engel <joern@logfs.org>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
If the first superblock is wrong and the second gets written, there
will still be a mismatch on next mount. Write both to make sure they
match.
Signed-off-by: Joern Engel <joern@logfs.org>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Apparently filldir can sleep, which forbids kmap_atomic.
Signed-off-by: Joern Engel <joern@logfs.org>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Signed-off-by: Joern Engel <joern@logfs.org>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Intel SSDs have a limit of 0xffff as queue_max_hw_sectors(q). Such a
limit may make sense from a hardware pov, but it causes bio_alloc() to
return NULL.
Signed-off-by: Joern Engel <joern@logfs.org>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
logfs_recover_sb() needs it open.
Signed-off-by: Joern Engel <joern@logfs.org>
|
|/ / / / / /
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
|
|\ \ \ \ \ \
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2:
ocfs2: Fix a race in o2dlm lockres mastery
Ocfs2: Handle deletion of reflinked oprhan inodes correctly.
Ocfs2: Journaling i_flags and i_orphaned_slot when adding inode to orphan dir.
ocfs2: Clear undo bits when local alloc is freed
ocfs2: Init meta_ac properly in ocfs2_create_empty_xattr_block.
ocfs2: Fix the update of name_offset when removing xattrs
ocfs2: Always try for maximum bits with new local alloc windows
ocfs2: set i_mode on disk during acl operations
ocfs2: Update i_blocks in reflink operations.
ocfs2: Change bg_chain check for ocfs2_validate_gd_parent.
[PATCH] Skip check for mandatory locks when unlocking
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
In o2dlm, the master of a lock resource keeps a map of all interested
nodes. This prevents the master from purging the resource before an
interested node can create a lock.
A race between the mastery thread and the mastery handler allowed an
interested node to discover who the master is without informing the
master directly. This is easily fixed by holding the dlm spinlock a
little longer in the mastery handler.
Signed-off-by: Srinivas Eeda <srinivas.eeda@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
The rule is that all inodes in the orphan dir have ORPHANED_FL,
otherwise we treated it as an ERROR. This rule works well except
for some rare cases of reflink operation:
http://oss.oracle.com/bugzilla/show_bug.cgi?id=1215
The problem is caused by how reflink and our orphan_scan thread
interact.
* The orphan scan pulls the orphans into a queue first, then runs the
queue at a later time. We only hold the orphan_dir's lock
during scanning.
* Reflink create a oprhaned target in orphan_dir as its first step.
It removes the target and clears the flag as the final step.
These two steps take the orphan_dir's lock, but it is not held for
the duration.
Based on the above semantics, a reflink inode can be moved out of the
orphan dir and have its ORPHANED_FL cleared before the queue of orphans
is run. This leads to a ERROR in ocfs2_query_wipde_inode().
This patch teaches ocfs2_query_wipe_inode() to detect previously
orphaned reflink targets. If a reflink fails or a crash occurs during
the relfink operation, the inode will retain ORPHANED_FL and will be
properly wiped.
Signed-off-by: Tristan Ye <tristan.ye@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Currently, some callers were missing to journal the dirty inode after
adding it to orphan dir.
Now we're going to journal such modifications within the ocfs2_orphan_add()
itself, It's safe to do so, though some existing caller may duplicate this,
and it makes the logic look more straightforward anyway.
Signed-off-by: Tristan Ye <tristan.ye@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
When the local alloc file changes windows, unused bits are freed back to the
global bitmap. By defnition, those bits can not be in use by any file. Also,
the local alloc will never have been able to allocate those bits if they
were part of a previous truncate. Therefore it makes sense that we should
clear unused local alloc bits in the undo buffer so that they can be used
immediatly.
[ Modified to call it ocfs2_release_clusters() -- Joel ]
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
You can't store a pointer that you haven't filled in yet and expect it
to work.
Signed-off-by: Tao Ma <tao.ma@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
When replacing a xattr's value, in some case we wipe its name/value
first and then re-add it. The wipe is done by
ocfs2_xa_block_wipe_namevalue() when the xattr is in the inode or
block. We currently adjust name_offset for all the entries which have
(offset < name_offset). This does not adjust the entrie we're replacing.
Since we are replacing the entry, we don't adjust the total entry count.
When we calculate a new namevalue location, we trust the entries
now-wrong offset in ocfs2_xa_get_free_start(). The solution is to
also adjust the name_offset for the replaced entry, allowing
ocfs2_xa_get_free_start() to calculate the new namevalue location
correctly.
The following script can trigger a kernel panic easily.
echo 'y'|mkfs.ocfs2 --fs-features=local,xattr -b 4K $DEVICE
mount -t ocfs2 $DEVICE $MNT_DIR
FILE=$MNT_DIR/$RANDOM
for((i=0;i<76;i++))
do
string_76="a$string_76"
done
string_78="aa$string_76"
string_82="aaaa$string_78"
touch $FILE
setfattr -n 'user.test1234567890' -v $string_76 $FILE
setfattr -n 'user.test1234567890' -v $string_78 $FILE
setfattr -n 'user.test1234567890' -v $string_82 $FILE
Signed-off-by: Tao Ma <tao.ma@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
What we were doing before was to ask for the current window size as the
maximum allocation. This had the effect of limiting the amount of allocation
we could get for the local alloc during times when the window size was
shrunk due to fragmentation. In some cases, that could actually *increase*
fragmentation by artificially limiting the number of bits we can accept. So
while we still want to ask for a minimum number of bits equal to window
size, there is no reason why we should limit the number of bits the local
alloc should accept. Hence always allow the maximum number of local alloc
bits.
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
ocfs2_set_acl() and ocfs2_init_acl() were setting i_mode on the in-memory
inode, but never setting it on the disk copy. Thus, acls were some times not
getting propagated between nodes. This patch fixes the issue by adding a
helper function ocfs2_acl_set_mode() which does this the right way.
ocfs2_set_acl() and ocfs2_init_acl() are then updated to call
ocfs2_acl_set_mode().
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
In reflink, we need to upate i_blocks for the target inode.
Reported-by: Jie Liu <jeff.liu@oracle.com>
Signed-off-by: Tao Ma <tao.ma@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
In ocfs2_validate_gd_parent, we check bg_chain against the
cl_next_free_rec of the dinode. Actually in resize, we have
the chance of bg_chain == cl_next_free_rec. So add some
additional condition check for it.
I also rename paramter "clean_error" to "resize", since the
old one is not clearly enough to indicate that we should only
meet with this case in resize.
btw, the correpsonding bug is
http://oss.oracle.com/bugzilla/show_bug.cgi?id=1230.
Signed-off-by: Tao Ma <tao.ma@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
|
| | |_|_|_|/
| |/| | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
ocfs2_lock() will skip locks on file which has mode set to 02666. This
is a problem in cases where the mode of the file is changed after a
process has obtained a lock on the file.
ocfs2_lock() should skip the check for mandatory locks when unlocking a
file.
Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
|
|\ \ \ \ \ \
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (28 commits)
ceph: update discussion list address in MAINTAINERS
ceph: some documentations fixes
ceph: fix use after free on mds __unregister_request
ceph: avoid loaded term 'OSD' in documention
ceph: fix possible double-free of mds request reference
ceph: fix session check on mds reply
ceph: handle kmalloc() failure
ceph: propagate mds session allocation failures to caller
ceph: make write_begin wait propagate ERESTARTSYS
ceph: fix snap rebuild condition
ceph: avoid reopening osd connections when address hasn't changed
ceph: rename r_sent_stamp r_stamp
ceph: fix connection fault con_work reentrancy problem
ceph: prevent dup stale messages to console for restarting mds
ceph: fix pg pool decoding from incremental osdmap update
ceph: fix mds sync() race with completing requests
ceph: only release unused caps with mds requests
ceph: clean up handle_cap_grant, handle_caps wrt session mutex
ceph: fix session locking in handle_caps, ceph_check_caps
ceph: drop unnecessary WARN_ON in caps migration
...
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
There was a use after free in __unregister_request that would trigger
whenever the request map held the last reference. This appears to have
triggered an oops during 'umount -f' when requests are being torn down.
Signed-off-by: Sage Weil <sage@newdream.net>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Clear pointer to mds request after dropping the reference to
ensure we don't drop it again, as there is at least one error
path through this function that does not reset fi->last_readdir
to a new value.
Signed-off-by: Sage Weil <sage@newdream.net>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Fix a broken check that a reply came back from the same MDS we sent the
request to. I don't think a case that actually triggers this would ever
come up in practice, but it's clearly wrong and easy to fix.
Reported-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Sage Weil <sage@newdream.net>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Return ERR_PTR(-ENOMEM) if kmalloc() fails. We handle allocation
failures the same way later in the function.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Sage Weil <sage@newdream.net>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Return error to original caller if register_session() fails.
Signed-off-by: Sage Weil <sage@newdream.net>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Currently, if the wait_event_interruptible is interrupted, we
return EAGAIN unconditionally and loop, such that we aren't, in
fact, interruptible. So, propagate ERESTARTSYS if we get it.
Signed-off-by: Sage Weil <sage@newdream.net>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
We were rebuilding the snap context when it was not necessary
(i.e. when the realm seq hadn't changed _and_ the parent seq
was still older), which caused page snapc pointers to not match
the realm's snapc pointer (even though the snap context itself
was identical). This confused begin_write and put it into an
endless loop.
The correct logic is: rebuild snapc if _my_ realm seq changed, or
if my parent realm's seq is newer than mine (and thus mine needs
to be rebuilt too).
Signed-off-by: Sage Weil <sage@newdream.net>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
We get a fault callback on _every_ tcp connection fault. Normally, we
want to reopen the connection when that happens. If the address we have
is bad, however, and connection attempts always result in a connection
refused or similar error, explicitly closing and reopening the msgr
connection just prevents the messenger's backoff logic from kicking in.
The result can be a console full of
[ 3974.417106] ceph: osd11 10.3.14.138:6800 connection failed
[ 3974.423295] ceph: osd11 10.3.14.138:6800 connection failed
[ 3974.429709] ceph: osd11 10.3.14.138:6800 connection failed
Instead, if we get a fault, and have outstanding requests, but the osd
address hasn't changed and the connection never successfully connected in
the first place, do nothing to the osd connection. The messenger layer
will back off and retry periodically, because we never connected and thus
the lossy bit is not set.
Instead, touch each request's r_stamp so that handle_timeout can tell the
request is still alive and kicking.
Signed-off-by: Sage Weil <sage@newdream.net>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Make variable name slightly more generic, since it will (soon)
reflect either the time the request was sent OR the time it was
last determined to be still retrying.
Signed-off-by: Sage Weil <sage@newdream.net>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
The messenger fault was clearing the BUSY bit, for reasons unclear. This
made it possible for the con->ops->fault function to reopen the connection,
and requeue work in the workqueue--even though the current thread was
already in con_work.
This avoids a problem where the client busy loops with connection failures
on an unreachable OSD, but doesn't address the root cause of that problem.
Signed-off-by: Sage Weil <sage@newdream.net>
|