diff options
author | KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> | 2012-01-12 17:17:44 -0800 |
---|---|---|
committer | Greg Kroah-Hartman <gregkh@suse.de> | 2012-01-25 17:24:43 -0800 |
commit | ea1c62778121f6ece5e0120250716b45e204cb13 (patch) | |
tree | 65501073ab7e5a33a351f375bb0fa24216852ef9 /include | |
parent | b9e11747e1227d7ad67c5b80be4b206e4059687e (diff) | |
download | kernel_samsung_espresso10-ea1c62778121f6ece5e0120250716b45e204cb13.zip kernel_samsung_espresso10-ea1c62778121f6ece5e0120250716b45e204cb13.tar.gz kernel_samsung_espresso10-ea1c62778121f6ece5e0120250716b45e204cb13.tar.bz2 |
memcg: add mem_cgroup_replace_page_cache() to fix LRU issue
commit ab936cbcd02072a34b60d268f94440fd5cf1970b upstream.
Commit ef6a3c6311 ("mm: add replace_page_cache_page() function") added a
function replace_page_cache_page(). This function replaces a page in the
radix-tree with a new page. WHen doing this, memory cgroup needs to fix
up the accounting information. memcg need to check PCG_USED bit etc.
In some(many?) cases, 'newpage' is on LRU before calling
replace_page_cache(). So, memcg's LRU accounting information should be
fixed, too.
This patch adds mem_cgroup_replace_page_cache() and removes the old hooks.
In that function, old pages will be unaccounted without touching
res_counter and new page will be accounted to the memcg (of old page).
WHen overwriting pc->mem_cgroup of newpage, take zone->lru_lock and avoid
races with LRU handling.
Background:
replace_page_cache_page() is called by FUSE code in its splice() handling.
Here, 'newpage' is replacing oldpage but this newpage is not a newly allocated
page and may be on LRU. LRU mis-accounting will be critical for memory cgroup
because rmdir() checks the whole LRU is empty and there is no account leak.
If a page is on the other LRU than it should be, rmdir() will fail.
This bug was added in March 2011, but no bug report yet. I guess there
are not many people who use memcg and FUSE at the same time with upstream
kernels.
The result of this bug is that admin cannot destroy a memcg because of
account leak. So, no panic, no deadlock. And, even if an active cgroup
exist, umount can succseed. So no problem at shutdown.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Miklos Szeredi <mszeredi@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Diffstat (limited to 'include')
-rw-r--r-- | include/linux/memcontrol.h | 6 |
1 files changed, 6 insertions, 0 deletions
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 50940da..313a00e 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -119,6 +119,8 @@ struct zone_reclaim_stat* mem_cgroup_get_reclaim_stat_from_page(struct page *page); extern void mem_cgroup_print_oom_info(struct mem_cgroup *memcg, struct task_struct *p); +extern void mem_cgroup_replace_page_cache(struct page *oldpage, + struct page *newpage); #ifdef CONFIG_CGROUP_MEM_RES_CTLR_SWAP extern int do_swap_account; @@ -370,6 +372,10 @@ static inline void mem_cgroup_count_vm_event(struct mm_struct *mm, enum vm_event_item idx) { } +static inline void mem_cgroup_replace_page_cache(struct page *oldpage, + struct page *newpage) +{ +} #endif /* CONFIG_CGROUP_MEM_CONT */ #if !defined(CONFIG_CGROUP_MEM_RES_CTLR) || !defined(CONFIG_DEBUG_VM) |