aboutsummaryrefslogtreecommitdiffstats
path: root/include
diff options
context:
space:
mode:
authorKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>2012-01-12 17:17:44 -0800
committerGreg Kroah-Hartman <gregkh@suse.de>2012-01-25 17:24:43 -0800
commitea1c62778121f6ece5e0120250716b45e204cb13 (patch)
tree65501073ab7e5a33a351f375bb0fa24216852ef9 /include
parentb9e11747e1227d7ad67c5b80be4b206e4059687e (diff)
downloadkernel_samsung_espresso10-ea1c62778121f6ece5e0120250716b45e204cb13.zip
kernel_samsung_espresso10-ea1c62778121f6ece5e0120250716b45e204cb13.tar.gz
kernel_samsung_espresso10-ea1c62778121f6ece5e0120250716b45e204cb13.tar.bz2
memcg: add mem_cgroup_replace_page_cache() to fix LRU issue
commit ab936cbcd02072a34b60d268f94440fd5cf1970b upstream. Commit ef6a3c6311 ("mm: add replace_page_cache_page() function") added a function replace_page_cache_page(). This function replaces a page in the radix-tree with a new page. WHen doing this, memory cgroup needs to fix up the accounting information. memcg need to check PCG_USED bit etc. In some(many?) cases, 'newpage' is on LRU before calling replace_page_cache(). So, memcg's LRU accounting information should be fixed, too. This patch adds mem_cgroup_replace_page_cache() and removes the old hooks. In that function, old pages will be unaccounted without touching res_counter and new page will be accounted to the memcg (of old page). WHen overwriting pc->mem_cgroup of newpage, take zone->lru_lock and avoid races with LRU handling. Background: replace_page_cache_page() is called by FUSE code in its splice() handling. Here, 'newpage' is replacing oldpage but this newpage is not a newly allocated page and may be on LRU. LRU mis-accounting will be critical for memory cgroup because rmdir() checks the whole LRU is empty and there is no account leak. If a page is on the other LRU than it should be, rmdir() will fail. This bug was added in March 2011, but no bug report yet. I guess there are not many people who use memcg and FUSE at the same time with upstream kernels. The result of this bug is that admin cannot destroy a memcg because of account leak. So, no panic, no deadlock. And, even if an active cgroup exist, umount can succseed. So no problem at shutdown. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: Miklos Szeredi <mszeredi@suse.cz> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Diffstat (limited to 'include')
-rw-r--r--include/linux/memcontrol.h6
1 files changed, 6 insertions, 0 deletions
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 50940da..313a00e 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -119,6 +119,8 @@ struct zone_reclaim_stat*
mem_cgroup_get_reclaim_stat_from_page(struct page *page);
extern void mem_cgroup_print_oom_info(struct mem_cgroup *memcg,
struct task_struct *p);
+extern void mem_cgroup_replace_page_cache(struct page *oldpage,
+ struct page *newpage);
#ifdef CONFIG_CGROUP_MEM_RES_CTLR_SWAP
extern int do_swap_account;
@@ -370,6 +372,10 @@ static inline
void mem_cgroup_count_vm_event(struct mm_struct *mm, enum vm_event_item idx)
{
}
+static inline void mem_cgroup_replace_page_cache(struct page *oldpage,
+ struct page *newpage)
+{
+}
#endif /* CONFIG_CGROUP_MEM_CONT */
#if !defined(CONFIG_CGROUP_MEM_RES_CTLR) || !defined(CONFIG_DEBUG_VM)