aboutsummaryrefslogtreecommitdiffstats
path: root/mm/slob.c
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'for-linus' of ↵Linus Torvalds2010-08-061-1/+8
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6: slub: Allow removal of slab caches during boot Revert "slub: Allow removal of slab caches during boot" slub numa: Fix rare allocation from unexpected node slab: use deferable timers for its periodic housekeeping slub: Use kmem_cache flags to detect if slab is in debugging mode. slub: Allow removal of slab caches during boot slub: Check kasprintf results in kmem_cache_init() SLUB: Constants need UL slub: Use a constant for a unspecified node. SLOB: Free objects to their own list slab: fix caller tracking on !CONFIG_DEBUG_SLAB && CONFIG_TRACING
| * SLOB: Free objects to their own listBob Liu2010-07-161-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SLOB has alloced smaller objects from their own list in reduce overall external fragmentation and increase repeatability, free to their own list also. This is /proc/meminfo result in my test machine: without this patch: === MemTotal: 1030720 kB MemFree: 750012 kB Buffers: 15496 kB Cached: 160396 kB SwapCached: 0 kB Active: 105024 kB Inactive: 145604 kB Active(anon): 74816 kB Inactive(anon): 2180 kB Active(file): 30208 kB Inactive(file): 143424 kB Unevictable: 16 kB .... with this patch: === MemTotal: 1030720 kB MemFree: 751908 kB Buffers: 15492 kB Cached: 160280 kB SwapCached: 0 kB Active: 102720 kB Inactive: 146140 kB Active(anon): 73168 kB Inactive(anon): 2180 kB Active(file): 29552 kB Inactive(file): 143960 kB Unevictable: 16 kB ... The result shows an improvement of 1 MB! And when I tested it on a embeded system with 64 MB, I found this path is never called during kernel bootup. Acked-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Bob Liu <lliubbo@gmail.com> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
* | Merge branch 'perf-core-for-linus' of ↵Linus Torvalds2010-08-061-1/+3
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (162 commits) tracing/kprobes: unregister_trace_probe needs to be called under mutex perf: expose event__process function perf events: Fix mmap offset determination perf, powerpc: fsl_emb: Restore setting perf_sample_data.period perf, powerpc: Convert the FSL driver to use local64_t perf tools: Don't keep unreferenced maps when unmaps are detected perf session: Invalidate last_match when removing threads from rb_tree perf session: Free the ref_reloc_sym memory at the right place x86,mmiotrace: Add support for tracing STOS instruction perf, sched migration: Librarize task states and event headers helpers perf, sched migration: Librarize the GUI class perf, sched migration: Make the GUI class client agnostic perf, sched migration: Make it vertically scrollable perf, sched migration: Parameterize cpu height and spacing perf, sched migration: Fix key bindings perf, sched migration: Ignore unhandled task states perf, sched migration: Handle ignored migrate out events perf: New migration tool overview tracing: Drop cpparg() macro perf: Use tracepoint_synchronize_unregister() to flush any pending tracepoint call ... Fix up trivial conflicts in Makefile and drivers/cpufreq/cpufreq.c
| * | tracing: Remove kmemtrace ftrace pluginLi Zefan2010-06-091-1/+3
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | We have been resisting new ftrace plugins and removing existing ones, and kmemtrace has been superseded by kmem trace events and perf-kmem, so we remove it. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Acked-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Cc: Ingo Molnar <mingo@elte.hu> Cc: Steven Rostedt <rostedt@goodmis.org> [ remove kmemtrace from the makefile, handle slob too ] Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
* | mm: remove all rcu head initializationsPaul E. McKenney2010-06-141-1/+0
|/ | | | | | | | | | | | | | Remove all rcu head inits. We don't care about the RCU head state before passing it to call_rcu() anyway. Only leave the "on_stack" variants so debugobjects can keep track of objects on stack. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Matt Mackall <mpm@selenic.com> Cc: Andrew Morton <akpm@linux-foundation.org>
* mm: Move ARCH_SLAB_MINALIGN and ARCH_KMALLOC_MINALIGN to <linux/slob_def.h>David Woodhouse2010-05-191-8/+0
| | | | | | Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
* slab: remove duplicate kmem_cache_init_late() declarationsWu Fengguang2009-08-061-0/+5
| | | | | | | | | | kmem_cache_init_late() has been declared in slab.h CC: Nick Piggin <npiggin@suse.de> CC: Matt Mackall <mpm@selenic.com> CC: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
* fix RCU-callback-after-kmem_cache_destroy problem in sl[aou]bPaul E. McKenney2009-06-261-0/+2
| | | | | | | | | | | | Jesper noted that kmem_cache_destroy() invokes synchronize_rcu() rather than rcu_barrier() in the SLAB_DESTROY_BY_RCU case, which could result in RCU callbacks accessing a kmem_cache after it had been destroyed. Cc: <stable@kernel.org> Acked-by: Matt Mackall <mpm@selenic.com> Reported-by: Jesper Dangaard Brouer <hawk@comx.dk> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
*---. Merge branches 'slab/documentation', 'slab/fixes', 'slob/cleanups' and ↵Pekka Enberg2009-06-171-3/+3
|\ \ \ | | | | | | | | | | | | 'slub/fixes' into for-linus
| | | * slob: use PG_slab for identifying SLOB pagesWu Fengguang2009-05-111-3/+3
| | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For the sake of consistency. Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Andi Kleen <andi@firstfloor.org> Acked-by: Matt Mackall <mpm@selenic.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
* | | page allocator: do not check NUMA node ID when the caller knows the node is ↵Mel Gorman2009-06-161-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | valid Callers of alloc_pages_node() can optionally specify -1 as a node to mean "allocate from the current node". However, a number of the callers in fast paths know for a fact their node is valid. To avoid a comparison and branch, this patch adds alloc_pages_exact_node() that only checks the nid with VM_BUG_ON(). Callers that know their node is valid are then converted. Signed-off-by: Mel Gorman <mel@csn.ul.ie> Reviewed-by: Christoph Lameter <cl@linux-foundation.org> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi> Acked-by: Paul Mundt <lethal@linux-sh.org> [for the SLOB NUMA bits] Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | kmemleak: Add the slob memory allocation/freeing hooksCatalin Marinas2009-06-111-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds the callbacks to kmemleak_(alloc|free) functions from the slob allocator. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Matt Mackall <mpm@selenic.com> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
* | | Merge branch 'tracing-for-linus' of ↵Linus Torvalds2009-06-101-1/+1
|\ \ \ | |/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (244 commits) Revert "x86, bts: reenable ptrace branch trace support" tracing: do not translate event helper macros in print format ftrace/documentation: fix typo in function grapher name tracing/events: convert block trace points to TRACE_EVENT(), fix !CONFIG_BLOCK tracing: add protection around module events unload tracing: add trace_seq_vprint interface tracing: fix the block trace points print size tracing/events: convert block trace points to TRACE_EVENT() ring-buffer: fix ret in rb_add_time_stamp ring-buffer: pass in lockdep class key for reader_lock tracing: add annotation to what type of stack trace is recorded tracing: fix multiple use of __print_flags and __print_symbolic tracing/events: fix output format of user stack tracing/events: fix output format of kernel stack tracing/trace_stack: fix the number of entries in the header ring-buffer: discard timestamps that are at the start of the buffer ring-buffer: try to discard unneeded timestamps ring-buffer: fix bug in ring_buffer_discard_commit ftrace: do not profile functions when disabled tracing: make trace pipe recognize latency format flag ...
| * | tracing, kmemtrace: Separate include/trace/kmemtrace.h to kmemtrace part and ↵Zhaolei2009-04-121-1/+1
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | tracepoint part Impact: refactor code for future changes Current kmemtrace.h is used both as header file of kmemtrace and kmem's tracepoints definition. Tracepoints' definition file may be used by other code, and should only have definition of tracepoint. We can separate include/trace/kmemtrace.h into 2 files: include/linux/kmemtrace.h: header file for kmemtrace include/trace/kmem.h: definition of kmem tracepoints Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Acked-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <49DEE68A.5040902@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | mm: SLOB fix reclaim_stateNick Piggin2009-05-061-1/+4
|/ | | | | | | | | | | | SLOB does not correctly account reclaim_state.reclaimed_slab, so it will break memory reclaim. Account it like SLAB does. Cc: stable@kernel.org Cc: linux-mm@kvack.org Acked-by: Matt Mackall <mpm@selenic.com> Acked-by: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
* kmemtrace: trace kfree() calls with NULL or zero-length objectsPekka Enberg2009-04-031-2/+2
| | | | | | | | | | | | | Impact: also output kfree(NULL) entries This patch moves the trace_kfree() calls before the ZERO_OR_NULL_PTR check so that we can trace call-sites that call kfree() with NULL many times which might be an indication of a bug. Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> LKML-Reference: <1237971957.30175.18.camel@penberg-laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* kmemtrace: use tracepointsEduard - Gabriel Munteanu2009-04-031-16/+12
| | | | | | | | | | | | | kmemtrace now uses tracepoints instead of markers. We no longer need to use format specifiers to pass arguments. Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> [ folded: Use the new TP_PROTO and TP_ARGS to fix the build. ] [ folded: fix build when CONFIG_KMEMTRACE is disabled. ] [ folded: define tracepoints when CONFIG_TRACEPOINTS is enabled. ] Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> LKML-Reference: <ae61c0f37156db8ec8dc0d5778018edde60a92e3.1237813499.git.eduard.munteanu@linux360.ro> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* Merge branch 'tracing/core-v2' into tracing-for-linusIngo Molnar2009-04-021-5/+30
|\ | | | | | | | | | | | | | | Conflicts: include/linux/slub_def.h lib/Kconfig.debug mm/slob.c mm/slub.c
| * Merge branch 'core/locking' into tracing/ftraceIngo Molnar2009-03-041-0/+2
| |\
| * \ Merge branches 'tracing/ftrace', 'tracing/ring-buffer', 'tracing/sysprof', ↵Ingo Molnar2009-02-131-0/+1
| |\ \ | | | | | | | | | | | | 'tracing/urgent' and 'linus' into tracing/core
| * | | tracing/kmemtrace: normalize the raw tracer event to the unified tracing APIFrederic Weisbecker2008-12-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: new tracer plugin This patch adapts kmemtrace raw events tracing to the unified tracing API. To enable and use this tracer, just do the following: echo kmemtrace > /debugfs/tracing/current_tracer cat /debugfs/tracing/trace You will have the following output: # tracer: kmemtrace # # # ALLOC TYPE REQ GIVEN FLAGS POINTER NODE CALLER # FREE | | | | | | | | # | type_id 1 call_site 18446744071565527833 ptr 18446612134395152256 type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1 type_id 1 call_site 18446744071565585534 ptr 18446612134405955584 type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1 type_id 0 call_site 18446744071565636711 ptr 18446612134345164672 bytes_req 240 bytes_alloc 240 gfp_flags 208 node -1 type_id 1 call_site 18446744071565585534 ptr 18446612134405955584 type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1 type_id 0 call_site 18446744071565636711 ptr 18446612134345164912 bytes_req 240 bytes_alloc 240 gfp_flags 208 node -1 type_id 1 call_site 18446744071565585534 ptr 18446612134405955584 type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1 type_id 0 call_site 18446744071565636711 ptr 18446612134345165152 bytes_req 240 bytes_alloc 240 gfp_flags 208 node -1 type_id 0 call_site 18446744071566144042 ptr 18446612134346191680 bytes_req 1304 bytes_alloc 1312 gfp_flags 208 node -1 type_id 1 call_site 18446744071565585534 ptr 18446612134405955584 type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1 type_id 1 call_site 18446744071565585534 ptr 18446612134405955584 That was to stay backward compatible with the format output produced in inux/tracepoint.h. This is the default ouput, but note that I tried something else. If you change an option: echo kmem_minimalistic > /debugfs/trace_options and then cat /debugfs/trace, you will have the following output: # tracer: kmemtrace # # # ALLOC TYPE REQ GIVEN FLAGS POINTER NODE CALLER # FREE | | | | | | | | # | - C 0xffff88007c088780 file_free_rcu + K 4096 4096 000000d0 0xffff88007cad6000 -1 getname - C 0xffff88007cad6000 putname + K 4096 4096 000000d0 0xffff88007cad6000 -1 getname + K 240 240 000000d0 0xffff8800790dc780 -1 d_alloc - C 0xffff88007cad6000 putname + K 4096 4096 000000d0 0xffff88007cad6000 -1 getname + K 240 240 000000d0 0xffff8800790dc870 -1 d_alloc - C 0xffff88007cad6000 putname + K 4096 4096 000000d0 0xffff88007cad6000 -1 getname + K 240 240 000000d0 0xffff8800790dc960 -1 d_alloc + K 1304 1312 000000d0 0xffff8800791d7340 -1 reiserfs_alloc_inode - C 0xffff88007cad6000 putname + K 4096 4096 000000d0 0xffff88007cad6000 -1 getname - C 0xffff88007cad6000 putname + K 992 1000 000000d0 0xffff880079045b58 -1 alloc_inode + K 768 1024 000080d0 0xffff88007c096400 -1 alloc_pipe_info + K 240 240 000000d0 0xffff8800790dca50 -1 d_alloc + K 272 320 000080d0 0xffff88007c088780 -1 get_empty_filp + K 272 320 000080d0 0xffff88007c088000 -1 get_empty_filp Yeah I shall confess kmem_minimalistic should be: kmem_alternative. Whatever, I find it more readable but this a personal opinion of course. We can drop it if you want. On the ALLOC/FREE column, + means an allocation and - a free. On the type column, you have K = kmalloc, C = cache, P = page I would like the flags to be GFP_* strings but that would not be easy to not break the column with strings.... About the node...it seems to always be -1. I don't know why but that shouldn't be difficult to find. I moved linux/tracepoint.h to trace/tracepoint.h as well. I think that would be more easy to find the tracer headers if they are all in their common directory. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | Merge branch 'topic/kmemtrace' of ↵Ingo Molnar2008-12-291-6/+31
| |\ \ \ | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6 into tracing/kmemtrace
| | * | | kmemtrace: SLOB hooks.Eduard - Gabriel Munteanu2008-12-291-6/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds hooks for the SLOB allocator, to allow tracing with kmemtrace. We also convert some inline functions to __always_inline to make sure _RET_IP_, which expands to __builtin_return_address(0), always works as expected. Acked-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
* | | | | Merge branch 'locking-for-linus' of ↵Linus Torvalds2009-03-301-0/+2
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (33 commits) lockdep: fix deadlock in lockdep_trace_alloc lockdep: annotate reclaim context (__GFP_NOFS), fix SLOB lockdep: annotate reclaim context (__GFP_NOFS), fix lockdep: build fix for !PROVE_LOCKING lockstat: warn about disabled lock debugging lockdep: use stringify.h lockdep: simplify check_prev_add_irq() lockdep: get_user_chars() redo lockdep: simplify get_user_chars() lockdep: add comments to mark_lock_irq() lockdep: remove macro usage from mark_held_locks() lockdep: fully reduce mark_lock_irq() lockdep: merge the !_READ mark_lock_irq() helpers lockdep: merge the _READ mark_lock_irq() helpers lockdep: simplify mark_lock_irq() helpers #3 lockdep: further simplify mark_lock_irq() helpers lockdep: simplify the mark_lock_irq() helpers lockdep: split up mark_lock_irq() lockdep: generate usage strings lockdep: generate the state bit definitions ...
| * | | | | lockdep: annotate reclaim context (__GFP_NOFS), fix SLOBIngo Molnar2009-03-301-1/+1
| | |_|_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: build fix fix typo in mm/slob.c: mm/slob.c:469: error: ‘flags’ undeclared (first use in this function) mm/slob.c:469: error: (Each undeclared identifier is reported only once mm/slob.c:469: error: for each function it appears in.) Cc: Nick Piggin <npiggin@suse.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20090128135457.350751756@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | lockdep: annotate reclaim context (__GFP_NOFS)Nick Piggin2009-02-141-0/+2
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here is another version, with the incremental patch rolled up, and added reclaim context annotation to kswapd, and allocation tracing to slab allocators (which may only ever reach the page allocator in rare cases, so it is good to put annotations here too). Haven't tested this version as such, but it should be getting closer to merge worthy ;) -- After noticing some code in mm/filemap.c accidentally perform a __GFP_FS allocation when it should not have been, I thought it might be a good idea to try to catch this kind of thing with lockdep. I coded up a little idea that seems to work. Unfortunately the system has to actually be in __GFP_FS page reclaim, then take the lock, before it will mark it. But at least that might still be some orders of magnitude more common (and more debuggable) than an actual deadlock condition, so we have some improvement I hope (the concept is no less complete than discovery of a lock's interrupt contexts). I guess we could even do the same thing with __GFP_IO (normal reclaim), and even GFP_NOIO locks too... but filesystems will have the most locks and fiddly code paths, so let's start there and see how it goes. It *seems* to work. I did a quick test. ================================= [ INFO: inconsistent lock state ] 2.6.28-rc6-00007-ged31348-dirty #26 --------------------------------- inconsistent {in-reclaim-W} -> {ov-reclaim-W} usage. modprobe/8526 [HC0[0]:SC0[0]:HE1:SE1] takes: (testlock){--..}, at: [<ffffffffa0020055>] brd_init+0x55/0x216 [brd] {in-reclaim-W} state was registered at: [<ffffffff80267bdb>] __lock_acquire+0x75b/0x1a60 [<ffffffff80268f71>] lock_acquire+0x91/0xc0 [<ffffffff8070f0e1>] mutex_lock_nested+0xb1/0x310 [<ffffffffa002002b>] brd_init+0x2b/0x216 [brd] [<ffffffff8020903b>] _stext+0x3b/0x170 [<ffffffff80272ebf>] sys_init_module+0xaf/0x1e0 [<ffffffff8020c3fb>] system_call_fastpath+0x16/0x1b [<ffffffffffffffff>] 0xffffffffffffffff irq event stamp: 3929 hardirqs last enabled at (3929): [<ffffffff8070f2b5>] mutex_lock_nested+0x285/0x310 hardirqs last disabled at (3928): [<ffffffff8070f089>] mutex_lock_nested+0x59/0x310 softirqs last enabled at (3732): [<ffffffff8061f623>] sk_filter+0x83/0xe0 softirqs last disabled at (3730): [<ffffffff8061f5b6>] sk_filter+0x16/0xe0 other info that might help us debug this: 1 lock held by modprobe/8526: #0: (testlock){--..}, at: [<ffffffffa0020055>] brd_init+0x55/0x216 [brd] stack backtrace: Pid: 8526, comm: modprobe Not tainted 2.6.28-rc6-00007-ged31348-dirty #26 Call Trace: [<ffffffff80265483>] print_usage_bug+0x193/0x1d0 [<ffffffff80266530>] mark_lock+0xaf0/0xca0 [<ffffffff80266735>] mark_held_locks+0x55/0xc0 [<ffffffffa0020000>] ? brd_init+0x0/0x216 [brd] [<ffffffff802667ca>] trace_reclaim_fs+0x2a/0x60 [<ffffffff80285005>] __alloc_pages_internal+0x475/0x580 [<ffffffff8070f29e>] ? mutex_lock_nested+0x26e/0x310 [<ffffffffa0020000>] ? brd_init+0x0/0x216 [brd] [<ffffffffa002006a>] brd_init+0x6a/0x216 [brd] [<ffffffffa0020000>] ? brd_init+0x0/0x216 [brd] [<ffffffff8020903b>] _stext+0x3b/0x170 [<ffffffff8070f8b9>] ? mutex_unlock+0x9/0x10 [<ffffffff8070f83d>] ? __mutex_unlock_slowpath+0x10d/0x180 [<ffffffff802669ec>] ? trace_hardirqs_on_caller+0x12c/0x190 [<ffffffff80272ebf>] sys_init_module+0xaf/0x1e0 [<ffffffff8020c3fb>] system_call_fastpath+0x16/0x1b Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | | |
| \ \ \
| \ \ \
| \ \ \
*---. \ \ \ Merge branches 'topic/slob/cleanups', 'topic/slob/fixes', 'topic/slub/core', ↵Pekka Enberg2009-03-241-16/+27
|\ \ \ \ \ \ | | | |/ / / | | | | | / | |_|_|_|/ |/| | | | 'topic/slub/cleanups' and 'topic/slub/perf' into for-linus
| | * | | slob: fix lockup in slob_free()Nick Piggin2009-03-231-1/+2
| | |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Don't hold SLOB lock when freeing the page. Reduces lock hold width. See the following thread for discussion of the bug: http://marc.info/?l=linux-kernel&m=123709983214143&w=2 Reported-by: Ingo Molnar <mingo@elte.hu> Acked-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
| * | | slob: clean up the codeAmérico Wang2009-01-191-15/+25
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Use NULL instead of plain 0; - Rename slob_page() to is_slob_page(); - Define slob_page() to convert void* to struct slob_page*; - Rename slob_new_page() to slob_new_pages(); - Define slob_free_pages() accordingly. Compile tests only. Signed-off-by: WANG Cong <wangcong@zeuux.org> Signed-off-by: Matt Mackall <mpm@selenic.com> Cc: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
* | | mm: Export symbol ksize()Kirill A. Shutemov2009-02-121-0/+1
|/ / | | | | | | | | | | | | | | | | | | Commit 7b2cd92adc5430b0c1adeb120971852b4ea1ab08 ("crypto: api - Fix zeroing on free") added modular user of ksize(). Export that to fix crypto.ko compilation. Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
* | slob: do not pass the SLAB flags as GFP in kmem_cache_create()Catalin Marinas2008-12-151-1/+1
|/ | | | | | | | | | | | | | The kmem_cache_create() function in the slob allocator passes the SLAB flags as GFP flags to the slob_alloc() function. The patch changes this call to pass GFP_KERNEL as the other allocators seem to do. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Matt Mackall <mpm@selenic.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* SLOB: fix bogus ksize calculation fixMatt Mackall2008-10-091-3/+5
| | | | | | | | | | | | | This fixes the previous fix, which was completely wrong on closer inspection. This version has been manually tested with a user-space test harness and generates sane values. A nearly identical patch has been boot-tested. The problem arose from changing how kmalloc/kfree handled alignment padding without updating ksize to match. This brings it in sync. Signed-off-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* SLOB: fix bogus ksize calculationMatt Mackall2008-10-071-1/+1
| | | | | | | | | | | SLOB's ksize calculation was braindamaged and generally harmlessly underreported the allocation size. But for very small buffers, it could in fact overreport them, leading code depending on krealloc to overrun the allocation and trample other data. Signed-off-by: Matt Mackall <mpm@selenic.com> Tested-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm: unexport ksizeAdrian Bunk2008-07-291-1/+0
| | | | | | | This patch removes the obsolete and no longer used exports of ksize. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
* SL*B: drop kmem cache argument from constructorAlexey Dobriyan2008-07-261-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | Kmem cache passed to constructor is only needed for constructors that are themselves multiplexeres. Nobody uses this "feature", nor does anybody uses passed kmem cache in non-trivial way, so pass only pointer to object. Non-trivial places are: arch/powerpc/mm/init_64.c arch/powerpc/mm/hugetlbpage.c This is flag day, yes. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Acked-by: Christoph Lameter <cl@linux-foundation.org> Cc: Jon Tollefson <kniht@linux.vnet.ibm.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Matt Mackall <mpm@selenic.com> [akpm@linux-foundation.org: fix arch/powerpc/mm/hugetlbpage.c] [akpm@linux-foundation.org: fix mm/slab.c] [akpm@linux-foundation.org: fix ubifs] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* slob: record page flag overlays explicitlyAndy Whitcroft2008-07-241-6/+6
| | | | | | | | | | | | | | | | | | SLOB reuses two page bits for internal purposes, it overlays PG_active and PG_private. This is hidden away in slob.c. Document these overlays explicitly in the main page-flags enum along with all the others. Signed-off-by: Andy Whitcroft <apw@shadowen.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Matt Mackall <mpm@selenic.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Rik van Riel <riel@redhat.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* slob: Fix to return wrong pointerMinChan Kim2008-05-191-2/+3
| | | | | | | | | | | | | Although slob_alloc return NULL, __kmalloc_node returns NULL + align. Because align always can be changed, it is very hard for debugging problem of no page if it don't return NULL. We have to return NULL in case of no page. [penberg@cs.helsinki.fi: fix formatting as suggested by Matt.] Acked-by: Matt Mackall <mpm@selenic.com> Signed-off-by: MinChan Kim <minchan.kim@gmail.com> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
* slob: fix bug - when slob allocates "struct kmem_cache", it does not force ↵Yi Li2008-04-271-1/+2
| | | | | | | | | | | alignment. This may trigger misaligned memory access exception. Acked-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Yi Li <yi.li@analog.com> Signed-off-by: Bryan Wu <cooloney@kernel.org> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
* slob: reduce external fragmentation by using three free listsMatt Mackall2008-02-051-14/+33
| | | | | | | | | | By putting smaller objects on their own list, we greatly reduce overall external fragmentation and increase repeatability. This reduces total SLOB overhead from > 50% to ~6% on a simple boot test. Signed-off-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* slob: fix free block merging at head of subpageMatt Mackall2008-02-051-0/+4
| | | | | | | | | We weren't merging freed blocks at the beginning of the free list. Fixing this showed a 2.5% efficiency improvement in a userspace test harness. Signed-off-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Avoid double memclear() in SLOB/SLUBLinus Torvalds2007-12-091-1/+1
| | | | | | | | | | Both slob and slub react to __GFP_ZERO by clearing the allocation, which means that passing the GFP_ZERO bit down to the page allocator is just wasteful and pointless. Acked-by: Matt Mackall <mpm@selenic.com> Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Add EXPORT_SYMBOL(ksize);Tetsuo Handa2007-12-051-0/+1
| | | | | | | | | | | | mm/slub.c exports ksize(), but mm/slob.c and mm/slab.c don't. It's used by binfmt_flat, which can be built as a module. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Christoph Lameter <clameter@sgi.com> Cc: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* slob: fix memory corruptionNick Piggin2007-11-151-1/+2
| | | | | | | | | | | | | | | | | | Previously, it would be possible for prev->next to point to &free_slob_pages, and thus we would try to move a list onto itself, and bad things would happen. It seems a bit hairy to be doing list operations with the list marker as an entry, rather than a head, but... this resolves the following crash: http://bugzilla.kernel.org/show_bug.cgi?id=9379 Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Slab API: remove useless ctor parameter and reorder parametersChristoph Lameter2007-10-171-3/+3
| | | | | | | | | | | | | | | | | | | | | Slab constructors currently have a flags parameter that is never used. And the order of the arguments is opposite to other slab functions. The object pointer is placed before the kmem_cache pointer. Convert ctor(void *object, struct kmem_cache *s, unsigned long flags) to ctor(struct kmem_cache *s, void *object) throughout the kernel [akpm@linux-foundation.org: coupla fixes] Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Slab allocators: fail if ksize is called with a NULL parameterChristoph Lameter2007-10-161-1/+2
| | | | | | | | | | | | | | | | A NULL pointer means that the object was not allocated. One cannot determine the size of an object that has not been allocated. Currently we return 0 but we really should BUG() on attempts to determine the size of something nonexistent. krealloc() interprets NULL to mean a zero sized object. Handle that separately in krealloc(). Signed-off-by: Christoph Lameter <clameter@sgi.com> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* {slub, slob}: use unlikely() for kfree(ZERO_OR_NULL_PTR) checkSatyam Sharma2007-10-161-3/+3
| | | | | | | | | | | | | Considering kfree(NULL) would normally occur only in error paths and kfree(ZERO_SIZE_PTR) is uncommon as well, so let's use unlikely() for the condition check in SLUB's and SLOB's kfree() to optimize for the common case. SLAB has this already. Signed-off-by: Satyam Sharma <satyam@infradead.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* slob: reduce list scanningMatt Mackall2007-07-211-5/+16
| | | | | | | | | | | | | | | | | | | | The version of SLOB in -mm always scans its free list from the beginning, which results in small allocations and free segments clustering at the beginning of the list over time. This causes the average search to scan over a large stretch at the beginning on each allocation. By starting each page search where the last one left off, we evenly distribute the allocations and greatly shorten the average search. Without this patch, kernel compiles on a 1.5G machine take a large amount of system time for list scanning. With this patch, compiles are within a few seconds of performance of a SLAB kernel with no notable change in system time. Signed-off-by: Matt Mackall <mpm@selenic.com> Cc: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm: Remove slab destructors from kmem_cache_create().Paul Mundt2007-07-201-2/+1
| | | | | | | | | | | | | | Slab destructors were no longer supported after Christoph's c59def9f222d44bb7e2f0a559f2906191a0862d7 change. They've been BUGs for both slab and slub, and slob never supported them either. This rips out support for the dtor pointer from kmem_cache_create() completely and fixes up every single callsite in the kernel (there were about 224, not including the slab allocator definitions themselves, or the documentation references). Signed-off-by: Paul Mundt <lethal@linux-sh.org>
* Slab allocators: Cleanup zeroing allocationsChristoph Lameter2007-07-171-10/+0
| | | | | | | | | | | It becomes now easy to support the zeroing allocs with generic inline functions in slab.h. Provide inline definitions to allow the continued use of kzalloc, kmem_cache_zalloc etc but remove other definitions of zeroing functions from the slab allocators and util.c. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Slab allocators: support __GFP_ZERO in all allocatorsChristoph Lameter2007-07-171-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A kernel convention for many allocators is that if __GFP_ZERO is passed to an allocator then the allocated memory should be zeroed. This is currently not supported by the slab allocators. The inconsistency makes it difficult to implement in derived allocators such as in the uncached allocator and the pool allocators. In addition the support zeroed allocations in the slab allocators does not have a consistent API. There are no zeroing allocator functions for NUMA node placement (kmalloc_node, kmem_cache_alloc_node). The zeroing allocations are only provided for default allocs (kzalloc, kmem_cache_zalloc_node). __GFP_ZERO will make zeroing universally available and does not require any addititional functions. So add the necessary logic to all slab allocators to support __GFP_ZERO. The code is added to the hot path. The gfp flags are on the stack and so the cacheline is readily available for checking if we want a zeroed object. Zeroing while allocating is now a frequent operation and we seem to be gradually approaching a 1-1 parity between zeroing and not zeroing allocs. The current tree has 3476 uses of kmalloc vs 2731 uses of kzalloc. Signed-off-by: Christoph Lameter <clameter@sgi.com> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>