aboutsummaryrefslogtreecommitdiffstats
path: root/kernel/trace
Commit message (Collapse)AuthorAgeFilesLines
* tracing: Fix compile issue for trace_sched_wakeup.cSteven Rostedt2010-10-191-2/+1
| | | | | | | | | | | | | | The function start_func_tracer() was incorrectly added in the #ifdef CONFIG_FUNCTION_TRACER condition, but is still used even when function tracing is not enabled. The calls to register_ftrace_function() and register_ftrace_graph() become nops (and their arguments are even ignored), thus there is no reason to hide start_func_tracer() when function tracing is not enabled. Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* tracing: Remove parent recording in latency tracer graph optionsSteven Rostedt2010-10-181-1/+0
| | | | | | | | | | | | | | | | | | | | | | Even though the parent is recorded with the normal function tracing of the latency tracers (irqsoff and wakeup), the function graph recording is bogus. This is due to the function graph messing with the return stack. The latency tracers pass in as the parent CALLER_ADDR0, which works fine for plain function tracing. But this causes bogus output with the graph tracer: 3) <idle>-0 | d.s3. 0.000 us | return_to_handler(); 3) <idle>-0 | d.s3. 0.000 us | _raw_spin_unlock_irqrestore(); 3) <idle>-0 | d.s3. 0.000 us | return_to_handler(); 3) <idle>-0 | d.s3. 0.000 us | trace_hardirqs_on(); The "return_to_handle()" call is the trampoline of the function graph tracer, and is meaningless in this context. Cc: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* tracing: Use one prologue for the preempt irqs off tracer function tracersSteven Rostedt2010-10-181-48/+48
| | | | | | | | | | | | | The preempt and irqsoff tracers have three types of function tracers. Normal function tracer, function graph entry, and function graph return. Each of these use a complex dance to prevent recursion and whether to trace the data or not (depending if interrupts are enabled or not). This patch moves the duplicate code into a single routine, to prevent future mistakes with modifying duplicate complex code. Cc: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* tracing: Use one prologue for the wakeup tracer function tracersSteven Rostedt2010-10-181-52/+50
| | | | | | | | | | | | | The wakeup tracer has three types of function tracers. Normal function tracer, function graph entry, and function graph return. Each of these use a complex dance to prevent recursion and whether to trace the data or not (depending on the wake_task variable). This patch moves the duplicate code into a single routine, to prevent future mistakes with modifying duplicate complex code. Cc: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* tracing: Graph support for wakeup tracerJiri Olsa2010-10-181-10/+221
| | | | | | | | | | Add function graph support for wakeup latency tracer. The graph output is enabled by setting the 'display-graph' trace option. Signed-off-by: Jiri Olsa <jolsa@redhat.com> LKML-Reference: <1285243253-7372-4-git-send-email-jolsa@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* tracing: Make graph related irqs/preemptsoff functions globalJiri Olsa2010-10-183-52/+71
| | | | | | | | | Move trace_graph_function() and print_graph_headers_flags() functions to the trace_function_graph.c to be globaly available. Signed-off-by: Jiri Olsa <jolsa@redhat.com> LKML-Reference: <1285243253-7372-3-git-send-email-jolsa@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* tracing: Add proper check for irq_depth routinesJiri Olsa2010-10-181-4/+20
| | | | | | | | | | | | | The check_irq_entry and check_irq_return could be called from graph event context. In such case there's no graph private data allocated. Adding checks to handle this case. Signed-off-by: Jiri Olsa <jolsa@redhat.com> LKML-Reference: <20100924154102.GB1818@jolsa.brq.redhat.com> [ Fixed some grammar in the comments ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* tracing/trivial: Remove cast from void*matt mooney2010-10-182-3/+3
| | | | | | | Unnecessary cast from void* in assignment. Signed-off-by: matt mooney <mfm@muteddisk.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* Merge branch 'tip/perf/recordmcount-2' of ↵Ingo Molnar2010-10-151-0/+5
|\ | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/core
| * ftrace: Rename config option HAVE_C_MCOUNT_RECORD to HAVE_C_RECORDMCOUNTSteven Rostedt2010-10-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The config option used by archs to let the build system know that the C version of the recordmcount works for said arch is currently called HAVE_C_MCOUNT_RECORD which enables BUILD_C_RECORDMCOUNT. To be more consistent with the name that all archs may use, it has been renamed to HAVE_C_RECORDMCOUNT. This will be less confusing since we are building a C recordmcount and not a mcount_record. Suggested-by: Ingo Molnar <mingo@elte.hu> Cc: <linux-arch@vger.kernel.org> Cc: Michal Marek <mmarek@suse.cz> Cc: linux-kbuild@vger.kernel.org Cc: John Reiser <jreiser@bitwagon.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * ftrace/x86: Add support for C version of recordmcountSteven Rostedt2010-10-141-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds the support for the C version of recordmcount and compile times show ~ 12% improvement. After verifying this works, other archs can add: HAVE_C_MCOUNT_RECORD in its Kconfig and it will use the C version of recordmcount instead of the perl version. Cc: <linux-arch@vger.kernel.org> Cc: Michal Marek <mmarek@suse.cz> Cc: linux-kbuild@vger.kernel.org Cc: John Reiser <jreiser@bitwagon.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* | tracing: Fix function-graph build warning on 32-bitBorislav Petkov2010-10-131-2/+3
|/ | | | | | | | | | | | | | | | | Fix kernel/trace/trace_functions_graph.c: In function ‘trace_print_graph_duration’: kernel/trace/trace_functions_graph.c:652: warning: comparison of distinct pointer types lacks a cast when building 36-rc6 on a 32-bit due to the strict type check failing in the min() macro. Signed-off-by: Borislav Petkov <bp@alien8.de> Cc: Chase Douglas <chase.douglas@canonical.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@elte.hu> LKML-Reference: <20100929080823.GA13595@liondog.tnic> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
* jump label: Initialize workqueue tracepoints *before* they are registeredJason Baron2010-09-221-5/+5
| | | | | | | | | Initialize the workqueue data structures *before* they are registered so that they are ready for callbacks. Signed-off-by: Jason Baron <jbaron@redhat.com> LKML-Reference: <e3a3383fc370ac7086625bebe89d9480d7caf372.1284733808.git.jbaron@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* Merge branch 'tip/perf/core' of ↵Ingo Molnar2010-09-152-31/+78
|\ | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/core
| * tracing: Fix reading of set_ftrace_filter across listsSteven Rostedt2010-09-141-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we do: # cd /sys/kernel/debug # echo 'do_IRQ:traceon schedule:traceon sys_write:traceon' > \ set_ftrace_filter # cat set_ftrace_filter We get the following output: #### all functions enabled #### sys_write:traceon:unlimited schedule:traceon:unlimited do_IRQ:traceon:unlimited This outputs two lists. One is the fact that all functions are currently enabled for function tracing, the other has three probed functions, which happen to have 'traceon' as their commands. Currently, when reading the first list (functions enabled) the seq_file code will receive a "NULL" from the t_next() function causing it to exit early. This makes "read()" from userspace stop reading the code at this boarder. Although read is allowed to do this, some (broken) applications might consider this an end of file and stop early. This patch adds the start of the second list to t_next() when it finishes the first list. It is a simple change and gives the set_ftrace_filter file nicer reading ability. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * tracing: Keep track of set_ftrace_filter position and allow lseek againSteven Rostedt2010-09-141-8/+26
| | | | | | | | | | | | | | | | | | | | This patch keeps track of the index within the elements of set_ftrace_filter and if the position goes backwards, it nicely resets and starts from the beginning again. This allows for lseek and pread to work properly now. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * tracing: Replace typecasted void pointer in set_ftrace_filter codeSteven Rostedt2010-09-141-21/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The set_ftrace_filter uses seq_file and reads from two lists. The pointer returned by t_next() can either be of type struct dyn_ftrace or struct ftrace_func_probe. If there is a bug (there was one) the wrong pointer may be used and the reference can cause an oops. This patch makes t_next() and friends only return the iterator structure which now has a pointer of type struct dyn_ftrace and struct ftrace_func_probe. The t_show() can now test if the pointer is NULL or not and if the pointer exists, it is guaranteed to be of the correct type. Now if there's a bug, only wrong data will be shown but not an oops. Cc: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * tracing: Do not reset *pos in set_ftrace_filterSteven Rostedt2010-09-141-2/+6
| | | | | | | | | | | | | | | | | | | | After the filtered functions are read, the probed functions are read from the hash in set_ftrace_filter. When the hashed probed functions are read, the *pos passed in is reset. Instead of modifying the pos given to the read function, just record the pos where the filtered functions ended and subtract from that. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * Merge branch 'perf-fixes-for-linus' of ↵Linus Torvalds2010-09-103-19/+31
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: tracing: t_start: reset FTRACE_ITER_HASH in case of seek/pread perf symbols: Fix multiple initialization of symbol system perf: Fix CPU hotplug perf, trace: Fix module leak tracing/kprobe: Fix handling of C-unlike argument names tracing/kprobes: Fix handling of argument names perf probe: Fix handling of arguments names perf probe: Fix return probe support tracing/kprobe: Fix a memory leak in error case tracing: Do not allow llseek to set_ftrace_filter
| | * tracing: t_start: reset FTRACE_ITER_HASH in case of seek/preadChris Wright2010-09-091-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Be sure to avoid entering t_show() with FTRACE_ITER_HASH set without having properly started the iterator to iterate the hash. This case is degenerate and, as discovered by Robert Swiecki, can cause t_hash_show() to misuse a pointer. This causes a NULL ptr deref with possible security implications. Tracked as CVE-2010-3079. Cc: Robert Swiecki <swiecki@google.com> Cc: Eugene Teo <eugene@redhat.com> Cc: <stable@kernel.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | Merge branch 'core-fixes-for-linus' of ↵Linus Torvalds2010-09-081-2/+0
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: gcc-4.6: kernel/*: Fix unused but set warnings mutex: Fix annotations to include it in kernel-locking docbook pid: make setpgid() system call use RCU read-side critical section MAINTAINERS: Add RCU's public git tree
| | * | gcc-4.6: kernel/*: Fix unused but set warningsAndi Kleen2010-09-051-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | No real bugs I believe, just some dead code. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: andi@firstfloor.org Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | tracing: Remove leftover FTRACE_ENABLE/DISABLE_MCOUNT enumsSteven Rostedt2010-09-141-14/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The enums for FTRACE_ENABLE_MCOUNT and FTRACE_DISABLE_MCOUNT were used as commands to ftrace_run_update_code(). But these commands were used by the old nasty ftrace daemon that has long been slain. This is a clean up patch to remove the references to these enums and simplify the code a little. Reported-by: Wu Zhangjin <wuzhangjin@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* | | | tracing: Do not trace in irq when funcgraph-irq option is zeroSteven Rostedt2010-09-141-1/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the function graph tracer funcgraph-irq option is zero, disable tracing in IRQs. This makes the option have two effects. 1) When reading the trace file, do not display the functions that happen in interrupt context (when detected) 2) [*new*] When recording a trace, skip those that are detected to be in interrupt by the 'in_irq()' function Note, in_irq() is updated at irq_enter() and irq_exit(). There are still functions that are recorded by the function graph tracer that is in interrupt context but outside the irq_enter/exit() routines. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* | | | tracing: Add funcgraph-irq option for function graph tracer.Jiri Olsa2010-09-141-1/+100
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It's handy to be able to disable the irq related output and not to have to jump over each irq related code, when you have no interrest in it. The option is by default enabled, so there's no change to current behaviour. It affects only the final output, so all the irq related data stay in the ring buffer. Signed-off-by: Jiri Olsa <jolsa@redhat.com> LKML-Reference: <20100907145344.GC1912@jolsa.brq.redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* | | | perf: Rework the PMU methodsPeter Zijlstra2010-09-091-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace pmu::{enable,disable,start,stop,unthrottle} with pmu::{add,del,start,stop}, all of which take a flags argument. The new interface extends the capability to stop a counter while keeping it scheduled on the PMU. We replace the throttled state with the generic stopped state. This also allows us to efficiently stop/start counters over certain code paths (like IRQ handlers). It also allows scheduling a counter without it starting, allowing for a generic frozen state (useful for rotating stopped counters). The stopped state is implemented in two different ways, depending on how the architecture implemented the throttled state: 1) We disable the counter: a) the pmu has per-counter enable bits, we flip that b) we program a NOP event, preserving the counter state 2) We store the counter state and ignore all read/overflow events Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: paulus <paulus@samba.org> Cc: stephane eranian <eranian@googlemail.com> Cc: Robert Richter <robert.richter@amd.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Lin Ming <ming.m.lin@intel.com> Cc: Yanmin <yanmin_zhang@linux.intel.com> Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com> Cc: David Miller <davem@davemloft.net> Cc: Michael Cree <mcree@orcon.net.nz> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | Merge branch 'perf/urgent' into perf/coreIngo Molnar2010-09-093-23/+40
|\ \ \ \ | | |_|/ | |/| | | | | | | | | | | | | | Merge reason: Pick up pending fixes before applying dependent new changes. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | perf, trace: Fix module leakLi Zefan2010-09-091-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 1c024eca (perf, trace: Optimize tracepoints by using per-tracepoint-per-cpu hlist to track events) caused a module refcount leak. Reported-And-Tested-by: Avi Kivity <avi@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <4C7E1F12.8030304@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | tracing/kprobe: Fix handling of C-unlike argument namesMasami Hiramatsu2010-09-081-8/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Check the argument name whether it is invalid (not C-like symbol name). This makes event format simple. Reported-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> LKML-Reference: <20100827113912.22882.62313.stgit@ltc236.sdl.hitachi.co.jp> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | | tracing/kprobes: Fix handling of argument namesMasami Hiramatsu2010-09-081-7/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Set "argN" name for each argument automatically if it has no specified name. Since dynamic trace event(kprobe_events) accepts special characters for its argument, its format can show those special characters (e.g. '$', '%', '+'). However, perf can't parse those format because of the character (especially '%') mess up the format. This sets "argX" name for those arguments if user omitted the argument names. E.g. # echo 'p do_fork %ax IP=%ip $stack' > tracing/kprobe_events # cat tracing/kprobe_events p:kprobes/p_do_fork_0 do_fork arg1=%ax IP=%ip arg3=$stack Reported-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> LKML-Reference: <20100827113906.22882.59312.stgit@ltc236.sdl.hitachi.co.jp> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | | tracing/kprobe: Fix a memory leak in error caseMasami Hiramatsu2010-09-081-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix a memory leak which happens when a field name conflicts with others. In error case, free_trace_probe() will free all arguments until nr_args, so this increments nr_args the begining of the loop instead of the end. Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> LKML-Reference: <20100827113846.22882.12670.stgit@ltc236.sdl.hitachi.co.jp> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | | tracing: Do not allow llseek to set_ftrace_filterSteven Rostedt2010-09-081-1/+1
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reading the file set_ftrace_filter does three things. 1) shows whether or not filters are set for the function tracer 2) shows what functions are set for the function tracer 3) shows what triggers are set on any functions 3 is independent from 1 and 2. The way this file currently works is that it is a state machine, and as you read it, it may change state. But this assumption breaks when you use lseek() on the file. The state machine gets out of sync and the t_show() may use the wrong pointer and cause a kernel oops. Luckily, this will only kill the app that does the lseek, but the app dies while holding a mutex. This prevents anyone else from using the set_ftrace_filter file (or any other function tracing file for that matter). A real fix for this is to rewrite the code, but that is too much for a -rc release or stable. This patch simply disables llseek on the set_ftrace_filter() file for now, and we can do the proper fix for the next major release. Reported-by: Robert Swiecki <swiecki@google.com> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Tavis Ormandy <taviso@google.com> Cc: Eugene Teo <eugene@redhat.com> Cc: vendor-sec@lst.de Cc: <stable@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | tracing: Fix a race in function profileLi Zefan2010-08-311-4/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While we are reading trace_stat/functionX and someone just disabled function_profile at that time, we can trigger this: divide error: 0000 [#1] PREEMPT SMP ... EIP is at function_stat_show+0x90/0x230 ... This fix just takes the ftrace_profile_lock and checks if rec->counter is 0. If it's 0, we know the profile buffer has been reset. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Cc: stable@kernel.org LKML-Reference: <4C723644.4040708@cn.fujitsu.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* | | ring-buffer: Place duplicate expression into a single functionSteven Rostedt2010-09-011-6/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While discussing the strictness of the 80 character limit on the Kernel Summit Discussion mailing list, I showed examples that I broke that limit slightly with some algorithms. In discussing with John Linville, what looked better, I realized that two of the 80 char breaking culprits were an identical expression. As a clean up, this patch moves the identical expression into its own helper function and that is used instead. As a side effect, the offending code is now under the 80 character limit. :-) This clean up code also changes the expression from (A - B) - C to A - (B + C) This makes the code look a little nicer too. Cc: John W. Linville <linville@tuxdriver.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* | | Merge branch 'perf/urgent' into perf/coreFrederic Weisbecker2010-08-271-1/+1
|\ \ \ | |/ / | | | | | | | | | | | | | | | | | | Conflicts: tools/perf/util/callchain.h Merge reason: Fix a non-trivial conflict with latest fixes
| * | tracing/trace_stack: Fix stack trace on ppc64Anton Blanchard2010-08-251-1/+1
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | save_stack_trace() stores the instruction pointer, not the function descriptor. On ppc64 the trace stack code currently dereferences the instruction pointer and shows 8 bytes of instructions in our backtraces: # cat /sys/kernel/debug/tracing/stack_trace Depth Size Location (26 entries) ----- ---- -------- 0) 5424 112 0x6000000048000004 1) 5312 160 0x60000000ebad01b0 2) 5152 160 0x2c23000041c20030 3) 4992 240 0x600000007c781b79 4) 4752 160 0xe84100284800000c 5) 4592 192 0x600000002fa30000 6) 4400 256 0x7f1800347b7407e0 7) 4144 208 0xe89f0108f87f0070 8) 3936 272 0xe84100282fa30000 Since we aren't dealing with function descriptors, use %pS instead of %pF to fix it: # cat /sys/kernel/debug/tracing/stack_trace Depth Size Location (26 entries) ----- ---- -------- 0) 5424 112 ftrace_call+0x4/0x8 1) 5312 160 .current_io_context+0x28/0x74 2) 5152 160 .get_io_context+0x48/0xa0 3) 4992 240 .cfq_set_request+0x94/0x4c4 4) 4752 160 .elv_set_request+0x60/0x84 5) 4592 192 .get_request+0x2d4/0x468 6) 4400 256 .get_request_wait+0x7c/0x258 7) 4144 208 .__make_request+0x49c/0x610 8) 3936 272 .generic_make_request+0x390/0x434 Signed-off-by: Anton Blanchard <anton@samba.org> Cc: rostedt@goodmis.org Cc: fweisbec@gmail.com LKML-Reference: <20100825013238.GE28360@kryten> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | Merge branch 'tip/perf/urgent' of ↵Ingo Molnar2010-08-1910-128/+398
|\ \ | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/core
| * | tracing: Clean up seqfile code for format fileLi Zefan2010-08-181-37/+18
| |/ | | | | | | | | | | | | | | | | Remove the nasty hack that marks a pointer's LSB to distinguish common fields from event fields. Replace it with a more sane approach. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> LKML-Reference: <4C6A23C2.9020606@cn.fujitsu.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * Merge branch 'tip/perf/urgent-3' of ↵Steven Rostedt2010-08-164-68/+163
| |\ | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into trace/tip/perf/urgent-4 Conflicts: kernel/trace/trace_events.c Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| | * tracing: Sanitize value returned from write(trace_marker, "...", len)Marcin Slusarz2010-08-131-3/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When userspace code writes non-new-line-terminated string to trace_marker file, write handler appends new-line and returns number of bytes written to trace buffer, so write(fd, "abc", 3) will return 4 That's unexpected and unfortunately it confuses glibc's fprintf function. Example: int main() { fprintf(stderr, "abc"); return 0; } $ gcc test.c -o test $ echo mmiotrace > /sys/kernel/debug/tracing/current_tracer $ ./test 2>/sys/kernel/debug/tracing/trace_marker results in infinite loop: write(fd, "abc", 3) = 4 write(fd, "", 1) = 0 write(fd, "", 1) = 0 write(fd, "", 1) = 0 write(fd, "", 1) = 0 write(fd, "", 1) = 0 write(fd, "", 1) = 0 write(fd, "", 1) = 0 (...) ...and kernel trace buffer full of empty markers. Fix it by sanitizing write return value. Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> LKML-Reference: <20100727231801.GB2826@joi.lan> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| | * tracing/events: Convert format output to seq_fileSteven Rostedt2010-08-121-67/+141
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Two new events were added that broke the current format output. Both from the SCSI system: scsi_dispatch_cmd_done and scsi_dispatch_cmd_timeout The reason is that their print_fmt exceeded a page size. Since the output of the format used simple_read_from_buffer and trace_seq, it was limited to a page size in output. This patch converts the printing of the format of an event into seq_file, which allows greater than a page size to be shown. I diffed all event formats comparing the output with and without this patch. All matched except for the above two, which showed just: FORMAT TOO BIG without this patch, but now properly displays the output with this patch. v2: Remove updating *pos in seq start function. [ Thanks to Li Zefan for pointing that out ] Reviewed-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: Kei Tokunaga <tokunaga.keiich@jp.fujitsu.com> Cc: James Bottomley <James.Bottomley@suse.de> Cc: Tomohiro Kusumi <kusumi.tomohiro@jp.fujitsu.com> Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| | * tracing: Fix ring_buffer_read_page reading out of page boundaryHuang Ying2010-08-061-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the configuration: CONFIG_DEBUG_PAGEALLOC=y and Shaohua's patch: [PATCH]x86: make spurious_fault check correct pte bit Function call graph trace with the following will trigger a page fault. # cd /sys/kernel/debug/tracing/ # echo function_graph > current_tracer # cat per_cpu/cpu1/trace_pipe_raw > /dev/null BUG: unable to handle kernel paging request at ffff880006e99000 IP: [<ffffffff81085572>] rb_event_length+0x1/0x3f PGD 1b19063 PUD 1b1d063 PMD 3f067 PTE 6e99160 Oops: 0000 [#1] SMP DEBUG_PAGEALLOC last sysfs file: /sys/devices/virtual/net/lo/operstate CPU 1 Modules linked in: Pid: 1982, comm: cat Not tainted 2.6.35-rc6-aes+ #300 /Bochs RIP: 0010:[<ffffffff81085572>] [<ffffffff81085572>] rb_event_length+0x1/0x3f RSP: 0018:ffff880006475e38 EFLAGS: 00010006 RAX: 0000000000000ff0 RBX: ffff88000786c630 RCX: 000000000000001d RDX: ffff880006e98000 RSI: 0000000000000ff0 RDI: ffff880006e99000 RBP: ffff880006475eb8 R08: 000000145d7008bd R09: 0000000000000000 R10: 0000000000008000 R11: ffffffff815d9336 R12: ffff880006d08000 R13: ffff880006e605d8 R14: 0000000000000000 R15: 0000000000000018 FS: 00007f2b83e456f0(0000) GS:ffff880002100000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: ffff880006e99000 CR3: 00000000064a8000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process cat (pid: 1982, threadinfo ffff880006474000, task ffff880006e40770) Stack: ffff880006475eb8 ffffffff8108730f 0000000000000ff0 000000145d7008bd <0> ffff880006e98010 ffff880006d08010 0000000000000296 ffff88000786c640 <0> ffffffff81002956 0000000000000000 ffff8800071f4680 ffff8800071f4680 Call Trace: [<ffffffff8108730f>] ? ring_buffer_read_page+0x15a/0x24a [<ffffffff81002956>] ? return_to_handler+0x15/0x2f [<ffffffff8108a575>] tracing_buffers_read+0xb9/0x164 [<ffffffff810debfe>] vfs_read+0xaf/0x150 [<ffffffff81002941>] return_to_handler+0x0/0x2f [<ffffffff810248b0>] __bad_area_nosemaphore+0x17e/0x1a1 [<ffffffff81002941>] return_to_handler+0x0/0x2f [<ffffffff810248e6>] bad_area_nosemaphore+0x13/0x15 Code: 80 25 b2 16 b3 00 fe c9 c3 55 48 89 e5 f0 80 0d a4 16 b3 00 02 c9 c3 55 31 c0 48 89 e5 48 83 3d 94 16 b3 00 01 c9 0f 94 c0 c3 55 <8a> 0f 48 89 e5 83 e1 1f b8 08 00 00 00 0f b6 d1 83 fa 1e 74 27 RIP [<ffffffff81085572>] rb_event_length+0x1/0x3f RSP <ffff880006475e38> CR2: ffff880006e99000 ---[ end trace a6877bb92ccb36bb ]--- The root cause is that ring_buffer_read_page() may read out of page boundary, because the boundary checking is done after reading. This is fixed via doing boundary checking before reading. Reported-by: Shaohua Li <shaohua.li@intel.com> Cc: <stable@kernel.org> Signed-off-by: Huang Ying <ying.huang@intel.com> LKML-Reference: <1280297641.2771.307.camel@yhuang-dev> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| | * tracing: Fix an unallocated memory access in function_graphShaohua Li2010-08-061-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With CONFIG_DEBUG_PAGEALLOC, I observed an unallocated memory access in function_graph trace. It appears we find a small size entry in ring buffer, but we access it as a big size entry. The access overflows the page size and touches an unallocated page. Cc: <stable@kernel.org> Signed-off-by: Shaohua Li <shaohua.li@intel.com> LKML-Reference: <1280217994.32400.76.camel@sli10-desk.sh.intel.com> [ Added a comment to explain the problem - SDR ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
| * | block: add secure discardAdrian Hunter2010-08-121-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Secure discard is the same as discard except that all copies of the discarded sectors (perhaps created by garbage collection) must also be erased. Signed-off-by: Adrian Hunter <adrian.hunter@nokia.com> Acked-by: Jens Axboe <axboe@kernel.dk> Cc: Kyungmin Park <kmpark@infradead.org> Cc: Madhusudhan Chikkature <madhu.cr@ti.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Ben Gardiner <bengardiner@nanometrics.ca> Cc: <linux-mmc@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | Merge branch 'for-2.6.36' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds2010-08-101-17/+63
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-2.6.36' of git://git.kernel.dk/linux-2.6-block: (149 commits) block: make sure that REQ_* types are seen even with CONFIG_BLOCK=n xen-blkfront: fix missing out label blkdev: fix blkdev_issue_zeroout return value block: update request stacking methods to support discards block: fix missing export of blk_types.h writeback: fix bad _bh spinlock nesting drbd: revert "delay probes", feature is being re-implemented differently drbd: Initialize all members of sync_conf to their defaults [Bugz 315] drbd: Disable delay probes for the upcomming release writeback: cleanup bdi_register writeback: add new tracepoints writeback: remove unnecessary init_timer call writeback: optimize periodic bdi thread wakeups writeback: prevent unnecessary bdi threads wakeups writeback: move bdi threads exiting logic to the forker thread writeback: restructure bdi forker loop a little writeback: move last_active to bdi writeback: do not remove bdi from bdi_list writeback: simplify bdi code a little writeback: do not lose wake-ups in bdi threads ... Fixed up pretty trivial conflicts in drivers/block/virtio_blk.c and drivers/scsi/scsi_error.c as per Jens.
| | * | block: push BKL into blktrace ioctlsArnd Bergmann2010-08-071-0/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The blktrace driver currently needs the BKL, but we should not need to take that in the block layer, so just push it down into the driver itself. It is quite likely that the BKL is not actually required in blktrace code and could be removed in a follow-on patch. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
| | * | block: unify flags for struct bio and struct requestChristoph Hellwig2010-08-071-12/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove the current bio flags and reuse the request flags for the bio, too. This allows to more easily trace the type of I/O from the filesystem down to the block driver. There were two flags in the bio that were missing in the requests: BIO_RW_UNPLUG and BIO_RW_AHEAD. Also I've renamed two request flags that had a superflous RW in them. Note that the flags are in bio.h despite having the REQ_ name - as blkdev.h includes bio.h that is the only way to go for now. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
| | * | block: remove wrappers for request type/flagsChristoph Hellwig2010-08-071-5/+5
| | |/ | | | | | | | | | | | | | | | | | | | | | | | | Remove all the trivial wrappers for the cmd_type and cmd_flags fields in struct requests. This allows much easier grepping for different request types instead of unwinding through macros. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
| * | Merge branch 'bkl/core' of ↵Linus Torvalds2010-08-071-8/+0
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing * 'bkl/core' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing: do_coredump: Do not take BKL init: Remove the BKL from startup code
| | * | init: Remove the BKL from startup codeArnd Bergmann2010-07-091-8/+0
| | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I have shown by code review that no driver takes the BKL at init time any more, so whatever the init code was locking against is no longer there and it is now safe to remove the BKL there. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Steven Rostedt <rostedt@goodmis> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>