aboutsummaryrefslogtreecommitdiffstats
path: root/tools
Commit message (Collapse)AuthorAgeFilesLines
...
| * | | | | perf stat: Fix -nan% output in perf stat noise printoutsIngo Molnar2011-04-261-5/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before: 0 CPU-migrations # 0.000 M/sec ( +- -nan% ) After: 0 CPU-migrations # 0.000 M/sec ( +- 0.00% ) Also factor out the noise printing function. Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-z89h2v1bk1mikcbsf7e6v34q@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | perf stat: Add stalled cycles to the default outputIngo Molnar2011-04-262-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The new default output looks like this: Performance counter stats for './loop_1b_instructions': 236.010686 task-clock # 0.996 CPUs utilized 0 context-switches # 0.000 M/sec 0 CPU-migrations # 0.000 M/sec 99 page-faults # 0.000 M/sec 756,487,646 cycles # 3.205 GHz 354,938,996 stalled-cycles # 46.92% of all cycles are idle 1,001,403,797 instructions # 1.32 insns per cycle # 0.35 stalled cycles per insn 100,279,773 branches # 424.895 M/sec 12,646 branch-misses # 0.013 % of all branches 0.236902540 seconds time elapsed We dropped cache-refs and cache-misses and added stalled-cycles - this is a more generic "how well utilized is the CPU" metric. If the stalled-cycles ratio is too high then more specific measurements can be taken to figure out the source of the inefficiency. Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-pbpl2l4mn797s69bclfpwkwn@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | perf stat: Add stalled cycles accounting, prettify the resulting outputIngo Molnar2011-04-261-13/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add stalled cycles accounting and use it to print the "cycles stalled per instruction" value. Also change the unit of the cycles output from M/sec to GHz - this is more intuitive. Prettify the output to: Performance counter stats for './loop_1b_instructions': 239.775036 task-clock # 0.997 CPUs utilized 761,903,912 cycles # 3.178 GHz 356,620,620 stalled-cycles # 46.81% of all cycles are idle 1,001,578,351 instructions # 1.31 insns per cycle # 0.36 stalled cycles per insn 14,782 cache-references # 0.062 M/sec 5,694 cache-misses # 38.520 % of all cache refs 0.240493656 seconds time elapsed Also adjust the --repeat output to make the percentages align vertically: Performance counter stats for './loop_1b_instructions' (10 runs): 236.096793 task-clock # 0.997 CPUs utilized ( +- 0.011% ) 756,553,086 cycles # 3.204 GHz ( +- 0.002% ) 354,942,692 stalled-cycles # 46.92% of all cycles are idle ( +- 0.008% ) 1,001,389,700 instructions # 1.32 insns per cycle # 0.35 stalled cycles per insn ( +- 0.000% ) 10,166 cache-references # 0.043 M/sec ( +- 0.742% ) 468 cache-misses # 4.608 % of all cache refs ( +- 13.385% ) 0.236874136 seconds time elapsed ( +- 0.01% ) Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-uapziqny39601apdmmhoz7hk@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | perf stat: Factor our shadow statsIngo Molnar2011-04-261-14/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Create update_shadow_stats() which is then used in both read_counter_aggr() and read_counter(). This not only simplifies the code but also fixes a bug: HW_CACHE_REFERENCES was not updated in read_counter(). Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-9uc55z3g88r47exde7zxjm6p@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | perf stat: Make all displayed event names parseable as wellIngo Molnar2011-04-261-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Right now we display this by default: 0.202204 task-clock-msecs # 0.282 CPUs 0 context-switches # 0.000 M/sec 0 CPU-migrations # 0.000 M/sec 85 page-faults # 0.420 M/sec The task-clock-msecs event cannot actually be passed back as an event name, the event name we recognize is 'task-clock'. So change the output of the cpu-clock and task-clock events to be idempotent. ( Units should be printed out in the right-side column, if needed. ) Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-lexrnbzy09asscgd4f7oac4i@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | perf stat: Fail more clearly when an invalid modifier is specifiedIngo Molnar2011-04-261-11/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently we fail without printing any error message on "perf stat -e task-clock-msecs". The reason is that the task-clock event is matched and the "-msecs" postfix is assumed to be an event modifier - but is not recognized. This patch changes the code to be more informative: $ perf stat -e task-clock-msecs true invalid event modifier: '-msecs' Run 'perf list' for a list of valid events and modifiers And restructures the return value of parse_event_modifier() to allow the printing of all variants of invalid event modifiers. Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-wlaw3dvz1ly6wple8l52cfca@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | perf tools: Accept case-insensitive symbolic event variantsIngo Molnar2011-04-261-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We currently fail on something like '-e CPU-migrations', with: invalid or unsupported event: 'CPU-migrations' While 'CPU-migrations' is how we actually print out the event in the default perf stat output: Performance counter stats for 'true': 0.202204 task-clock-msecs # 0.282 CPUs 0 context-switches # 0.000 M/sec 0 CPU-migrations # 0.000 M/sec So change the matching to be case-insensitive. Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-omcm3edjjtx83a4kh2e244se@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | perf stat: Print cache misses as percentageIngo Molnar2011-04-261-3/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before: 113,393,041 cache-references # 83.636 M/sec 7,052,454 cache-misses # 5.202 M/sec After: 112,589,441 cache-references # 87.925 M/sec 6,556,354 cache-misses # 5.823 % misses/hits percentages are more expressive than absolute numbers or rates. (Also prettify the CPUs printout line to not have a trailing whitespace.) Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-axm28f43x439bl41zkvfzd63@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | perf stat: Print stalled cycles percentageIngo Molnar2011-04-261-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Print: 611,527 cycles 400,553 instructions # ( 0.71 instructions per cycle ) 77,809 stalled-cycles # ( 12.71% of all cycles ) 0.000610987 seconds time elapsed Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Link: http://lkml.kernel.org/n/tip-fd6x8r1cpyb6zhlrc4ix8m45@git.kernel.org
| * | | | | perf events: Add stalled cycles generic event - PERF_COUNT_HW_STALLED_CYCLESIngo Molnar2011-04-262-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The new PERF_COUNT_HW_STALLED_CYCLES event tries to approximate cycles the CPU does nothing useful, because it is stalled on a cache-miss or some other condition. Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-fue11vymwqsoo5to72jxxjyl@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | perf script: improve validation of sample attributes for output fieldsDavid Ahern2011-04-203-17/+109
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Check for required sample attributes using evsel rather than sample_type in the session header. If the attribute for a default field is not present for the event type (e.g., new command operating on file from older kernel) the field is removed from the output list. Expected event types must exist. For example, if a user specifies -f trace:time,trace -f sw:time,cpu,sym the perf.data file must contain both tracepoints and software events (ie., it is an error if either does not exist in the file). Attribute checking is done once at the beginning of perf-script rather than for each sample. v1 -> v2: - addressed comments from acme Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1302148460-570-1-git-send-email-daahern@cisco.com Signed-off-by: David Ahern <daahern@cisco.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | | | | perf script: Add support for PERF_TYPE_RAWArun Sharma2011-04-191-1/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Useful for getting stack traces for hardware events not handled by PERF_TYPE_HARDWARE. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Arun Sharma <asharma@fb.com> Link: http://lkml.kernel.org/n/tip-qimdcdpekjqxuyqovy4kjusx@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | | | | perf tools: git mv tools/perf/{features-tests.mak,config/}Michael Witten2011-04-192-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Michael Witten <mfwitten@gmail.com> Link: http://lkml.kernel.org/n/tip-a6zhefjayuounko1tk5sjji2@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | | | | perf tools: Move `try-cc'Michael Witten2011-04-192-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The `try-cc' user-defined function was in tools/perf/feature-tests.mak; this commit moves it to tools/perf/config/utilities.mak. Signed-off-by: Michael Witten <mfwitten@gmail.com> Link: http://lkml.kernel.org/n/tip-bqhwcuxsrve0iodn6q4ejaoi@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | | | | perf tools: Makefile: PYTHON{,_CONFIG} to bandage Python 3 incompatibilityMichael Witten2011-04-193-21/+264
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, Python 3 is not supported by perf's code; this can cause the build to fail for systems that have Python 3 installed as the default python: python{,-config} The Correct Solution is to write compatibility code so that Python 3 works out-of-the-box. However, users often have an ancillary Python 2 installed: python2{,-config} Therefore, a quick fix is to allow the user to specify those ancillary paths as the python binaries that Makefile should use, thereby avoiding Python 3 altogether; as an added benefit, the Python binaries may be installed in non-standard locations without the need for updating any PATH variable. This commit adds the ability to set PYTHON and/or PYTHON_CONFIG either as environment variables or as make variables on the command line; the paths may be relative, and usually only PYTHON is necessary in order for PYTHON_CONFIG to be defined implicitly. Some rudimentary error checking is performed when the user explicitly specifies a value for any of these variables. In addition, this commit introduces significantly robust makefile infrastructure for working with paths and communicating with the shell; it's currently only used for handling Python, but I hope it will prove useful in refactoring the makefiles. Thanks to: Raghavendra D Prabhu <rprabhu@wnohang.net> for motivating this patch. Acked-by: Raghavendra D Prabhu <rprabhu@wnohang.net> Link: http://lkml.kernel.org/r/e987828e-87ec-4973-95e7-47f10f5d9bab-mfwitten@gmail.com Signed-off-by: Michael Witten <mfwitten@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | | | | perf tools: Makefile: Clean up `python/perf.so' ruleMichael Witten2011-04-191-6/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is no need for a subshell or an explicit `export'; as per the POSIX Shell Command Language specification: http://pubs.opengroup.org/onlinepubs/009695399/utilities/xcu_chap02.html#tag_02_09_01 http://pubs.opengroup.org/onlinepubs/009695399/utilities/xcu_chap02.html#tag_02_10_02 It is only necessary to include the environment variable assignment just before the command to be run. Also, it is better to use single-quotes, because GNU make might expand `$(BASIC_CFLAGS)' into something that the shell could interpret within double-quotes. Acked-by: Raghavendra D Prabhu <rprabhu@wnohang.net> Link: http://lkml.kernel.org/n/tip-58n38o02ocuzrm9qh096hsf5@git.kernel.org Signed-off-by: Michael Witten <mfwitten@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | | | | perf symbols: Give more useful names to 'self' parametersArnaldo Carvalho de Melo2011-04-192-346/+361
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | One more installment on an area that is mostly dormant. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | | | | Merge branch 'perf/urgent' into perf/coreIngo Molnar2011-04-1918-117/+181
| |\ \ \ \ \ | | | |_|/ / | | |/| | | | | | | | | | | | | | | | | | | | | Merge reason: we'll be queueing up dependent changes. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | perf script: Add more documentation about the -f/--fields parametersArnaldo Carvalho de Melo2011-03-301-2/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using the commit log for 2c9e45f. Cc: David Ahern <daahern@cisco.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | | | | perf script: If type not given fields apply to all event typesDavid Ahern2011-03-301-49/+122
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allow: perf script -f <fields> to be equivalent to: perf script -f trace:<fields> -f sw:<fields> -f hw:<fields> i.e., the specified fields apply to all event types if the type string is not given. The field (-f) arguments are processed in the order received. A later usage can reset a prior request. e.g., -f trace: -f comm,tid,time,sym The first -f suppresses trace events (field list is ""), but then the second invocation sets the fields to comm,tid,time,sym. In this case a warning is given to the user: "Overriding previous field request for all events." Alternativey, consider the order: -f comm,tid,time,sym -f trace: The first -f sets the fields for all events and the second -f suppresses trace events. The user is given a warning message about the override, and the result of the above is that only S/W and H/W events are displayed with the given fields. For the 'wildcard' option if a user selected field is invalid for an event type, a message is displayed to the user that the option is ignored for that type. For example: perf script -f comm,tid,trace 2>&1 | less 'trace' not valid for hardware events. Ignoring. 'trace' not valid for software events. Ignoring. Alternatively, if the type is given an invalid field is specified it is an error. For example: perf script -v -f sw:comm,tid,trace 2>&1 | less 'trace' not valid for software events. At this point usage is displayed, and perf-script exits. Finally, a user may not set fields to none for all event types. i.e., -f "" is not allowed. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-kernel@vger.kernel.org LPU-Reference: <1300377801-27246-1-git-send-email-daahern@cisco.com> Signed-off-by: David Ahern <daahern@cisco.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | | | | perf probe: Add fastpath to do lookup by function nameLin Ming2011-03-292-0/+74
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | v3 -> v2: - Make pubname_search_cb more generic - Add fastpath to find_probes also v2 -> v1: - Don't compare file names with cu_find_realpath(...), instead, compare them with the name returned by dwarf_decl_file(sp_die) The vmlinux file may have thousands of CUs. We can lookup function name from .debug_pubnames section to avoid the slow loop on CUs. 1. Improvement data for find_line_range ./perf stat -e cycles -r 10 -- ./perf probe -k /home/mlin/vmlinux \ -s /home/mlin/linux-2.6 \ --line csum_partial_copy_to_user > tmp.log before patch applied ===================== 847,988,276 cycles 0.355075856 seconds time elapsed after patch applied ===================== 206,102,622 cycles 0.086883555 seconds time elapsed 2. Improvement data for find_probes ./perf stat -e cycles -r 10 -- ./perf probe -k /home/mlin/vmlinux \ -s /home/mlin/linux-2.6 \ --vars csum_partial_copy_to_user > tmp.log before patch applied ===================== 848,490,844 cycles 0.355307901 seconds time elapsed after patch applied ===================== 205,684,469 cycles 0.086694010 seconds time elapsed Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: linux-kernel <linux-kernel@vger.kernel.org> LKML-Reference: <1301041668.14111.52.camel@minggr.sh.intel.com> Signed-off-by: Lin Ming <ming.m.lin@intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | | | | | Merge branch 'perf/urgent' of ↵Ingo Molnar2011-05-156-54/+116
|\ \ \ \ \ \ | |_|_|/ / / |/| | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux-2.6 into perf/urgent
| * | | | | perf evlist: Fix per thread mmap setupArnaldo Carvalho de Melo2011-05-156-53/+115
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The PERF_EVENT_IOC_SET_OUTPUT ioctl was returning -EINVAL when using --pid when monitoring multithreaded apps, as we can only share a ring buffer for events on the same thread if not doing per cpu. Fix it by using per thread ring buffers. Tested with: [root@felicio ~]# tuna -t 26131 -CP | nl 1 thread ctxt_switches 2 pid SCHED_ rtpri affinity voluntary nonvoluntary cmd 3 26131 OTHER 0 0,1 10814276 2397830 chromium-browse 4 642 OTHER 0 0,1 14688 0 chromium-browse 5 26148 OTHER 0 0,1 713602 115479 chromium-browse 6 26149 OTHER 0 0,1 801958 2262 chromium-browse 7 26150 OTHER 0 0,1 1271128 248 chromium-browse 8 26151 OTHER 0 0,1 3 0 chromium-browse 9 27049 OTHER 0 0,1 36796 9 chromium-browse 10 618 OTHER 0 0,1 14711 0 chromium-browse 11 661 OTHER 0 0,1 14593 0 chromium-browse 12 29048 OTHER 0 0,1 28125 0 chromium-browse 13 26143 OTHER 0 0,1 2202789 781 chromium-browse [root@felicio ~]# So 11 threads under pid 26131, then: [root@felicio ~]# perf record -F 50000 --pid 26131 [root@felicio ~]# grep perf_event /proc/`pidof perf`/maps | nl 1 7fa4a2538000-7fa4a25b9000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 2 7fa4a25b9000-7fa4a263a000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 3 7fa4a263a000-7fa4a26bb000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 4 7fa4a26bb000-7fa4a273c000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 5 7fa4a273c000-7fa4a27bd000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 6 7fa4a27bd000-7fa4a283e000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 7 7fa4a283e000-7fa4a28bf000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 8 7fa4a28bf000-7fa4a2940000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 9 7fa4a2940000-7fa4a29c1000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 10 7fa4a29c1000-7fa4a2a42000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 11 7fa4a2a42000-7fa4a2ac3000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] [root@felicio ~]# 11 mmaps, one per thread since we didn't specify any CPU list, so we need one mmap per thread and: [root@felicio ~]# perf record -F 50000 --pid 26131 ^M ^C[ perf record: Woken up 79 times to write data ] [ perf record: Captured and wrote 20.614 MB perf.data (~900639 samples) ] [root@felicio ~]# perf report -D | grep PERF_RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort -n | uniq -c | sort -nr | nl 1 371310 26131 2 96516 26148 3 95694 26149 4 95203 26150 5 7291 26143 6 87 27049 7 76 661 8 60 29048 9 47 618 10 43 642 [root@felicio ~]# Ok, one of the threads, 26151 was quiescent, so no samples there, but all the others are there. Then, if I specify one CPU: [root@felicio ~]# perf record -F 50000 --pid 26131 --cpu 1 ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.680 MB perf.data (~29730 samples) ] [root@felicio ~]# perf report -D | grep PERF_RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort -n | uniq -c | sort -nr | nl 1 8444 26131 2 2584 26149 3 2518 26148 4 2324 26150 5 123 26143 6 9 661 7 9 29048 [root@felicio ~]# This machine has two cores, so fewer threads appeared on the radar, and: [root@felicio ~]# grep perf_event /proc/`pidof perf`/maps | nl 1 7f484b922000-7f484b9a3000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] [root@felicio ~]# Just one mmap, as now we can use just one per-cpu buffer instead of the per-thread needed in the previous case. For global profiling: [root@felicio ~]# perf record -F 50000 -a ^C[ perf record: Woken up 26 times to write data ] [ perf record: Captured and wrote 7.128 MB perf.data (~311412 samples) ] [root@felicio ~]# grep perf_event /proc/`pidof perf`/maps | nl 1 7fb49b435000-7fb49b4b6000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 2 7fb49b4b6000-7fb49b537000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] [root@felicio ~]# It uses per-cpu buffers. For just one thread: [root@felicio ~]# perf record -F 50000 --tid 26148 ^C[ perf record: Woken up 2 times to write data ] [ perf record: Captured and wrote 0.330 MB perf.data (~14426 samples) ] [root@felicio ~]# perf report -D | grep PERF_RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort -n | uniq -c | sort -nr | nl 1 9969 26148 [root@felicio ~]# [root@felicio ~]# grep perf_event /proc/`pidof perf`/maps | nl 1 7f286a51b000-7f286a59c000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] [root@felicio ~]# Tested-by: David Ahern <dsahern@gmail.com> Tested-by: Lin Ming <ming.m.lin@intel.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Link: http://lkml.kernel.org/r/20110426204401.GB1746@ghostprotocols.net Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | | | | perf tools: Honour the cpu list parameter when also monitoring a thread listArnaldo Carvalho de Melo2011-05-151-1/+1
| | |/ / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The perf_evlist__create_maps was discarding the --cpu parameter when a --pid or --tid was specified, fix that. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Link: http://lkml.kernel.org/r/20110426204401.GB1746@ghostprotocols.net Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* / | | | perf tools: Makefile: Use gcc to determine ARCHLin Ming2011-05-071-6/+10
|/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The original Makefile uses "uname -m" to determine ARCH. This causes problem on x86 when compile perf tool on 32 bit userspace with a 64 bit kernel. bench/../../../arch/x86/lib/memcpy_64.S: Assembler messages: bench/../../../arch/x86/lib/memcpy_64.S:28: Error: bad register name `%rdi' This is because "uname -m" returns x86_64 and memcpy_64.S is included in 32 bit build. Reported-by: Riccardo Magliocchetti <riccardo.magliocchetti@gmail.com> Signed-off-by: Lin Ming <ming.m.lin@intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Link: http://lkml.kernel.org/r/1304743274.3132.17.camel@localhost Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | perf evsel: Fix use of inheritArnaldo Carvalho de Melo2011-04-158-42/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | perf stat doesn't mmap and its perfectly fine for it to use task-bound counters with inheritance. So set the attr.inherit on the caller and leave the syscall itself to validate it. When the mmap fails perf_evlist__mmap will just emit a warning if this is the failure reason. Reported-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Link: http://lkml.kernel.org/r/20110414170121.GC3229@ghostprotocols.net Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | | | perf hists browser: Fix seg fault when annotate null symbolLin Ming2011-04-152-3/+5
| |/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In hists browser, press hotkey 'a' to annotate current symbol. Now it causes segment fault if 'a' is pressed on a null symbol. Here are 2 small bugs: - In perf_evsel__hists_browse, the condition check after 'a' is pressed is not correct, we should check ->sym instead of ->map. - In symbol__tui_annotate we must check whether sym is NULL or not before getting annotation structure. This patch fixes above 2 small bugs. Link: http://lkml.kernel.org/r/1302244286.4106.36.camel@minggr.sh.intel.com Signed-off-by: Lin Ming <ming.m.lin@intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | | perf: Fix a build error with some GCC versionsEric Dumazet2011-04-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix this: util/cgroup.c: In function ‘open_cgroup’: util/cgroup.c:16:16: error: ‘saved_ptr’ may be used uninitialized in this function util/cgroup.c:16:16: note: ‘saved_ptr’ was declared here Apparently newer GCC (4.6) can figure out that this variable is properly initialized - but some versions of GCC (such as 4.5.2) need help. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | |
| \ \
| \ \
| \ \
*---. \ \ Merge branches 'x86-fixes-for-linus', 'sched-fixes-for-linus', ↵Linus Torvalds2011-04-072-65/+113
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'timers-fixes-for-linus', 'irq-fixes-for-linus' and 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86-32, fpu: Fix FPU exception handling on non-SSE systems x86, hibernate: Initialize mmu_cr4_features during boot x86-32, NUMA: Fix ACPI NUMA init broken by recent x86-64 change x86: visws: Fixup irq overhaul fallout * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: Clean up rebalance_domains() load-balance interval calculation * 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86/mrst/vrtc: Fix boot crash in mrst_rtc_init() rtc, x86/mrst/vrtc: Fix boot crash in rtc_read_alarm() * 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: genirq: Fix cpumask leak in __setup_irq() * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf probe: Fix listing incorrect line number with inline function perf probe: Fix to find recursively inlined function perf probe: Fix multiple --vars options behavior perf probe: Fix to remove redundant close perf probe: Fix to ensure function declared file
| | | * | | perf probe: Fix listing incorrect line number with inline functionMasami Hiramatsu2011-04-051-53/+81
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix a bug showing incorrect line number when a probe is put on the head of an inline function. This patch updates find_perf_probe_point() and introduces new rules to get correct line number. - If debuginfo doesn't have a correct file name, we shouldn't return line number too, because, without file name, line number is meaningless. - If the address is in a function, it stores the function name and the offset from the function entry. - If the address is on a line, it tries to get the relative line number from the function entry line, except for the address is same as the entry address of the function (in this case, the relative line number should be 0). - If the address is in an inline function entry (call-site), it uses the inline function call line number as the line on which the address is. - If the address is in an inline function body, it stores the inline function name and offset from the inline function call site instead of the (non-inlined) function. Cc: 2nddept-manager@sdl.hitachi.co.jp Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Lin Ming <ming.m.lin@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20110330092605.2132.11629.stgit@ltc236.sdl.hitachi.co.jp> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | | * | | perf probe: Fix to find recursively inlined functionMasami Hiramatsu2011-04-051-1/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix die_find_inlinefunc() to return correct innermost inlined function at given address. Without this fix, it returns the outermost inlined function. Cc: 2nddept-manager@sdl.hitachi.co.jp Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Lin Ming <ming.m.lin@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20110330092559.2132.78634.stgit@ltc236.sdl.hitachi.co.jp> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | | * | | perf probe: Fix multiple --vars options behaviorMasami Hiramatsu2011-04-051-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix a bug that perf-probe fails to initialize libdwfl and shows incorrect error when user gives multiple --vars options. Cc: 2nddept-manager@sdl.hitachi.co.jp Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Lin Ming <ming.m.lin@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20110330092553.2132.42691.stgit@ltc236.sdl.hitachi.co.jp> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | | * | | perf probe: Fix to remove redundant closeMasami Hiramatsu2011-04-052-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since dwfl_end() closes given fd with dwfl, caller doesn't need to close its fd when finishing process. Cc: 2nddept-manager@sdl.hitachi.co.jp Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Lin Ming <ming.m.lin@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20110330092547.2132.93728.stgit@ltc236.sdl.hitachi.co.jp> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | | * | | perf probe: Fix to ensure function declared fileMasami Hiramatsu2011-04-051-0/+8
| | |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix to ensure function declared file matches given file name. This fixes a potential bug. As I've commented on Lin Ming's fastpath enhancement, decl_file should be checked on each probe point if user gives a probe point as func@file. Cc: 2nddept-manager@sdl.hitachi.co.jp Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Lin Ming <ming.m.lin@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20110330092541.2132.3584.stgit@ltc236.sdl.hitachi.co.jp> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | | | | Merge branch 'for-linus2' of git://git.profusion.mobi/users/lucas/linux-2.6Linus Torvalds2011-04-072-2/+2
|\ \ \ \ \ | |_|/ / / |/| | | | | | | | | | | | | | * 'for-linus2' of git://git.profusion.mobi/users/lucas/linux-2.6: Fix common misspellings
| * | | | Fix common misspellingsLucas De Marchi2011-03-312-2/+2
| |/ / / | | | | | | | | | | | | | | | | | | | | Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
* | | | perf: mmap 512 kiB by defaultFrederic Weisbecker2011-03-311-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The default setting of perf record is to mmap 128 pages if the user did not override with -m. However the page size may vary accross different architecture settings, giving different default size between each. Moreover the kernel side still has a default max number of mlocked pages of 512 kiB + 1 page for unprivileged users. 128 + 1 pages with page size > 4096 overlaps this threshold. Thus, better adapt to this limitation and set the default number of pages to fit those 512 kiB + 1 page. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <1301535324-9735-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | Merge commits 'ca6a42586fae' and 'c286c419c784' into perf/urgentIngo Molnar2011-03-304-20/+54
|\ \ \ \ | | |/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pick up these two commits from Arnaldo's perf/core tree: ca6a42586fae: perf tools: Emit clearer message for sys_perf_event_open ENOENT return c286c419c784: perf tools: Fixup exit path when not able to open events As they are really fixes we want to have sooner than laer. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | perf tools: Emit clearer message for sys_perf_event_open ENOENT returnDavid Ahern2011-03-292-1/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Resend of patch sent back in January 2011 in light of recent confusion around unsupported events for a given platform. Improve sys_perf_event_open ENOENT return handling in top and record, just like 5a3446b does for stat. Retry of Arnaldo's patch using ui_warning instead of die which allows the fallback from hardware cycles to software clock. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org LKML-Reference: <1301080271-20945-1-git-send-email-daahern@cisco.com> Signed-off-by: David Ahern <daahern@cisco.com> [ committer note: Some adjustments to make it apply to newer codebase ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | | perf tools: Fixup exit path when not able to open eventsArnaldo Carvalho de Melo2011-03-294-19/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have to deal with the TUI mode in perf top, so that we don't end up with a garbled screen when, say, a non root user on a machine with a paranoid setting (the default) tries to use 'perf top'. Introduce a ui__warning_paranoid() routine shared by top and record that tells the user the valid values for /proc/sys/kernel/perf_event_paranoid. Cc: David Ahern <daahern@cisco.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | | | perf tools: Fix NO_NEWT=1 python build errorRobert Richter2011-03-292-3/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix the following build error: GEN python/perf.so In file included from util/evsel.h:10, from util/python.c:6: util/hist.h:106:18: error: newt.h: No such file or directory error: command 'x86_64-pc-linux-gnu-gcc' failed with exit status 1 make: *** [python/perf.so] Error 1 by passing BASIC_CFLAGS to setup.py. BASIC_CFLAGS variable contains the -DNO_NEWT_SUPPORT switch which prevents building python c extension with newt. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <20110329180236.GA19366@erda.amd.com> Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | | | perf symbols: Properly align symbol_conf.priv_sizeDavid S. Miller2011-03-291-0/+2
|/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If symbol_conf.priv_size is not a multiple of "sizeof(u64)" we'll bus error on sparc64 in symbol__new because the "struct symbol *" pointer is computed by adding symbol_conf.priv_size to the memory allocated. We cannot isolate the fix to symbol__new and symbol__delete since the private area is computed by subtracting the priv_size value from a "struct symbol" pointer, so then the private area can still be potentially unaligned. So, simply align the symbol_conf.priv_size value in symbol__init() Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20110328.175849.112593455.davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* / / perf symbols: Fix vsyscall symbol lookupAndrew Lutomirski2011-03-282-1/+4
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Perf can't currently trace into the vsyscall page. It looks like it was meant to work. Tested on 2.6.38 and today's -git. The bug is easy to reproduce. Compile this: int main() { int i; struct timespec t; for(i = 0; i < 10000000; i++) clock_gettime(CLOCK_MONOTONIC, &t); return 0; } and run it through perf record; perf report. The top entry shows "[unknown]" and you can't zoom in. It looks like there are two issues. The first is a that a test for user mode executing in kernel space is backwards. (That's the first hunk below). The second (I think) is that something's wrong with the code that generates lots of little struct dso objects for different sections -- when it runs on vmlinux it results in bogus long_name values which cause objdump to fail. Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LPU-Reference: <AANLkTikxSw5+wJZUWNz++nL7mgivCh_Zf=2Kq6=f9Ce_@mail.gmail.com> Signed-off-by: Andy Lutomirski <luto@mit.edu> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | Merge branch 'perf-fixes-for-linus' of ↵Linus Torvalds2011-03-2520-60/+145
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf, x86: Complain louder about BIOSen corrupting CPU/PMU state and continue perf, x86: P4 PMU - Read proper MSR register to catch unflagged overflows perf symbols: Look at .dynsym again if .symtab not found perf build-id: Add quirk to deal with perf.data file format breakage perf session: Pass evsel in event_ops->sample() perf: Better fit max unprivileged mlock pages for tools needs perf_events: Fix stale ->cgrp pointer in update_cgrp_time_from_cpuctx() perf top: Fix uninitialized 'counter' variable tracing: Fix set_ftrace_filter probe function display perf, x86: Fix Intel fixed counters base initialization
| * | perf symbols: Look at .dynsym again if .symtab not foundArnaldo Carvalho de Melo2011-03-231-12/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The original intent of the code was to repeat the search with want_symtab = 0. But as the code stands now, we never hit the "default" case of the switch statement. Which means we never repeat the search. Tested-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Reported-by: Arun Sharma <asharma@fb.com> Reported-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Dave Martin <dave.martin@linaro.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | perf build-id: Add quirk to deal with perf.data file format breakageArnaldo Carvalho de Melo2011-03-231-1/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The a1645ce1 changeset: "perf: 'perf kvm' tool for monitoring guest performance from host" Added a field to struct build_id_event that broke the file format. Since the kernel build-id is the first entry, process the table using the old format if the well known '[kernel.kallsyms]' string for the kernel build-id has the first 4 characters chopped off (where the pid_t sits). Reported-by: Han Pingtian <phan@redhat.com> Cc: Avi Kivity <avi@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: Zhang Yanmin <yanmin_zhang@linux.intel.com> Cc: stable@kernel.org LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | perf session: Pass evsel in event_ops->sample()Arnaldo Carvalho de Melo2011-03-2317-46/+73
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Resolving the sample->id to an evsel since the most advanced tools, report and annotate, and the others will too when they evolve to properly support multi-event perf.data files. Good also because it does an extra validation, checking that the ID is valid when present. When that is not the case, the overhead is just a branch + function call (perf_evlist__id2evsel). Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| * | perf top: Fix uninitialized 'counter' variableAkihiro Nagai2011-03-231-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | builtin-top.c has an uninitialized variable. gcc(version 4.5.1) warns about it and it results in build failure: builtin-top.c: In function 'display_thread': builtin-top.c:518:9: error: 'counter' may be used uninitialized This situation can indeed trigger, if the getline() call in prompt_integer() fails. Signed-off-by: Akihiro Nagai <akihiro.nagai.hw@hitachi.com> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20110323072939.11638.50173.stgit@localhost6.localdomain6> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | Merge branch 'for-linus' of ↵Linus Torvalds2011-03-212-20/+161
|\ \ \ | |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-ktest * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-ktest: ktest: Add STOP_TEST_AFTER to stop the test after a period of time ktest: Monitor kernel while running of user tests ktest: Fix bug where the test would not end after failure ktest: Add BISECT_FILES to run git bisect on paths ktest: Add BISECT_SKIP ktest: Add manual bisect ktest: Handle kernels before make oldnoconfig ktest: Start failure timeout on panic too ktest: Print logfile name on failure
| * | ktest: Add STOP_TEST_AFTER to stop the test after a period of timeSteven Rostedt2011-03-082-1/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, if a test causes constant output but never reaches a boot prompt, or crashes, the test will never stop. Add STOP_TEST_AFTER to create a variable that will stop (and fail) the test after it has run for this amount of time. The default is 10 minutes. Setting this variable to -1 will disable it. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>