Re: [PATCH] perf: Add a new sort order: SORT_INCLUSIVE (v4)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


On 3/19/12 8:57 AM, Frederic Weisbecker wrote:

Each hist have a period of 1, but the total period is 1.
So the end result should be (IIUC):

100%    foo     a
100%    foo     b
                 |
                 --- a
100%    foo     c
                 |
                 --- b
                     |
                     --- c


That is correct. The first column no longer adds up to 100%.

So do we really want this?


I think so. It's a different way of presenting the data. Pie chart vs a bar chart of OS market share where people may be using more than one OS.

I'll post some documentation updates.

If we don't do this, total_period will be inflated.

Yeah right I've just tried and callchains look right. I'm just puzzled
by the percentages:


Thanks for testing this!

+  98,99%  [k] execve
+  98,99%  [k] stub_execve
+  98,99%  [k] do_execve
+  98,99%  [k] do_execve_common
+  98,99%  [k] sys_execve
+  53,12%  [k] __libc_start_main
+  53,12%  [k] cmd_record

These look like they belong to the perf binary and are incorrectly classified as kernel samples. Problem is that callchain_get() is not populating the privilege level - it's simply propagating the privilege level of the sample:


+       for (i = 0; i < cursor->nr; i++) {
+               struct addr_location al_child = *al;
+
+               err = callchain_get(&iter, &al_child);

Not all fields of al_child are populated by callchain_get().

+  53,12%  [k] T.101
+  53,12%  [k] main
+  53,12%  [k] run_builtin
+  52,11%  [k] perf_evlist__prepare_workload
+  52,09%  [k] T.1163

The rest of them look ok to me. If something doesn't make sense, please point me at the output of "perf script".



Also this feature reminds me a lot the -b option in perf report.
Branch sorting and callchain inclusive sorting are a bit different in
the way they handle the things but the core idea is the same. Callchains
are branches as well.


Yes - I kept asking why the branch stack stuff doesn't use the
existing callchain logic.

Because I fear that loops branches could make the tree representation useless.


The loops could happen in callgraphs too right (eg: recursive programs)? The other problem in branch stacks/LBR is that they're sampled branches. Just because I got a sample with:

a -> b
b -> c

doesn't necessarily mean that the callchain was a -> b -> c.

I still don't have the branch stack setup working properly. But I'm now more sympathetic to the view that last branch sampling and callchains may have different representations in perf.

 -Arun
--
To unsubscribe from this list: send the line "unsubscribe linux-perf-users" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]    [Yosemite Photos]    [Free Online Dating]     [Linux Kernel]     [Linux SCSI]     [XFree86]

Add to Google Powered by Linux