I have to profile my multithreaded C++ app and find its bottlenecks. The problem is: I need to see wall clock profile. I have used oprofile and perf. No one can provide me such information. 
I have used perf record -g -e sched:sched_stat_sleep <cmd> but perf record falls with SIGFPE exception. This makes me angry. 
Valgrind doesn't suits me because I use fanotify_mark syscall which is not implemented in this tool.
I'm not sure google's perftools can do wall clock profiling - I haven't see any info in their documentation.
Can anyone suggest? Thank you.
 
    