Previous article: https://exadata-dba.blogspot.com/2014/01/system-performance-utilities-show-wrong.html
The real life example. Look at the "Host CPU" table.
Customer ask: why its batch jobs take 2 times more time if it run more 520 batch jobs simultaneously ?
Look the AWR extract:
I'd like to ask you: is it system overloaded or it has a plenty available resources ?
From one side: it has 706 working processes per 256 physical cores.
But from other side it has 68.9% CPU IDLE !!!!! How it is possible ?
ADDM extract "CPU was NOT A BOTTLENECK for the instance":
-------------------------------------------------------------------------
About Intel perf
Oracle Linux: %Steal Ratio Is High even if Guest/Instance is Using Pinned CPUs (Doc ID 2783819.1): "With hyperthreading there is no doubling CPU possibilities/performance, per Intel documentation system can achieve 15% performance boost compared to non hyperthreading enabled CPUs"
Oracle Linux: How to understand OS load average and run queue / blocked queue in terms of CPU utilization (Doc ID 2221159.1):
“The rule of thumb is:
- Single Core system - if load average is 1.00 it means that system is fully utilized and if there will be more tasks incoming they will be queue-up and wait for execution
- Single Core system - if load average is 2.00 it means that System is already utilized and some tasks are already queued-up and waiting for execution
- Multi core system ( 4 cores ) - if load average is 1.00 it means that system uses 1/4 of his CPU capabilities, one task is actively running and there are still 3 cores at 'idle' stage
- Multi core system ( 4 cores ) - if load average is 4.00 it means that system uses all 4 cores and it indicate that system is fully utilized “
“Run queue column should be always lower/same as number of cores installed on system - of course run queue of 100 can be visible on system with only 8 cores - it will mean that 8 processes are actively being served by CPU and rest 92 are queued and waiting for execution.”
Intel provides good document about HT systems and how much gain can be achieved:
In real life scenario HT will allow for additional 15% -> 20%
performance gain but it can't be used as general rule to gain as twice
possible 'power' to boost application/database or virtualization layer
for additional core sources.
And here https://exadata-dba.blogspot.com/2014/01/system-performance-utilities-show-wrong.html
About SPARC perf:
CPU utilization of multi-threaded architectures explained:
"Example:
Assuming 1 CPU core has 4 threads. Currently 2 (single-threaded)
processes are scheduled to run on this core and these 2 processes
already saturate all available shared compute resources (ALU, FPU,
Cache, Memory bandwidth, etc.) of the core. Commonly used performance
tools would still report (at least) 50% idle since 2 logical processors
(hardware threads) appear completely idle."
" .. at 25% of
the maximum throughput load the operating system only reports 8% CPU
utilization with 92% idle.
At half of the maximum achievable throughput the system only appears to be 21%
busy with 79% idle. "
---------------------------------------------------------------
VMWare: