Friday, June 11, 2021

CPU utilization as a performance metric

Previous article: https://exadata-dba.blogspot.com/2014/01/system-performance-utilities-show-wrong.html

The real life example. Look at the "Host CPU" table.

Customer ask: why its batch jobs take 2 times more time if it run more 520 batch jobs simultaneously ?


Look the AWR extract:



I'd like to ask you: is it system overloaded or it has a plenty available resources ?

From one side: it has 706 working processes per 256 physical cores.

But from other side it has 68.9% CPU IDLE !!!!!  How it is possible ?

 

ADDM extract "CPU was NOT A BOTTLENECK for the instance":


 

-------------------------------------------------------------------------


About Intel perf

Oracle Linux: %Steal Ratio Is High even if Guest/Instance is Using Pinned CPUs (Doc ID 2783819.1):   "With hyperthreading there is no doubling CPU possibilities/performance, per Intel documentation system can achieve 15% performance boost compared to non hyperthreading enabled CPUs"  

 

Oracle Linux: How to understand OS load average and run queue / blocked queue in terms of CPU utilization (Doc ID 2221159.1):

“The rule of thumb is:

  • Single Core system - if load average is 1.00 it means that system is fully utilized and if there will be more tasks incoming they will be queue-up and wait for execution
  • Single Core system - if load average is 2.00 it means that System is already utilized and some tasks are already queued-up and waiting for execution
  • Multi core system ( 4 cores ) - if load average is 1.00 it means that system uses 1/4 of his CPU capabilities, one task is actively running and there are still 3 cores at 'idle' stage
  • Multi core system ( 4 cores ) - if load average is 4.00 it means that system uses all 4 cores and it indicate that system is fully utilized 

Run queue column should be always lower/same as number of cores installed on system - of course run queue of 100 can be visible on system with only 8 cores - it will mean that 8 processes are actively being served by CPU and rest 92 are queued and waiting for execution.

Intel provides good document about HT systems and how much gain can be achieved:

https://software.intel.com/content/www/us/en/develop/articles/how-to-determine-the-effectiveness-of-hyper-threading-technology-with-an-application.html?wapkw=(explains)

In real life scenario HT will allow for additional 15% -> 20% performance gain but it can't be used as general rule to gain as twice possible 'power' to boost application/database or virtualization layer for additional core sources.


And here https://exadata-dba.blogspot.com/2014/01/system-performance-utilities-show-wrong.html

 

About SPARC perf:

 CPU utilization of multi-threaded architectures explained:

 

"Example:
Assuming 1 CPU core has 4 threads. Currently 2 (single-threaded) processes are scheduled to run on this core and these 2 processes already saturate all available shared compute resources (ALU, FPU, Cache, Memory bandwidth, etc.) of the core. Commonly used performance tools would still report (at least) 50% idle since 2 logical processors (hardware threads) appear completely idle."


" .. at 25% of the maximum throughput load the operating system only reports 8% CPU utilization with 92% idle.
At half of the maximum achievable throughput the system only appears to be 21% busy with 79% idle.
"

---------------------------------------------------------------

VMWare:

The my real-life-experience conclusion: CPU IS THE BOTTLENECK FOR THIS INSTANCE !
DESPITE CPU = 69% IDLE.

 

http://www.oracle.com/technetwork/database/availability/exadata-health-resource-usage-2021227.pdf

https://blog.pythian.com/cpu-utilization-not-useful-metric/
http://www.brendangregg.com/blog/2017-08-08/linux-load-averages.html
https://tools.ietf.org/html/rfc546
http://www.hpts.ws/papers/2007/Cockcroft_HPTS-Useless.pdf
https://www.dasher.com/will-hyper-threading-improve-processing-performance/

 

Does DEALLOCATE UNUSED or SHRINK SPACE will free space occupied by LOB segment?

Lets check how it works. My env is DB 19.20@Linux-x64 1) I created the table with 4 LOB columns of 4 different LOB types: BASICFILE BLOB, BA...