Friday, January 31, 2014

System performance utilities show wrong CPU load on Intel processosrs when HyperThreading is enables

The interesting information is published in the

 
pages 28-29:



"Compute node CPU utilization can be measured through many different tools – top, AWR, iostat, vmstat, etc. and they all give the same number, and % CPU utilization typically averaged over a set period of time. Choose whichever tool is most convenient, but allow for Intel CPU hyper-threading.
The Intel CPUs used in all Exadata models run with two threads per CPU core. This helps to boost overall performance, but the second thread is not as powerful as the first. The operating system assumes that all threads are equal thus overstates the available CPU capacity by the operating system. We need to allow for this. Here is an approximate rule of thumb that can be used to estimate actual CPU utilization, but note that this can vary with different workloads:
∙ For CPU utilization less than 50%, multiply by 1.7.
∙ For CPU utilization over 50%, assume 85% plus (util-50%)* 0.3.

Here is a table that summarizes the effect:
Measured Utilization


Actual Utilization
10%
17%
20%
34%
30%
51%
40%
68%
50%
85%
60%
88%
70%
91%
80%
94%
90%
97%
100%
100%
 "

This information is applicable to all x86 abd x86-64 servers with HyperThreading enabled.

Enabling HT causes system statistics tools - vmstat, sar - show incorrect CPU load,
and Oracle performance tools - AWR, ASH - also show and store incorrect CPU load statistics.

Be careful !



ExaWatcher in X4-2 Exadata

The new old product appeared in the Exadata - ExaWatcher, documentation is in note 1617454.1.

It can monitor IB switches and databases.

The new path is like the old OSW path :  /opt/oracle.ExaWatcher

What it can monitor ?  These files are directories in the /opt/oracle.ExaWatcher/archive:

Diskinfo, 
IBCardInfo
IBprocs
Iostat
LGWR
Lsof
MegaRaidFW
Meminfo
Mpstat
Netstat
Ps
RDSinfo
Slabinfo
Top
Vmstat


ExaWatcher saves files in the / file system (/opt/oracle.ExaWatcher/archiv is the / FS), / size is 30g .
It runs these cmds respectively and fill the directories above :

iostat -t -x
cat /sys/class/net/ib0/carrier;/bin/cat /sys/class/net/ib1/carrier;/usr/sbin/ibstatus;/usr/bin/ibv_devinfo;/sbin/ifconfig -a | /bin/cat
/opt/oracle.ExaWatcher/LGWRExaWatcher.sh  -  делает cat/proc/$pid_LGWR/schedstat
top -b
vmstat
ps -eo flags,s,ruser,pid,ppid,c,psr,pri,ni,addr,sz,wchan,stime,tty,time,cmd
netstat -a -i -n; netstat -s; netstat -n -p -l
/opt/oracle.ExaWatcher/RDSinfoExaWatcher.sh 2>/dev/null
mpstat -P ALL
lsof +c0 -w +L -b -R -i; lsof +c0 -w +L -b -R -U; lsof +c0 -w +L1 -b -R
/opt/oracle.ExaWatcher/IBCardInfoExaWatcher.sh 2>/dev/null
cat /proc/meminfo
cat /proc/slabinfo
/opt/MegaRAID/MegaCli/MegaCli64 fwtermlog dsply -a0

There is difference between original OSW and Exadata versions (OSW and EW).
Original OSW store historical files in the *.dat, but the Exadata's version store files in bzip2 :

[root@ed02dbadm01 oracle.ExaWatcher]# ls -l archive/RDSinfo.ExaWatcher/
total 15224
-rw-r----- 1 root root   5136 Dec  8 10:08 2013_12_07_21_18_23_RDSinfoExaWatcher_db01.domain.mycompany.com.dat.bz2
-rw-r----- 1 root root   5438 Jan 23 16:02 2014_01_23_03_02_25_RDSinfoExaWatcher_db01.domain.mycompany.com.dat.bz2
-rw-r----- 1 root root   4998 Jan 23 17:02 

...
2014_01_31_16_49_45_RDSinfoExaWatcher_ed02dbadm01.distr.fors.ru.dat

But currenly writing file is *.dat.



How long  store the history:  looking the code
/opt/oracle.ExaWatcher/ExaWatcherCleanup.sh
I see that deleting is triggered when file system / is filled 80% . This code choose the one oldest file in each directory and delete it. And run this cycle again until / is 20% free.


The log for deleted files: /opt/oracle.ExaWatcher/log/OldFilesDeleted.log

 And the processes are:

[root@ed02dbadm01 ~]# ps -ef|grep -i ExaWatcher
root      12753      1  0 14:58 ?        00:00:00 /bin/bash ./ExaWatcher.sh --fromconf
root      16161  12753  0 14:59 ?        00:00:04 /usr/bin/perl /opt/oracle.ExaWatcher/ExecutorExaWatcher.pl /opt/oracle.ExaWatcher/ExaWatcher.execonf
root      16237  16161  0 14:59 ?        00:00:00 sh -c /usr/bin/iostat -t -x  5  720 2>/dev/null >> /opt/oracle.ExaWatcher/archive/Iostat.ExaWatcher/2014_01_24_14_59_00_IostatExaWatcher_ed02dbadm01.distr.fors.ru.dat
root      16259  16161  0 14:59 ?        00:00:00 sh -c /usr/bin/top -b -d 5 -n 720 2>/dev/null >> /opt/oracle.ExaWatcher/archive/Top.ExaWatcher/2014_01_24_14_59_00_TopExaWatcher_ed02dbadm01.distr.fors.ru.dat
root      16312  16161  0 14:59 ?        00:00:00 sh -c /usr/bin/mpstat -P ALL  5  720 2>/dev/null >> /opt/oracle.ExaWatcher/archive/Mpstat.ExaWatcher/2014_01_24_14_59_00_MpstatExaWatcher_ed02dbadm01.distr.fors.ru.dat
root      16557  16161  0 14:59 ?        00:00:00 sh -c /opt/oracle.ExaWatcher/ExaWatcherCleanup.sh 1386431083 1701963863 3600 /opt/oracle.ExaWatcher/archive/ 3145728 2>>/dev/null
root      16558  16557  0 14:59 ?        00:00:00 /bin/bash /opt/oracle.ExaWatcher/ExaWatcherCleanup.sh 1386431083 1701963863 3600 /opt/oracle.ExaWatcher/archive/ 3145728
root      30003  16161  0 15:33 ?        00:00:00 sh -c /usr/bin/vmstat  5  2 >> /opt/oracle.ExaWatcher/archive/Vmstat.ExaWatcher/2014_01_24_14_59_00_VmstatExaWatcher_ed02dbadm01.distr.fors.ru.dat


You can see the some input numbers for the cleaner:

ExaWatcherCleanup.sh 1386431083 1701963863 3600 /opt/oracle.ExaWatcher/archive/ 3145728 2>>/dev/null

I think first 2 are the StartTime and EndTime for this script. So the ExaWatcher is supposed to work in time between

[root@ed02dbadm02 oracle.ExaWatcher]# date --date @1386431083
Sat Dec  7 19:44:43 MSK 2013
[root@ed02dbadm02 oracle.ExaWatcher]# date --date @1701963863
Thu Dec  7 19:44:23 MSK 2023

3600 - is cycle length. CleanupInterval.
/opt/oracle.ExaWatcher/archive/ - place for historical files
3145728 - the maximal size of archive directory in KB. SpaceLimit.


How to disable/setup autostart parameters for specified instance ?

Q: We have a 4-node RAC. I need to disable autostart of the DB on one node only.    How to do it and how to see autostart parameters, confir...