Saturday, September 7, 2019

Exadata image 18.1.19 pitfall

After Exadata image 18.1.19 have been installed we obtained some errors, concerned to new process creation:

The "tail  alert.log" bring the message:
$ tail -100f alert.log
-bash: fork: retry: Resource temporarily unavailable
-bash: fork: retry: Resource temporarily unavailable


After some time after reboot and DBs have started we were unable to switch to oracle account:
# su - oracle
su: /bin/bash: Resource temporarily unavailable


Because the problem is concerned to new process creation,  we checked some files and found they are ok.We found that these files were old enough, nobody modified it:
/etc/sysctl.conf
/etc/security/limits.conf
oracle soft nproc 400000
oracle hard nproc 400000

But "ulimit" show  the very different values:
[oracle@exa6dbadm01 ~]$ ulimit -u
2047

The number of oracle processes is close to 2048 value:
# ps -ef | grep oracle | wc -l
2326

The linux reboot didn't bring new ulimit values.

The "strace" is the way to found the root cause:

# strace su - oracle
...
open("/etc/security/limits.d/90-nproc.conf", O_RDONLY) = 3
...
setrlimit(RLIMIT_NPROC, {rlim_cur=2047, rlim_max=400000}) = 0
...


# cat /etc/security/limits.d/90-nproc.conf
# Default limit for number of user's processes to prevent
# accidental fork bombs.
# See rhbz #432903 for reasoning.
*          soft    nproc     1024
root       soft    nproc     unlimited
#Oracle recommended value for nproc is set to 2047 for user oracle
oracle  soft  nproc  2047


After we changed the "oracle  soft  nproc " to 400000 the system is work ok.
Bingo !


No comments:

Post a Comment

ORA-01405: fetched column value is NULL after upgrade from 12.1 to 18c

SYMPTOMs: After upgrade 12.1.0.2 -> 18.6 we obtained at any SQL (select and DML) : ORA-00604: error occurred at recursive SQL level ...