Wednesday, October 23, 2019

NUMA on the Exadata and ORA-04031

At the very beginning when Exadata appeared at 2010 the settings of Linux OS and Oracle RDBMS were configured to use flat memory. There were no NUMA settings despite the SUN 4275 server is actualy NUMA server. And some notes at old times prohibited using NUMA settings on the Exadata.

But starting from November 2018 (
Exadata image 18.10), Oracle silently enabled NUMA settings for Exadata. Which can increase appearance of ORA-04031.

Here is a some words about enabling NUMA on Exadata in
2378765.1:
" Normally, PX message buffers come out of the large pool as long as sga_target or memory_target is set. However, when "_enable_NUMA_support" is 'TRUE' and you are running on Exadata (then _px_numa_support_enabled is 'TRUE') causes PX message buffers to be allocated from the numa pool. "

Monday, September 16, 2019

ORA-01405: fetched column value is NULL after upgrade from 12.1 to 18c

SYMPTOMs:

After upgrade 12.1.0.2 -> 18.6 we obtained at any SQL (select and DML) :

ORA-00604: error occurred at recursive SQL level 1
ORA-01405: fetched column value is NULL

At first (after successful upgrade) we see warning messages in alert log:

Completed: ALTER DATABASE OPEN /* db agent *//* {1:44916:15577} */
Unable to obtain current patch information due to error: 1405, ORA-01405: fetched column value is NULL
===========================================================
Dumping current patch information
===========================================================
Unable to obtain current patch information due to error: 1405
===========================================================
## jox_ujs_status: database not open read-write or Java not installed, returning FALSE in pid 258708

As you can see there is no expected patch list writen in alert log and issue concerning java.

Investigating the patch issue we found that all SQLs lead to ORA-1405 error:

SQL> select * from dual;
select * from dual
              *
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-01405: fetched column value is NULL

We did the trace 10046 for the query above.
And for "select * from dual" we obtained simple trace file:

*** 2019-09-15T11:41:35.514839+03:00
WAIT #139918613321520: nam='SQL*Net message from client' ela= 4123724 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=5429869063221
CLOSE #139918613321520:c=5,e=5,dep=0,type=1,tim=5429869063321
=====================
PARSING IN CURSOR #139918613299752 len=102 dep=1 uid=0 oct=3 lid=0 tim=5429869064742 hv=3908278695 ad='27f813db8' sqlid='68hhnz3ng76d7'
select max_iops, max_mbps, max_pmbps, latency, num_disks, additional_info  from resource_io_calibrate$
END OF STMT
PARSE #139918613299752:c=399,e=966,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=4,plh=1003572626,tim=5429869064742
EXEC  #139918613299752:c=21,e=21,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=1003572626,tim=5429869064821
FETCH #139918613299752:c=35,e=35,p=0,cr=2,cu=0,mis=0,r=0,dep=1,og=4,plh=1003572626,tim=5429869064871
STAT  #139918613299752 id=1 cnt=1 pid=0 pos=1 obj=303 op='TABLE ACCESS STORAGE FULL RESOURCE_IO_CALIBRATE$ (cr=2 pr=0 pw=0 str=1 time=34 us cost=2 size=41 card=1)'
CLOSE #139918613299752:c=56,e=56,dep=1,type=0,tim=5429869064963
=====================
PARSE ERROR #139918613303136:len=18 dep=0 uid=0 oct=3 lid=0 tim=5429869064981 err=604
select * from dual
WAIT #139918613303136: nam='Disk file operations I/O' ela= 29 FileOperation=8 fileno=0 filetype=8 obj#=-1 tim=5429869065096
WAIT #139918613303136: nam='SQL*Net break/reset to client' ela= 2 driver id=1650815232 break?=1 p3=0 obj#=-1 tim=5429869065126
WAIT #139918613303136: nam='SQL*Net break/reset to client' ela= 63 driver id=1650815232 break?=0 p3=0 obj#=-1 tim=5429869065201
WAIT #139918613303136: nam='SQL*Net message to client' ela= 1 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=5429869065217

*** 2019-09-15T11:41:37.746371+03:00
WAIT #139918613303136: nam='SQL*Net message from client' ela= 2229498 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=5429871294743
XCTEND rlbk=0, rd_only=1, tim=5429871294873
CLOSE #139918613303136:c=6,e=7,dep=0,type=0,tim=5429871294938


Nothing that would catch on the problem:  "PARSE ERROR ... err=604"

Security restrictions ? Broken audit triggers? VPD ?

None !

As it turned out, the message "ORA-01405: fetched column value is NULL" appears at intersection of:
-    Parallel SQL (parallel_degree_policy=AUTO)
-    parallel_degree_limit=IO 
-    upgrade 12.1 => 18c


REASON:

Oracle changed data format stored in the resource_io_calibrate$.ADDITIONAL_INFO column
and after the upgrade 18c database cannot recognise the content of this column.
As oracle say: " The DDL for resource_io_calibrate$ is different between 12c and 18.5.
the upgrade isn't handling the new ADDITIONAL_INFO column correctly"

So, if you use the parallel_degree_limit=IO + parallel_degree_policy=AUTO and there is data in the resource_io_calibrate$ table then can  obtain ORA-01405.

After we set parallel_degree_limit=CPU the error is disappeared.


SOLUTIONs:

- delete from resource_io_calibrate$;
  commit;
 And if you  need to populate resource_io_calibrate$ with new values, then: exec dbms_resource_manager.calibrate_io - .

  OR

- change the parallel_degree_limit to CPU
  (an i recommend next "delete from resource_io_calibrate$"  to avoid this error in the future)


Getting ORA-604, ORA-01405: fetched column value is NULL when PARALLEL_DEGREE_LIMIT = IO in 18c (Doc ID 2537431.1)

Saturday, September 7, 2019

Exadata image 18.1.19 pitfall

After Exadata image 18.1.19 have been installed we obtained some errors, concerned to new process creation:

The "tail  alert.log" bring the message:
$ tail -100f alert.log
-bash: fork: retry: Resource temporarily unavailable
-bash: fork: retry: Resource temporarily unavailable


After some time after reboot and DBs have started we were unable to switch to oracle account:
# su - oracle
su: /bin/bash: Resource temporarily unavailable


Because the problem is concerned to new process creation,  we checked some files and found they are ok.We found that these files were old enough, nobody modified it:
/etc/sysctl.conf
/etc/security/limits.conf
oracle soft nproc 400000
oracle hard nproc 400000

But "ulimit" show  the very different values:
[oracle@exa6dbadm01 ~]$ ulimit -u
2047

The number of oracle processes is close to 2048 value:
# ps -ef | grep oracle | wc -l
2326

The linux reboot didn't bring new ulimit values.

The "strace" is the way to found the root cause:

# strace su - oracle
...
open("/etc/security/limits.d/90-nproc.conf", O_RDONLY) = 3
...
setrlimit(RLIMIT_NPROC, {rlim_cur=2047, rlim_max=400000}) = 0
...


# cat /etc/security/limits.d/90-nproc.conf
# Default limit for number of user's processes to prevent
# accidental fork bombs.
# See rhbz #432903 for reasoning.
*          soft    nproc     1024
root       soft    nproc     unlimited
#Oracle recommended value for nproc is set to 2047 for user oracle
oracle  soft  nproc  2047


After we changed the "oracle  soft  nproc " to 400000 the system is work ok.
Bingo !


Friday, September 6, 2019

In defense of HCC: ORA-39726: unsupported add/drop column operation on compressed tables

A friend asked to check the

ISSUE:
ORA-39726 is hit when adding column with default value to basic compressed table.

CAUSE:
compressed table will not allow some operation, which are allowed in uncompressed table.


"Is there any error during add column with default value to table compressed with HCC ?
Does Oracle unpack HCC data during add column with default value to the table compressed with HCC ?"  he asked.

----------------------------------------------------------

And the answer is here.

RDBMS version is 12.1.0.2.171017 (i choosed old enough version i can find. newer versions belived work the same).

$ opatch lspatches
26635845;Database PSU 12.1.0.2.171017, Oracle JavaVM Component (OCT2017)
25729214;OCW Interim patch for 25729214
26717470;Database Bundle Patch : 12.1.0.2.171017 (26717470)



$ sqlplus / as sysdba
SQL> create tablespace yu extent management local segment space management auto;
SQL> create user yu identified by yu default tablespace yu;
SQL> grant dba to yu ;

SQL> conn yu/yu

SQL> create table NOCOMP as select owner,object_type,object_name,object_id,created from dba_objects order by owner,object_type,object_name;
insert into nocomp select * from nocomp;
insert into nocomp select * from nocomp;
insert into nocomp select * from nocomp;
insert into nocomp select * from nocomp;
commit;

Table created.

SQL>
151156 rows created.

SQL>
302312 rows created.

SQL>
604624 rows created.

SQL>
1209248 rows created.

SQL>
Commit complete.

SQL> select table_name from user_tables;

TABLE_NAME
--------------------------------------------
NOCOMP

SQL> select segment_name, bytes, bytes/1048576 MB, blocks from user_segments order by 1;

SEGMENT_NAME        BYTES   MB    BLOCKS
-------------- ---------- ------- ----------
NOCOMP          159383552   152   9728


SQL> create table BASICOMP compress as select * from NOCOMP;

create table OLTPCOMP  compress for OLTP as select * from NOCOMP;

create table QL   compress for query LOW as select * from NOCOMP;
create table QLRL compress for query LOW row level locking as select * from NOCOMP;

create table QH   compress for query HIGH as select * from NOCOMP;
create table QHRL compress for query HIGH row level locking as select * from NOCOMP;

create table AL   compress for archive LOW as select * from NOCOMP;
create table ALRL compress for archive LOW row level locking as select * from NOCOMP;

create table AH   compress for archive HIGH as select * from NOCOMP;
create table AHRL compress for archive HIGH row level locking as select * from NOCOMP;


Table created.

SQL>
Table created.

SQL> SQL>
Table created.

SQL>
Table created.

SQL> SQL>
Table created.

SQL>
Table created.

SQL> SQL>
Table created.

SQL>
Table created.

SQL> SQL>
Table created.

SQL>
Table created.

SQL> exec dbms_stats.gather_schema_stats('YU');
PL/SQL procedure successfully completed.

SQL> col segment_name for a8
SQL> select segment_name, bytes, bytes/1048576 MB, blocks from user_segments order by 2 desc;

SEGMENT_      BYTES        MB     BLOCKS
-------- ---------- ---------- ---------

NOCOMP    159383552       152       9728
OLTPCOMP  100663296        96       6144
BASICOMP   92274688        88       5632
QLRL       47185920        45       2880
QL         44040192        42       2688
QHRL       26214400        25       1600
ALRL       25165824        24       1536
QH         24117248        23       1472
AL         23068672        22       1408
AHRL       18874368        18       1152
AH         16777216        16       1024

SQL> col table_name for a8
SQL> select table_name,num_rows,blocks from user_tables order by blocks desc;

TABLE_NA   NUM_ROWS    BLOCKS
-------- ---------- ---------
NOCOMP      2418496      9652
OLTPCOMP    2418496      5799
BASICOMP    2418496      5211
QLRL        2418496      2848
QL          2418496      2670
QHRL        2418496      1577
ALRL        2418496      1492
QH          2418496      1456
AL          2418496      1400
AHRL        2418496      1148
AH          2418496      1010

SQL> 

alter table NOCOMP   add new_column number(3) default 0;
alter table BASICOMP add new_column number(3) default 0;
alter table OLTPCOMP add new_column number(3) default 0;
alter table QLRL     add new_column number(3) default 0;
alter table QL       add new_column number(3) default 0;
alter table QHRL     add new_column number(3) default 0;
alter table ALRL     add new_column number(3) default 0;
alter table QH       add new_column number(3) default 0;
alter table AL       add new_column number(3) default 0;
alter table AHRL     add new_column number(3) default 0;
alter table AH       add new_column number(3) default 0;
Table altered.

alter table BASICOMP add new_column number(3) default 0
                        *
ERROR at line 1:
ORA-39726: unsupported add/drop column operation on compressed tables

SQL>
Table altered.

SQL>
Table altered.

SQL>
Table altered.

SQL>
Table altered.

SQL>
Table altered.

SQL>
Table altered.

SQL>
Table altered.

SQL>
Table altered.

SQL>
Table altered.

SQL> exec dbms_stats.gather_schema_stats('YU');
PL/SQL procedure successfully completed.

SQL> col segment_name for a8
SQL> select segment_name, bytes, bytes/1048576 MB, blocks from user_segments order by 2 desc;

SEGMENT_      BYTES     MB   BLOCKS  BLOCKS before add column
-------- ---------- ------- ------- --------------------------

NOCOMP    159383552    152     9728    9728
OLTPCOMP  100663296     96     6144    6144
BASICOMP   92274688     88     5632    5632
QLRL       47185920     45     2880    2880
QL         44040192     42     2688    2688
QHRL       26214400     25     1600    1600
ALRL       25165824     24     1536    1536
QH         24117248     23     1472    1472
AL         23068672     22     1408    1408
AHRL       18874368     18     1152    1152
AH         16777216     16     1024    1024

SQL> col table_name for a8
SQL> select table_name,num_rows,blocks from user_tables order by blocks desc;

TABLE_NA   NUM_ROWS    BLOCKS   BLOCKS before add column

-------- ---------- ----------   ------------------------
NOCOMP      2418496      9652     9652
OLTPCOMP    2418496      5799     5799
BASICOMP    2418496      5211     5211
QLRL        2418496      2848     2848
QL          2418496      2670     2670
QHRL        2418496      1577     1577
ALRL        2418496      1492     1492
QH          2418496      1456     1456
AL          2418496      1400     1400
AHRL        2418496      1148     1148
AH          2418496      1010     1010


No one block is added after ADD COLUMN modification!

As you can see the only issue for BASIC COMPRESSION:

alter table BASICOMP add new_column number(3) default 0
                         *
ERROR at line 1:
ORA-39726: unsupported add/drop column operation on compressed tables

In other cases "alter table add column" work well: no decompression at all.
As the DOC 12.1.0.2 say:
"... the default value is stored as metadata, the column itself is not populated with data"




Wednesday, April 17, 2019

How to measure network throughput with RMAN ?

Implementing the ZDLRA into customer's environment we often need to measure network throughput. The easy way is to use RMAN. 

I assume you prepared customer database to work with ZDLRA (created wallet, installed libra.so ) and you can backup this database or its datafiles to ZDLRA without errors.
But if the backup run slow, then you need to test network between protected database and ZDLRA.

This step is to prepare the nettest.rman file:

$ cat nettest.rman
# rman target / |tee /home/oracle/rman_log/nettest_`date +%d%H%M`.log
RUN  {
ALLOCATE CHANNEL ch01 DEVICE TYPE sbt_tape PARMS 'SBT_LIBRARY=/home/oracle/zdlra/libra.so,ENV=(RA_WALLET=LOCATION=file:/home/oracle/zdlra/ CREDENTIAL_ALIAS=zdlra_ib)' FORMAT '%I_%T_%p_%u_%c';

SEND CHANNEL ch01 'NETTEST BACKUP 1024M';


Then connect to target database and run this script:

$ rman target / |tee /home/oracle/rman_log/nettest_`date +%d%H%M`.log
Recovery Manager: Release 12.2.0.1.0 - Production on Wed Apr 17 09:46:02 2019
Copyright (c) 1982, 2017, Oracle and/or its affiliates.  All rights reserved.
connected to target database: BR (DBID=1302487780)

RMAN> @nettest.rman

RMAN> # rman target / |tee /home/oracle/rman_log/nettest_`date +%d%H%M`.log
2> RUN  {
3> ALLOCATE CHANNEL ch01 DEVICE TYPE sbt_tape PARMS 'SBT_LIBRARY=/home/oracle/zdlra/libra.so,ENV=(RA_WALLET=LOCATION=file:/home/oracle/zdlra/ CREDENTIAL_ALIAS=zdlra_ib)' FORMAT '%I_%T_%p_%u_%c';
4> SEND CHANNEL ch01 'NETTEST BACKUP 1024M';
5> }

using target database control file instead of recovery catalog
allocated channel: ch01
channel ch01: SID=1460 instance=br1 device type=SBT_TAPE
channel ch01: RA Library (ZDLRA) SID=86B53DD7A23715B3E053E20F1FAC745F

RMAN-06918: warning: allocated SBT channel to the Recovery Appliance in NOCATALOG mode

released channel: ch01
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of send command at 04/17/2019 09:46:17
ORA-19559: error sending device command: NETTEST BACKUP 1024M
ORA-19557: device error, device type: SBT_TAPE, device name:
ORA-27194: skgfdvcmd: sbtcommand returned error
ORA-19511: non RMAN, but media manager or vendor specific failure, error text:
   KBHS-00402: NETTEST sucessfully completed
KBHS-00400: NETTEST BACKUP: 1073741824 bytes sent in 4033428 microseconds

RMAN>
RMAN> **end-of-file**

RMAN>

Recovery Manager complete.



Don't pay attention to the errors, they doesn't matter.
Look the numbers in the line
NETTEST BACKUP: 1073741824 bytes sent in 4033428 microseconds

RMAN transferred 1g during about 4 seconds via my network .

The same way you can test the restore operation:
Syntax: NETTEST {[BACKUP | RESTORE]} [<size>M];


RMAN> @nettest.rman

RMAN> # rman target / |tee /home/oracle/rman_log/nettest_`date +%d%H%M`.log
2> RUN  {
3> ALLOCATE CHANNEL ch01 DEVICE TYPE sbt_tape PARMS 'SBT_LIBRARY=/home/oracle/zdlra/libra.so,ENV=(RA_WALLET=LOCATION=file:/home/oracle/zdlra/ CREDENTIAL_ALIAS=zdlra_ib)' FORMAT '%I_%T_%p_%u_%c';
4>
5> #SEND CHANNEL ch01 'NETTEST BACKUP 1024M';
6> SEND CHANNEL ch01 'NETTEST RESTORE 1024M';
7> }
using target database control file instead of recovery catalog
allocated channel: ch01
channel ch01: SID=127 instance=br1 device type=SBT_TAPE
channel ch01: RA Library (ZDLRA) SID=86B5F9BE6256D7B5E053E20F1FACB9D2
RMAN-06918: warning: allocated SBT channel to the Recovery Appliance in NOCATALOG mode

released channel: ch01
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of send command at 04/17/2019 10:38:47
ORA-19559: error sending device command: NETTEST RESTORE 1024M
ORA-19557: device error, device type: SBT_TAPE, device name:
ORA-27194: skgfdvcmd: sbtcommand returned error
ORA-19511: non RMAN, but media manager or vendor specific failure, error text:
   KBHS-00402: NETTEST sucessfully completed
KBHS-00401: NETTEST RESTORE: 1073741824 bytes received in 813152 microseconds



Source:
How to measure network performance from RMAN for ZDLRA or Cloud Backups (Doc ID 2371860.1)

Wednesday, April 3, 2019

Exadata X8 released

Exadata X8 released:

https://www.oracle.com/engineered-systems/exadata/database-machine-x8/

Cells of new type at $ 16500:
https://www.oracle.com/assets/exadata-pricelist-070598.pdf

NUMA on the Exadata and ORA-04031

At the very beginning when Exadata appeared at 2010 the settings of Linux OS and Oracle RDBMS were configured to use flat memory. There wer...