Saturday, February 16, 2019

ORA-15017: diskgroup ... cannot be mounted




Some days ago (preparing an Exadata Storage Snapshots  environment ) i changed the asm_diskgroup in ASM parameter file. Before change this parameter pointed diskgroups RECO and DATA:
asm_diskgroups = 'DATAC1, RECOC1'

I dropped diskgroup RECOC1, created SPARSE on the RECOC1 space and changed this parameter to the value:  
asm_diskgroups = 'DATAC1', 'SPARSE'

After a day or two i noticed the message in the ASM aler log which appeared every minute and mentioned RECO disk group:
ERROR: ALTER DISKGROUP RECOC1 MOUNT  /* asm agent *//* {1:40706:63383} */
SQL> ALTER DISKGROUP RECOC1 MOUNT  /* asm agent *//* {1:40706:63408} */
NOTE: cache registered group RECOC1 4/0xAAB48873
NOTE: cache began mount (first) of group RECOC1 4/0xAAB48873
ERROR: no read quorum in group: required 2, found 0 disks
NOTE: cache dismounting (clean) group 4/0xAAB48873 (RECOC1)
NOTE: messaging CKPT to quiesce pins Unix process pid: 213192, image: oracle@ed03dbadm01.distr.fors.ru (TNS V1-V3)
NOTE: dbwr not being msg'd to dismount
NOTE: LGWR not being messaged to dismount
NOTE: cache dismounted group 4/0xAAB48873 (RECOC1)
NOTE: cache ending mount (fail) of group RECOC1 number=4 incarn=0xaab48873
NOTE: cache deleting context for group RECOC1 4/0xaab48873
GMON dismounting group 4 at 63274 for pid 42, osid 213192
ERROR: diskgroup RECOC1 was not mounted
ORA-15032: not all alterations performed
ORA-15017: diskgroup "RECOC1" cannot be mounted
ORA-15040: diskgroup is incomplete

Very strange!
This message indicates that the list of groups is stored somewhere else in the Grid.



The solution is easy:
[grid@ed03dbadm01 ~]$ srvctl -h |grep diskgroup


Usage: srvctl start diskgroup -diskgroup <dg_name> [-node "<node_list>"]
Usage: srvctl stop diskgroup -diskgroup <dg_name> [-node "<node_list>"] [-force]
Usage: srvctl status diskgroup -diskgroup <dg_name> [-node "<node_list>"] [-detail] [-verbose]

Usage: srvctl enable diskgroup -diskgroup <dg_name> [-node "<node_list>"]
Usage: srvctl disable diskgroup -diskgroup <dg_name> [-node "<node_list>"]
Usage: srvctl remove diskgroup -diskgroup <dg_name> [-force]
Usage: srvctl predict diskgroup -diskgroup <diskgroup_name> [-verbose]

[grid@ed03dbadm01 ~]$ srvctl status diskgroup -diskgroup recoc1
Disk Group recoc1 is not running
[grid@ed03dbadm01 ~]$ srvctl remove diskgroup -diskgroup RECOC1
 


And the message no longer occur !

Friday, February 15, 2019

Exadata Storage Snapshots, Part5: Clone from Clone

(start is here)

Clone from clone


We want to make clone database KK from our clone database K.

I stopped 2nd instance of K on 2nd RAC node and have 1st instance K working because I need "create controlfile".


At the 1st node make preparation:

SQL> ALTER DATABASE BACKUP CONTROLFILE TO TRACE;

SQL> SET newpage 0
SET linesize 999
SET pagesize 0
SET feedback off
SET heading off
SET echo off
SET space 0
SET tab off
SET trimspool on
SPOOL rename.sql
SELECT 'EXECUTE dbms_dnfs.clonedb_renamefile('||''''||name||''''||','||''''||REPLACE(REPLACE(REPLACESQL> SQL> SQL> SQL> SQL> SQL> SQL> SQL> SQL> SQL> (name,'.','_'),'K','KK'),'+DATA','+SPARSE')||''''||');'
FROM v$datafile;


I prepared 3 files:

-rw-r----- 1 oracle oinstall 791 Jan 29 17:14 cf.sql
-rw-r----- 1 oracle oinstall 965 Jan 29 17:18 initkk.ora
-rw-r--r-- 1 oracle oinstall 746 Jan 29 16:57 rename.sql

$ cat initkk.ora


audit_file_dest=/u01/app/oracle/admin/kk/adump  # $ mkdir -p /u01/app/oracle/admin/kk/adump
*.cluster_database=FALSE
kk2.cluster_interconnects='192.168.13.2'
kk1.cluster_interconnects='192.168.13.1'
*.compatible='18.0.0.0.0'
*.control_files='+DATAC1/kk/CONTROLFILE/CF1.dbf'
*.db_block_size=8192
*.db_create_file_dest='+DATAC1'
*.db_files=1024
*.db_name='kk'
*.db_recovery_file_dest_size=100g
*.db_recovery_file_dest='+SPARSE'
*.diagnostic_dest='/u01/app/oracle'
*.dispatchers='(PROTOCOL=TCP)(SERVICE=kkXDB)'
*.fast_start_mttr_target=300
*.filesystemio_options='setall'
*.global_names=TRUE
kk2.instance_number=2
kk1.instance_number=1
*.local_listener='-oraagent-dummy-'
*.log_archive_dest_1='location=USE_DB_RECOVERY_FILE_DEST'
*.log_archive_format='%t_%s_%r.dbf'
*.open_cursors=1000
*.parallel_threads_per_cpu=1
*.pga_aggregate_target=6442450944
*.processes=2048
*.remote_login_passwordfile='exclusive'
*.sga_target=8192m
kk2.thread=2
kk1.thread=1
kk2.undo_tablespace='UNDOTBS2'
kk1.undo_tablespace='UNDOTBS1'

CREATE CONTROLFILE REUSE SET DATABASE kk RESETLOGS ARCHIVELOG
    MAXLOGFILES 1024
    MAXLOGMEMBERS 5
    MAXDATAFILES 32767
    MAXINSTANCES 32
    MAXLOGHISTORY 33012
LOGFILE
  GROUP 1 '+DATAC1' SIZE 4096M ,
  GROUP 2 '+DATAC1' SIZE 4096M ,
  GROUP 3 '+DATAC1' SIZE 4096M
DATAFILE
  '+SPARSE/K/DATAFILE/system_260_998393855',
  '+SPARSE/K/DATAFILE/sysaux_261_998393857',
  '+SPARSE/K/DATAFILE/undotbs1_262_998393859',
  '+SPARSE/K/DATAFILE/undotbs2_264_998393867',
  '+SPARSE/K/DATAFILE/users_265_998393867',
  '+SPARSE/K/DATAFILE/tpch_291_998477197'

CHARACTER SET AL32UTF8 ;


$ cat rename_files.sql
EXECUTE dbms_dnfs.clonedb_renamefile('+SPARSE/K/DATAFILE/system_260_998393855','+SPARSE/KK/DATAFILE/system_260_998393855');
EXECUTE dbms_dnfs.clonedb_renamefile('+SPARSE/K/DATAFILE/sysaux_261_998393857','+SPARSE/KK/DATAFILE/sysaux_261_998393857');
EXECUTE dbms_dnfs.clonedb_renamefile('+SPARSE/K/DATAFILE/undotbs1_262_998393859','+SPARSE/KK/DATAFILE/undotbs1_262_998393859');
EXECUTE dbms_dnfs.clonedb_renamefile('+SPARSE/K/DATAFILE/undotbs2_264_998393867','+SPARSE/KK/DATAFILE/undotbs2_264_998393867');
EXECUTE dbms_dnfs.clonedb_renamefile('+SPARSE/K/DATAFILE/users_265_998393867','+SPARSE/KK/DATAFILE/users_265_998393867');
EXECUTE dbms_dnfs.clonedb_renamefile('+SPARSE/K/DATAFILE/tpch_291_998477197','+SPARSE/KK/DATAFILE/tpch_291_998477197');


Look at K datafiles :

[grid@ed03dbadm01 ~]$ asmcmd
ASMCMD> cd spa*/k/da*
ASMCMD> ls -l
Type      Redund  Striped  Time             Sys  Name
DATAFILE  MIRROR  COARSE   JAN 29 16:00:00  N    sysaux_261_998393857 => +SPARSE/K/DATAFILE/sysaux_261_998393857.264.998749765
DATAFILE  MIRROR  COARSE   JAN 29 16:00:00  Y    sysaux_261_998393857.264.998749765
DATAFILE  MIRROR  COARSE   JAN 29 16:00:00  N    system_260_998393855 => +SPARSE/K/DATAFILE/system_260_998393855.265.998749747
DATAFILE  MIRROR  COARSE   JAN 29 16:00:00  Y    system_260_998393855.265.998749747
DATAFILE  MIRROR  COARSE   JAN 29 16:00:00  N    tpch_291_998477197 => +SPARSE/K/DATAFILE/tpch_291_998477197.274.998749765
DATAFILE  MIRROR  COARSE   JAN 29 16:00:00  Y    tpch_291_998477197.274.998749765
DATAFILE  MIRROR  COARSE   JAN 29 16:00:00  N    undotbs1_262_998393859 => +SPARSE/K/DATAFILE/undotbs1_262_998393859.263.998749765
DATAFILE  MIRROR  COARSE   JAN 29 16:00:00  Y    undotbs1_262_998393859.263.998749765
DATAFILE  MIRROR  COARSE   JAN 29 16:00:00  N    undotbs2_264_998393867 => +SPARSE/K/DATAFILE/undotbs2_264_998393867.262.998749765
DATAFILE  MIRROR  COARSE   JAN 29 16:00:00  Y    undotbs2_264_998393867.262.998749765
DATAFILE  MIRROR  COARSE   JAN 29 16:00:00  N    users_265_998393867 => +SPARSE/K/DATAFILE/users_265_998393867.281.998749765
DATAFILE  MIRROR  COARSE   JAN 29 16:00:00  Y    users_265_998393867.281.99874976


I intentionally don't shutdown the source database K to demonstrate the errors.

Let's make the clone:

[oracle@ed03dbadm01 klone_from_klone]$ sqlplus / as sysdba

Connected to an idle instance.
SQL> startup nomount pfile='initkk.ora'
ORACLE instance started.

Total System Global Area 8589933416 bytes
Fixed Size           12217192 bytes
Variable Size         2264924160 bytes
Database Buffers     6241124352 bytes
Redo Buffers           71667712 bytes

SQL> @cf

Control file created.
          
SQL> EXECUTE dbms_dnfs.clonedb_renamefile('+SPARSE/K/DATAFILE/system_260_998393855','+SPARSE/KK/DATAFILE/system_260_998393855');
BEGIN dbms_dnfs.clonedb_renamefile('+SPARSE/K/DATAFILE/system_260_998393855','+SPARSE/KK/DATAFILE/system_260_998393855'); END;

*
ERROR at line 1:
ORA-17515: Creation of clonedb failed using snapshot file
+SPARSE/K/DATAFILE/system_260_998393855
ORA-17515: Creation of clonedb failed using snapshot file
+SPARSE/K/DATAFILE/system_260_998393855
ORA-15304: operation requires ACCESS_CONTROL.ENABLED attribute to be TRUE
ORA-06512: at "SYS.X$DBMS_DNFS", line 10
ORA-06512: at line 1

Our mistake: SPARSE ASM diskgoup hasn't set 'ACCESS_CONTROL.ENABLED' = 'TRUE'

[grid]$ sqlplus / as sysasm

SQL> ALTER DISKGROUP SPARSE SET ATTRIBUTE 'ACCESS_CONTROL.ENABLED' = 'TRUE';

Diskgroup altered.


Repeat rename.sql at KK database:

SQL> EXECUTE dbms_dnfs.clonedb_renamefile('+SPARSE/K/DATAFILE/system_260_998393855','+SPARSE/KK/DATAFILE/system_260_998393855');
BEGIN dbms_dnfs.clonedb_renamefile('+SPARSE/K/DATAFILE/system_260_998393855','+SPARSE/KK/DATAFILE/system_260_998393855'); END;

*
ERROR at line 1:
ORA-17515: Creation of clonedb failed using snapshot file
+SPARSE/K/DATAFILE/system_260_998393855
ORA-17515: Creation of clonedb failed using snapshot file
+SPARSE/K/DATAFILE/system_260_998393855
ORA-06512: at "SYS.X$DBMS_DNFS", line 10
ORA-06512: at line 1

The source database is not shutdowned clearly.
Oracle, please make this message more informative, for example "source DB is not in Read Only state" ;) .

Connect to K and shutdown it:

[oracle@ed03dbadm01 ~]$ . k
[oracle@ed03dbadm01 ~]$ sqlplus / as sysdba
SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.


Back to KK:

SQL> EXECUTE dbms_dnfs.clonedb_renamefile('+SPARSE/K/DATAFILE/system_260_998393855','+SPARSE/KK/DATAFILE/system_260_998393855');
BEGIN dbms_dnfs.clonedb_renamefile('+SPARSE/K/DATAFILE/system_260_998393855','+SPARSE/KK/DATAFILE/system_260_998393855'); END;

*
ERROR at line 1:
ORA-17515: Creation of clonedb failed using snapshot file
+SPARSE/K/DATAFILE/system_260_998393855
ORA-17515: Creation of clonedb failed using snapshot file
+SPARSE/K/DATAFILE/system_260_998393855
ORA-15260: permission denied on ASM disk group
ORA-06512: at "SYS.X$DBMS_DNFS", line 10
ORA-06512: at line 1

Permission denied !


At ASM:
$ sqlplus / as sysasm
ALTER DISKGROUP SPARSE set permission owner=read ONLY, group=read ONLY, other=none for file '+SPARSE/K/DATAFILE/system_260_998393855.265.998749747';  
ALTER DISKGROUP SPARSE set permission owner=read ONLY, group=read ONLY, other=none for file '+SPARSE/K/DATAFILE/sysaux_261_998393857';  
ALTER DISKGROUP SPARSE set permission owner=read ONLY, group=read ONLY, other=none for file '+SPARSE/K/DATAFILE/undotbs1_262_998393859';
ALTER DISKGROUP SPARSE set permission owner=read ONLY, group=read ONLY, other=none for file '+SPARSE/K/DATAFILE/undotbs2_264_998393867';
ALTER DISKGROUP SPARSE set permission owner=read ONLY, group=read ONLY, other=none for file '+SPARSE/K/DATAFILE/users_265_998393867';
ALTER DISKGROUP SPARSE set permission owner=read ONLY, group=read ONLY, other=none for file '+SPARSE/K/DATAFILE/tpch_291_998477197';    


SQL> ALTER DISKGROUP SPARSE set permission owner=read ONLY, group=read ONLY, other=none for file '+SPARSE/K/DATAFILE/system_260_998393855.265.998749747';  
Diskgroup altered.

SQL> ALTER DISKGROUP SPARSE set permission owner=read ONLY, group=read ONLY, other=none for file '+SPARSE/K/DATAFILE/sysaux_261_998393857';  
Diskgroup altered.

SQL> ALTER DISKGROUP SPARSE set permission owner=read ONLY, group=read ONLY, other=none for file '+SPARSE/K/DATAFILE/undotbs1_262_998393859';
Diskgroup altered.

SQL> ALTER DISKGROUP SPARSE set permission owner=read ONLY, group=read ONLY, other=none for file '+SPARSE/K/DATAFILE/undotbs2_264_998393867';
Diskgroup altered.

SQL> ALTER DISKGROUP SPARSE set permission owner=read ONLY, group=read ONLY, other=none for file '+SPARSE/K/DATAFILE/users_265_998393867';
Diskgroup altered.

SQL> ALTER DISKGROUP SPARSE set permission owner=read ONLY, group=read ONLY, other=none for file '+SPARSE/K/DATAFILE/tpch_291_998477197';    
Diskgroup altered.



And obtain error one more time:

SQL> EXECUTE dbms_dnfs.clonedb_renamefile('+SPARSE/K/DATAFILE/system_260_998393855','+SPARSE/KK/DATAFILE/system_260_998393855');
BEGIN dbms_dnfs.clonedb_renamefile('+SPARSE/K/DATAFILE/system_260_998393855','+SPARSE/KK/DATAFILE/system_260_998393855'); END;

*
ERROR at line 1:
ORA-01565: error in identifying file '+SPARSE/K/DATAFILE/system_260_998393855'
ORA-17503: ksfdopn:2 Failed to open file
+SPARSE/K/DATAFILE/system_260_998393855
ORA-15260: permission denied on ASM disk group
ORA-06512: at "SYS.X$DBMS_DNFS", line 10
ORA-06512: at line 1


Look at ASM file permissions :

[grid@ed03dbadm01 ~]$ asmcmd ls --permission +SPARSE/K/DATAFILE
User  Group  Permission  Name
              r--r-----  sysaux_261_998393857 => +SPARSE/K/DATAFILE/sysaux_261_998393857.264.998749765
              r--r-----  sysaux_261_998393857.264.998749765
              r--r-----  system_260_998393855 => +SPARSE/K/DATAFILE/system_260_998393855.265.998749747
              r--r-----  system_260_998393855.265.998749747
              r--r-----  tpch_291_998477197 => +SPARSE/K/DATAFILE/tpch_291_998477197.274.998749765
              r--r-----  tpch_291_998477197.274.998749765
              r--r-----  undotbs1_262_998393859 => +SPARSE/K/DATAFILE/undotbs1_262_998393859.263.998749765
              r--r-----  undotbs1_262_998393859.263.998749765
              r--r-----  undotbs2_264_998393867 => +SPARSE/K/DATAFILE/undotbs2_264_998393867.262.998749765
              r--r-----  undotbs2_264_998393867.262.998749765
              r--r-----  users_265_998393867 => +SPARSE/K/DATAFILE/users_265_998393867.281.998749765
              r--r-----  users_265_998393867.281.998749765

See there is no User & Group permissions on the K database.
The solution:

ALTER DISKGROUP SPARSE SET OWNERSHIP OWNER='oracle' FOR FILE '+SPARSE/K/DATAFILE/system_260_998393855';
ALTER DISKGROUP SPARSE SET OWNERSHIP OWNER='oracle' FOR FILE '+SPARSE/K/DATAFILE/sysaux_261_998393857';
ALTER DISKGROUP SPARSE SET OWNERSHIP OWNER='oracle' FOR FILE '+SPARSE/K/DATAFILE/undotbs1_262_998393859';
ALTER DISKGROUP SPARSE SET OWNERSHIP OWNER='oracle' FOR FILE '+SPARSE/K/DATAFILE/undotbs2_264_998393867';
ALTER DISKGROUP SPARSE SET OWNERSHIP OWNER='oracle' FOR FILE '+SPARSE/K/DATAFILE/users_265_998393867';
ALTER DISKGROUP SPARSE SET OWNERSHIP OWNER='oracle' FOR FILE '+SPARSE/K/DATAFILE/tpch_291_998477197';


SQL> ALTER DISKGROUP SPARSE SET OWNERSHIP OWNER='oracle' FOR FILE '+SPARSE/K/DATAFILE/system_260_998393855';
Diskgroup altered.

SQL> ALTER DISKGROUP SPARSE SET OWNERSHIP OWNER='oracle' FOR FILE '+SPARSE/K/DATAFILE/sysaux_261_998393857';
Diskgroup altered.

SQL> ALTER DISKGROUP SPARSE SET OWNERSHIP OWNER='oracle' FOR FILE '+SPARSE/K/DATAFILE/undotbs1_262_998393859';
Diskgroup altered.

SQL> ALTER DISKGROUP SPARSE SET OWNERSHIP OWNER='oracle' FOR FILE '+SPARSE/K/DATAFILE/undotbs2_264_998393867';
Diskgroup altered.

SQL> ALTER DISKGROUP SPARSE SET OWNERSHIP OWNER='oracle' FOR FILE '+SPARSE/K/DATAFILE/users_265_998393867';
Diskgroup altered.

SQL> ALTER DISKGROUP SPARSE SET OWNERSHIP OWNER='oracle' FOR FILE '+SPARSE/K/DATAFILE/tpch_291_998477197';
Diskgroup altered.

SQL> ALTER DISKGROUP SPARSE SET OWNERSHIP OWNER='oracle' FOR FILE '+SPARSE/K/DATAFILE/tpch_291_998477197';
Diskgroup altered.

Now we see the User appeared for ASM files:

[grid@ed03dbadm01 ~]$ asmcmd ls --permission +SPARSE/K/DATAFILE
User                                     Group  Permission  Name
oracle@697ac408d88aff8bbfa7863a8fbb6e8d          r--r-----  sysaux_261_998393857 => +SPARSE/K/DATAFILE/sysaux_261_998393857.264.998749765
oracle@697ac408d88aff8bbfa7863a8fbb6e8d          r--r-----  sysaux_261_998393857.264.998749765
oracle@697ac408d88aff8bbfa7863a8fbb6e8d          r--r-----  system_260_998393855 => +SPARSE/K/DATAFILE/system_260_998393855.265.998749747
oracle@697ac408d88aff8bbfa7863a8fbb6e8d          r--r-----  system_260_998393855.265.998749747
oracle@697ac408d88aff8bbfa7863a8fbb6e8d          r--r-----  tpch_291_998477197 => +SPARSE/K/DATAFILE/tpch_291_998477197.274.998749765
oracle@697ac408d88aff8bbfa7863a8fbb6e8d          r--r-----  tpch_291_998477197.274.998749765
oracle@697ac408d88aff8bbfa7863a8fbb6e8d          r--r-----  undotbs1_262_998393859 => +SPARSE/K/DATAFILE/undotbs1_262_998393859.263.998749765
oracle@697ac408d88aff8bbfa7863a8fbb6e8d          r--r-----  undotbs1_262_998393859.263.998749765
oracle@697ac408d88aff8bbfa7863a8fbb6e8d          r--r-----  undotbs2_264_998393867 => +SPARSE/K/DATAFILE/undotbs2_264_998393867.262.998749765
oracle@697ac408d88aff8bbfa7863a8fbb6e8d          r--r-----  undotbs2_264_998393867.262.998749765
oracle@697ac408d88aff8bbfa7863a8fbb6e8d          r--r-----  users_265_998393867 => +SPARSE/K/DATAFILE/users_265_998393867.281.998749765
oracle@697ac408d88aff8bbfa7863a8fbb6e8d          r--r-----  users_265_998393867.281.998749765


And our rename command now work well:

SQL> EXECUTE dbms_dnfs.clonedb_renamefile('+SPARSE/K/DATAFILE/system_260_998393855','+SPARSE/KK/DATAFILE/system_260_998393855');
PL/SQL procedure successfully completed.

SQL> EXECUTE dbms_dnfs.clonedb_renamefile('+SPARSE/K/DATAFILE/sysaux_261_998393857','+SPARSE/KK/DATAFILE/sysaux_261_998393857');
PL/SQL procedure successfully completed.

SQL> EXECUTE dbms_dnfs.clonedb_renamefile('+SPARSE/K/DATAFILE/undotbs1_262_998393859','+SPARSE/KK/DATAFILE/undotbs1_262_998393859');
PL/SQL procedure successfully completed.

SQL> EXECUTE dbms_dnfs.clonedb_renamefile('+SPARSE/K/DATAFILE/undotbs2_264_998393867','+SPARSE/KK/DATAFILE/undotbs2_264_998393867');
PL/SQL procedure successfully completed.

SQL> EXECUTE dbms_dnfs.clonedb_renamefile('+SPARSE/K/DATAFILE/users_265_998393867','+SPARSE/KK/DATAFILE/users_265_998393867');
PL/SQL procedure successfully completed.

SQL> EXECUTE dbms_dnfs.clonedb_renamefile('+SPARSE/K/DATAFILE/tpch_291_998477197','+SPARSE/KK/DATAFILE/tpch_291_998477197');
PL/SQL procedure successfully completed.



SQL> alter database open resetlogs;
*
ERROR at line 1:
ORA-01152: file 1 was not restored from a sufficiently old backup
ORA-01110: data file 1: '+SPARSE/KK/DATAFILE/system_260_998393855'

This error appeared because KK's controlfile is newer than K's datafiles.
We shutdowned K after controlfile for KK was created.
Solution:

SQL> alter database recover until cancel using backup controlfile;
alter database recover until cancel using backup controlfile
*
ERROR at line 1:
ORA-00279: change 2543440 generated at 01/29/2019 17:30:18 needed for thread 1
ORA-00289: suggestion : +DATAC1
ORA-15173: entry 'ARCHIVELOG' does not exist in directory 'KK'
ORA-00280: change 2543440 for thread 1 is in sequence #2


SQL> alter database recover cancel;

Database altered.

SQL> alter database open resetlogs;

Database altered.


The KK (clone from clone) is created !

Let see the V$CLONEDFILE (aka x$ksfdsscloneinfo)

SQL> col num for 999
SQL> col parent for a42
SQL> col child for a42
SQL> SELECT filenumber num, snapshotfilename parent, clonefilename child FROM x$ksfdsscloneinfo;

 NUM PARENT                    CHILD
---- ------------------------------------------ ------------------------------------------
   1 +SPARSE/K/DATAFILE/system_260_998393855    +SPARSE/KK/DATAFILE/system_260_998393855
   2 +SPARSE/K/DATAFILE/sysaux_261_998393857    +SPARSE/KK/DATAFILE/sysaux_261_998393857
   3 +SPARSE/K/DATAFILE/undotbs1_262_998393859    +SPARSE/KK/DATAFILE/undotbs1_262_998393859
   4 +SPARSE/K/DATAFILE/undotbs2_264_998393867    +SPARSE/KK/DATAFILE/undotbs2_264_998393867
   5 +SPARSE/K/DATAFILE/users_265_998393867    +SPARSE/KK/DATAFILE/users_265_998393867
   6 +SPARSE/K/DATAFILE/tpch_291_998477197    +SPARSE/KK/DATAFILE/tpch_291_998477197

6 rows selected.

At the end of process we have the test master database "TM".
From TM we created 1st level clone database "K" - child of TM.
From 1st level clone K we created 2nd level clone KK - child database of K.

Now we have running KK and stopped K and TM databases.
What is we open READ-WRITE the K ?


[oracle@ed03dbadm01 ~]$ srvctl start db -d k
PRCR-1079 : Failed to start resource ora.k.db
CRS-5017: The resource action "ora.k.db start" encountered the following error:
ORA-03113: end-of-file on communication channel
Process ID: 379727
Session ID: 1167 Serial number: 15923
. For details refer to "(:CLSN00107:)" in "/u01/app/grid/diag/crs/ed03dbadm01/crs/trace/crsd_oraagent_oracle.trc".

CRS-2674: Start of 'ora.k.db' on 'ed03dbadm01' failed
CRS-2632: There are no more servers to try to place resource 'ora.k.db' on that would satisfy its placement policy
CRS-5017: The resource action "ora.k.db start" encountered the following error:
ORA-03113: end-of-file on communication channel
Process ID: 241835
Session ID: 1070 Serial number: 36687
. For details refer to "(:CLSN00107:)" in "/u01/app/grid/diag/crs/ed03dbadm02/crs/trace/crsd_oraagent_oracle.trc".

CRS-2674: Start of 'ora.k.db' on 'ed03dbadm02' failed


Look at K's alert.log:




ALTER DATABASE MOUNT
...
ALTER DATABASE OPEN
...
Errors in file /u01/app/oracle/diag/rdbms/k/k1/trace/k1_lgwr_379662.trc:
ORA-01110: data file 2: '+SPARSE/K/DATAFILE/sysaux_261_998393857'
ORA-01114: IO error writing block to file 2 (block # 1)
ORA-27009: cannot write to file opened for read

Errors in file /u01/app/oracle/diag/rdbms/k/k1/trace/k1_lgwr_379662.trc:
ORA-01110: data file 1: '+SPARSE/K/DATAFILE/system_260_998393855'
ORA-01114: IO error writing block to file 1 (block # 1)
ORA-27009: cannot write to file opened for read

Errors in file /u01/app/oracle/diag/rdbms/k/k1/trace/k1_lgwr_379662.trc:
ORA-01114: IO error writing block to file 3 (block # 1)
ORA-27009: cannot write to file opened for read
ORA-01114: IO error writing block to file 4 (block # 1)
ORA-27009: cannot write to file opened for read
ORA-17528: A read-only file or a file opened read-only cannot be written to: +SPARSE/K/DATAFILE/system_260_998393855.

Errors in file /u01/app/oracle/diag/rdbms/k/k1/trace/k1_ora_379727.trc:
ORA-01114: IO error writing block to file 1 (block # )
2019-01-30T12:56:21.435186+03:00
...
Dumping diagnostic data in directory=[cdmp_20190130125621], requested by (instance=1, osid=379727), summary=[abnormal instance termination].Instance terminated by USER, pid = 379727


ASM files are READ-ONLY at ASM leve, protection is working!

Set datafiles RW:

SQL> ALTER DISKGROUP SPARSE set permission owner=read write, group=read write, other=none for file '+SPARSE/K/DATAFILE/system_260_998393855.265.998749747';  
Diskgroup altered.

SQL> ALTER DISKGROUP SPARSE set permission owner=read write, group=read write, other=none for file '+SPARSE/K/DATAFILE/sysaux_261_998393857';  
Diskgroup altered.

SQL> ALTER DISKGROUP SPARSE set permission owner=read write, group=read write, other=none for file '+SPARSE/K/DATAFILE/undotbs1_262_998393859';
Diskgroup altered.

SQL> ALTER DISKGROUP SPARSE set permission owner=read write, group=read write, other=none for file '+SPARSE/K/DATAFILE/undotbs2_264_998393867';
Diskgroup altered.

SQL> ALTER DISKGROUP SPARSE set permission owner=read write, group=read write, other=none for file '+SPARSE/K/DATAFILE/users_265_998393867';
Diskgroup altered.

SQL> ALTER DISKGROUP SPARSE set permission owner=read write, group=read write, other=none for file '+SPARSE/K/DATAFILE/tpch_291_998477197';    
Diskgroup altered.


Then we start database K:

SQL> startup
ORACLE instance started.

Total System Global Area 8589933416 bytes
Fixed Size           12217192 bytes
Variable Size         2415919104 bytes
Database Buffers     6090129408 bytes
Redo Buffers           71667712 bytes
Database mounted.
Database opened.


How KK feels itcelf ?

At the moment there is no any messages in alert log. So, let's restart KK:
SQL> shutdown immediate

ORA-03113: end-of-file on communication channel
Process ID: 335855
Session ID: 390 Serial number: 8981

Alert_KK.log

Shutting down instance (immediate) (OS id: 335855)

Stopping background process SMCO

Shutting down instance: further logons disabled

Stopping background process CJQ0
Stopping background process MMNL

Stopping background process MMON
License high water mark = 34
OS process OFSD (ospid 211761) idle for 30 seconds, exiting
Dispatchers and shared servers shutdown
ALTER DATABASE CLOSE NORMAL
Stopping Emon pool

IM on ADG: Start of Empty Journal

IM on ADG: End of Empty Journal
Stopping Emon pool
stopping change tracking

Errors in file /u01/app/oracle/diag/rdbms/kk/kk1/trace/kk1_lgwr_211799.trc:
ORA-01243: system tablespace file suffered media failure
ORA-01116: error in opening database file 1
ORA-01110: data file 1: '+SPARSE/KK/DATAFILE/system_260_998393855'
ORA-17532: snapshot (parent)=+SPARSE/K/DATAFILE/system_260_998393855 checkpoint SCN 2543451 is different from snapshot checkpoint SCN 2543440 recorded in file +SPARSE/KK/DATAFILE/system_260_998393855
ORA-17531: snapshot (parent)=+SPARSE/K/DATAFILE/system_260_998393855 modification time 01/30/2019 13:16:09 is different from snapshot creation time 01/29/2019 17:30:18 recorded in file +SPARSE/KK/DATAFILE/system_260_998393855
ORA-17530: snapshot (parent), +SPARSE/K/DATAFILE/system_260_998393855, modified after clone (child) created, +SPARSE/KK/DATAFILE/system_260_998393855


Errors in file /u01/app/oracle/diag/rdbms/kk/kk1/trace/kk1_lgwr_211799.trc:
ORA-01243: system tablespace file suffered media failure
ORA-01116: error in opening database file 1
ORA-01110: data file 1: '+SPARSE/KK/DATAFILE/system_260_998393855'
ORA-17532: snapshot (parent)=+SPARSE/K/DATAFILE/system_260_998393855 checkpoint SCN 2543451 is different from snapshot checkpoint SCN 2543440 recorded in file +SPARSE/KK/DATAFILE/system_260_998393855
ORA-17531: snapshot (parent)=+SPARSE/K/DATAFILE/system_260_998393855 modification time 01/30/2019 13:16:09 is different from snapshot creation time 01/29/2019 17:30:18 recorded in file +SPARSE/KK/DATAFILE/system_260_998393855
ORA-17530: snapshot (parent), +SPARSE/K/DATAFILE/system_260_998393855, modified after clone (child) created, +SPARSE/KK/DATAFILE/system_260_998393855
Errors in file /u01/app/oracle/diag/rdbms/kk/kk1/trace/kk1_lgwr_211799.trc  (incident=249):
ORA-1243 [] [] [] [] [] [] [] [] [] [] [] []
Incident details in: /u01/app/oracle/diag/rdbms/kk/kk1/incident/incdir_249/kk1_lgwr_211799_i249.trc

USER (ospid: 211799): terminating the instance due to ORA error 1243
2019-01-30T13:19:25.122213+03:00
System state dump requested by (instance=1, osid=211799 (LGWR)), summary=[abnormal instance termination]. error - 'Instance is terminating.'
System State dumped to trace file /u01/app/oracle/diag/rdbms/kk/kk1/trace/kk1_diag_211759_20190130131925.trc
2019-01-30T13:19:25.369428+03:00
Dumping diagnostic data in directory=[cdmp_20190130131925], requested by (instance=1, osid=211799 (LGWR)), summary=[abnormal instance termination].
2019-01-30T13:19:26.161385+03:00
DIA0 (ospid: 211780): terminating the instance due to ORA error 1092
Cause - 'Instance is terminating.'

Instance terminated by DIA0, pid = 211780


Ater we open RW the K datapase (the parent of KK), then KK was crashed.




Sunday, February 10, 2019

Exadata Storage Snapshots, Part 4, Technical details

(start is here)

Create Clone database

Before we create the clone database we need prepare some stuff ("create controlfile" command, ini.ora, a set of rename datafile commands) :


Prepare cf.sql, on the TM: ALTER DATABASE BACKUP CONTROLFILE TO TRACE  + edit trace file

Prepare init$KLON.ora

Prepare rename.sql:
                               SELECT
'EXECUTE dbms_dnfs.clonedb_renamefile(
                               '||''''||
name||''''||','||''''||REPLACE(REPLACE(REPLACE(name,'.','_'),
                               '
TM',‘KLON'),'+DATA','+SPARSE')||''''||');' from v$datafiles;

The stuff related to Test Master database is colored by red.
The stuff related to new clone database is colored by green.

 

Clone database have two parts: private - colored by green and shared between clones - colored by red :

Clone’s private files
parameter file (spfile or pfile), passwordfile
Control file
Redo log files
Temporary tablespace files
TM’s datafiles                                  

 

1. Prepare controlfile



At 1st step you need to connect to source database and prepare text file with "create controlfile" command. Then you need to edit this file and change database name to new.



SQL> ALTER DATABASE BACKUP CONTROLFILE TO TRACE;
SQL> SELECT value FROM v$diag_info WHERE name = 'Default Trace File';


This is example of out file (K means klone, new database name is k):

$ cat cf.sql

CREATE CONTROLFILE REUSE SET DATABASE k RESETLOGS ARCHIVELOG
    MAXLOGFILES 1024
    MAXLOGMEMBERS 5
    MAXDATAFILES 32767
    MAXINSTANCES 32
    MAXLOGHISTORY 33012

LOGFILE
  GROUP 1 '+DATAC1/k/ONLINELOG/thread_1_group_1.dbf' SIZE 4096M ,
  GROUP 2 '+DATAC1/k/ONLINELOG/thread_1_group_2.dbf' SIZE 4096M
    …

DATAFILE
  '+DATAC1/TM/DATAFILE/system.260.998393855',
  '+DATAC1/TM/DATAFILE/sysaux.261.998393857',
  '+DATAC1/TM/DATAFILE/undotbs1.262.998393859',
  '+DATAC1/TM/DATAFILE/undotbs2.264.998393867',
  '+DATAC1/TM/DATAFILE/users.265.998393867',
  '+DATAC1/TM/DATAFILE/tpch.291.998477197'

2. Prepare init$KLON.ora

Then we need to prepare new parameter file for K database. The best way is to take parameter file from source DB TM and change some parameters. Because my source database is RAC database I need to set cluster_database=FALSE (create controlfile command produce the error).
Also we need change path for CONTROL_FILES and set the memory for new instance SGA & PGA size:


*.cluster_database=FALSE

*.compatible='18.0.0.0.0'

*.control_files='+DATAC1/k/CONTROLFILE/CF1.dbf'

*.db_block_size=8192

*.db_create_file_dest='+SPARSE'

*.db_files=1024

*.db_name='k'

*.db_recovery_file_dest_size=48g

*.db_recovery_file_dest='+DATAC1'

*.diagnostic_dest='/u01/app/oracle'

*.dispatchers='(PROTOCOL=TCP) (SERVICE=db1db1XDB)'

k2.instance_number=2

k1.instance_number=1

*.local_listener='-oraagent-dummy-'

*.log_archive_dest_1='location=USE_DB_RECOVERY_FILE_DEST'

*.log_archive_format='%t_%s_%r.dbf'

*.pga_aggregate_target=6g

*.processes=2048

*.remote_login_passwordfile='exclusive'

*.sga_target=8192m

k2.thread=2

k1.thread=1

k2.undo_tablespace='UNDOTBS2'

k1.undo_tablespace='UNDOTBS1'


DB_CREATE_FILE_DEST was set to +SPARSE in order to allow for K database create new tablespaces in SPARSE diskgroup. Sparse diskgroup can store sparse and non-sparce datafiles and databases. You can use any diskgroup for DB_CREATE_FILE_DEST.


3. Prepare a rename.sql file

Run the query below to generate a set of rename commands and store them into file:

SET newpage 0
SET linesize 999
SET pagesize 0
SET feedback off
SET heading off
SET echo off
SET space 0
SET tab off
SET trimspool on
SPOOL rename.sql
SELECT 'EXECUTE dbms_dnfs.clonedb_renamefile -
('||''''||name||''''||','||''''||REPLACE(REPLACE(REPLACE(name,'.','_'),-
'TM','K'),'+DATAC1','+SPARSE')||''''||');' FROM v$datafile;
My output is:


EXECUTE dbms_dnfs.clonedb_renamefile('+DATAC1/TM/DATAFILE/system.260.998393855','+SPARSE/K/DATAFILE/system_260_998393855');
EXECUTE dbms_dnfs.clonedb_renamefile('+DATAC1/TM/DATAFILE/sysaux.261.998393857','+SPARSE/K/DATAFILE/sysaux_261_998393857');
EXECUTE dbms_dnfs.clonedb_renamefile('+DATAC1/TM/DATAFILE/undotbs1.262.998393859','+SPARSE/K/DATAFILE/undotbs1_262_998393859');
EXECUTE dbms_dnfs.clonedb_renamefile('+DATAC1/TM/DATAFILE/undotbs2.264.998393867','+SPARSE/K/DATAFILE/undotbs2_264_998393867');
EXECUTE dbms_dnfs.clonedb_renamefile('+DATAC1/TM/DATAFILE/users.265.998393867','+SPARSE/K/DATAFILE/users_265_998393867');
EXECUTE dbms_dnfs.clonedb_renamefile('+DATAC1/TM/DATAFILE/tpch.291.998477197','+SPARSE/K/DATAFILE/tpch_291_998477197');


The "dbms_dnfs.clonedb_renamefile ..." is completely different then "alter database rename file". There are 2 different things.

4.  Prepare ASM

Oracle recommend to set READ ONLY permissions at the source datafiles before make clone (to protect source datafiles). Let demonstrate how to do it. The command is

ALTER DISKGROUP DATA set permission 
owner=read ONLY, 
group=read ONLY, 
other=none for file 'all source data files'

For example:


SQL> ALTER DISKGROUP DATAC1 set permission owner=read ONLY, group=read ONLY, other=none for file '+DATAC1/TM/DATAFILE/users.265.998393867'

*
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15304: operation requires ACCESS_CONTROL.ENABLED attribute to be TRUE

This means i need to set ACCESS_CONTROL.ENABLED'='TRUE' on the diskgroup before doing "read only" :

SQL> ALTER DISKGROUP DATAC1 SET ATTRIBUTE 'ACCESS_CONTROL.ENABLED' = 'TRUE';
Diskgroup altered.


Now we're allowed to set permissions:


SQL> ALTER DISKGROUP DATAC1 set permission owner=read ONLY, group=read ONLY, other=none for file '+DATAC1/TM/DATAFILE/system.260.998393855';
Diskgroup altered.


To revert permissions back:


ALTER DISKGROUP DATAC1 set permission owner=read write, group=read write, other=none for file '+DATAC1/TM/DATAFILE/system.260.998393855';

ALTER DISKGROUP DATAC1 SET ATTRIBUTE 'ACCESS_CONTROL.ENABLED' = 'FALSE';


In my Exadata GI is under "grid" OS account and RDBMS is under "oracle" OS account.
Datafiles in ASM diskgroup have no any owner by default and this bring us to error later at rename.sql phase:

SQL> EXECUTE dbms_dnfs.clonedb_renamefile(...);

*

ERROR at line 1:

ORA-01565: error in identifying file '+DATAC1/TM/DATAFILE/system.260.998393855'

ORA-17503: ksfdopn:2 Failed to open file

+DATAC1/TM/DATAFILE/system.260.998393855

ORA-15260: permission denied on ASM disk group

ORA-06512: at "SYS.X$DBMS_DNFS", line 10

ORA-06512: at line 1

So, we need to set the permissions. Look what we have now :

[grid@ed03dbadm01 ~]$ asmcmd
ASMCMD> ls --permission +DATAC1/TM/DATAFILE/system.260.998393855
User  Group  Permission  Name
              r--r-----  system.260.998393855

 Empty User and Group.

Let we add "oracle" user:

SQL> ALTER DISKGROUP DATAC1 ADD USER 'oracle';
ALTER DISKGROUP DATAC1 ADD USER 'oracle'
*
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15261: user 'oracle' already exists in disk group 'DATAC1'

This message shows that "oracle" account is well known to ASM.
So, let do "alter diskgroup set ownership" for all datafiles:



ALTER DISKGROUP DATAC1 SET OWNERSHIP OWNER='oracle' FOR FILE '+DATAC1/DB1DB1/DATAFILE/system.260.998393855';



I demonstrated one datafile, but you need apply "set ownership" for all source datafiles.
You'll obtain something like that:


[grid@ed03dbadm01 ~]$ asmcmd
ASMCMD> ls --permission +DATAC1/DB1DB1/DATAFILE/*
User                                     Group  Permission  Name
oracle@697ac408d88aff8bbfa7863a8fbb6e8d          r--r-----  SYSAUX.261.998393857
oracle@697ac408d88aff8bbfa7863a8fbb6e8d          r--r-----  SYSTEM.260.998393855
oracle@697ac408d88aff8bbfa7863a8fbb6e8d          r--r-----  TPCH.291.998477197
oracle@697ac408d88aff8bbfa7863a8fbb6e8d          r--r-----  UNDOTBS1.262.998393859
oracle@697ac408d88aff8bbfa7863a8fbb6e8d          r--r-----  UNDOTBS2.264.998393867
oracle@697ac408d88aff8bbfa7863a8fbb6e8d          r--r-----  USERS.265.998393867

 


 You may want to create clone under another OS account, not an "oracle". In this case you need to add this new account to disk group owners.






5. Shutdown the Test Master


Previous actions don't require downtime, but now we need to shutdown our Test Master database.
The obvious recommendation "shutdown database clearly" with "shutdown immediate".
Not "shutdown abort" and fuzzy datafiles - this will bring you to errors later!



We forgot shutdown the 2nd instance of TM at the 2nd RAC node and obtained the error later at rename phase:



EXEC dbms_dnfs.clonedb_renamefile('+DATAC1/TM/DATAFILE/system.260.998393855',

'+SPARSE/K/DATAFILE/system_260_998393855');

ERROR at line 1:
ORA-17515: Creation of clonedb failed using snapshot file
+DATAC1/TM/DATAFILE/system.260.998393855
ORA-17515: Creation of clonedb failed using snapshot file
+DATAC1/TM/DATAFILE/system.260.998393855
ORA-06512: at "SYS.X$DBMS_DNFS", line 10
ORA-06512: at line 1




Is you got similar message check the datafile's headers. These files have same SCN, so they good for cloning:


FHFNO HXIFZ HXNRCV HXSTS         FHSCN       FHAFS

----- ----- ------ ------- ------------ ----------

    1     0      0 2368471      2368471          0

    2     0      0 2368471      2368471          0

    3     0      0 2368471      2368471          0

    4     0      0 2368471      2368471          0

    5     0      0 2368471      2368471          0

    6     0      0 2368471      2368471          0




6. Cloning

Now you'd run some commands, prepared at previous steps:

 

SQL> startup nomount pfile=init$K.ora
ORACLE instance started.

SQL> @cf.sql
Control file created.

SQL> @rename.sql

    File renamed.

SQL> alter database open resetlogs;

    Database altered.

ALTER TABLESPACE temp ADD TEMPFILE  '+…'  size …g autoextend on next 1024m;

 

 

 

 

 

 

 



Does DEALLOCATE UNUSED or SHRINK SPACE will free space occupied by LOB segment?

Lets check how it works. My env is DB 19.20@Linux-x64 1) I created the table with 4 LOB columns of 4 different LOB types: BASICFILE BLOB, BA...