Wednesday, 19 June 2013

Copy control file from Primary to Stanby DB

DataGuard - Copy control file from Primary to Standby DB

Steps to do the control file copy to standby database. There are number of times you need to do the same in DG environment.

1. Create the standby control file from Primary database.
      
          ALTER DATABASE CREATE STANDBY CONTROLFILE AS '/tmp/test_stdby.ctl';

2. Shutdown the the standby database.
3. Transfer the file from Primary to the standby server, copy to the control file locations.
4. startup mount  (standby)
5. switch on the flashback if it was on before you copy the controlfile.

         ALTER DATABASE FLASHBACK ON;

Note: Flashback is required for FSFO configuration.

Now the above seems pretty straight forward - but sometimes you may see dataguard configuration is broken after the controlfile copy from Primary to standby, here is the case -

DGMGRL> show configuration verbose

Configuration - FSF

  Protection Mode: MaxAvailability
  Databases:
    test_a - Primary database
      Error: ORA-16825: multiple errors or warnings, including fast-start failover-related errors or warnings, detected for the database

    test_b - (*) Physical standby database
      Warning: ORA-16817: unsynchronized fast-start failover configuration

  (*) Fast-Start Failover target


Cause:
REAL TIME APPLY is not active as standby redologs are created after the controlfile copy from the primary and once you have copied the old control file from primary it doesn't have the info regarding the new standby redo logs.

Fix:
Drop and recreate the standby redologs on standby database.

1. check the status if any standby redolog is active, if none - that means RFS is not using standby redo logs, lets recreate the same.

SQL> select GROUP#,THREAD#,SEQUENCE#,BYTES/1024/1024/1024 ,status from v$standby_log;

    GROUP#    THREAD#  SEQUENCE# BYTES/1024/1024/1024 STATUS
---------- ---------- ---------- -------------------- ----------
        21          1          0                    1 UNASSIGNED
        22          1          0                    1 UNASSIGNED
        23          0          0                    1 UNASSIGNED
        24          0          0                    1 UNASSIGNED
        25          0          0                    1 UNASSIGNED
        26          0          0                    1 UNASSIGNED
        27          0          0                    1 UNASSIGNED
        28          0          0                    1 UNASSIGNED
        29          0          0                    1 UNASSIGNED

1. stop the MRP in standby database -

alter database recover managed standby database cancel;

2. Drop the standby redo logs.

ALTER DATABASE DROP STANDBY LOGFILE GROUP 21;
ALTER DATABASE DROP STANDBY LOGFILE GROUP 22;
ALTER DATABASE DROP STANDBY LOGFILE GROUP 23;
ALTER DATABASE DROP STANDBY LOGFILE GROUP 24;
ALTER DATABASE DROP STANDBY LOGFILE GROUP 25;
ALTER DATABASE DROP STANDBY LOGFILE GROUP 26;
ALTER DATABASE DROP STANDBY LOGFILE GROUP 27;
ALTER DATABASE DROP STANDBY LOGFILE GROUP 28;
ALTER DATABASE DROP STANDBY LOGFILE GROUP 29;

3. Recreate all with reuse.

alter database add standby logfile group 21 '/redo-01/databases/test/stby-g21-m1.log'  size 1073741824 reuse;
alter database add standby logfile group 22 '/redo-01/databases/test/stby-g22-m1.log'  size 1073741824 reuse;
alter database add standby logfile group 23 '/redo-01/databases/test/stby-g23-m1.log'  size 1073741824 reuse;
alter database add standby logfile group 24 '/redo-01/databases/test/stby-g24-m1.log'  size 1073741824 reuse;
alter database add standby logfile group 25 '/redo-01/databases/test/stby-g25-m1.log'  size 1073741824 reuse;
alter database add standby logfile group 26 '/redo-01/databases/test/stby-g26-m1.log'  size 1073741824 reuse;
alter database add standby logfile group 27 '/redo-01/databases/test/stby-g27-m1.log'  size 1073741824 reuse;
alter database add standby logfile group 28 '/redo-01/databases/test/stby-g28-m1.log'  size 1073741824 reuse;
alter database add standby logfile group 29 '/redo-01/databases/test/stby-g29-m1.log'  size 1073741824 reuse;

4. Done. check the dgmgrl status.

DGMGRL> show configuration verbose

Configuration - FSF

  Protection Mode: MaxAvailability
  Databases:
    test_a - Primary database
    test_b - (*) Physical standby database

  (*) Fast-Start Failover target



Tuesday, 18 June 2013

ORA-16664: unable to receive the result from a database


Error Code: ORA-16664: unable to receive the result from a database


Case: You can see the redo shipping is working fine from Primary to standby DB. you can notice this error when you try to enable the DG broker.

Test the following:

1. connection to standby from Primary server - should be good if the redo shipping is fine
2. connection from standby to Primary server

This could be the cause - As during normal redo shipping we don't need the vice versa connection from standby to Primary but when you enable the DG Broker it needs that for the auto switchover, fail over.

From standby server, try to connect to the Primary DB

sqlplus sys@dg1_a as sysdba

SQL*Plus: Release 11.2.0.3.0 Production on Wed Jun 19 05:46:02 2013

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

Enter password:
ERROR:
ORA-12154: TNS:could not resolve the connect identifier specified

In My case  the issue is because I changed the tnsnames port to replicate the issue. 
From the standby drc*log also you can make out the same -

drc*.log location - same as alert log.

$ tail -f drc$ORACLE_SID.log
06/19/2013 05:37:54
Failed to send message to site dg1_a. Error code is ORA-16501.
06/19/2013 05:39:09
Failed to send message to site dg1_a. Error code is ORA-16501.
06/19/2013 05:40:24
Failed to send message to site dg1_a. Error code is ORA-16501.
06/19/2013 05:41:39
Failed to send message to site dg1_a. Error code is ORA-16501.
06/19/2013 05:42:44

Fix:

Check the tnsnames in the standby server and make it similar to the Primary server. Make sure you have right ports in tnsnames which is used in listener.ora and local_listener parameter.


Some other causes for the same issue:

The logon trigger was preventing the connection from the standby database during a switchover.
Fix: disable the logon trigger in primary database.  (Metalink id 1530881.1)

Check the non default port used in connect string of both primary and standby(tnsnames.ora). Metalink id 1228797.1


 






Saturday, 20 October 2012

Rebuilding New Standby Oracle 11g DataGuard

Enable FSFO/TestCases 


Current Configuration


Server Name                        Database Name        DB_unique_Name
dg1.localdomain.com            DG1                           DG1_A  --Primary   
dg2.localdomain.com            DG1                           DG1_B  --Standby
dg3.localdomain.com            DG1                           DG1_C  --To be Configured   

For the DG setup, check the following links -

ORACLE11g-dataguard-unfolded
How-to-recreate-standby-redologs

What are we upto?

We are rebuilding the new standby database from the current Primary Standby which is FSFO enabled. This can be used in case of HOST migration.

Process

Let quickly check that we have a proper DG setup already-

select  SEQUENCE# ,STANDBY_DEST,COMPLETION_TIME from v$archived_log order by COMPLETION_TIME

 SEQUENCE# STA COMPLETION_TIME
---------- --- --------------------
        41 YES 11-OCT-2012 23:10:33
        41 NO  11-OCT-2012 23:10:33
        42 YES 11-OCT-2012 23:51:13
        42 NO  11-OCT-2012 23:51:13

select  SEQUENCE# ,STANDBY_DEST,COMPLETION_TIME from v$archived_log order by COMPLETION_TIME       
       
        40 NO  11-OCT-2012 23:10:31
        41 NO  11-OCT-2012 23:10:33
        42 NO  11-OCT-2012 23:51:13


Switch the log file on primary and verify the redo transport -

SQL> alter system switch logfile;

System altered.

SQL> /

System altered.

select  SEQUENCE# ,STANDBY_DEST,COMPLETION_TIME from v$archived_log order by COMPLETION_TIME

 SEQUENCE# STA COMPLETION_TIME
---------- --- --------------------
        43 YES 12-OCT-2012 00:25:39
        43 NO  12-OCT-2012 00:25:40
        44 YES 12-OCT-2012 00:25:41
        44 NO  12-OCT-2012 00:25:41

From Standby        

select  SEQUENCE# ,STANDBY_DEST,COMPLETION_TIME from v$archived_log order by COMPLETION_TIME

 SEQUENCE# STA COMPLETION_TIME
---------- --- --------------------
        43 NO  12-OCT-2012 00:25:39
        44 NO  12-OCT-2012 00:25:41

 
Looks perfect so lets get into creating a new standby Database.
I am doing the password less ssh configuration between servers which is not mandatory for the setup but always beneficial for obvious reasons. So do that (easy process)

Passwordless Ssh configuration-


Generate the RSA keys in the new machine which is dg3.localdomain.com
Use

/usr/bin/ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/opt/app/oracle/.ssh/id_rsa):
Created directory '/opt/app/oracle/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /opt/app/oracle/.ssh/id_rsa.
Your public key has been saved in /opt/app/oracle/.ssh/id_rsa.pub.
The key fingerprint is:
36:14:60:fc:c0:f2:f3:d3:9f:d7:e4:8b:27:b3:f5:7e

The key entry would be there in directory you mentioned above /opt/app/oracle/.ssh/id_rsa

cat id_rsa.pub
ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAmzYYJU+qnANZQh/5Uh2CVs9SbpP+NulfbHK5z/IlQbPIJa4GOKWXB8f7z6GQIyGe5ivibHK3bTPCNEKJIJS7HWb1X+HzeVwaqYVUjDMH3bohNvjpIx8wqezOysyXOmGNi6uRy8XDiiQguRgdF/cKxGCrBen2pQecLQQujqVt5fFCiga0DuZaQik+m6sVsVDmqArTeLETBB5g/pOqcAt8CrVzT8z8T6gC4Be7IrT/gI+AQnsR1teBhbaWS+j+PZv9Ced/mH/neM3nL7ZknXtKaUpjYhIuvMKHta97unxG6JH9DXV+OM4QnNfU7KG3+Ql8peF2KkXOXJMSxOARESZUHw== oracle@****.desktop.localdomain.com

Add this entry in the other server's authorized_keys file (you have to create the file first time)
Good to go ..check you are able to access the servers without the password. Same procedure vice versa.
Note: DO this using 'oracle' login or with the account where you want the ssh to be passwordless

Check the File locations from Primary

SQL> select db_unique_name, protection_mode from v$database;

DB_UNIQUE_NAME                 PROTECTION_MODE
------------------------------ --------------------
dg1_b                          MAXIMUM AVAILABILITY

SQL> select name from v$datafile;

NAME
--------------------------------------------------------------------------------
/opt/app/oracle/11.2.0.2/A13db/oradata/dg1/system01.dbf
/opt/app/oracle/11.2.0.2/A13db/oradata/dg1/sysaux01.dbf
/opt/app/oracle/11.2.0.2/A13db/oradata/dg1/undotbs01.dbf
/opt/app/oracle/11.2.0.2/A13db/oradata/dg1/AUBX_DATA_01.dbf
/opt/app/oracle/11.2.0.2/A13db/oradata/dg1/AUBX_IDX_01.dbf
/opt/app/oracle/11.2.0.2/A13db/oradata/dg1/users01.dbf


SQL> select name from v$controlfile;

NAME
--------------------------------------------------------------------------------
/opt/app/oracle/11.2.0.2/A13db/oradata/dg1/dg1_control01.ctl
/opt/app/oracle/11.2.0.2/A13db/oradata/dg1/dg1_control02.ctl
/opt/app/oracle/11.2.0.2/A13db/oradata/dg1/dg1_control03.ctl


SQL> select member from v$logfile;

MEMBER
--------------------------------------------------------------------------------
/opt/app/oracle/11.2.0.2/A13db/oradata/dg1/dg1_log1a.dbf
/opt/app/oracle/11.2.0.2/A13db/oradata/dg1/dg1_log2a.dbf
/opt/app/oracle/11.2.0.2/A13db/oradata/dg1/dg1_log3a.dbf
/opt/app/oracle/11.2.0.2/A13db/oradata/dg1/dg1_standby_redo01.log
/opt/app/oracle/11.2.0.2/A13db/oradata/dg1/dg1_standby_redo02.log
/opt/app/oracle/11.2.0.2/A13db/oradata/dg1/dg1_standby_redo03.log
/opt/app/oracle/11.2.0.2/A13db/oradata/dg1/dg1_standby_redo04.log


NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
spfile                               string      /opt/app/oracle/11.2.0.2/A13db
                                                 /dbs/spfiledg1.ora
       
Create the directories in the dg3 server-
dg3.localdomain$ mkdir -p /opt/app/oracle/11.2.0.2/A13db/oradata/dg1/
dg3.localdomain$ mkdir -p /opt/app/oracle/11.2.0.2/A13db/dbs/
dg3.localdomain$ mkdir -p /opt/app/oracle/11.2.0.2/A13db/network/admin/dg1/diag/rdbms/dg1_b/dg1/trace
dg3.localdomain$ mkdir -p /opt/app/oracle/11.2.0.2/A13db/network/admin/dg1/diag/rdbms/dg1_b/dg1/cdump
dg3.localdomain$ mkdir -p /opt/app/oracle/11.2.0.2/A13db/oradata/dg1/FRA
dg3.localdomain$ mkdir -p /opt/app/oracle/11.2.0.2/A13db/network/admin/dg1
dg3.localdomain$ mkdir -p /opt/app/oracle/11.2.0.2/A13db/oradata/dg1/arch


*.db_recovery_file_dest='/opt/app/oracle/11.2.0.2/A13db/oradata/dg1/FRA'
*.standby_archive_dest='/opt/app/oracle/11.2.0.2/A13db/oradata/dg1/arch'
*.diagnostic_dest='/opt/app/oracle/11.2.0.2/A13db/network/admin/dg1'

Shutdown the standby database -

SQL> shut immediate
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.

Start the File transfer, following files are needed to be copied and transferred after the standby shutdown-
DATAFILE,
REDOLOG FILE (can be recreated also), CONTROLFILE, SPFILE, DATAGUARD BROKER FILE, PASSWORD FILE, LISTENER , TNSNAMES,

dg2.localdomain$ cd /opt/app/oracle/11.2.0.2/A13db/oradata/dg1/
dg2.localdomain$

scp * oracle@172.27.154.66:/opt/app/oracle/11.2.0.2/A13db/oradata/dg1/
scp spfiledg1.ora dr2dg1_b.dat dr1dg1_b.dat orapwdg1 oracle@172.27.154.66:/opt/app/oracle/11.2.0.2/A13db/dbs/
scp tnsnames.ora listener.ora oracle@172.27.154.66:/opt/app/oracle/11.2.0.2/A13db/network/admin/

dg2.localdomain$ pwd
/opt/app/oracle/11.2.0.2/A13db/dbs
dg2.localdomain$
dg2.localdomain$
dg2.localdomain$ scp spfiledg1.ora dr2dg1_b.dat dr1dg1_b.dat orapwdg1 oracle@172.27.154.66:/opt/app/oracle/11.2.0.2/A13db/dbs/

Now we will do some transactions in Primary database to make it out of sync with the standby.

SQL> create user test identified by test;

User created.

SQL> grant dba to test;

Grant succeeded.

SQL> create table aa as select * from all_objects;

Table created.

SQL> insert into aa  select * from aa;

12679 rows created.
....keep on doing the same for large data entry

Check the current log sequence
SQL> conn / as sysdba
Connected.
SQL>
OR
select  SEQUENCE# ,STANDBY_DEST,COMPLETION_TIME from v$archived_log order by COMPLETION_TIME

244 NO  12-OCT-2012 04:47:38
245 NO  12-OCT-2012 04:47:53

--As the standby is down you can see there are no duplicates for the sequence#

33710 rows selected.

SQL> archive log list
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            USE_DB_RECOVERY_FILE_DEST
Oldest online log sequence     245
Next log sequence to archive   247
Current log sequence           247



dg3.localdomain$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.2.0 Production on Fri Oct 12 03:59:40 2012

Copyright (c) 1982, 2010, Oracle.  All rights reserved.

Connected to an idle instance.

SQL>
SQL> startup
ORA-32004: obsolete or deprecated parameter(s) specified for RDBMS instance
ORA-00119: invalid specification for system parameter LOCAL_LISTENER
ORA-00130: invalid listener address '(ADDRESS=(PROTOCOL=TCP)(HOST=dg3.localdomain.com)(PORT=1521))'
SQL> exit


To resolve the following error add the hostname entry in the /etc/hosts
from the new standby server dg3 start the oracle and check the last log sequence
SQL> startup mount
ORA-32004: obsolete or deprecated parameter(s) specified for RDBMS instance
ORACLE instance started.

Total System Global Area  271437824 bytes
Fixed Size                  2225584 bytes
Variable Size             213912144 bytes
Database Buffers           50331648 bytes
Redo Buffers                4968448 bytes
Database mounted.

select  SEQUENCE# ,STANDBY_DEST,COMPLETION_TIME from v$archived_log order by COMPLETION_TIME
 SEQUENCE# STA COMPLETION_TIME
---------- --- --------------------
        49 NO  12-OCT-2012 03:07:22
        50 NO  12-OCT-2012 03:07:25


Check the DGMGRL configuration from the primary

DGMGRL> connect sys/system
Connected.
DGMGRL>
DGMGRL>
DGMGRL> show configuration

Configuration - dg1_conf

  Protection Mode: MaxAvailability
  Databases:
    dg1_a - Primary database
      Error: ORA-16825: multiple errors or warnings, including fast-start failover-related errors or warnings, detected for the database

    dg1_b - (*) Physical standby database
      Error: ORA-01034: ORACLE not available

Fast-Start Failover: ENABLED

Configuration Status:
ERROR

Update tnsnames in primary to change the hostname to dg3

dg1_b =
  (DESCRIPTION =
    (ADDRESS_LIST =
      (ADDRESS = (PROTOCOL = TCP)(HOST = dg3.localdomain.com)(PORT = 1521))
    )
    (CONNECT_DATA =
      (SERVICE_NAME = dg1)
    )
  )

Make the changes in the tnsnames, listener for dg3 server (hostname needs to be modified )

DGMGRL> disable FAST_START FAILOVER;
Error: ORA-12545: Connect failed because target host or object does not exist

Failed.

DGMGRL> disable configuration;
Error: ORA-16654: fast-start failover is enabled

 
Oops I am not able to turn off fast start failover, we Should have done that before shutting down the standby.
I am going to start the old standby and do the changes and then stop it once again.

DGMGRL> disable FAST_START FAILOVER;
DGMGRL> disable configuration;
       

SQL> alter database recover managed standby database disconnect from session;

Database altered.

Logs are not applied to dg3, errors in the alert logs are as follows

Errors in file /opt/app/oracle/11.2.0.2/A13db/network/admin/dg1/diag/rdbms/dg1_b/dg1/trace/dg1_rvwr_8362.trc:
ORA-38701: Flashback database log 3 seq 210 thread 1: "/opt/app/oracle/11.2.0.2/A13db/oradata/dg1/FRA/DG1_B/flashback/o1_mf_85fxk962_.flb"
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory


So  as message states some of the archive-logs are missing, check whats the difference in SCN for the primary and standby database

SQL> select current_scn from v$database;

CURRENT_SCN
-----------
    4037233

You have to check this query from the Primary database as standby is in mount state, run the same in Primary with the standby SCN to note the time difference-
SQL>  select scn_to_timestamp(4037233) from dual;

SCN_TO_TIMESTAMP(4037233)
---------------------------------------------------------------------------
15-OCT-12 12.37.14.000000000 AM

From Standby Server

SQL> select current_scn from v$database;

CURRENT_SCN
-----------
    3758124

 SQL> select scn_to_timestamp(3758124) from dual;  


SCN_TO_TIMESTAMP(3758124)
---------------------------------------------------------------------------
12-OCT-12 03.18.18.000000000 AM

Have to resolve this GAP, we can do that by taking the incremental backups from Primary and applying the same to Standby. So login to Primary DB and run the following -       
dg1.localdomain$ rman target /

Recovery Manager: Release 11.2.0.2.0 - Production on Mon Oct 15 00:42:00 2012

Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.

connected to target database: DG1 (DBID=1774279666)
RMAN> run {
2> allocate channel c1 type disk format '/opt/app/oracle/%U.rmb';
3> backup incremental from scn 3758124 database;
4> }
.........

Create the Standby database control file from the Primary and transfer the control file and incremental backup from Primary to standby server-

alter database create standby controlfile as '/opt/app/oracle/Standby_control.ctl';

scp 0innpa51_1_1.rmb 0jnnpa8p_1_1.rmb Standby_control.ctl oracle@172.27.154.66:/opt/app/oracle/

dg3.localdomain$ cp Standby_control.ctl /opt/app/oracle/11.2.0.2/A13db/oradata/dg1/dg1_control01.ctl
dg3.localdomain$ cp Standby_control.ctl /opt/app/oracle/11.2.0.2/A13db/oradata/dg1/dg1_control02.ctl
dg3.localdomain$ cp Standby_control.ctl /opt/app/oracle/11.2.0.2/A13db/oradata/dg1/dg1_control03.ctl
dg3.localdomain$
dg3.localdomain$
dg3.localdomain$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.2.0 Production on Mon Oct 15 01:25:41 2012

Copyright (c) 1982, 2010, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> alter database mount standby database;

Database altered.

exit

keep the backups in a seperate directory, catalog the same with the standby rman and recover the database.
dg3.localdomain$ rman target /

Recovery Manager: Release 11.2.0.2.0 - Production on Mon Oct 15 01:27:01 2012

Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.

connected to target database: DG1 (DBID=1774279666, not open)

RMAN> catalog start with '/opt/app/oracle/backup_from_prim';

RMAN> recover database
2> ;

Starting recover at 15-OCT-12
using channel ORA_DISK_1
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of recover command at 10/15/2012 01:29:47
RMAN-06094: datafile 7 must be restored

RMAN>

We Have added one datafile to the system tablespace after restoring the standby database , So need to restore the same first.

RMAN> restore datafile 7;

Starting restore at 15-OCT-12
using channel ORA_DISK_1

channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00007 to /opt/app/oracle/11.2.0.2/A13db/oradata/dg1/system02.dbf
channel ORA_DISK_1: reading from backup piece /opt/app/oracle/backup_from_prim/0innpa51_1_1.rmb
channel ORA_DISK_1: piece handle=/opt/app/oracle/backup_from_prim/0innpa51_1_1.rmb tag=TAG20121015T011304
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:25
Finished restore at 15-OCT-12

RMAN>

RMAN> recover database
2> ;

Starting recover at 15-OCT-12
using channel ORA_DISK_1
channel ORA_DISK_1: starting incremental datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
destination for restore of datafile 00001: /opt/app/oracle/11.2.0.2/A13db/oradata/dg1/system01.dbf
destination for restore of datafile 00002: /opt/app/oracle/11.2.0.2/A13db/oradata/dg1/sysaux01.dbf
destination for restore of datafile 00003: /opt/app/oracle/11.2.0.2/A13db/oradata/dg1/undotbs01.dbf
destination for restore of datafile 00004: /opt/app/oracle/11.2.0.2/A13db/oradata/dg1/AUBX_DATA_01.dbf
destination for restore of datafile 00005: /opt/app/oracle/11.2.0.2/A13db/oradata/dg1/AUBX_IDX_01.dbf
destination for restore of datafile 00006: /opt/app/oracle/11.2.0.2/A13db/oradata/dg1/users01.dbf
channel ORA_DISK_1: reading from backup piece /opt/app/oracle/backup_from_prim/0innpa51_1_1.rmb
channel ORA_DISK_1: piece handle=/opt/app/oracle/backup_from_prim/0innpa51_1_1.rmb tag=TAG20121015T011304
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:35

starting media recovery

unable to find archived log
archived log thread=1 sequence=339
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of recover command at 10/15/2012 01:43:51
RMAN-06054: media recovery requesting unknown archived log for thread 1 with sequence 339 and starting SCN of 4039532

The above error is fine as we have not applied all the archive logs yet, Lets start the managed recovery and see if this catches up with Primary DB


SQL> alter database recover managed standby database disconnect from session;

Database altered.

SQL> select  SEQUENCE# ,STANDBY_DEST,COMPLETION_TIME from v$archived_log order by COMPLETION_TIME;

no rows selected

/
no rows selected


SQL> archive log list
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            USE_DB_RECOVERY_FILE_DEST
Oldest online log sequence     337
Next log sequence to archive   339
Current log sequence           339

Doing log switch on primary is not transporting the redo properly as we cant see new sequence#. Last one was 339 which is been applied through the incremental backups.
Checking the primary DB alert log, Ahh ..here is the culprit-

TNS-12545: Connect failed because target host or object does not exist
    ns secondary err code: 12560
    nt main err code: 515

I have not made the entry of the dg3 in the /etc/hosts file of the primary server yet. Making the same entry, lets c now -

cat /etc/hosts

172.27.152.123          dg1.localdomain.com dg1
172.27.154.139          dg2.localdomain.com dg2
172.27.154.66           dg3.localdomain.com dg3  --> made this entry was missing


SQL> select  SEQUENCE# ,STANDBY_DEST,COMPLETION_TIME from v$archived_log order by COMPLETION_TIME;

 SEQUENCE# STA COMPLETION_TIME
---------- --- --------------------
       341 NO  15-OCT-2012 02:00:29
       340 NO  15-OCT-2012 02:00:29
       342 NO  15-OCT-2012 02:00:33
       339 NO  15-OCT-2012 02:00:33
       343 NO  15-OCT-2012 02:00:34

Bingo :) Standby is upto date with Primary now. let do the broker configuration once again

DGMGRL> connect sys/system
Connected.
DGMGRL>
DGMGRL>
DGMGRL>
DGMGRL> show configuration

Configuration - dg1_conf

  Protection Mode: MaxAvailability
  Databases:
    dg1_a - Primary database
    dg1_b - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:
DISABLED

DGMGRL> enable configuration;
Enabled.
DGMGRL>
DGMGRL>
DGMGRL>
DGMGRL>  show configuration

Configuration - dg1_conf

  Protection Mode: MaxAvailability
  Databases:
    dg1_a - Primary database
    dg1_b - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS


DGMGRL> show database verbose dg1_b

Database - dg1_b

  Role:            PHYSICAL STANDBY
  Intended State:  APPLY-ON
  Transport Lag:   0 seconds
  Apply Lag:       0 seconds
  Real Time Query: OFF
  Instance(s):
    dg1

  Properties:
    DGConnectIdentifier             = 'dg1_b'
    ObserverConnectIdentifier       = ''
    LogXptMode                      = 'SYNC'
    DelayMins                       = '0'
    Binding                         = 'OPTIONAL'
    MaxFailure                      = '0'
    MaxConnections                  = '1'
    ReopenSecs                      = '300'
    NetTimeout                      = '10'
    RedoCompression                 = 'DISABLE'
    LogShipping                     = 'ON'
    PreferredApplyInstance          = ''
    ApplyInstanceTimeout            = '0'
    ApplyParallel                   = 'AUTO'
    StandbyFileManagement           = 'AUTO'
    ArchiveLagTarget                = '0'
    LogArchiveMaxProcesses          = '30'
    LogArchiveMinSucceedDest        = '1'
    DbFileNameConvert               = 'dg1_b, dg1_a'
    LogFileNameConvert              = 'dg1_b, dg1_a'
    FastStartFailoverTarget         = 'dg1_a'
    InconsistentProperties          = '(monitor)'
    InconsistentLogXptProps         = '(monitor)'
    SendQEntries                    = '(monitor)'
    LogXptStatus                    = '(monitor)'
    RecvQEntries                    = '(monitor)'
    SidName                         = 'dg1'
    StaticConnectIdentifier         = '(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=dg3.localdomain.com)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=dg1_b_DGMGRL.localdomain.com)(INSTANCE_NAME=dg1)(SERVER=DEDICATED)))'
    StandbyArchiveLocation          = 'USE_DB_RECOVERY_FILE_DEST'
    AlternateLocation               = ''
    LogArchiveTrace                 = '0'
    LogArchiveFormat                = '%t_%s_%r.arc'
    TopWaitEvents                   = '(monitor)'

Database Status:
SUCCESS

All looks good, lets try the switchover and see if this is smooth-
check the switchover using DGMGRL

DGMGRL> switchover to dg1_b
Performing switchover NOW, please wait...
New primary database "dg1_b" is opening...
Operation requires shutdown of instance "dg1" on database "dg1_a"
Shutting down instance "dg1"...
ORACLE instance shut down.
Operation requires startup of instance "dg1" on database "dg1_a"
Starting instance "dg1"...
ORA-32004: obsolete or deprecated parameter(s) specified for RDBMS instance
ORACLE instance started.
Database mounted.
Switchover succeeded, new primary is "dg1_b"

So far so good :) lets switch it back and try to enable the fast start fail over

Lets enable the FSFO now -

DGMGRL> ENABLE FAST_START FAILOVER;
Error: ORA-16651: requirements not met for enabling fast-start failover

:-) One more error (if u are wondering why a happy smile so a good DBA "minimize errors on Prod..welcome them on test" )       

Things to check before doing FSFO
  • The protection mode should be either in max availability or max protection mode
  • logxptmode ( log transport mode) for both primary and standby should be 'SYNC'
  • Both primary and standby should be in flash back enabled mode.
  • FastStartFailoverTarget if you have more than one standby database, not valid for our case.

In our case the issue is with flashback, its not enabled for the standby database. Enabling the same

SQL> select flashback_on from  v$database;

FLASHBACK_ON
------------------
NO

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;

Database altered.

SQL> alter database flashback on;

Database altered.

SQL>  select flashback_on from  v$database;

FLASHBACK_ON
------------------
YES

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT FROM SESSION;

Database altered.


DGMGRL> enable fast_start failover;
Enabled.
DGMGRL>

start the observer-

DGMGRL> start observer
Observer started

DGMGRL>
DGMGRL>  show configuration;

Configuration - dg1_conf

  Protection Mode: MaxAvailability
  Databases:
    dg1_a - Primary database
    dg1_b - (*) Physical standby database
      Warning: ORA-16826: apply service state is inconsistent with the DelayMins property

Fast-Start Failover: ENABLED

Configuration Status:
WARNING

One more error and a very common one -

DGMGRL> show database verbose dg1_b

Database - dg1_b

  Role:            PHYSICAL STANDBY
  Intended State:  APPLY-ON
  Transport Lag:   0 seconds
  Apply Lag:       2 minutes 52 seconds
  Real Time Query: OFF
  Instance(s):
    dg1

  Database Warning(s):
    ORA-16826: apply service state is inconsistent with the DelayMins property

  Properties:
    DGConnectIdentifier             = 'dg1_b'
    ObserverConnectIdentifier       = ''
    LogXptMode                      = 'SYNC'
    DelayMins                       = '0'
    Binding                         = 'OPTIONAL'
    MaxFailure                      = '0'

Why this Error??

Redo Apply is running without USING CURRENT LOGFILE option while DelayMins=0
That means broker property DelayMins is set to 0, but you didn't start recover managed standby database as real apply service.
So DelayMins=0 cannot be achieved. we have to start the apply from the current redo log files

SQL> alter database recover managed standby database cancel;

Database altered.

SQL>

SQL>
SQL>  ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT FROM SESSION;

Database altered.

Now check the status of the verbose

DGMGRL> show database verbose dg1_b

Database - dg1_b

  Role:            PHYSICAL STANDBY
  Intended State:  APPLY-ON
  Transport Lag:   0 seconds
  Apply Lag:       0 seconds
  Real Time Query: OFF
  Instance(s):
    dg1

  Properties:
    DGConnectIdentifier             = 'dg1_b'
    ObserverConnectIdentifier       = ''
    LogXptMode                      = 'SYNC'
    DelayMins                       = '0'

Now Its Fine. see the
Apply Lag:       0 seconds

DGMGRL> show configuration

Configuration - dg1_conf

  Protection Mode: MaxAvailability
  Databases:
    dg1_a - Primary database
    dg1_b - (*) Physical standby database

Fast-Start Failover: ENABLED

Configuration Status:
SUCCESS

SQL>  select DB_UNIQUE_NAME,DATABASE_ROLE,CURRENT_SCN,OPEN_MODE,FS_FAILOVER_STATUS,FS_FAILOVER_CURRENT_TARGET FSFO_CURR_TARGET from v$database;
DB_UNIQUE_NAME                 DATABASE_ROLE    CURRENT_SCN OPEN_MODE            FS_FAILOVER_STATUS     FSFO_CURR_TARGET
------------------------------ ---------------- ----------- -------------------- ---------------------- ------------------------------
DG1_A                          PRIMARY              4089498 READ WRITE           SYNCHRONIZED           dg1_b


Standby-

SQL>  select DB_UNIQUE_NAME,DATABASE_ROLE,CURRENT_SCN,OPEN_MODE,FS_FAILOVER_STATUS,FS_FAILOVER_CURRENT_TARGET FSFO_CURR_TARGET from v$database;

DB_UNIQUE_NAME                 DATABASE_ROLE    CURRENT_SCN OPEN_MODE            FS_FAILOVER_STATUS     FSFO_CURR_TARGET
------------------------------ ---------------- ----------- -------------------- ---------------------- ------------------------------
dg1_b                          PHYSICAL STANDBY     4088545 MOUNTED              SYNCHRONIZED           dg1_b

Test the Failover process

Shutdown the primary database and lets check if standby takes it place

SQL> select db_unique_name from v$database;

DB_UNIQUE_NAME
------------------------------
DG1_A

SQL>
SQL>
SQL> shut abort
ORACLE instance shut down.


Verifying the standby database alert logs -

RFS[3]: Possible network disconnect with primary database
Mon Oct 15 03:34:01 2012
Attempting Fast-Start Failover because the threshold of 30 seconds has elapsed.
Mon Oct 15 03:34:01 2012
Data Guard Broker: Beginning failover
Mon Oct 15 03:34:01 2012
..
.....
.......
Failover succeeded. Primary database is now dg1_b.


Login to Sql plus and check the same -

SQL> select DB_UNIQUE_NAME,DATABASE_ROLE,CURRENT_SCN,OPEN_MODE,FS_FAILOVER_STATUS,FS_FAILOVER_CURRENT_TARGET FSFO_CURR_TARGET from v$database;

DB_UNIQUE_NAME                 DATABASE_ROLE    CURRENT_SCN OPEN_MODE            FS_FAILOVER_STATUS     FSFO_CURR_TARGET
------------------------------ ---------------- ----------- -------------------- ---------------------- ------------------------------
dg1_b                          PRIMARY              4090200 READ WRITE           REINSTATE REQUIRED     dg1_a

As you can see dg1_b is the new PRIMARY and FSFO target is dg1_a.
But "REINSTATE REQUIRED" what needs to be done fot this, nothing just start the old primary (new standby) in the mount state

SQL> startup mount
ORA-32004: obsolete or deprecated parameter(s) specified for RDBMS instance
ORACLE instance started.

Total System Global Area  835104768 bytes
Fixed Size                  2231088 bytes
Variable Size             536872144 bytes
Database Buffers          289406976 bytes
Redo Buffers                6594560 bytes
Database mounted.

Check the query once again

SQL>  select DB_UNIQUE_NAME,DATABASE_ROLE,CURRENT_SCN,OPEN_MODE,FS_FAILOVER_STATUS,FS_FAILOVER_CURRENT_TARGET FSFO_CURR_TARGET from v$database;

DB_UNIQUE_NAME                 DATABASE_ROLE    CURRENT_SCN OPEN_MODE            FS_FAILOVER_STATUS     FSFO_CURR_TARGET
------------------------------ ---------------- ----------- -------------------- ---------------------- ------------------------------
dg1_b                          PRIMARY              4090321 READ WRITE           SYNCHRONIZED           dg1_a


So thats all guys..if I make it any more somebody is gonna find me and beat me to death so lets wrap up.Will take some other things in my next blog...till thn bbye!!!