Rebuilding New Standby Oracle 11g DataGuard
Enable FSFO/TestCases
Current Configuration
Server Name Database Name DB_unique_Name
dg1.localdomain.com DG1 DG1_A --Primary
dg2.localdomain.com DG1 DG1_B --Standby
dg3.localdomain.com DG1 DG1_C --To be Configured
For the DG setup, check the following links -
ORACLE11g-dataguard-unfolded
How-to-recreate-standby-redologs
What are we upto?
We are rebuilding the new standby database from the current Primary Standby which is FSFO enabled. This can be used in case of HOST migration.
Process
Let quickly check that we have a proper DG setup already-
select SEQUENCE# ,STANDBY_DEST,COMPLETION_TIME from v$archived_log order by COMPLETION_TIME
SEQUENCE# STA COMPLETION_TIME
---------- --- --------------------
41 YES 11-OCT-2012 23:10:33
41 NO 11-OCT-2012 23:10:33
42 YES 11-OCT-2012 23:51:13
42 NO 11-OCT-2012 23:51:13
select SEQUENCE# ,STANDBY_DEST,COMPLETION_TIME from v$archived_log order by COMPLETION_TIME
40 NO 11-OCT-2012 23:10:31
41 NO 11-OCT-2012 23:10:33
42 NO 11-OCT-2012 23:51:13
Switch the log file on primary and verify the redo transport -
SQL> alter system switch logfile;
System altered.
SQL> /
System altered.
select SEQUENCE# ,STANDBY_DEST,COMPLETION_TIME from v$archived_log order by COMPLETION_TIME
SEQUENCE# STA COMPLETION_TIME
---------- --- --------------------
43 YES 12-OCT-2012 00:25:39
43 NO 12-OCT-2012 00:25:40
44 YES 12-OCT-2012 00:25:41
44 NO 12-OCT-2012 00:25:41
From Standby
select SEQUENCE# ,STANDBY_DEST,COMPLETION_TIME from v$archived_log order by COMPLETION_TIME
SEQUENCE# STA COMPLETION_TIME
---------- --- --------------------
43 NO 12-OCT-2012 00:25:39
44 NO 12-OCT-2012 00:25:41
Looks perfect so lets get into creating a new standby Database.
I am doing the password less ssh configuration between servers which is not mandatory for the setup but always beneficial for obvious reasons. So do that (easy process)
Passwordless Ssh configuration-
Generate the RSA keys in the new machine which is dg3.localdomain.com
Use
/usr/bin/ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/opt/app/oracle/.ssh/id_rsa):
Created directory '/opt/app/oracle/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /opt/app/oracle/.ssh/id_rsa.
Your public key has been saved in /opt/app/oracle/.ssh/id_rsa.pub.
The key fingerprint is:
36:14:60:fc:c0:f2:f3:d3:9f:d7:e4:8b:27:b3:f5:7e
The key entry would be there in directory you mentioned above /opt/app/oracle/.ssh/id_rsa
cat id_rsa.pub
ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAmzYYJU+qnANZQh/5Uh2CVs9SbpP+NulfbHK5z/IlQbPIJa4GOKWXB8f7z6GQIyGe5ivibHK3bTPCNEKJIJS7HWb1X+HzeVwaqYVUjDMH3bohNvjpIx8wqezOysyXOmGNi6uRy8XDiiQguRgdF/cKxGCrBen2pQecLQQujqVt5fFCiga0DuZaQik+m6sVsVDmqArTeLETBB5g/pOqcAt8CrVzT8z8T6gC4Be7IrT/gI+AQnsR1teBhbaWS+j+PZv9Ced/mH/neM3nL7ZknXtKaUpjYhIuvMKHta97unxG6JH9DXV+OM4QnNfU7KG3+Ql8peF2KkXOXJMSxOARESZUHw== oracle@****.desktop.localdomain.com
Add this entry in the other server's authorized_keys file (you have to create the file first time)
Good to go ..check you are able to access the servers without the password. Same procedure vice versa.
Note: DO this using 'oracle' login or with the account where you want the ssh to be passwordless
Check the File locations from Primary
SQL> select db_unique_name, protection_mode from v$database;
DB_UNIQUE_NAME PROTECTION_MODE
------------------------------ --------------------
dg1_b MAXIMUM AVAILABILITY
SQL> select name from v$datafile;
NAME
--------------------------------------------------------------------------------
/opt/app/oracle/11.2.0.2/A13db/oradata/dg1/system01.dbf
/opt/app/oracle/11.2.0.2/A13db/oradata/dg1/sysaux01.dbf
/opt/app/oracle/11.2.0.2/A13db/oradata/dg1/undotbs01.dbf
/opt/app/oracle/11.2.0.2/A13db/oradata/dg1/AUBX_DATA_01.dbf
/opt/app/oracle/11.2.0.2/A13db/oradata/dg1/AUBX_IDX_01.dbf
/opt/app/oracle/11.2.0.2/A13db/oradata/dg1/users01.dbf
SQL> select name from v$controlfile;
NAME
--------------------------------------------------------------------------------
/opt/app/oracle/11.2.0.2/A13db/oradata/dg1/dg1_control01.ctl
/opt/app/oracle/11.2.0.2/A13db/oradata/dg1/dg1_control02.ctl
/opt/app/oracle/11.2.0.2/A13db/oradata/dg1/dg1_control03.ctl
SQL> select member from v$logfile;
MEMBER
--------------------------------------------------------------------------------
/opt/app/oracle/11.2.0.2/A13db/oradata/dg1/dg1_log1a.dbf
/opt/app/oracle/11.2.0.2/A13db/oradata/dg1/dg1_log2a.dbf
/opt/app/oracle/11.2.0.2/A13db/oradata/dg1/dg1_log3a.dbf
/opt/app/oracle/11.2.0.2/A13db/oradata/dg1/dg1_standby_redo01.log
/opt/app/oracle/11.2.0.2/A13db/oradata/dg1/dg1_standby_redo02.log
/opt/app/oracle/11.2.0.2/A13db/oradata/dg1/dg1_standby_redo03.log
/opt/app/oracle/11.2.0.2/A13db/oradata/dg1/dg1_standby_redo04.log
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
spfile string /opt/app/oracle/11.2.0.2/A13db
/dbs/spfiledg1.ora
Create the directories in the dg3 server-
dg3.localdomain$ mkdir -p /opt/app/oracle/11.2.0.2/A13db/oradata/dg1/
dg3.localdomain$ mkdir -p /opt/app/oracle/11.2.0.2/A13db/dbs/
dg3.localdomain$ mkdir -p /opt/app/oracle/11.2.0.2/A13db/network/admin/dg1/diag/rdbms/dg1_b/dg1/trace
dg3.localdomain$ mkdir -p /opt/app/oracle/11.2.0.2/A13db/network/admin/dg1/diag/rdbms/dg1_b/dg1/cdump
dg3.localdomain$ mkdir -p /opt/app/oracle/11.2.0.2/A13db/oradata/dg1/FRA
dg3.localdomain$ mkdir -p /opt/app/oracle/11.2.0.2/A13db/network/admin/dg1
dg3.localdomain$ mkdir -p /opt/app/oracle/11.2.0.2/A13db/oradata/dg1/arch
*.db_recovery_file_dest='/opt/app/oracle/11.2.0.2/A13db/oradata/dg1/FRA'
*.standby_archive_dest='/opt/app/oracle/11.2.0.2/A13db/oradata/dg1/arch'
*.diagnostic_dest='/opt/app/oracle/11.2.0.2/A13db/network/admin/dg1'
Shutdown the standby database -
SQL> shut immediate
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.
Start the File transfer, following files are needed to be copied and transferred after the standby shutdown-
DATAFILE,
REDOLOG FILE (can be recreated also), CONTROLFILE, SPFILE, DATAGUARD BROKER FILE, PASSWORD FILE, LISTENER , TNSNAMES,
dg2.localdomain$ cd /opt/app/oracle/11.2.0.2/A13db/oradata/dg1/
dg2.localdomain$
scp * oracle@172.27.154.66:/opt/app/oracle/11.2.0.2/A13db/oradata/dg1/
scp spfiledg1.ora dr2dg1_b.dat dr1dg1_b.dat orapwdg1 oracle@172.27.154.66:/opt/app/oracle/11.2.0.2/A13db/dbs/
scp tnsnames.ora listener.ora oracle@172.27.154.66:/opt/app/oracle/11.2.0.2/A13db/network/admin/
dg2.localdomain$ pwd
/opt/app/oracle/11.2.0.2/A13db/dbs
dg2.localdomain$
dg2.localdomain$
dg2.localdomain$ scp spfiledg1.ora dr2dg1_b.dat dr1dg1_b.dat orapwdg1 oracle@172.27.154.66:/opt/app/oracle/11.2.0.2/A13db/dbs/
Now we will do some transactions in Primary database to make it out of sync with the standby.
SQL> create user test identified by test;
User created.
SQL> grant dba to test;
Grant succeeded.
SQL> create table aa as select * from all_objects;
Table created.
SQL> insert into aa select * from aa;
12679 rows created.
....keep on doing the same for large data entry
Check the current log sequence
SQL> conn / as sysdba
Connected.
SQL>
OR
select SEQUENCE# ,STANDBY_DEST,COMPLETION_TIME from v$archived_log order by COMPLETION_TIME
244 NO 12-OCT-2012 04:47:38
245 NO 12-OCT-2012 04:47:53
--As the standby is down you can see there are no duplicates for the sequence#
33710 rows selected.
SQL> archive log list
Database log mode Archive Mode
Automatic archival Enabled
Archive destination USE_DB_RECOVERY_FILE_DEST
Oldest online log sequence 245
Next log sequence to archive 247
Current log sequence 247
dg3.localdomain$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.2.0 Production on Fri Oct 12 03:59:40 2012
Copyright (c) 1982, 2010, Oracle. All rights reserved.
Connected to an idle instance.
SQL>
SQL> startup
ORA-32004: obsolete or deprecated parameter(s) specified for RDBMS instance
ORA-00119: invalid specification for system parameter LOCAL_LISTENER
ORA-00130: invalid listener address '(ADDRESS=(PROTOCOL=TCP)(HOST=dg3.localdomain.com)(PORT=1521))'
SQL> exit
To resolve the following error add the hostname entry in the /etc/hosts
from the new standby server dg3 start the oracle and check the last log sequence
dg2.localdomain$ cd /opt/app/oracle/11.2.0.2/A13db/oradata/dg1/
dg2.localdomain$
scp * oracle@172.27.154.66:/opt/app/oracle/11.2.0.2/A13db/oradata/dg1/
scp spfiledg1.ora dr2dg1_b.dat dr1dg1_b.dat orapwdg1 oracle@172.27.154.66:/opt/app/oracle/11.2.0.2/A13db/dbs/
scp tnsnames.ora listener.ora oracle@172.27.154.66:/opt/app/oracle/11.2.0.2/A13db/network/admin/
dg2.localdomain$ pwd
/opt/app/oracle/11.2.0.2/A13db/dbs
dg2.localdomain$
dg2.localdomain$
dg2.localdomain$ scp spfiledg1.ora dr2dg1_b.dat dr1dg1_b.dat orapwdg1 oracle@172.27.154.66:/opt/app/oracle/11.2.0.2/A13db/dbs/
Now we will do some transactions in Primary database to make it out of sync with the standby.
SQL> create user test identified by test;
User created.
SQL> grant dba to test;
Grant succeeded.
SQL> create table aa as select * from all_objects;
Table created.
SQL> insert into aa select * from aa;
12679 rows created.
....keep on doing the same for large data entry
Check the current log sequence
SQL> conn / as sysdba
Connected.
SQL>
OR
select SEQUENCE# ,STANDBY_DEST,COMPLETION_TIME from v$archived_log order by COMPLETION_TIME
244 NO 12-OCT-2012 04:47:38
245 NO 12-OCT-2012 04:47:53
--As the standby is down you can see there are no duplicates for the sequence#
33710 rows selected.
SQL> archive log list
Database log mode Archive Mode
Automatic archival Enabled
Archive destination USE_DB_RECOVERY_FILE_DEST
Oldest online log sequence 245
Next log sequence to archive 247
Current log sequence 247
dg3.localdomain$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.2.0 Production on Fri Oct 12 03:59:40 2012
Copyright (c) 1982, 2010, Oracle. All rights reserved.
Connected to an idle instance.
SQL>
SQL> startup
ORA-32004: obsolete or deprecated parameter(s) specified for RDBMS instance
ORA-00119: invalid specification for system parameter LOCAL_LISTENER
ORA-00130: invalid listener address '(ADDRESS=(PROTOCOL=TCP)(HOST=dg3.localdomain.com)(PORT=1521))'
SQL> exit
To resolve the following error add the hostname entry in the /etc/hosts
from the new standby server dg3 start the oracle and check the last log sequence
SQL> startup mount
ORA-32004: obsolete or deprecated parameter(s) specified for RDBMS instance
ORACLE instance started.
Total System Global Area 271437824 bytes
Fixed Size 2225584 bytes
Variable Size 213912144 bytes
Database Buffers 50331648 bytes
Redo Buffers 4968448 bytes
Database mounted.
select SEQUENCE# ,STANDBY_DEST,COMPLETION_TIME from v$archived_log order by COMPLETION_TIME
SEQUENCE# STA COMPLETION_TIME
---------- --- --------------------
49 NO 12-OCT-2012 03:07:22
50 NO 12-OCT-2012 03:07:25
Check the DGMGRL configuration from the primary
DGMGRL> connect sys/system
Connected.
DGMGRL>
DGMGRL>
DGMGRL> show configuration
Configuration - dg1_conf
Protection Mode: MaxAvailability
Databases:
dg1_a - Primary database
Error: ORA-16825: multiple errors or warnings, including fast-start failover-related errors or warnings, detected for the database
dg1_b - (*) Physical standby database
Error: ORA-01034: ORACLE not available
Fast-Start Failover: ENABLED
Configuration Status:
ERROR
Update tnsnames in primary to change the hostname to dg3
dg1_b =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = dg3.localdomain.com)(PORT = 1521))
)
(CONNECT_DATA =
(SERVICE_NAME = dg1)
)
)
Make the changes in the tnsnames, listener for dg3 server (hostname needs to be modified )
DGMGRL> disable FAST_START FAILOVER;
Error: ORA-12545: Connect failed because target host or object does not exist
Failed.
DGMGRL> disable configuration;
Error: ORA-16654: fast-start failover is enabled
Oops I am not able to turn off fast start failover, we Should have done that before shutting down the standby.
I am going to start the old standby and do the changes and then stop it once again.
DGMGRL> disable FAST_START FAILOVER;
DGMGRL> disable configuration;
SQL> alter database recover managed standby database disconnect from session;
Database altered.
Logs are not applied to dg3, errors in the alert logs are as follows
Errors in file /opt/app/oracle/11.2.0.2/A13db/network/admin/dg1/diag/rdbms/dg1_b/dg1/trace/dg1_rvwr_8362.trc:
ORA-38701: Flashback database log 3 seq 210 thread 1: "/opt/app/oracle/11.2.0.2/A13db/oradata/dg1/FRA/DG1_B/flashback/o1_mf_85fxk962_.flb"
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
So as message states some of the archive-logs are missing, check whats the difference in SCN for the primary and standby database
SQL> select current_scn from v$database;
CURRENT_SCN
-----------
4037233
You have to check this query from the Primary database as standby is in mount state, run the same in Primary with the standby SCN to note the time difference-
SQL> select scn_to_timestamp(4037233) from dual;
SCN_TO_TIMESTAMP(4037233)
---------------------------------------------------------------------------
15-OCT-12 12.37.14.000000000 AM
From Standby Server
ORA-32004: obsolete or deprecated parameter(s) specified for RDBMS instance
ORACLE instance started.
Total System Global Area 271437824 bytes
Fixed Size 2225584 bytes
Variable Size 213912144 bytes
Database Buffers 50331648 bytes
Redo Buffers 4968448 bytes
Database mounted.
select SEQUENCE# ,STANDBY_DEST,COMPLETION_TIME from v$archived_log order by COMPLETION_TIME
SEQUENCE# STA COMPLETION_TIME
---------- --- --------------------
49 NO 12-OCT-2012 03:07:22
50 NO 12-OCT-2012 03:07:25
Check the DGMGRL configuration from the primary
DGMGRL> connect sys/system
Connected.
DGMGRL>
DGMGRL>
DGMGRL> show configuration
Configuration - dg1_conf
Protection Mode: MaxAvailability
Databases:
dg1_a - Primary database
Error: ORA-16825: multiple errors or warnings, including fast-start failover-related errors or warnings, detected for the database
dg1_b - (*) Physical standby database
Error: ORA-01034: ORACLE not available
Fast-Start Failover: ENABLED
Configuration Status:
ERROR
Update tnsnames in primary to change the hostname to dg3
dg1_b =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = dg3.localdomain.com)(PORT = 1521))
)
(CONNECT_DATA =
(SERVICE_NAME = dg1)
)
)
Make the changes in the tnsnames, listener for dg3 server (hostname needs to be modified )
DGMGRL> disable FAST_START FAILOVER;
Error: ORA-12545: Connect failed because target host or object does not exist
Failed.
DGMGRL> disable configuration;
Error: ORA-16654: fast-start failover is enabled
Oops I am not able to turn off fast start failover, we Should have done that before shutting down the standby.
I am going to start the old standby and do the changes and then stop it once again.
DGMGRL> disable FAST_START FAILOVER;
DGMGRL> disable configuration;
SQL> alter database recover managed standby database disconnect from session;
Database altered.
Logs are not applied to dg3, errors in the alert logs are as follows
Errors in file /opt/app/oracle/11.2.0.2/A13db/network/admin/dg1/diag/rdbms/dg1_b/dg1/trace/dg1_rvwr_8362.trc:
ORA-38701: Flashback database log 3 seq 210 thread 1: "/opt/app/oracle/11.2.0.2/A13db/oradata/dg1/FRA/DG1_B/flashback/o1_mf_85fxk962_.flb"
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
So as message states some of the archive-logs are missing, check whats the difference in SCN for the primary and standby database
SQL> select current_scn from v$database;
CURRENT_SCN
-----------
4037233
You have to check this query from the Primary database as standby is in mount state, run the same in Primary with the standby SCN to note the time difference-
SQL> select scn_to_timestamp(4037233) from dual;
SCN_TO_TIMESTAMP(4037233)
---------------------------------------------------------------------------
15-OCT-12 12.37.14.000000000 AM
From Standby Server
SQL> select current_scn from v$database;
CURRENT_SCN
-----------
3758124
SQL> select scn_to_timestamp(3758124) from dual;
SCN_TO_TIMESTAMP(3758124)
---------------------------------------------------------------------------
12-OCT-12 03.18.18.000000000 AM
Have to resolve this GAP, we can do that by taking the incremental backups from Primary and applying the same to Standby. So login to Primary DB and run the following -
dg1.localdomain$ rman target /
Recovery Manager: Release 11.2.0.2.0 - Production on Mon Oct 15 00:42:00 2012
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
connected to target database: DG1 (DBID=1774279666)
RMAN> run {
2> allocate channel c1 type disk format '/opt/app/oracle/%U.rmb';
3> backup incremental from scn 3758124 database;
4> }
.........
2> allocate channel c1 type disk format '/opt/app/oracle/%U.rmb';
3> backup incremental from scn 3758124 database;
4> }
.........
Create the Standby database control file from the Primary and transfer the control file and incremental backup from Primary to standby server-
alter database create standby controlfile as '/opt/app/oracle/Standby_control.ctl';
scp 0innpa51_1_1.rmb 0jnnpa8p_1_1.rmb Standby_control.ctl oracle@172.27.154.66:/opt/app/oracle/
dg3.localdomain$ cp Standby_control.ctl /opt/app/oracle/11.2.0.2/A13db/oradata/dg1/dg1_control01.ctl
dg3.localdomain$ cp Standby_control.ctl /opt/app/oracle/11.2.0.2/A13db/oradata/dg1/dg1_control02.ctl
dg3.localdomain$ cp Standby_control.ctl /opt/app/oracle/11.2.0.2/A13db/oradata/dg1/dg1_control03.ctl
dg3.localdomain$
dg3.localdomain$
dg3.localdomain$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.2.0 Production on Mon Oct 15 01:25:41 2012
Copyright (c) 1982, 2010, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> alter database mount standby database;
Database altered.
exit
keep the backups in a seperate directory, catalog the same with the standby rman and recover the database.
dg3.localdomain$ rman target /
Recovery Manager: Release 11.2.0.2.0 - Production on Mon Oct 15 01:27:01 2012
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
connected to target database: DG1 (DBID=1774279666, not open)
RMAN> catalog start with '/opt/app/oracle/backup_from_prim';
RMAN> recover database
2> ;
Starting recover at 15-OCT-12
using channel ORA_DISK_1
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of recover command at 10/15/2012 01:29:47
RMAN-06094: datafile 7 must be restored
RMAN>
We Have added one datafile to the system tablespace after restoring the standby database , So need to restore the same first.
RMAN> restore datafile 7;
Starting restore at 15-OCT-12
using channel ORA_DISK_1
channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00007 to /opt/app/oracle/11.2.0.2/A13db/oradata/dg1/system02.dbf
channel ORA_DISK_1: reading from backup piece /opt/app/oracle/backup_from_prim/0innpa51_1_1.rmb
channel ORA_DISK_1: piece handle=/opt/app/oracle/backup_from_prim/0innpa51_1_1.rmb tag=TAG20121015T011304
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:25
Finished restore at 15-OCT-12
RMAN>
RMAN> recover database
2> ;
Starting recover at 15-OCT-12
using channel ORA_DISK_1
channel ORA_DISK_1: starting incremental datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
destination for restore of datafile 00001: /opt/app/oracle/11.2.0.2/A13db/oradata/dg1/system01.dbf
destination for restore of datafile 00002: /opt/app/oracle/11.2.0.2/A13db/oradata/dg1/sysaux01.dbf
destination for restore of datafile 00003: /opt/app/oracle/11.2.0.2/A13db/oradata/dg1/undotbs01.dbf
destination for restore of datafile 00004: /opt/app/oracle/11.2.0.2/A13db/oradata/dg1/AUBX_DATA_01.dbf
destination for restore of datafile 00005: /opt/app/oracle/11.2.0.2/A13db/oradata/dg1/AUBX_IDX_01.dbf
destination for restore of datafile 00006: /opt/app/oracle/11.2.0.2/A13db/oradata/dg1/users01.dbf
channel ORA_DISK_1: reading from backup piece /opt/app/oracle/backup_from_prim/0innpa51_1_1.rmb
channel ORA_DISK_1: piece handle=/opt/app/oracle/backup_from_prim/0innpa51_1_1.rmb tag=TAG20121015T011304
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:35
starting media recovery
unable to find archived log
archived log thread=1 sequence=339
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of recover command at 10/15/2012 01:43:51
RMAN-06054: media recovery requesting unknown archived log for thread 1 with sequence 339 and starting SCN of 4039532
The above error is fine as we have not applied all the archive logs yet, Lets start the managed recovery and see if this catches up with Primary DB
SQL> alter database recover managed standby database disconnect from session;
Database altered.
SQL> select SEQUENCE# ,STANDBY_DEST,COMPLETION_TIME from v$archived_log order by COMPLETION_TIME;
no rows selected
/
no rows selected
SQL> archive log list
Database log mode Archive Mode
Automatic archival Enabled
Archive destination USE_DB_RECOVERY_FILE_DEST
Oldest online log sequence 337
Next log sequence to archive 339
Current log sequence 339
Doing log switch on primary is not transporting the redo properly as we cant see new sequence#. Last one was 339 which is been applied through the incremental backups.
Checking the primary DB alert log, Ahh ..here is the culprit-
TNS-12545: Connect failed because target host or object does not exist
ns secondary err code: 12560
nt main err code: 515
I have not made the entry of the dg3 in the /etc/hosts file of the primary server yet. Making the same entry, lets c now -
cat /etc/hosts
172.27.152.123 dg1.localdomain.com dg1
172.27.154.139 dg2.localdomain.com dg2
172.27.154.66 dg3.localdomain.com dg3 --> made this entry was missing
SQL> select SEQUENCE# ,STANDBY_DEST,COMPLETION_TIME from v$archived_log order by COMPLETION_TIME;
SEQUENCE# STA COMPLETION_TIME
---------- --- --------------------
341 NO 15-OCT-2012 02:00:29
340 NO 15-OCT-2012 02:00:29
342 NO 15-OCT-2012 02:00:33
339 NO 15-OCT-2012 02:00:33
343 NO 15-OCT-2012 02:00:34
Bingo :) Standby is upto date with Primary now. let do the broker configuration once again
DGMGRL> connect sys/system
Connected.
DGMGRL>
DGMGRL>
DGMGRL>
DGMGRL> show configuration
Configuration - dg1_conf
Protection Mode: MaxAvailability
Databases:
dg1_a - Primary database
dg1_b - Physical standby database
Fast-Start Failover: DISABLED
Configuration Status:
DISABLED
DGMGRL> enable configuration;
Enabled.
DGMGRL>
DGMGRL>
DGMGRL>
DGMGRL> show configuration
Configuration - dg1_conf
Protection Mode: MaxAvailability
Databases:
dg1_a - Primary database
dg1_b - Physical standby database
Fast-Start Failover: DISABLED
Configuration Status:
SUCCESS
DGMGRL> show database verbose dg1_b
Database - dg1_b
Role: PHYSICAL STANDBY
Intended State: APPLY-ON
Transport Lag: 0 seconds
Apply Lag: 0 seconds
Real Time Query: OFF
Instance(s):
dg1
Properties:
DGConnectIdentifier = 'dg1_b'
ObserverConnectIdentifier = ''
LogXptMode = 'SYNC'
DelayMins = '0'
Binding = 'OPTIONAL'
MaxFailure = '0'
MaxConnections = '1'
ReopenSecs = '300'
NetTimeout = '10'
RedoCompression = 'DISABLE'
LogShipping = 'ON'
PreferredApplyInstance = ''
ApplyInstanceTimeout = '0'
ApplyParallel = 'AUTO'
StandbyFileManagement = 'AUTO'
ArchiveLagTarget = '0'
LogArchiveMaxProcesses = '30'
LogArchiveMinSucceedDest = '1'
DbFileNameConvert = 'dg1_b, dg1_a'
LogFileNameConvert = 'dg1_b, dg1_a'
FastStartFailoverTarget = 'dg1_a'
InconsistentProperties = '(monitor)'
InconsistentLogXptProps = '(monitor)'
SendQEntries = '(monitor)'
LogXptStatus = '(monitor)'
RecvQEntries = '(monitor)'
SidName = 'dg1'
StaticConnectIdentifier = '(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=dg3.localdomain.com)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=dg1_b_DGMGRL.localdomain.com)(INSTANCE_NAME=dg1)(SERVER=DEDICATED)))'
StandbyArchiveLocation = 'USE_DB_RECOVERY_FILE_DEST'
AlternateLocation = ''
LogArchiveTrace = '0'
LogArchiveFormat = '%t_%s_%r.arc'
TopWaitEvents = '(monitor)'
Database Status:
SUCCESS
All looks good, lets try the switchover and see if this is smooth-
check the switchover using DGMGRL
DGMGRL> switchover to dg1_b
Performing switchover NOW, please wait...
New primary database "dg1_b" is opening...
Operation requires shutdown of instance "dg1" on database "dg1_a"
Shutting down instance "dg1"...
ORACLE instance shut down.
Operation requires startup of instance "dg1" on database "dg1_a"
Starting instance "dg1"...
ORA-32004: obsolete or deprecated parameter(s) specified for RDBMS instance
ORACLE instance started.
Database mounted.
Switchover succeeded, new primary is "dg1_b"
So far so good :) lets switch it back and try to enable the fast start fail over
Lets enable the FSFO now -
DGMGRL> ENABLE FAST_START FAILOVER;
Error: ORA-16651: requirements not met for enabling fast-start failover
:-) One more error (if u are wondering why a happy smile so a good DBA "minimize errors on Prod..welcome them on test" )
Things to check before doing FSFO
- The protection mode should be either in max availability or max protection mode
- logxptmode ( log transport mode) for both primary and standby should be 'SYNC'
- Both primary and standby should be in flash back enabled mode.
- FastStartFailoverTarget if you have more than one standby database, not valid for our case.
In our case the issue is with flashback, its not enabled for the standby database. Enabling the same
SQL> select flashback_on from v$database;
FLASHBACK_ON
------------------
NO
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
Database altered.
SQL> alter database flashback on;
Database altered.
SQL> select flashback_on from v$database;
FLASHBACK_ON
------------------
YES
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT FROM SESSION;
Database altered.
DGMGRL> enable fast_start failover;
Enabled.
DGMGRL>
start the observer-
DGMGRL> start observer
Observer started
DGMGRL>
DGMGRL> show configuration;
Configuration - dg1_conf
Protection Mode: MaxAvailability
Databases:
dg1_a - Primary database
dg1_b - (*) Physical standby database
Warning: ORA-16826: apply service state is inconsistent with the DelayMins property
Fast-Start Failover: ENABLED
Configuration Status:
WARNING
One more error and a very common one -
DGMGRL> show database verbose dg1_b
Database - dg1_b
Role: PHYSICAL STANDBY
Intended State: APPLY-ON
Transport Lag: 0 seconds
Apply Lag: 2 minutes 52 seconds
Real Time Query: OFF
Instance(s):
dg1
Database Warning(s):
ORA-16826: apply service state is inconsistent with the DelayMins property
Properties:
DGConnectIdentifier = 'dg1_b'
ObserverConnectIdentifier = ''
LogXptMode = 'SYNC'
DelayMins = '0'
Binding = 'OPTIONAL'
MaxFailure = '0'
Why this Error??
Redo Apply is running without USING CURRENT LOGFILE option while DelayMins=0
That means broker property DelayMins is set to 0, but you didn't start recover managed standby database as real apply service.
So DelayMins=0 cannot be achieved. we have to start the apply from the current redo log files
SQL> alter database recover managed standby database cancel;
Database altered.
SQL>
SQL>
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT FROM SESSION;
Database altered.
Now check the status of the verbose
DGMGRL> show database verbose dg1_b
Database - dg1_b
Role: PHYSICAL STANDBY
Intended State: APPLY-ON
Transport Lag: 0 seconds
Apply Lag: 0 seconds
Real Time Query: OFF
Instance(s):
dg1
Properties:
DGConnectIdentifier = 'dg1_b'
ObserverConnectIdentifier = ''
LogXptMode = 'SYNC'
DelayMins = '0'
Now Its Fine. see the
Apply Lag: 0 seconds
DGMGRL> show configuration
Configuration - dg1_conf
Protection Mode: MaxAvailability
Databases:
dg1_a - Primary database
dg1_b - (*) Physical standby database
Fast-Start Failover: ENABLED
Configuration Status:
SUCCESS
SQL> select DB_UNIQUE_NAME,DATABASE_ROLE,CURRENT_SCN,OPEN_MODE,FS_FAILOVER_STATUS,FS_FAILOVER_CURRENT_TARGET FSFO_CURR_TARGET from v$database;
DB_UNIQUE_NAME DATABASE_ROLE CURRENT_SCN OPEN_MODE FS_FAILOVER_STATUS FSFO_CURR_TARGET
------------------------------ ---------------- ----------- -------------------- ---------------------- ------------------------------
DG1_A PRIMARY 4089498 READ WRITE SYNCHRONIZED dg1_b
Standby-
SQL> select DB_UNIQUE_NAME,DATABASE_ROLE,CURRENT_SCN,OPEN_MODE,FS_FAILOVER_STATUS,FS_FAILOVER_CURRENT_TARGET FSFO_CURR_TARGET from v$database;
DB_UNIQUE_NAME DATABASE_ROLE CURRENT_SCN OPEN_MODE FS_FAILOVER_STATUS FSFO_CURR_TARGET
------------------------------ ---------------- ----------- -------------------- ---------------------- ------------------------------
dg1_b PHYSICAL STANDBY 4088545 MOUNTED SYNCHRONIZED dg1_b
Test the Failover process
Shutdown the primary database and lets check if standby takes it place
SQL> select db_unique_name from v$database;
DB_UNIQUE_NAME
------------------------------
DG1_A
SQL>
SQL>
SQL> shut abort
ORACLE instance shut down.
Verifying the standby database alert logs -
RFS[3]: Possible network disconnect with primary database
Mon Oct 15 03:34:01 2012
Attempting Fast-Start Failover because the threshold of 30 seconds has elapsed.
Mon Oct 15 03:34:01 2012
Data Guard Broker: Beginning failover
Mon Oct 15 03:34:01 2012
..
.....
.......
Failover succeeded. Primary database is now dg1_b.
Login to Sql plus and check the same -
SQL> select DB_UNIQUE_NAME,DATABASE_ROLE,CURRENT_SCN,OPEN_MODE,FS_FAILOVER_STATUS,FS_FAILOVER_CURRENT_TARGET FSFO_CURR_TARGET from v$database;
DB_UNIQUE_NAME DATABASE_ROLE CURRENT_SCN OPEN_MODE FS_FAILOVER_STATUS FSFO_CURR_TARGET
------------------------------ ---------------- ----------- -------------------- ---------------------- ------------------------------
dg1_b PRIMARY 4090200 READ WRITE REINSTATE REQUIRED dg1_a
As you can see dg1_b is the new PRIMARY and FSFO target is dg1_a.
But "REINSTATE REQUIRED" what needs to be done fot this, nothing just start the old primary (new standby) in the mount state
SQL> startup mount
ORA-32004: obsolete or deprecated parameter(s) specified for RDBMS instance
ORACLE instance started.
Total System Global Area 835104768 bytes
Fixed Size 2231088 bytes
Variable Size 536872144 bytes
Database Buffers 289406976 bytes
Redo Buffers 6594560 bytes
Database mounted.
Check the query once again
SQL> select DB_UNIQUE_NAME,DATABASE_ROLE,CURRENT_SCN,OPEN_MODE,FS_FAILOVER_STATUS,FS_FAILOVER_CURRENT_TARGET FSFO_CURR_TARGET from v$database;
DB_UNIQUE_NAME DATABASE_ROLE CURRENT_SCN OPEN_MODE FS_FAILOVER_STATUS FSFO_CURR_TARGET
------------------------------ ---------------- ----------- -------------------- ---------------------- ------------------------------
dg1_b PRIMARY 4090321 READ WRITE SYNCHRONIZED dg1_a
So thats all guys..if I make it any more somebody is gonna find me and beat me to death so lets wrap up.Will take some other things in my next blog...till thn bbye!!!
Great tutorial. Found it when getting error ORA-16651: requirements not met for enabling fast-start failover. Great to see a well written tutorial that doesn't just cover what happens when things go right!
ReplyDelete