Oracle11G数据库DataGuard灾备切换方案一、检查1、确定MRP进程在正常运行备库执行如下SQL确定MRP进程正常:SELECTPROCESSFROMV$MANAGED_STANDBYWHEREPROCESSLIKE'MRP%';主库执行如下SQL,确定备库是“REALTIMEAPPLY”状态SQLSELECTRECOVERY_MODEFROMV$ARCHIVE_DEST_STATUSWHEREDEST_ID=2;RECOVERY_MODE-----------------------MANAGEDREALTIMEAPPLY如果备库没有启用real-timeapply,则需要重新将备库启动至real-timeapply:SQLALTERDATABASERECOVERMANAGEDSTANDBYDATABASECANCEL;SQLALTERDATABASERECOVERMANAGEDSTANDBYDATABASEUSINGCURRENTLOGFILEDISCONNECT;2、确定有足够的归档进程在所有的主备库实例上查询参数LOG_ARCHIVE_MAX_PROCESSES,确定其值大于等于4,但不会太大3、确定目标备库的REDO为clear状态虽然在发起SWITCHOVERTOPRIMARY命令时,备库的REDO会自动转换为CLEAR状态,但依然建议在SWITCHOVER前REDO为CLEAR状态。确保正确设置了LOG_FILE_NAME_CONVERT参数。使用如下SQL在目标备库上查看REDO状态:SQLSELECTDISTINCTL.GROUP#FROMV$LOGL,V$LOGFILELFWHEREL.GROUP#=LF.GROUP#ANDL.STATUSNOTIN(‘UNUSED’,‘CLEARING’,’CLEARING_CURRENT’);如果如上的查询有结果,则需要停止备库的REDOAPPLY,并通过如下的SQL来对其进行CLEARSQLALTERDATABASECLEARLOGFILEGROUPORLGROUP#fromthequeryabove;4、确定没有大量的GAP主库执行如下SQL查看主库当前的REDOSEQUENCESQLSELECTTHREAD#,SEQUENCE#FROMV$THREAD;在备库上执行如下查询,确定查询出来的结果与上面的结果相比较只差1-2个数值SELECTTHREAD#,MAX(SEQUENCE#)FROMV$ARCHIVED_LOGWHEREAPPLIED='YES'ANDRESETLOGS_CHANGE#=(SELECTRESETLOGS_CHANGE#FROMV$DATABASE_INCARNATIONWHERESTATUS=‘CURRENT’)GROUPBYTHREAD#;5、确定主库以及目标备库的所有文件都为ONLINE主备库分别执行如下SQL,查看tempfile是否正常,如果备库上缺失文件则需要进行处理:SELECTTMP.NAMEFILENAME,BYTES,TS.NAMETABLESPACEFROMV$TEMPFILETMP,V$TABLESPACETSWHERETMP.TS#=TS.TS#;在主备库分别执行如下SQL,查看数据文件状态,结果应该一致SELECTNAMEFROMV$DATAFILEWHERESTATUS=’OFFLINE’;如果备库上有比主库多出的OFFLINE状态的数据文件,则将其ONLINE:ALTERDATABASEDATAFILE&FILE_IDONLINE;二、切换1、检查主库是否可切换至STANDBY主库执行如下SQL执行检查SQLSELECTSWITCHOVER_STATUSFROMV$DATABASE;SWITCHOVER_STATUS-----------------TOSTANDBY如上的SQL查询结果如果为”TOSTANDBY”或者”SESSIONSACTIVE”表示主库可切换至STANDBY,如果不为这两个值,则说明REDO传输存在问题。2、停止主库第一个节点以外的所有实例(RAC)最好使用shutdownnormal或者shutdownimmediate方式停止数据库。如果使用了shutdownabort将其他节点进行了关闭,则需等待RACreconfig完成,且第一个节点将其余REDO正常前滚或回滚3、切换主库至STANDBY角色将主库切换至STANDBYALTERDATABASECOMMITTOSWITCHOVERTOSTANDBYWITHSESSIONSHUTDOWN;如果遇到ORA-16139报错,且V$DATABASE视图中DATABASE_ROLE字段的值已为”PHYSICALSTANDBY”,则可继续(这种问题的出现其中一个可能是数据库有大量的数据文件)。4、确定STANDBY收到EOR在主库的ALERT日志中可以看到类似如下的信息:Switchover:Primarycontrolfileconvertedtostandbycontrolfilesuccesfully.TueMar1516:12:152011MRP0startedwithpid=17,OSid=2717MRP0:BackgroundManagedStandbyRecoveryprocessstarted(SFO)SerialMediaRecoverystartedManagedStandbyRecoverynotusingRealTimeApplyOnlinelogfilepre-clearingoperationdisabledbyswitchoverMediaRecoveryLog/u01/app/flash_recovery_area/SFO/archivelog/2011_03_15/o1_mf_1_133_6qzl0yvd_.arcIdentifiedEnd-Of-Redoforthread1sequence133ResettingstandbyactivationID0(0x0)MediaRecoveryEnd-Of-RedoindicatorencounteredMediaRecoveryApplieduntilchange4314801MRP0:MediaRecoveryComplete:End-Of-REDO(SFO)MRP0:BackgroundMediaRecoveryprocessshutdown(SFO)TueMar1516:12:212011Switchover:Complete-Databaseshutdownrequired(SFO)Completed:ALTERDATABASECOMMITTOSWITCHOVERTOPHYSICALSTANDBYWITHSESSIONSHUTDOWN同时在所有备库的ALERT日志中可以看到类似如下的信息:TueMar1516:12:152011RFS[8]:AssignedtoRFSprocess2715RFS[8]:Identifieddatabasetypeas'physicalstandby':ClientisForegroundpid2568MediaRecoveryLog/u01/app/flash_recovery_area/NYC/archivelog/2011_03_15/o1_mf_1_133_6qzl0yjp_.arcIdentifiedEnd-Of-Redoforthread1sequence133ResettingstandbyactivationID2680651518(0x9fc77efe)MediaRecoveryEnd-Of-RedoindicatorencounteredMediaRecoveryContinuingResettingstandbyactivationID2680651518(0x9fc77efe)MediaRecoveryWaitingforthread1sequence1345、检查STANDBY能够切换至PRIMARY目标备库上执行如下SQL进行检查SQLSELECTSWITCHOVER_STATUSFROMV$DATABASE;SWITCHOVER_STATUS-----------------TOPRIMARY如上的SQL查询结果如果为”TOPRIMARY”或者”SESSIONSACTIVE”表示目标备库可切换至PRIMARY,如果不为这两个值,则说明REDO传输或者应用存在问题。6、切换备库至PRIMARY在目标备库执行如下命令ALTERDATABASECOMMITTOSWITCHOVERTOPRIMARYWITHSESSIONSHUTDOWN;同时在alert日志中有类似如下信息TueMar1516:16:442011ALTERDATABASECOMMITTOSWITCHOVERTOPRIMARYWITHSESSIONSHUTDOWNALTERDATABASESWITCHOVERTOPRIMARY(NYC)Maximumwaitforroletransitionis15minutes.Switchover:MediarecoveryisstillactiveRoleChange:CancelingMRP-nomoreredotoapplyTueMar1516:16:452011MRP0:BackgroundMediaRecoverycancelledwithstatus16037Errorsinfile/u01/app/diag/rdbms/nyc/NYC/trace/NYC_pr00_2467.trc:ORA-16037:userrequestedcancelofmanagedrecoveryoperationManagedStandbyRecoverynotusingRealTimeApplyRecoveryinterrupted!WaitingforMRP0pid2460toterminateErrorsinfile/u01/app/diag/rdbms/nyc/NYC/trace/NYC_pr00_2467.trc:ORA-16037:userrequestedcancelofmanagedrecoveryoperationTueMar1516:16:452011MRP0:BackgroundMediaRecoveryprocessshutdown(NYC)RoleChange:CanceledMRP7、打开新的主库在新的主库上打开数据库ALTERDATABASEOPEN;8、检查新主库的TEMPFILE如果存在问题则进行处理。9、重启新的备库首先停止新的备库SHUTDOWNABORT;注:如果使用immediate停止数据库,则其依然会使用abort方式停止数据库,会在alert日志中看到类似如下信息:PerformingimplicitshutdownabortduetoswitchovertophysicalstandbyShuttingdowninstance(abort)Licensehighwatermark=15USER(ospid:14665):terminatingtheinstanceInstanceterminatedbyUSER,pid=14665启动新的备库:SQLSTARTUPMOUNT;SQLALTERDATABASERECOVERMANAGEDSTANDBYDATABASEUSINGCURRENTLOGFILEDISCONNECTFROMSESSION;10、意外或回退参考AppendixA.4.5RollBackAfterUnsuccessfulSwitchoverandStartOver三、无法正常切换的处理若主数据库异常中断无法连接做switchover处理,需要将灾备环境强制切换为主库(即failover),需