Back to Main 

 

Download  Zipped scripts for this Automation (goldengate_dr_automation.zip)

PDF Version of this Document

 

Oracle GoldenGate Disaster Recovery

Switchover / Failover Automation

 

 

 

 

 

 

 

 

Author:                        Vladimir Grigorian

Review Date:              March 20, 2015

Last Updated:            March 20, 2015

Version:                       2.0

 

 

 

 

Table of Contents

 

Document Control   ……………………………………………………………………….      3

Introduction  ………………………………………………………………………………      4

Overview of Standby and GoldenGate Configuration  ………………………………..       5

1.      GoldenGate Failover scripts Functionality and Purpose………………………      8

2.      Implementation …………………………………….……………………………     10

3.      Testing of GoldenGate Failover Capaility.abase for Target …………………..    11

 

APPENDIX ………………………………………………………………………………..    16

 

Document Control

 

Change Record

                                                    

Date

Author

Version

Change Reference

 

 

 

 

 

 

March 10, 2011

Vladimir Grigorian

1.0

First Draft

 

 

 

 

 

 

March 20, 2011

Vladimir Grigorian

2.0

Changed scripts from 5 mins checks from cron to on-demand run. Added first SCN after failover check.

 

 

 

 

 

 

 

Reviewers

Name

Position

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Distribution

Copy No.

Name

Location

 

 

 

 

 

1          

 

 

 

2          

 

 

 

3

 

 

 

 

This publication may not be reproduced, stored in a retrieval system, or transmitted in whole or in part, in any form, or by any means, electronic, mechanical, photocopying, recording or otherwise, without the prior written permission of Cognizant/ Xerox Company.

 

 

 

 

 

Introduction

 

Purpose

The purpose of this document is document is to outline a procedure for failing over GoldenGate from a primary database to its standby in an Oracle DataGuard configuration. This is a Proof of Concept setup.

Usually, production databases with a running GoldenGate configuration, which replicates to a DSS or another environment, also utilize a standby database for disaster recovery. The challenge most DBA’s are facing is that if a primary datacenter is inaccessible, the rebuilding of a new GoldenGate setup on the failedover standby instance either takes time, or becomes impossible because primary GoldenGate configuration files and trails are no longer available. If DataGuard failover takes minutes to complete, the reinstantiation of GoldenGate on the new primary could take weeks.

Therefore, the purpose of this proof of concept test is to implement and successfully test a functionality that will allow the following:

1)       The periodic (every 5 minutes) parameter file sync up between a primary GoldenGate and standby. This will run from the primary database. It will first make backups of the files being replaced. The sync up should automatically change copied parameter files on the standby to reflect its specific parameters, such as $TNS_ADMIN, $ORACLE_SID, dirdat, etc. The sync up should not overwrite parameter files that do not exist on primary.

2)       The on-demand script run which will get the first SCN after failover to ensure no data is lost after standby has been started as primary. The script will enable instance GoldenGate parameters, logon to GGSCI, enable and start processes synced up and changed to the standby’s specs. The script then will pass the SCN to the obey GoldenGate script. The script should then clean up the obey script, so the first SCN after failover is deleted.

 

Audience

This document is intended for:

·         Oracle DBAs and GoldenGate Administrators.

 

 

Assumptions

Following assumptions have been verified prior to installing and configuring Oracle GoldenGate.

1.       Oracle Dataguard has been installed and is functioning properly.

2.       GoldenGate is installed on the primary and standby, but functioning on primary.

3.       This is a test system which is acceptable to switchover for testing purposes.

Overview of Standby and GoldenGate Configuration

 

The following outlines the intended setup for this configuration.

 

Table 1. Server Information (Due to Hardware Constraint this POC was executed on a Single Server)

 

 

Source

Destination

Hostname

Oralinux6

Oralinux6

OS

3.8.13-55.1.2.el7uek.x86_64

3.8.13-55.1.2.el7uek.x86_64

GoldenGate Home

/u02/gg/source

/u02/gg/target

Oracle Instance

PBMP

STBMP

Oracle Version

12.1.0.2.0

12.1.0.2.0

GoldenGate version

11.2.1.0.27

11.2.1.0.27

 

Table 2. GoldenGate Processes Information

 

 

Primary

Standby

Manager Port

7809

7810

DataPump

ESNDRT01

ESNDRT01

Failover Scripts in

/u02/gg/source/scripts

/u02/gg/source/scripts

Trails

/u02/gg/source/dirdat/s1

/u02/gg/source/dirdat/s1

GoldenGate Admin

GGATE

GGATE

User schemas

SENDER

SENDER

 

 

 

Fig 1. Regular DataGuard and GoldenGate Operation

 

 

 

 

Fig 2. Switchedover DataGuard and GoldenGate Configuration

 

 

 

 

 

 

 

 

 

 

 

1.     GoldenGate Failover  Scripts Functionality and Purpose

 

 

Script Name

Purpose

Function

ggdr_sync_01_rsync.ksh

Sync up

Backup parameter files, syncup files, but exclude MGR

ggdr_sync_02_rename_SID.ksh

Sync up

Rename SID to the standby’s or other STBY specific parameters

ggdr_01_if_switchedover_enable_GG.ksh

GG Failover

Check STANDBY_BECAME_PRIMARY_SCN in V$DATABASE. Insert it into obey script (script #4)

ggdr_02_SQL_Eable_GG_param.ksh

GG Failover

Login as SYSDBA and issue “ALTER SYSTEM SET enable_goldengate_replication=TRUE SCOPE=BOTH; then kick off the next script

ggdr_03_failover_GoldenGate.ksh

GG Failover

This is a GG wrapper script that takes input from GGSCI script _obey_switchover.ksh

ggdr_04_obey_switchover.ksh (identical to ggdr_04_obey_switchover_VANILLA.ksh when idle, except at run time when SCN is inserted into it)

GG Failover

This is the actual GG script that logs into db in GGSCI, starts manager, forcestops processes, adds new processes, adds exttrails, starts processes from STANDBY_BECAME_PRIMARY_SCN (from script #1).

ggdr_05_rename_obey_afterGG_switchover.ksh

GG Failover

Deletes the obey script ggdr_04_obey_switchover.ksh  with a STANDBY_BECAME_PRIMARY_SCN so it cannot run again. Copies ggdr_04_obey_switchover_VANILLA..ksh (which doesn’t have the SCN) to ggdr_04_obey_switchover.ksh so it can be used for future switchover.

Table 3. Script Names and their Purpose

 

 

 

 

Fig 3. Sync Up Functionality

 

 

 

 

 

Fig 4. GoldenGate Failover Functionality

 

 

2.     Implementation

 

To enable automatic DataGuard and GoldenGate failover capability the following tasks are executed:

1)      Alter scripts in the APPENDIX of this document so they reflect your environment ($GG, $SID, $ORACLE_HOME, etc.). Put them in $GG/scripts

2)      Schedule the sync up script to run from cron (the failover script is run on-demand). The failover script can be run on-demand. To failover GoldenGate only the first script needs to be run - ggdr_01_if_switchedover_enable_GG.ksh

3)      Maintain scripts , especially the obey scripts, to reflect new processes. All this information can be received from dirprm and periodic reports.

Once these tasks are executed,  GoldenGate will failover automatically 5 minutes after standby is started as the primary.

 

 

Task

Scripts

Server

Runs from

Sync up

ggdr_sync_01_rsync.ksh, ggdr_sync_02_rename_SID.ksh

PRIMARY

Cron, every 5 mins

GG Failover

ggdr_01_if_switchedover_enable_GG.ksh, ggdr_02_SQL_Eable_GG_param.ksh, ggdr_03_failover_GoldenGate.ksh,

ggdr_04_obey_switchover.ksh,

ggdr_05_rename_obey_afterGG_switchover.ksh

 

 

 

STANDBY

On-demand

Table 5. DataGuard and GoldenGate Failover Scripts Implementation

 

 

 

 

 

 

 

 

 

 

 

 

3.     Testing of DataGuard GoldenGate Failover  Capability

 

To test this functionality simply switchover Dataguard and the sync up scripts will be kicked off automatically. Don’t forget to disable sync up scripts from the primary because they are based on the assumption that the primary server is out of commission and they therefore cannot run, which is not the case in the failover situation. The progress of the process can be derived from /tmp logs, ggserr.log or simple email alerts as shown below.

 

 

 

Fig 6. Email Notification for GoldenGate PRIM => STBY Parameters Sync Up

 

 

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++

+ Oracle GoldenGate Disaster Recovery Failover

+ Step 3 of 5.

+ Start Goldengate wrapper OBEY script from new primary GG home.

+ Proceed to step 4.

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

 

 

Oracle GoldenGate Command Interpreter for Oracle

Version 12.1.2.0.2 19269784 19683584_FBO

Linux, x64, 64bit (optimized), Oracle 12c on Oct 19 2014 16:12:14

Operating system character set identified as UTF-8.

 

Copyright (C) 1995, 2014, Oracle and/or its affiliates. All rights reserved.

 

 

 

GGSCI (oralinux6) 1> Successfully logged into database.

 

GGSCI (oralinux6) 2> MGR is already running.

 

 

GGSCI (oralinux6) 3> ERROR: Invalid command.

 

GGSCI (oralinux6) 4>

 

 

GGSCI (oralinux6) 5>

Sending FORCESTOP request to EXTRACT ESNDRT01 ...

Request processed.

 

 

GGSCI (oralinux6) 6> Deleted EXTRACT ESNDRT01.

 

 

GGSCI (oralinux6) 7>

 

 

GGSCI (oralinux6) 8> EXTRACT added.

 

 

GGSCI (oralinux6) 9>

 

 

GGSCI (oralinux6) 10> EXTTRAIL added.

 

 

GGSCI (oralinux6) 11>

 

 

GGSCI (oralinux6) 12>

GGSCI (oralinux6) 13> EXTRACT altered.

 

 

GGSCI (oralinux6) 14>

GGSCI (oralinux6) 15>

Sending START request to MANAGER ...

EXTRACT ESNDRT01 starting

 

 

GGSCI (oralinux6) 16>

GGSCI (oralinux6) 17>

GGSCI (oralinux6) 18>

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

 

MANAGER     RUNNING

EXTRACT     STOPPED     ESNDRT01    00:00:00      00:00:00

 

 

GGSCI (oralinux6) 19>

GGSCI (oralinux6) 20>

GGSCI (oralinux6) 21> bash: EOF: command not found...

 

 

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++

+ Oracle GoldenGate Disaster Recovery Failover

+ Step 4 of 5.

+ Wipe out old process. Add extract. Add exttrail. Start mgr and extract.

+ Proceed to step 5.

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

 

 

 

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++

+ Renaming GG OBEY file ggdr_04_obey_switchover.ksh

+ to ggdr_04_obey_switchover.ksh.already_run

+ to prevent in-loop processes dropping

+ and recreation.

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

 

 

 

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++

+ Oracle GoldenGate Disaster Recovery Failover

+ Step 5 of 5.

+ GoldenGate has now switched over and runs on STANDY

+ Comments to vladimir.grigorian@xerox.com.

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

 

Completed GoldenGate DR failover

 

Fig 7. A Completed GoldenGate Failover (from /tmp Log)

 

 

 

 

APPENDIX

 

Exclude_list.txt

mgr.prm

/u02/gg/target/dirprm/mgr.prm

jagent.prm

/u02/gg/target/dirprm/jagent.prm

 

ggdr_sync_01_rsync.ksh

 

############################################

# Name: ggdr_sync_01_rsync.ksh #

# Author: vladimir.grigorian@xerox.com

###########################################

export GGATE=/opt/app/gg2/gg

alias gate='clear;cd $GGATE;./ggsci'

export PATH=/opt/app/gg2/gg:/usr/sbin:/usr/lib/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:$PATH

export LD_LIBRARY_PATH=$ORACLE_HOME/lib:/usr/lib:/opt/app/gg2/gg:

LOGDIR=/opt/app/gg2/gg/log

EMAILFile=/tmp/ggdr.log

BOX=$(uname -a | awk '{print $2}')

 

rm -rf /tmp/ggdr.log

 

rm -rf /u02/gg/target/diprm

 

echo -e "Starting GoldenGate parameters sync between production MA $BOX and Disaster recovery server tt0sludb07pma \n Refreshing dirdat\n " >> /tmp/ggdr.log

 

rsync -avzhe ssh /u02/gg/source/dirprm --exclude-from '/u02/gg/source/scripts/exclude_list.txt' localhost:/u02/gg/target/ >> /tmp/ggdr.log

 

echo -e " \n \n Changing DR GoldenGate parameter files to reflect the Standby SID  \n \n " >> /tmp/ggdr.log

 

/u02/gg/source/scripts/ggdr_sync_02_rename_SID.ksh >> /tmp/ggdr.log

 

echo -e " \n \n GoldenGate DR Sync Completed successfully on $(date) on $BOX \n \n " >> /tmp/ggdr.log

 

 

##########################################################

## SENDING EMAIL IF ERRORS ARE IN LOGFILE ###

##########################################################

 

mailx -s "PROD MA GoldenGate Disaster Recovery Parameters Sync Completed on $BOX at $(date)" grigorianvlad@gmail.com < $EMAILFile

 

 

ggdr_sync_02_rename_SID.ksh

 

############################################

# Name: ggs_lag.ksh #

# Author: vladimir.grigorian@acs-in.com

# PURPOSE: TO MONITOR LAG OF GOLDEN GATE #

# NOTE: THIS SCRIPT CALLS ggs.ksh #

# THIS SCRIPT NOTIFY IF LAG IS MORE THEN 30 MIN #

# ONLY FOR FOR EXT AND PMP PROCESS GROUP #

###########################################

export GGATE=/opt/app/gg2/gg

alias gate='clear;cd $GGATE;./ggsci'

export PATH=/opt/app/gg2/gg:/usr/sbin:/usr/lib/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:$PATH

export LD_LIBRARY_PATH=$ORACLE_HOME/lib:/usr/lib:/opt/app/gg2/gg:

LOGDIR=/opt/app/gg2/gg/log

EMAILFile=/tmp/ggdr.log

BOX=$(uname -a | awk '{print $2}')

 

filepath="/u02/gg/target/dirprm"

searchstring="PBMP"

replacestring="STBMP"

 

i=0;

 

for file in $(grep -l -R $searchstring $filepath)

do

  cp $file $file.bak

  sed -e "s/$searchstring/$replacestring/ig" $file > tempfile.tmp

  mv tempfile.tmp $file

 

  let i++;

 

  echo "Modified: " $file

done

ggdr_01_if_switchedover_enable_GG.ksh

#!/bin/bash

echo -e " \n \n +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ \n

+ Oracle GoldenGate Disaster Recovery Failover \n

+ Created 3/15/2015 vladimir.grigorian@xerox.com 5 \n

+ Step 1 of 5. Check if standby has failed or switched over, if it has get the first SCN for GoldenGate. \n

+ Proceed to step 2. \n

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ \n "

 

. /home/oracle/stbmp.env

export ORACLE_HOME=/u01/app/oracle/product/12.1.0/dbhome_1

export ORACLE_SID=STBMP

sqlplus -s sys/oracle12cDB@STBMP as sysdba <<EOF  > get_STANDBY_BECAME_PRIMARY_SCN.log

SET PAGESIZE 0 FEEDBACK OFF VERIFY OFF HEADING OFF ECHO OFF trimspool on

select STANDBY_BECAME_PRIMARY_SCN from v\$database;

EXIT;

EOF

result=`cat get_STANDBY_BECAME_PRIMARY_SCN.log`

#echo "alter extract ESNDRT01 SCN $result " > /u02/gg/source/scripts/SCN.log

echo "alter extract ESNDRT01 SCN $result " >> /u02/gg/source/scripts/ggdr_04_obey_switchover.ksh

echo -e " \n start e* \n " >> /u02/gg/source/scripts/ggdr_04_obey_switchover.ksh

echo -e " \n info all \n " >> /u02/gg/source/scripts/ggdr_04_obey_switchover.ksh

echo -e " \n exit \n " >> /u02/gg/source/scripts/ggdr_04_obey_switchover.ksh

echo "The first after failover SCN is written to GoldenGate obey script $result "

/u02/gg/source/scripts/ggdr_02_SQL_Eable_GG_param.ksh

ggdr_02_SQL_Eable_GG_param.ksh

#!/bin/sh

echo -e " \n \n +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ \n + Oracle GoldenGate Disaster Recovery Failover \n + Step 2 of 5.  \n + Enable SQL "enable_goldengate_replication=TRUE". \n + Proceed to step 3. \n ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ \n "

export ORACLE_HOME=/u01/app/oracle/product/12.1.0/dbhome_1

export ORACLE_SID=STBMP

cd /home/oracle

. ./stbmp.env

 

sqlplus / as sysdba <<EOF

 

ALTER SYSTEM SET enable_goldengate_replication=TRUE SCOPE=BOTH;

exit

EOF

#exit

/u02/gg/source/scripts/ggdr_03_failover_GoldenGate.ksh

 

ggdr_03_failover_GoldenGate.ksh (identical to ggdr_04_obey_switchover_VANILLA.ksh)

 

#!/bin/bash

echo -e " \n \n +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ \n + Oracle GoldenGate Disaster Recovery Failover \n + Step 3 of 5.  \n + Start Goldengate wrapper OBEY script from new primary GG home. \n + Proceed to step 4. \n ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ \n "

. /home/oracle/stbmp.env

/u02/gg/target/ggsci < /u02/gg/source/scripts/ggdr_04_obey_switchover.ksh

EOF

echo -e " \n \n +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ \n + Oracle GoldenGate Disaster Recovery Failover \n + Step 4 of 5.  \n + Wipe out old process. Add extract. Add exttrail. Start mgr and extract. \n + Proceed to step 5. \n ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ \n "

/u02/gg/source/scripts/ggdr_05_rename_obey_afterGG_switchover.ksh

echo "Completed GoldenGate DR failover"

exit

ggdr_04_obey_switchover.ksh

 

dblogin userid ggate@stbmp, password oracle

start mgr

sleep 7

sh sleep 7

stop e*, forcestop!

delete ESNDRT01

sh sleep 5

add extract ESNDRT01, tranlog, begin now

sh sleep 5

add exttrail /u02/gg/source/dirdat/s1, extract ESNDRT01

sh sleep 5

 

 

 

ggdr_05_rename_obey_afterGG_switchover.ksh

 

echo -e " \n \n +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ \n + Cleaning GG OBEY file ggdr_04_obey_switchover.ksh \n + for the next failover \n ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ \n "

rm -rf /u02/gg/source/scripts/ggdr_04_obey_switchover.ksh

cp  /u02/gg/source/scripts/ggdr_04_obey_switchover_VANILLA.ksh /u02/gg/source/scripts/ggdr_04_obey_switchover.ksh

echo -e " \n \n +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ \n + Oracle GoldenGate Disaster Recovery Failover \n + Step 5 of 5.  \n + GoldenGate has now switched over and runs on STANDY \n + Comments to vladimir.grigorian@xerox.com. \n ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ \n "

 

END OF DOCUMENT