Thursday, August 16, 2012

DataGuard ReadOnly to Apply and back

Dataguard and Reporting Before the days of Active Dataguard, were the days of read-only reporting off the physical standby. It pains to see data-centers with production like hardware and unused physical standby databases doing nothing but receive redo content once every few hours. Active Dataguard is a boon in that way.

In older versions, reporting environments typically received all the archive logs through the day and then, applied them overnight. The database is available through the day as read only , for querying and taken offline once a day for applying the logs and bringing back online to read-only mode.

Below is a shell script to do exactly that.

Its self explanatory




#!/bin/ksh
################################################################
################################################################
########## AUTHOR          : Harris
########## DATE            : 13 Aug 2012
########## VERSION         : 1.0
########## DESCRIPTION     :This script is used to stop physical
#### standby in readonly mode, apply the logs and bring back
##### online
################################################################
# VERSION |    DATE    |   Author   | DETAIL OF CHANGES
#---------------------------------------------------------------
# V0.1    | 08/13/2012 |Harris Baskaran |Initial Script
################################################################
################################################################
set -x
ORATAB=/var/opt/oracle/oratab
##Entries to update##########
export ORACLE_SID=STANDBYTNS
export CONNECTSTR=dg/dg@PRODTNS
export ENVIRON=DB service name
#####Review before editing parameters below this

export NLS_DATE_FORMAT='Mon DD YYYY HH24:MI:SS'
export ORACLE_HOME=`grep -i "$ORACLE_SID:" $ORATAB | grep -v "^#" | grep -v "^\*"| grep ":" | awk -F":" '{print $2}'`
export PATH=$ORACLE_HOME/bin:$PATH
export LD_LIBRARY_PATH=$ORACLE_HOME/ctx/lib:$ORACLE_HOME/lib:$LD_LIBRARY_PATH

touch error.log
touch shutdown_report.log
touch apply_report.log
touch check_report.log
touch switch2ro.log
touch handshake.log


## Email functionality

mail_it_to_all()
{
#Email the report
SEND_DATE=`date`
RECIPIENTS=dba@company.com
MESSAGE_TITLE="Reporting DB ${ORACLE_SID} on ${ENVIRON} ${SEND_DATE}"
/usr/bin/mailx -s "${MESSAGE_TITLE}" ${RECIPIENTS}  << EOF
${SEND_DATE}
`cat error.log`
`cat shutdown_report.log`
`cat apply_report.log`
`cat check_report.log`
`cat switch2ro.log`
`cat handshake.log`
EOF
}


##Find Sequence of logs shipped
log_shipping_check()
{
sqlplus -s -l / as sysdba  << LOGSHIP
whenever sqlerror exit sql.sqlcode;
whenever oserror exit FAILURE
set heading off
SELECT SEQUENCE# FROM V\$ARCHIVED_LOG WHERE FIRST_TIME = (SELECT MAX(FIRST_TIME) FROM V\$ARCHIVED_LOG GROUP BY THREAD#);
exit
LOGSHIP
}

## function to get Production Sequence
runsqlprod()
{
sqlplus -s -l ${CONNECTSTR}  << PRDCHK
whenever sqlerror exit sql.sqlcode;
whenever oserror exit FAILURE
set heading off
SELECT distinct SEQUENCE# FROM V\$ARCHIVED_LOG WHERE (THREAD#,FIRST_TIME ) IN (SELECT THREAD#,MAX(FIRST_TIME) FROM V\$ARCHIVED_LOG GROUP BY THREAD#);
exit
PRDCHK
}

## function to get Standby Sequence
runsqlstdby()
{
sqlplus -s -l / as sysdba  << STDCHK
whenever sqlerror exit sql.sqlcode;
whenever oserror exit FAILURE
set heading off
select max(sequence#) from V\$ARCHIVED_LOG where applied='YES';
exit
STDCHK
}

shutdown_db()
{
sqlplus -s -l / as sysdba <<E1
whenever sqlerror exit sql.sqlcode;
whenever oserror exit FAILURE
spool shutdown_report.log
select name, open_mode from V\$database;
select username, machine from v\$session where status='ACTIVE';
shutdown immediate;
spool off
exit
E1
}

apply_logs()
{
sqlplus -s -l / as sysdba <<E2
whenever sqlerror exit sql.sqlcode;
whenever oserror exit FAILURE
spool apply_report.log
startup nomount;
alter database mount standby database;
recover managed standby database using current logfile disconnect from session;
spool off
exit
E2
}


#### custom handshake function with whatever reporting job you run
handshake()
{
sqlplus -s -l / as sysdba <<E4
whenever sqlerror exit sql.sqlcode;
whenever oserror exit FAILURE
##Replace with code here
##@handshake.sql
exit
E4
}

switch_to_ro()
{
sqlplus -s -l / as sysdba <<E2
whenever sqlerror exit sql.sqlcode;
whenever oserror exit FAILURE
spool switch2ro.log
recover managed standby database cancel;
alter database open read only;
spool off
exit
E2
}



##Find out if the logs are shipped already, if not no point continuing

PRODSEQ=`runsqlprod`
SHIPPED=`log_shipping_check`

if [ ${SHIPPED} -lt ${PRODSEQ} ]
then
echo "The "${ENVIRON}" Reporting and Source Logs are not shipped through the day. Need DBA intervention" >> error.log
echo "Production sequence is "${PRODSEQ} >> error.log
echo "Standby Sequence shipped is "${SHIPPED} >> error.log
mail_it_to_all()
exit 1
fi




####Shutdown the reporting to allow log application

shutdown_db

if grep -c 'ORA-' shutdown_report.log
then
echo ${ENVIRON}"-Error in bringing down reporting database..."
mail_it_to_all
exit 1
else
echo ${ENVIRON}"-Reporting Shutdown..."
fi


####Apply the archive logs and bring to current

apply_logs

if grep -c 'ORA-' apply_report.log
then
echo ${ENVIRON}"-Error in applying the logs on reporting database..."
mail_it_to_all
exit 1
else
echo ${ENVIRON}"-Reporting Logs Applying..."
fi




####Ensure it is synched with Production



### Sleep 15 minutes 10 times until Standby catches up with Prod site
### Increase freq as required


for i in 1 2 3 4 5 6 7 8 9 10
do
STDBSEQ=`runsqlstdby`
if [ ${STDBSEQ} -lt ${PRODSEQ} ]
then
sleep 60
else
break
fi
done


## If catch up still doesnt happen, send email and die!
if [ ${STDBSEQ} -lt ${PRODSEQ} ]
then
echo "The ${ENVIRON} Reporting and Source could not be synched. Need DBA intervention" >> error.log
echo "Production sequence is ${PRODSEQ} " >> error.log
echo "Standby Sequence is ${STDBSEQ} " >> error.log
mail_it_to_all()
exit 1
EEOF
fi

## Bring to read only

switch_to_ro

####Handshake with Application

#handshake()

if grep -c 'ORA-' handshake.log
then
echo ${ENVIRON}"-Error in handshaking with reporting module..."
mail_it_to_all
exit 1
else
echo ${ENVIRON}"-Reporting handshake done..."
fi

mail_it_to_all
mkdir `date "+%m%d%y"`
/usr/bin/mv *.log `date "+%m%d%y"`\/.

Thursday, January 19, 2012

SCN headroom and CVE-2012-0082

CVE-2012-0082 is a security flaw in Oracle Databases; which manifests in some scenarios which can significantly increase the rate SCN increases. InfoWorld also has documented the problem in details - InfoWorld(http://www.infoworld.com/d/security/fundamental-oracle-flaw-revealed-184163-0)
Oracle’s CPU2012Jan addresses this problem. Oracle Document 1374524.1

To understand the problem it is necessary to define SCN headroom.
The SCN value is an ever increasing value for a database. The threshold for this SCN value is also ever increasing number too. The max limit of SCN is defined by the number of seconds past the year 1988 X 16k, hence as time passes this value also keeps increasing. This is the soft-limit. There is also a hard limit of 218 trillion.

It is also necessary to understand that when a DB link operation between two databases, Oracle automatically selects the higher SCN out of the two databases and syncs them between the two databases.


--On first database
SQL> select name from V$database;

NAME
---------
Database1

SQL> select dbms_flashback.get_system_change_number from dual;

GET_SYSTEM_CHANGE_NUMBER
------------------------
398829972857 -- note the SCN number



--On second database
SQL> select name from V$database;

NAME
---------
Database2

SQL> select dbms_flashback.get_system_change_number from dual;

GET_SYSTEM_CHANGE_NUMBER
------------------------
391409782850


SQL> select sysdate from dual@Database1.WORLD; -- DB link query

SYSDATE
---------
19-JAN-12



SQL> select dbms_flashback.get_system_change_number from dual;

GET_SYSTEM_CHANGE_NUMBER
------------------------
398829973230 -- notice the significant increase for synchronizing the SCNs




InfoWorld also documents that Oracle Hot backups can significantly increase the scn generation. To find the headroom at any given time use the below query.


select
version,
to_char(SYSDATE,'YYYY/MM/DD HH24:MI:SS') CURR_TIME,
((((
((to_number(to_char(sysdate,'YYYY'))-1988)*12*31*24*60*60) + ((to_number(to_char(sysdate,'MM'))-1)*31*24*60*60) +
(((to_number(to_char(sysdate,'DD'))-1))*24*60*60) + (to_number(to_char(sysdate,'HH24'))*60*60) +
(to_number(to_char(sysdate,'MI'))*60) + (to_number(to_char(sysdate,'SS')))) * (16*1024))
- dbms_flashback.get_system_change_number)/ (16*1024*60*60*24)) scn_headroom
from v$instance;


See Oracle document : ID 1393363.1
Reading through Oracle’ patch if the headroom is less than 62, urgent attention is required. i.e. Apply patch immediately and contact Oracle.
To ensure that the headroom is not being reached rapidly, run the below query.


select max(first_change#)-min(first_change#), to_char(first_time,'YYYY-MM-DD') from V$log_history
group by to_char(first_time,'YYYY-MM-DD')
order by 2 desc;




Sometimes there may be a significant jump in the SCN which can be associated with a DBlink activity. Hot backups can also significantly increase the SCN generation rate as detailed by InfoWorld
If you notice that the SCN rate is very high there is plenty to worry about. If the high SCN rate database is interconnected via DBlink with other databases it can bring down a lot of databases together.
Along with the patch, Oracle have also added a hidden parameter _external_scn_rejection_threshold_hours which needs to be set to 24. There is not much details about this hidden parameter , but it might have something to do with the distributed transaction SCN synch.
As always please ensure you have read and understood the Oracle documentation before applying.