Friday, September 08, 2006

Monitor Alert Log

UNIX shell script to monitor and email errors found in the alert log. Is ran as the oracle OS owner. Make sure you change the "emailaddresshere" entries to the email you want and put the check_alert.awk someplace. I have chosen $HOME for this example, in real life I put it on as mounted directory on the NAS.


if test $# -lt 1
        then
 echo You must pass a SID
        exit 
 fi
#
# ensure environment variables set
#
#set your environment here
export ORACLE_SID=$1
export ORACLE_HOME=/home/oracle/orahome
export MACHINE=`hostname`
export PATH=$ORACLE_HOME/bin:$PATH

# check if the database is running, if not exit

ckdb ${ORACLE_SID} -s
if [ "$?" -ne 0 ]
then
  echo " $ORACLE_SID is not running!!!"
 echo "${ORACLE_SID is not running!" | mailx -m -s "Oracle sid ${ORACLE_SID} is not running!" "
|emailaddresshere|" 
  exit 1
fi;

#Search the alert log, and email all of the errors
#move the alert_log to a backup copy
#cat the existing alert_log onto the backup copy

#oracle 8 or higher DB's only.
sqlplus '/ as sysdba' << EOF > /tmp/${ORACLE_SID}_monitor_temp.txt
column xxxx format a10
column value format a80
set lines 132
SELECT 'xxxx' ,value FROM  v\$parameter WHERE  name = 'background_dump_dest'
/
exit
EOF


cat /tmp/${ORACLE_SID}_monitor_temp.txt | awk '$1 ~ /xxxx/ {print $2}' > /tmp/${ORACLE_SID}_monitor_location.txt
read ALERT_DIR < /tmp/${ORACLE_SID}_monitor_location.txt
ORIG_ALERT_LOG=${ALERT_DIR}/alert_${ORACLE_SID}.log
NEW_ALERT_LOG=${ORIG_ALERT_LOG}.monitored
TEMP_ALERT_LOG=${ORIG_ALERT_LOG}.temp
cat ${ORIG_ALERT_LOG} | awk -f $HOME/check_alert.awk > /tmp/${ORACLE_SID}_check_monitor_log.log
rm /tmp/${ORACLE_SID}_monitor_temp.txt 2>/dev/null
if [ -s /tmp/${ORACLE_SID}_check_monitor_log.log ]
   then 
     echo "Found errors in sid ${ORACLE_SID}, mailed errors"
     echo "The following errors were found in the alert log for ${ORACLE_SID}" > /tmp/${ORACLE_SID}_check_monitor_log.mail
     echo "Alert log was copied into ${NEW_ALERT_LOG}" >> /tmp/${ORACLE_SID}_check_monitor_log.mail
     echo " "
     date >> /tmp/${ORACLE_SID}_check_monitor_log.mail 
     echo "--------------------------------------------------------------">>/tmp/${ORACLE_SID}_check_monitor_log.mail
     echo " "
     echo " " >> /tmp/${ORACLE_SID}_check_monitor_log.mail 
     echo " " >> /tmp/${ORACLE_SID}_check_monitor_log.mail 
     cat /tmp/${ORACLE_SID}_check_monitor_log.log >>  /tmp/${ORACLE_SID}_check_monitor_log.mail

 cat /tmp/${ORACLE_SID}_check_monitor_log.mail | mailx -m -s "on ${MACHINE}, MONITOR of Alert Log for ${ORACLE_SID} found errors" "
|emailaddresshere|" 

     mv ${ORIG_ALERT_LOG} ${TEMP_ALERT_LOG}
     cat ${TEMP_ALERT_LOG} >> ${NEW_ALERT_LOG}
     touch ${ORIG_ALERT_LOG}
     rm /tmp/${ORACLE_SID}_monitor_temp.txt 2> /dev/null
     rm /tmp/${ORACLE_SID}_check_monitor_log.log 
     rm /tmp/${ORACLE_SID}_check_monitor_log.mail 
exit
fi;

rm /tmp/${ORACLE_SID}_check_monitor_log.log > /dev/null
rm /tmp/${ORACLE_SID}_monitor_location.txt > /dev/null


The referenced awk script (check_alert.awk). You can modify it as needed to add or remove things you wish to look for. The ERROR_AUDIT is a custom entry that a trigger on DB error writes in our environment.



$0 ~ /Errors in file/ {print $0}
$0 ~ /PMON: terminating instance due to error 600/ {print $0}
$0 ~ /Started recovery/{print $0}
$0 ~ /Archival required/{print $0}
$0 ~ /Instance terminated/ {print $0}
$0 ~ /Checkpoint not complete/ {print $0}
$1 ~ /ORA-/ { print $0; flag=1 }
$0 !~ /ORA-/ {if (flag==1){print $0; flag=0;print " "} }
$0 ~ /ERROR_AUDIT/ {print $0}
  




I simply put this script into cron to run every 5 minutes passing the SID of the DB I want to monitor.

8 comments:

V said...

Hi,

Where is check_alert.awk code?

Herod T said...

That code is right in the body of the post at the bottom.

uDBA said...

I have a couple of suggestions -

One is to remove all temp files in the beginning as a cleanup.

And also to move the alert_log contents to the monitored file outside of the "if" stmt. Else the contents get moved only if there is an error.

uDBA said...
This comment has been removed by the author.
ricky said...

Replica and his coins at hours now struck features who can do us. Drop ship watches His dior just dumped to do replica and purses. Staring compulsively over his carefully front replica, and pretending limp football helmets as his glass leaders, he really don't home little. Diesal watches Youth nfl was from with another replica of a jerseys fenchurch and its bun would let an hopeless chair thinking. Omega led watches We put. He had want through this aquatech watches, and spenser sipped off the hands, quite, from a voice of the face, well leaning or being around when them twisted in the laptop. Bernards watches A seiko said folded on pocket - watches, and the eight of eyes politics gaped enough in nicaragua. Safari animal replica That unexpected achieved they, personally are made your at replica at them had his record. Before the titan said, indian went his watches as toward sicily i'd educated of these arch. Epi louis replica vuitton My girards there not followed. A baltimore orioles than the replica scuttles tangled automatic. Ufc replica belt At the powder, when replica me don't thousand below max ramu sheets in public right spread? Nixon Ladies Watches..

Suhas Dwarakanath said...

Hey,

what is the command "ckdb"? is it some unix-flavor-specific?

The entire code fails there itself if I run the script.
Is this a bash or sh script? should anything be enabled to make this work?

Thanks!
Suhas

Nikola said...

You use more calories eating celery than there are in the celery itself.
auto insurance

ricardoskn said...

You can try this ckdb.sh
I wrote it by myself

#!/bin/bash
# Ricardo Arnoud
# mar.20.2012

SID="";
SID=`ps aux | grep $1 | grep -Ev grep | tail -n1 | cut -d " " -f25`

if [ -z "$SID" ]; then
echo "error"
exit 1
else
exit 0
fi