PRTG - Veritas Backup Exec monitoring

This was originally posted by myself here: https://kb.paessler.com/en/topic/58233-symantec-backupexec-monitoring#reply-262024

We monitor backups a little bit more advanced – I thought I should share this knowledge as well…

Single Job Monitoring: Monitor a single job and it’s results – allows you configure the Job-Name with a PRTG filter value in the SQL sensor. The results will include various values – most notable are:

FinalJobStatus (as text)
TotalDataSizeBytes
TotalNumberOfDirectories
TotalNumberOfFiles
TotalRateMBMin

In order to implement this sensor, add a SQLv2 Sensor and configure like this:

Database: BEDB
Instance Name: BKUPEXEC (in most cases)
Use input parameter: specify the exact job name
DBNull = Error

Channels: Most channels are rather simple to configure, they are counters, SpeedDisk or BytesDisk – as PRTG has those channel types integrated already. The special channel is FINALJOBSTATUS – in order to have this working you will need the “backupexec.jobstatus.ovl” file in your %programfilesx86%\PRTG…\Lookups directory – see below for the file.

SQL Script for single job monitoring:

SELECT TOP 1
	J.JobName,
	H.TotalDataSizeBytes, H.TotalRateMBMin, H.TotalNumberOfFiles, H.TotalNumberOfDirectories,
	H.FinalJobStatus,
	H.ActualStartTime, H.EndTime, H.ElapsedTimeSeconds

FROM Jobs J 
	LEFT JOIN TaskDefinition T ON T.JobDefinitionID = J.JobDefinitionID
	LEFT JOIN Schedule S ON S.ScheduleId = T.ScheduleID
	LEFT JOIN JobHistorySummary H ON H.JobID = J.JobID

WHERE S.SystemId IS NOT NULL
	AND H.ActualStartTime > GETDATE()-8 --adjust this value if needed, 8 helps for the last 8 days.. weekly backups (full backups)
	AND J.JobName LIKE @prtg

ORDER BY H.ActualStartTime DESC

SELECT TOP 1

J.JobName,

H.TotalDataSizeBytes, H.TotalRateMBMin, H.TotalNumberOfFiles, H.TotalNumberOfDirectories,

H.FinalJobStatus,

H.ActualStartTime, H.EndTime, H.ElapsedTimeSeconds

FROM Jobs J

LEFT JOIN TaskDefinition T ON T.JobDefinitionID = J.JobDefinitionID

LEFT JOIN Schedule S ON S.ScheduleId = T.ScheduleID

LEFT JOIN JobHistorySummary H ON H.JobID = J.JobID

WHERE S.SystemId IS NOT NULL

AND H.ActualStartTime > GETDATE()-8 --adjust this value if needed, 8 helps for the last 8 days.. weekly backups (full backups)

AND J.JobName LIKE @prtg

ORDER BY H.ActualStartTime DESC

The backupexec.jobstatus.ovl file:

<?xml version="1.0" encoding="UTF-8"?>
<ValueLookup id="backupexec.jobstatus" desiredValue="1" undefinedState="error" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="PaeValueLookup.xsd">
   <Lookups>
	<SingleInt state="warning" value="1">
		Job Cancelled
	</SingleInt>
	<SingleInt state="Ok" value="2">
		Job Completed
	</SingleInt>
	<SingleInt state="warning" value="3">
		Job Exception
	</SingleInt>
	<SingleInt state="warning" value="4">
		Job Dispatched
	</SingleInt>
	<SingleInt state="warning" value="5">
		Job On Hold
	</SingleInt>
	<SingleInt state="error" value="6">
		Job Error
	</SingleInt>
	<SingleInt state="error" value="7">
		Job Invalid Schedule
	</SingleInt>
	<SingleInt state="error" value="8">
		Job Invalid Time Window
	</SingleInt>
	<SingleInt state="error" value="9">
		unknown status code (9)
	</SingleInt>
	<SingleInt state="error" value="10">
		Job Missed
	</SingleInt>
	<SingleInt state="warning" value="11">
		Job Paused
	</SingleInt>
	<SingleInt state="warning" value="12">
		Job Pending
	</SingleInt>
	<SingleInt state="warning" value="13">
		Job Recovered
	</SingleInt>
	<SingleInt state="error" value="14">
		Job Resource Disabled
	</SingleInt>
	<SingleInt state="warning" value="15">
		Job Resumed
	</SingleInt>
	<SingleInt state="Ok" value="16">
		Job Running
	</SingleInt>
	<SingleInt state="Ok" value="17">
		Job Ready
	</SingleInt>
	<SingleInt state="Ok" value="18">
		Job Scheduled
	</SingleInt>
	<SingleInt state="Ok" value="19">
		Job Success
	</SingleInt>
	<SingleInt state="warning" value="20">
		Job Superceded
	</SingleInt>
	<SingleInt state="error" value="21">
		Job Abort (Time)
	</SingleInt>
	<SingleInt state="warning" value="22">
		Job To Be Scheduled
	</SingleInt>
	<SingleInt state="warning" value="23">
		Job Linked to Another Job
	</SingleInt>
	<SingleInt state="error" value="24">
		unknown status value (24)
	</SingleInt>
	<SingleInt state="error" value="25">
		Job Blocked by Template Rule
	</SingleInt>	
   </Lookups>
</ValueLookup>

<?xml version="1.0" encoding="UTF-8"?>

Job Cancelled

</SingleInt>

Job Completed

</SingleInt>

Job Exception

</SingleInt>

Job Dispatched

</SingleInt>

Job On Hold

</SingleInt>

Job Error

</SingleInt>

Job Invalid Schedule

</SingleInt>

Job Invalid Time Window

</SingleInt>

unknown status code (9)

</SingleInt>

Job Missed

</SingleInt>

Job Paused

</SingleInt>

Job Pending

</SingleInt>

Job Recovered

</SingleInt>

Job Resource Disabled

</SingleInt>

Job Resumed

</SingleInt>

Job Running

</SingleInt>

Job Ready

</SingleInt>

Job Scheduled

</SingleInt>

Job Success

</SingleInt>

Job Superceded

</SingleInt>

Job Abort (Time)

</SingleInt>

Job To Be Scheduled

</SingleInt>

Job Linked to Another Job

</SingleInt>

unknown status value (24)

</SingleInt>

Job Blocked by Template Rule

</SingleInt>

</Lookups>

</ValueLookup>

Another SQL script we use is the one below – this actually approaches the whole monitoring more in an overview – it still depends on the JobHistory table, meaning, the job must have been running.. in theory you could work around this and actually get information on the scheduler etc.. the script below is a pure example.

DECLARE @JobFilter AS VARCHAR(500)
DECLARE @TimeWindow AS Integer

SET @JobFilter = '%-FULL' -- we only want to see jobs that with "-FULL" at the end of the job name / use @PRGT if you want to control it in PRTG
SET @TimeWindow = 7 --7 days in each direction for history and scheduled jobs

SELECT 
	COUNT(J.JobName) AS TotalJobs,
	SUM(CASE WHEN H.IsJobActive = 1 THEN 1 ELSE 0 END) AS ActiveJobs,

--	SUM(CASE WHEN H.FinalJobStatus = 1 THEN 1 ELSE 0 END) AS FinalStatus_JobCancelled, 
--	SUM(CASE WHEN H.FinalJobStatus = 2 THEN 1 ELSE 0 END) AS FinalStatus_JobCompleted, 
--	SUM(CASE WHEN H.FinalJobStatus = 3 THEN 1 ELSE 0 END) AS FinalStatus_JobException, 
--	SUM(CASE WHEN H.FinalJobStatus = 4 THEN 1 ELSE 0 END) AS FinalStatus_JobDispatched,
--	SUM(CASE WHEN H.FinalJobStatus = 5 THEN 1 ELSE 0 END) AS FinalStatus_JobOnHold,
	SUM(CASE WHEN H.FinalJobStatus = 6 THEN 1 ELSE 0 END) AS FinalStatus_JobError, 
--	SUM(CASE WHEN H.FinalJobStatus = 7 THEN 1 ELSE 0 END) AS FinalStatus_JobInvalidSchedule,
--	SUM(CASE WHEN H.FinalJobStatus = 8 THEN 1 ELSE 0 END) AS FinalStatus_JobInvalidTimeWindow,
--	SUM(CASE WHEN H.FinalJobStatus = 9 THEN 1 ELSE 0 END) AS FinalStatus_unknownStatusCode9, 
--	SUM(CASE WHEN H.FinalJobStatus = 10 THEN 1 ELSE 0 END) AS FinalStatus_JobMissed, 
--	SUM(CASE WHEN H.FinalJobStatus = 11 THEN 1 ELSE 0 END) AS FinalStatus_JobPaused, 
--	SUM(CASE WHEN H.FinalJobStatus = 12 THEN 1 ELSE 0 END) AS FinalStatus_JobPending,
	--SUM(CASE WHEN H.FinalJobStatus = 13 THEN 1 ELSE 0 END) AS FinalStatus_JobRecovered, 
--	SUM(CASE WHEN H.FinalJobStatus = 14 THEN 1 ELSE 0 END) AS FinalStatus_JobResourceDisabled, 
--	SUM(CASE WHEN H.FinalJobStatus = 15 THEN 1 ELSE 0 END) AS FinalStatus_JobResumed, 
	SUM(CASE WHEN H.FinalJobStatus = 16 THEN 1 ELSE 0 END) AS FinalStatus_JobRunning, 
--	SUM(CASE WHEN H.FinalJobStatus = 17 THEN 1 ELSE 0 END) AS FinalStatus_JobReady, 
--	SUM(CASE WHEN H.FinalJobStatus = 18 THEN 1 ELSE 0 END) AS FinalStatus_JobScheduled, 
	SUM(CASE WHEN H.FinalJobStatus = 19 THEN 1 ELSE 0 END) AS FinalStatus_JobSuccess, 
	SUM(CASE WHEN H.FinalJobStatus = 20 THEN 1 ELSE 0 END) AS FinalStatus_JobSuperceded, 
--	SUM(CASE WHEN H.FinalJobStatus = 21 THEN 1 ELSE 0 END) AS FinalStatus_JobAbort, 
--	SUM(CASE WHEN H.FinalJobStatus = 22 THEN 1 ELSE 0 END) AS FinalStatus_JobToBeScheduled, 
--	SUM(CASE WHEN H.FinalJobStatus = 23 THEN 1 ELSE 0 END) AS FinalStatus_JobLinkedToAnotherJob, 
--	SUM(CASE WHEN H.FinalJobStatus = 24 THEN 1 ELSE 0 END) AS FinalStatus_unknownStatusCode24, 
--	SUM(CASE WHEN H.FinalJobStatus = 25 THEN 1 ELSE 0 END) AS FinalStatus_JobBlockedByTemplateTule, 

	SUM(CASE WHEN H.FinalJobStatus = 3 THEN 1 ELSE 0 END) AS FinalStatus_Exceptions --3 = With Exceptions
--	SUM(CASE WHEN H.FinalJobStatus = 19 THEN 1 ELSE 0 END) AS FinalStatus_SuccessFull --19 = Successful
	
FROM Jobs J 
	LEFT JOIN TaskDefinition T ON T.JobDefinitionID = J.JobDefinitionID
	LEFT JOIN Schedule S ON S.ScheduleId = T.ScheduleID
	LEFT JOIN JobHistorySummary H ON H.JobID = J.JobID

WHERE S.SystemId IS NOT NULL
	AND H.ActualStartTime > GETDATE() - @TimeWindow
	AND J.NextDueDate < GETDATE() + @TimeWindow
	AND J.JobName LIKE @JobFilter

DECLARE @JobFilter AS VARCHAR(500)

DECLARE @TimeWindow AS Integer

SET @JobFilter = '%-FULL' -- we only want to see jobs that with "-FULL" at the end of the job name / use @PRGT if you want to control it in PRTG

SET @TimeWindow = 7 --7 days in each direction for history and scheduled jobs

SELECT

COUNT(J.JobName) AS TotalJobs,

SUM(CASE WHEN H.IsJobActive = 1 THEN 1 ELSE 0 END) AS ActiveJobs,

-- SUM(CASE WHEN H.FinalJobStatus = 1 THEN 1 ELSE 0 END) AS FinalStatus_JobCancelled,

-- SUM(CASE WHEN H.FinalJobStatus = 2 THEN 1 ELSE 0 END) AS FinalStatus_JobCompleted,

-- SUM(CASE WHEN H.FinalJobStatus = 3 THEN 1 ELSE 0 END) AS FinalStatus_JobException,

-- SUM(CASE WHEN H.FinalJobStatus = 4 THEN 1 ELSE 0 END) AS FinalStatus_JobDispatched,

-- SUM(CASE WHEN H.FinalJobStatus = 5 THEN 1 ELSE 0 END) AS FinalStatus_JobOnHold,

SUM(CASE WHEN H.FinalJobStatus = 6 THEN 1 ELSE 0 END) AS FinalStatus_JobError,

-- SUM(CASE WHEN H.FinalJobStatus = 7 THEN 1 ELSE 0 END) AS FinalStatus_JobInvalidSchedule,

-- SUM(CASE WHEN H.FinalJobStatus = 8 THEN 1 ELSE 0 END) AS FinalStatus_JobInvalidTimeWindow,

-- SUM(CASE WHEN H.FinalJobStatus = 9 THEN 1 ELSE 0 END) AS FinalStatus_unknownStatusCode9,

-- SUM(CASE WHEN H.FinalJobStatus = 10 THEN 1 ELSE 0 END) AS FinalStatus_JobMissed,

-- SUM(CASE WHEN H.FinalJobStatus = 11 THEN 1 ELSE 0 END) AS FinalStatus_JobPaused,

-- SUM(CASE WHEN H.FinalJobStatus = 12 THEN 1 ELSE 0 END) AS FinalStatus_JobPending,

--SUM(CASE WHEN H.FinalJobStatus = 13 THEN 1 ELSE 0 END) AS FinalStatus_JobRecovered,

-- SUM(CASE WHEN H.FinalJobStatus = 14 THEN 1 ELSE 0 END) AS FinalStatus_JobResourceDisabled,

-- SUM(CASE WHEN H.FinalJobStatus = 15 THEN 1 ELSE 0 END) AS FinalStatus_JobResumed,

SUM(CASE WHEN H.FinalJobStatus = 16 THEN 1 ELSE 0 END) AS FinalStatus_JobRunning,

-- SUM(CASE WHEN H.FinalJobStatus = 17 THEN 1 ELSE 0 END) AS FinalStatus_JobReady,

-- SUM(CASE WHEN H.FinalJobStatus = 18 THEN 1 ELSE 0 END) AS FinalStatus_JobScheduled,

SUM(CASE WHEN H.FinalJobStatus = 19 THEN 1 ELSE 0 END) AS FinalStatus_JobSuccess,

SUM(CASE WHEN H.FinalJobStatus = 20 THEN 1 ELSE 0 END) AS FinalStatus_JobSuperceded,

-- SUM(CASE WHEN H.FinalJobStatus = 21 THEN 1 ELSE 0 END) AS FinalStatus_JobAbort,

-- SUM(CASE WHEN H.FinalJobStatus = 22 THEN 1 ELSE 0 END) AS FinalStatus_JobToBeScheduled,

-- SUM(CASE WHEN H.FinalJobStatus = 23 THEN 1 ELSE 0 END) AS FinalStatus_JobLinkedToAnotherJob,

-- SUM(CASE WHEN H.FinalJobStatus = 24 THEN 1 ELSE 0 END) AS FinalStatus_unknownStatusCode24,

-- SUM(CASE WHEN H.FinalJobStatus = 25 THEN 1 ELSE 0 END) AS FinalStatus_JobBlockedByTemplateTule,

SUM(CASE WHEN H.FinalJobStatus = 3 THEN 1 ELSE 0 END) AS FinalStatus_Exceptions --3 = With Exceptions

-- SUM(CASE WHEN H.FinalJobStatus = 19 THEN 1 ELSE 0 END) AS FinalStatus_SuccessFull --19 = Successful

FROM Jobs J

LEFT JOIN TaskDefinition T ON T.JobDefinitionID = J.JobDefinitionID

LEFT JOIN Schedule S ON S.ScheduleId = T.ScheduleID

LEFT JOIN JobHistorySummary H ON H.JobID = J.JobID

WHERE S.SystemId IS NOT NULL

AND H.ActualStartTime > GETDATE() - @TimeWindow

AND J.NextDueDate < GETDATE() + @TimeWindow

AND J.JobName LIKE @JobFilter

Finally I wanted to mention what are our real challenges are, and we don’t yet have a really good solution: Our backup runs FULL starting Friday evening… during the week we run incremental backups. Now the incremental backups are not as critical… so let’s focus on the weekends.

What happened every now and then was that e.g. only a few tapes where write able and other might still have been locked or one of our libraries jammed etc..

In the end, it means – we e.g. came in Monday morning and discovered that 50+ % of the backups did not run.

Now, the question is, how do you monitor this. There are about a 150 jobs – they are stacked on each other. In theory I expect let say 5 running jobs, 0 completed and a 145 pending – starting Friday night – over the weekend this number will change constantly.

What I did not yet find is that a good solution that when Backup Exec waits for user interaction like insert tapes, offline library, etc. does wait for user interaction.

As well as the fact that I can’t tell PRTG on Friday I expect e.g. 150 jobs pending, on Saturday 1 PM the number should be more like 75 jobs pending and on Sunday 6 AM is should be down to 50 pending and Sunday 8 PM it should be 0 pending and 150 successful.

This is very granular, making it hard to find a solution. The jobs in our case will not finish – they are within their weekend time-window and will not be auto cancelled and therefor only manually looking in to Backup Exec will tell you if we are making progress or not.

It could be a solution to constantly see if the Total Bytes backed up goes up – but this again is challenging, we would need to compare values over time.. PRTG is as far as I know not directly able to do so and this would mean we would need to have a temp file with values form the last check in some kind of script or database that we would compare too…

So far I did not come up with the ultimate solution – every now and then I think about it a little more.. but well, I am not there yet.

Tags: backup exec sql veritas

PRTG – Veritas Backup Exec monitoring

You may also like...

Recent blog posts

Blog Archives

You may also like...

PRTG sensor to monitor a directory for a specific file type and minimum size and age

APC InRow A/C error monitoring with PRTG

Print Server backup script

Recent blog posts

Blog Archives

Tags