server

VMware alert monitoring with PRTG and PowerShell

VMware alert monitoring with PRTG and PowerShell

There is a way to read out and process ALL alerts of your VMware environment using PowerShell and reporting the results back to PRTG. The script further down in this article does this. What you get is similar to the graphic here.

This show you the following channels:

  • Overall status
    • this will be green as long there aren’t any not acknowledged warnings or alerts in VMware
    • if the warning or alert is acknowledged, the sensor / script will return to green cause it is nothing that is new
  • Total Alerts – amount of alerts acknowledged and not ackowledged
  • Total Alerts – Acknowledged
  • Total Alerts – NOT Acknowledged
  • Total Warnings
  • Total Warnings – Acknowledged
  • Total Warnings – NOT Acknowledged
  • Total Warnings and Alerts
  • Total Warnings and Alerts – Acknowledged
  • Total Warnings and Alerts – NOT Acknowledged

As you can see – you can get more granular on your PRTG statuses if you use the channels for Warnings/Alerts that are acknowledged. You could set upper warning or error limits of 0 to keep a warning / error level in PRTG if you want to see them still.

While I was writing the script, I decided to create a new lookup value in PRTG to make it more clear. If you adjust the script in regards to add additional statuses for the channel overall status – you will need to adjust this file as well.

Let’s start with the value lookup file, you need to copy the text from the first script block in to a file you store here: C:\Program Files (x86)\PRTG Network Monitor\lookups\custom

Name the file: vmware.alerts.search.ovl

Now we need to create a custom EXE/XML sensor in this directory: C:\Program Files (x86)\PRTG Network Monitor\Custom Sensors\EXEXML

Name the file: VMwareAlerts.ps1

Once you have both files created, go to PRTG and add a new sensor called EXE/Script Advanced and select the new created script file. As Parameter you either type the host-name of your vSphere server or if you created it underneath the device in PRTG just use %host.

UPDATE: I changed the script cause I found it to be better to go with the following expected parameters and always making sure you have control over username and password used to connect to VMware. Please use the follow parameter moving forward:

There are still a few challenges you might need to overcome on top of this:

  • install the VMware PowerShell extensions on your PRTG probe server
  • credentials to connect to VMware can be a challenge as I tested this
    • you might need to have the service account of the PRTG probe have sufficient access rights – needs working SSO
    • alternative use a stored credentials file in PowerShell – somewhat secure
    • or provide the credentials clear text in PowerShell – least secure
    • please see line 20 respective the command “connect-viserver” for more details
  • updated the script – it now expects username and password as parameter

You might wanna test the script before you add a sensor to PRTG – the best way to do this is directly on the PRTG server with the service account of the PRTG probe to make sure it will work as a sensor later on.

Keep in mind that the script expects a parameter – the VMware vSphere server name / web-address.

This was also posted on the PRTG KB here.

 

SQL Express SQLState 08001 and Error 17

SQL Express SQLState 08001 and Error 17

One of the challenges especially with SQL Express is that you need to enable some protocols on the network level first in order to connect to it. You might see an error message the one below when you try to connect to SQL – stating SQLState 08001 and Error 17.

In order to resolve this, you need to enable named pipes and TCP in the SQL Server Configuration Manager that was installed by default on your system. See the image below on how it should look like. Please note that you need to restart the SQL service in order for those changes to take effect.

Please note – there might be a need of additional configuration like the Windows Firewall or other parameters, the above just addresses a rather common issue.

Microsoft RADIUS / NPS SQL logging

Microsoft RADIUS / NPS SQL logging

An issue or question I see again and again – proper RADIUS logging with Microsoft NPS / Network Policy Server.

Let’s guide you through a few steps

  1. Install a Microsoft SQL or if not available SQL Express
    1. be aware – SQL Express has very tight database size limits and no SQL Agent – this might be an issue
  2. Create a new database via SQL Management Studio in the SQL server
    1. name it e.g. RADIUSLogging
  3. run the SQL script from this Microsoft website in a new query window against this database (make sure it is not run against any other database by accident)
    1. you could add a line like USE RADIUSLogging to prevent this – in the very top…
  4. configure your RADIUS server to log to this SQL server and database
  5. make sure you have fail-over logging to a text-file – to avoid issues in case your SQL DB grew to big or was not reachable for any reason
    1. decide in the text-file configuration if you want to deny access if there is an issue or if you still want to proceed with the logon

Now you have RADIUS logging the information to a SQL database – actually a single table – and you can dig around in it. The IT-Assets database provides a front-end example for this – you don’t need to use it – but it might be of help – see here.

To interpret all those columns and values – look at the following links for additional information:

You will face the issue that the database will grow rapidly – depending on how many requests go to your RADIUS system etc… Keep an close eye on it – use a monitoring software like Paessler / PRTG to monitor the size and keep in mind that SQL Express might have size limits like 10 GB. The full version of Microsoft SQL has no such limits and further can you use SQL Agent to execute tasks. The following script can help you purging data from the RADIUS database to keep its size under control. You can use SQL Agent (not in SQL Express) to run it automatically or if you use SQL Express either run it manually or with another solution somehow automatically against the database delete older entries.

The script actually will purge data older then 14 days – you can adjust the days to your liking / needs.

Updated domain join script including KeePass / Pleasant Password server entries for local admins

ns a website from a systems administrator for systems administrators Home IT-Admins CMDB IT-Admins tool IT Search EOL Solutions Blog Contact Links Updated domain join script including KeePass / Pleasant Password server entries for local admins

Today I post an updated version of the domain-join script I initially posted here.

In theory you can just replace the script with the new version – assuming you did not make any changes other then adjusting it to your domain / server-names.

What changed in the newer version:

  • the top lines in the script hold the basic configuration parameters
    • line 1: NetBIOS name of your Active Directory domain
    • line 2: your DNS domain name
    • line 3: your distinguished domain name / root DN of your domain
    • line 4: your default OU for new workstations
    • line 5: empty
    • line 6: KeePass / Pleasant Password Server URL
    • line 7: KeePass folder to store the password in
  • the script now relies on the above parameters rather then specifying them in various areas in the script, making the whole use / adjustment of the script way easier
  • advanced error handling
    • after the user entered the computer name and his domain admin credentials the systems checks if it can connect to the domain and if the computer name already exists
      • if the domain credentials are invalid (can be a non-admin – as long they are valid) you get a message explaining that the script will stop due to wrong credentials
      • if the computer name already exists in the domain, you get a message about it and the script stops
    • KeePass or Pleasant Password Server connection – if it fails to connect with the credentials provided, you get a message about it and the script will stop
  • adjusted messages with various colours
    • white text – standard as it was before
    • yellow text – highlighted information so it sticks better out for the end-user
    • magenta text – handled error / failure message – this is an explanation that something stopped the script from going further
    • red text – those are real PowerShell error messages – either due to not handled errors or if the error was handled plotted out to the screen as additional reference and help

For additional information, please look at the original post here.

This script is also mentioned on the API Examples page on the Pleasant Solutions web site here.

WDS respective PXE boot and VMware

WDS respective PXE boot and VMware

If you try to PXE boot a VMware guest system that e.g. uses WDS / Windows Deployment Services or similar, you might encounter that the boot.wim etc. download unbelievable slow. This can take several hours. This has especially to do with booting via the VMware EFI environment. VMware BIOS does not cause this issue. You could switch a EFI system to BIOS to capture/deploy the image – but this is not really a solution rather then bypassing it.

The solution for this is pretty simple, while the download is transferred through a TFTP, VMware has an issue with the blocksize and this gets a bit messed up due to the variable blocksize between the VM guest system and your PXE server.

Set the Maximum Block Size to 1456 – what is the exact value VMware needs to work properly. Disable further the Variable Window Extension and try to PXE boot again – you will see it will load in about a minute now – depending on your WinPE image size and your issues are in the past.

In detail:

  1. open your Windows Deployment Services
  2. right click on the server and select Properties
  3. navigate to the register card TFTP
  4. set the Maximum Block Size to 1456
  5. uncheck the Enable Variable Window Extension checkbox

The advantage of DFS and how to set up a working structure

The advantage of DFS and how to set up a working structure

File shares are something every IT professional will work with. Many companies have way to complicated and unstructured network file systems with to deep permissions, to many shares and access points, often several connected drives and from an IT perspective nightmares when it comes to migrating to newer servers or having satellite offices and subsidiaries gaining access to it especially on lower speed connections.

Having been in IT for about 20 years by now, I saw a lot and was challenged with it quite a bit. One of the best solutions I came across is the one I am about to show you here. It is very structured while giving you the advantage of leveraging it as you need and go and should allow you to use it in most businesses.

First of all – please note that I will not go as far as explaining and exploring the differences with Active Directory integrated and Stand-A-Lone namespaces. If by any means possible, I suggest you use Active Directory integrated namespaces to simplify the roll out, but both would work.

The structure example:

The structure example will depend on a DFS Root server and a separate File-Server per root-folder on the later network drive. This is just an example, you do not need to split it all up, thought if you can do it to keep it as structured as possible

Example target file system structure:

  • N:\
    • N:\Archive
      • N:\Archive\John Doe
      • N:\Archive\Jane Doe
    • N:\Departments
      • N:\Departments\Marketing
        • General
        • Mangement
        • Public (anyone has read access)
      • N:\Departments\Accounting
        • General
        • Mangement
        • Public (anyone has read access)
    • N:\Other
      • N:\Other\Manufacturing
      • N:\Other\Projects

The declared goal is to keep the NTFS rights structure as simple as possible and not going any deeper then e.g. level three – e.g. N:\Departments\Marketing\General

Each department folder in this example will have a public folder where a member of the department has read/write access while any non-marketing member has read access to files that are published there.

The archive tree is for terminated employees and archive data. Their information gets collected in a sub-folder in this tree, a group will be created for each of those folders and only people that got approval to access this data will see and be able to read those archived files (read-only is recommended as NTFS permission)

The file servers and their preparation:

DFS Root-Server
  • create a folder on the data-partition like D:\DFSRoots – there will not be any real data in this folder – but it will hold the actual DFS structure
    • create sub folders for the branches on the shared DFS drive like:
      • D:\DFSRoots\Departments
      • D:\DFSRoots\Archive
      • D:\DFSRoots\Other
DFS Department Server
  • create a folder on the data-partition like D:\SharedFolders\Departments
    • remove the everyone or authenticated user groups from this folder – only System and Domain-Admins should have read/write permission here while group N_Departments will have read-only access on this folder.
    • create a sub-folder for each main folder you want to see under the path N:\Departments and share it 
    • add a $ (dollar/string) sign to the share name so it remains hidden a hidden share
    • Examples:
      • D:\SharedFolders\Departments\Marketing
      • D:\SharedFolders\Departments\Accounting
  • now create the following sub-folders for each department folder as shown on the example Marketing
    • D:\SharedFolders\Departments\Marketing\General
    • D:\SharedFolders\Departments\Marketing\Management
    • D:\SharedFolders\Departments\Marketing\Public
  • create two groups in Active Directory for Marketing
    • N_Departments_Marketing_General
    • N_Departments_Marketing_Management
  • create a general group N_Departments to use it for all Public folders
  • assign the groups to their according sub-folders General and Management with read/write rights you probably will need to remove the read-access that the group N_Departments inherited from this folder 
  • assign the group N_Departments to the Public folder in all departments with read-only rights (if not inherited)
  • assign the group N_Departments_Marketing_General to the Marketing\Public folder with read/write access – allowing each member of marketing to publish information for access to other people – only marketing can write in this folder, other people only have read-access to it
DFS Archive Server
  • create a folder on the data-partition like D:\SharedFolders\Archive
    • create a sub-folder for each main folder you want to see under the path N:\Archive and share it 
    • add a $ (dollar/string) sign to the share name so it remains hidden a hidden share
    • Examples:
      • D:\SharedFolders\Archive\John Doe
      • D:\SharedFolders\Archive\Jane Doe
DFS Other Server
  • create a folder on the data-partition like D:\SharedFolders\Other
    • create a sub-folder for each main folder you want to see under the path N:\Other and share it 
    • add a $ (dollar/string) sign to the share name so it remains hidden a hidden share
    • Examples:
      • D:\SharedFolders\Other\Manufacturing
      • D:\SharedFolders\Other\Projects

The DFS namespace set up and configuration

  • add the Namespace \\domain.local\N for the N: drive (just an example)
  • add the folders Archive, Departments and Other to the namespace
  • for each of those folders you add the shared sub-folders like indicated in the list below as sub-folders (they will appear on the Namespace tab when you click on the folder in the DFS Management) and set the target to the according file-share on the specific DFS server where the data will reside
    • Departments\Marketing
    • Departments\Accounting
    • Archive\John Doe
    • Archive\Jane Doe
    • Other\Manufacturing
    • Other\Projects
    • This will actually create a shared sub-folder on the DFS Root server for each of those folders in D:\DFSRoot\

Note – information about the above example

The example below is kept simple – I did not go in to each and every right you would need to assign for the sole purpose of keeping it simple and understandable. Please investigate and set the rights as you really need them. 

As for the Archive tree, it might be beneficial to have PowerShell script automate the folder creation, group creating and rights assignment for those NTFS paths, so you limit the possible failure-rate in case you are going to archive terminated employee data and other stuff in this tree branch.

What are the real benefits of this

  • add multiple folder targets for replication
    • replication can be beneficial in a server-migration scenario as well as in a subsidiary scenario
    • you can add replications on the departments-branch example per department folder – not each subsidiary will need a mirror folder of each department, rather then just a few – this decreases the amount of data and load on the connection and size of the server respective it’s disk-space and reduce cost as well
  • a simple rights structure fully based on groups
    • in general you should never ever use a user account to assign any rights – always create a group, whether for a drive-share, NTFS rights or any other purpose. Always create a group!
    • you can add and remove users from those groups
    • you can audit the permissions on the NTFS side rather quick cause they should relate to strong group names
    • the groups can be audited against HR lists of members of the department or by department managers and directors to make sure only people that need to have certain access levels will have them
  • while limiting access to certain folders you limit the amount of damage a possible attack by malware could cause 
  • you can divide or summarize the actual file-servers that hold the data as needed in the long run
  • a simple group design with limited depth permissions is easier to maintain and audit
  • you have one central network drive that you will assign in order to give everyone access – all data will be centrally on this path independent from any file-server host-name. This can be a huge advantage cause some applications might not relate to the mapped drive rather than a UNC path what could cause you major headache when ever you want to migrate/upgrade or retire your file-servers later on
  • possible other file-shares within the corporation in other locations could be made accessible by linking them in as a folder in e.g. the others-namespace avoiding that users would need to know and remember the UNC path and you even allowing them to access any UNC path – it will act like a mapped drive while pointing in the background to an UNC path

There are many more advantages to DFS and the whole design. I hope this gives you a good overview and idea of how to design or re-design your file-server structure and simplify the whole access structure. 

Full text search and DFS drive mappings

This is a challenge that is not easy to overcome. Still, thought there is no official and directly implemented solution from Microsoft for this, I was able develop and provide a solution that will access the Windows Search Index and provide it back to the end user only using standard Windows components. All you need to know and do is described in the IT Search section of this web site.

Enable SMBv1 on Windows 10 per GPO

Enable SMBv1 on Windows 10 per GPO

SMBv1 is an insecure protocol that you should not use if by any means possible. Windows 10 has SMBv1 disabled by default. In order to enable it you would need to go to the Control Panel and activate the Windows Feature “SMB 1.0/CIFS File Sharing Support” and at a bare minim the “SMB 1.0/CIFS Client“. You actually might just want to do that cause you really shouldn’t add more SMBv1 servers to your network.

Before you proceed reading – if you really need to enable this protocol – please make sure your systems are all patched! Especially your target servers should be patched as well – assuming they are Windows XP / 2003 / Vista / 2008 / 7 / 2008 R2 / 8 / 8.1 / 2012 / 2012 R2 / 2016 and 10. I highly recommend to look at this Microsoft link: https://docs.microsoft.com/en-us/security-updates/securitybulletins/2017/ms17-010. Additionally do I want to mention that Windows XP and Windows 2003 can be patched as well – though they are not on the list of the previous link. Look at Microsoft KB4012598 for more information or use this download link https://www.microsoft.com/en-us/download/details.aspx?id=55245. I can not warn enough about SMBv1 – you open the doors for malware here that can bring down your network in minutes and cause huge damage!

Please note – I did not research in detail if other previous Windows versions did disabled SMBv1 already by default, this article might in any case apply to Windows 7, 8 and 8.1 as well and be applicable to Windows 2008, 2008 R2, 2012, 2012 R2 and 2016 as well as newer Windows versions to come.

Now, the issue with Windows 10 and SMBv1 disabled is that often old legacy Windows 2003 servers are around that can’t just be upgraded or replaced. In order to access any file share you would need to enable SMBv1 on the client workstations. This could sure be done by preparing your installation image etc. – but if you did not plan for this or want to have more granular control, you might consider using Group Policies / GPO to enabled this Windows Feature.

 

It is further worth noting that the easiest way to find the issue is not trying to access the UNC share via the server-name rather then directly typing in the IP address in your attempt. This way you actually get a way clearer error-message from Windows. I mention this, to show you and explain that there actually is a difference between trying to access a server-name and an IP address per UNC path – especially when it comes down to Windows 10 and the error messages you might see.

Officially enabling a Windows Feature is not supported per GPOs nor is there much information out there on how to enable SMBv1 per GPO. Having faced this challenge recently, I found a good working way that is pretty easy to implement.

  1. enable the feature on 1x Windows 10 client
    1. export / document the registry key HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\mrxsmb10
    2. copy the file %windir%\system32\drivers\mrxsmb10.sys
  2. create a GPO
    1. put the mrxsmb10.sys in the GPO or a central accessible file (the target computer account must be able to read the file! – I often put it in either NETLOGON or directly in the GPO / scripts folder)
    2. Computer Configuration \ Preferences \ Windows Settings \ Files
      1. create a new entry to copy the file to the target system
      2. Source file: where you centrally placed the mrxsmb10.sys
      3. Destination file: %windir%\system32\drivers\mrxsmb10.sys
    3. Computer Configuration \ Preferences \ Windows Settings \ Registry
      1. Create or import all the registry keys from HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\mrxsmb10

A registry hive export would look like this:

Apply the GPO to your target systems / workstations and reboot them – after that you will be able to access the necessary shares. The downside is – you don’t really see the feature as enabled in the Windows-Features. It will work nevertheless.

 

Build your own lab environment with VMware

Build your own lab environment with VMware

If you have a virtual system, like VMware, and a storage device behind it you might be able to mirror your whole real / live environment and create a complete playground or shadow network to simulate any of your guest VMs like the real system and being able to change, update and adjust the configuration within this lab environment.. This is actually rather easy to accomplish and can save you a lot of headache while using it.

This is just an example, you might be able to accomplish something similar while just cloning single VMs and in theory it wouldn’t even matter if you use VMware vSphere or something like Microsoft Hyper-V. Still, it has advantages to do it this way, but let me explain it.

Assumed scenario:

You work with an VMware environment with one or more host systems and several guest VMs. You need to update software and configurations on those guest VMs but need to test this beforehand to ensure everything runs smooth. The VMs are stored on a central storage device that can do volume-level snapshots.

Prepare the environment

  1. you need one NIC per host that you will connect to a switch
  2. best use an independent switch that has no other network connections then to the 1x NIC per host
  3. create an Shadow-LAN virtual switch in your VMware cluster and use the 1x NIC per host so VMs through the cluster can communicate proper

What you do:

  1. create a snapshot and mount it to the VMware host systems as an additional volume
  2. go to the clone-volume file system and add the VMs you need to the inventory (you might want to rename to shadow-<servername> so you can easily identify them)
  3. re-configure their network connection from your regular LAN virtual switch to the shadow-LAN virtual switch you prepared
  4. start the added guest VM
    1. if you are asked if you moved or copied the guest just say you moved it – to avoid hardware / MAC and other changes possibly causing Windows to want to re-activate
    2. VMware might complain about a duplicate MAC address – you can ignore this, cause you are on two different / independent networks

Real usage example:

Let me give you a more detailed example on this with a few more details on what I personally used and did with this. The example you find below should help you understand the whole principle better.

  • VMware cluster with e.g. 10x host systems – we had enough RAM and CPU power that we could have 3x hosts go down – you won’t need that much, but of course you would have buffer for RAM and CPU usage
  • Nimble All-Flash storage arrays in the background connected to all the VMware hosts and using the Nimble VMware plugins (Note: Nimble was bought by HP / HPE as off today)
    • the Nimbles are configured to do volume level snapshots multiple times per day
  • all physical host systems had a dedicated network card (NIC) connected to an independent physical switch that was NOT connected to any other network switch
  • a virtual switch SHADOW-LAN was created and those physical NICs of the hosts systems had been assigned to it
    • this allowed any VM connected to this virtual switch to communicate with other VMs connected to the same virtual switch on other hosts

Due to migrations, software updates and quality controlled systems we constantly had the challenge to test and changes and adjustments thoroughly. So I came up with the solution to just clone a snapshot on the Nimble storage array the VM resided and mounting it to the VMware cluster, taking only minutes and then moving forward to add e.g. domain controllers, DHCP servers, necessary file-system servers and the target guest system to the inventory in VMware, adjusting their name so we could quickly identify them (even adding them to resource pools if necessary) and of course most important just changing their virtual switch configuration to the shadow switch.

Advantages and possibilities:

This now allowed to simulate the whole real world system (VMs) and simulate every change that we wanted. In order to get software there we attached if necessary VHDs that did hold what we needed or we even used a secondary internet connection to briefly connect to licensing services or update services that Vendors only provided online. The advantages of the solutions go even further:

  • simulate everything you have in your VMware environment available and have it working like the real / live system
  • if necessary, provide internet access while connecting a SECONDARY internet connection (router / firewall) to physical shadow network switch
  • adding real printers to the shadow switch to be able to test print-outs (we had those cases)
  • add physical workstations to simulate whole production environments
  • update / refresh the whole system in only a few minutes by using a fresh-snapshot clone
  • only minimal to almost none impact on the storage / free space of your storage device
    • this is due to grabbing a Nimble snapshot that was cloned and therefor created a new branch and only the deltas (changes) had an impact on the storage – we talk even for “bigger” simulations only about a few gigabyte changed data – if at all that much – of course depending on your storage and what you do
  • we installed VMware console connections on quality testing workstations so they could access the system directly on the console
    • of course only granting them minimal rights to this specific pool of VMs
    • avoiding that their access to the VMware environment had any impact to the real system
  • documenting any changes, challenges faced and solutions found
  • due to a physical switch, and best practice using a layer-3 switch, we where able to simulate whole VLANs, routing etc. within the environment and even connecting various physical systems like printers, workstations and temporarily an internet-connection to this environment

It is only a small amount of effort to initially prepare for those simulations, cause the virtual shadow switch, the physical shadow switch and the hosts network card connections to this physical switch are a one time effort. After this you just clone and mount snapshots and add the actual VMs you need while adjusting their network connection to the virtual shadow switch.

Once setup, preparing simulations usually takes less then 30 minutes till everything is cloned, mounted, added to the inventory (incl. NIC adjust to the shadow switch) and booted up.

Why not just clone all VMs via VMware?

Good questions – the answer is simple, this would have an impact on your storage capacity, cause it would create an actual clone. And it actually takes longer to clone individual VMs then to just grab a storage-level snapshot and being able to adjust what you want down to the volume level on the storage. Even the clean-up might be more intense or leave some unwanted data back – while a clone on off the volume only needs you to remove the VM guest system from the inventory and then unmount and delete the whole shadow volume.

I did write this all up cause I wanted to share it – the whole idea is not that special in theory, but I thought it is an good example on how you can accomplish having a huge and decent lab environment with only minimal effort. In any case, I hope the idea behind it will help some off you out there 🙂

 

 

Windows 2016 DHCP load balancer and it’s quirks

Windows 2016 DHCP load balancer and it’s quirks

Windows 2016 or probably even 2012 allows you to create a real load balanced / full failover DHCP server configuration, other then Windows 2008 that only allowed you to split the scopes.

Now, this works pretty great for the most part – but it has actually two major flaws you need to be aware of and actually take action on.

Neither reservations nor server / scope options are replicated.

This actually is a big deal. Assuming you are changing settings on server A for the pool, you end up that clients depending on which DHCP answered them might apply the new settings from server A or pull the old ones from server B.

Further might a reservation work when you put it in place, and then – all of a sudden a few days later you get a ticket in telling you there is an issue and you find out that the reservation didn’t pull anymore. What happened? Well – you might have set the reservation on server B but not on server A – depending on which server answered the client, you run again in to an issue.

Microsoft seems to have put a quick and dirty synchronization in place and the only true way around is to force the two DHCPs to synchronize with the following PowerShell command:

This could be automated via the Task Scheduler by using invoking the command from a DOS prompt via:

But even then, you better check all your DHCP servers and always make sure any changes are made on all DHCP servers or at least correctly replicated. Otherwise you might encounter the weirdest issues.

 

Monitor user accounts in Active Directory with PRTG

Monitor user accounts in Active Directory with PRTG

The following script will read through your current Active Directory and filter for user accounts with the following specific conditions:

  • Lockedout users – please read below for further information about this
    • all users that are lockedout
    • must be an enabled user
    • that is not expired
  • disabled users
    • all users that have been disabled
  • expired users
    • must be an enabled user
    • the expiration date is set and past the current date
  • users with password never expires set
    • must be an enabled user

This will give you a pure counter output per channel in an for PRTG Extended script sensor XML result.

But there is a theoretical flaw in one of the methods – the locked out users. Now, user accounts get locked out in Active Directory due to too many logon attempts with an invalid password. This causes Active Directory to set the lockedout bit in the object properties. The issue here is that this bit will not be set back to 0 after the defined lockout duration (GPO) is past, the property will only be set back to 0 once the lockout duration is passed and he successfully logged on.

This means, the counter might give you more results then currently true, it might count users that have been locked out but the lockout-duration passed – but they did not yet logon successfully. This is somehow a false positive, while not totally false. In any case, you need to be aware of this.

The script could be more efficient as well in the way it filters a few things, so far I optimized it as far as I could – the LockedOut value can not be set as a -Filter, in theory it might be possible to speed it up with a -Filter to the UserAccountControl (if that is even possible – not tested) – but I am not certain this would work. If you really want to speed it up you would need to work with -LDAPFilter – but this actually needs to completely replace the internal filter capabilities of Get-ADUser – you can’t use both – it is one or the other.

This script updated with a corrected version as of February 2019 and was also posted in the PRTG knowledge base here.

Monitor DFS replication backlog between servers in PRTG

Monitor DFS replication backlog between servers in PRTG

One of the challenges with DFS is to monitor the DFS replication backlog. There are various scripts out there to accomplish this. Unfortunately I found nothing I really liked and giving me the simple insight I wanted.

The goal was simple – a script that will monitor the backlog between two systems in both directions – meaning Server-A to Server-B and Server-B to Server-A. For both directions I wanted to see the amount of files as well as the size of those files. I did not care about what DFS groups and or DFS folders are affected in detail – this is because the amount of groups might change, the amount of folders will likely change rather frequently, meaning it would be a challenge to monitor it per group or even on a folder level very efficient. Monitoring the amounts of groups and folders alone has no really advantage, cause this would have been changed by an administrator.

Below you will find my script that actually expects three parameter – the two server names and a limit integer value. The limit will not influence the XML response of the script, you could add the <text>$Response</text> and <text>$Response2</text> tags in lines 77 and 79 after the </unit> and before the </result> tag if you want, I removed them currently.

See the picture below as an example of how the result looks like in PRTG.

Create the following script in C:\Program Files (x86)\PRTG Network Monitor\Custom Sensors\EXEXML and make sure you have the C:\Windows\system32\WindowsPowerShell\v1.0\Modules\Dfsr\ and the C:\Windows\SysWow64\WindowsPowerShell\v1.0\Modules\Dfsr\ folders – you might need to copy them over. If you miss them at all, you might need to add the needed Windows Roles/Features or install the RSAT (Remote Server Administration Tools) on your system.

Print Server backup script

Print Server backup script

Print servers need to backed up. This is because of two main reasons. One is that users heavily depend on printers and a not properly working print server will cause imediate helpdesk tickets and unhappy users. The other one is that installing a new driver, might it be a new version, a new model or even an additional manufacturer, can cause other print drivers to act up or even stop working – many administrators know and fear that.

Windows server actually allows you to backup the current print drivers, installed printers and their configuration. You can use this to migrate your printers or to back them up. Of course, you can simply depend on e.g. VMware snapshots, storage level snapshots or other backups of your server. But you also could just export the whole print server configuration while using the scripts below. Those will actually call the Windows API to back up the printers and store it all in a file that you can keep centrally. You don’t just rely on snapshots or a full server backup for e.g. your SQL databases as well, do you?

The script uses a .CMD file that will execute the actual backup and send a email report, while using the SMTPSEND program from Michael Kocum (https://www.dataenter.com/download.asp) for this since I already ad it flying around – you could replace the mail send option with another prepared SMTPSEND client, a VBS script or just remove it completely. Additionally there is a .VBS script that will do a clean up of the target backup files depending on the age of the files in the specified directory.

All the parameters are explained and set in the top part of the .CMD file – I therefor will not explain them here again – you should not need to modify the scripts by default – but feel free to do so. Of course, you should create a scheduled task and execute the .CMD periodically. This can save you time and headache in case you have a malfunctioning print server system. The restore can be easily done through the Print Management MMC that Windows provides you, cause the actual backup files are createed using the same Windows APIs. Your end users will be happy that their printers got back to work in no time, hopefully.

Automate your SUS clean up

Automate your SUS clean up

Many companies rely on WSUS respective SUS services from Microsoft – aka. Windows Server Update Services as internal source and control of their update deployment to clients and servers within their network.

One of the big challenges for IT is to keep them clean and performant. The Cleanup-Assistant in the SUS management console tends to run forever and in any case means manual labor over and over again.

Below are two scripts – a CMD script that needs to be adjusted with parameters and a powershell script that will be called with those parameters. The scripts acutally will call the same API as the MMC Assistant does, just that this can be automatically performed via a scheduled task in Windows.

It helps you to keep your SUS slim and more performant.

In any way – I highly recommend to not blindly just enable all categories rather then limiting it to the once you have in place as well as once you reached a certain patch-level even actively denying updates you never will need again (keep in mind, new rolled out systems might still need older updates – but you could possibly refresh your base images or rely on Microsoft update services / online updates for those cases).

The combination of making updates obsolete and actually running a cleanup periodically will improve your SUS server performance.

As for the parameters, those are explained in the CMD script header – therefor I will not explain them here again.

Solution for VSS exceptions with VMware guests / VM tools

Solution for VSS exceptions with VMware guests / VM tools

Today’s blog is about “Solution for VSS exceptions with VMware guests / VM tools” and was initially posted by myself here: https://vox.veritas.com/t5/Backup-Exec/Solution-for-VSS-exceptions-with-VMware-guests-VM-tools/td-p/829072 – it actually became a KB entry for an as of today older version of Veritas Backup Exec – but I did not want to leave it out of my blog.. Here is the link to the KB article: https://www.veritas.com/docs/000009506

Here is the article I wrote – starting with a description of the actual issue:

We have been experiencing VSS issues with VMware guests in regards to the installed Backup Exec Agent and a previously installed VMware Tools VSS option.

Uninstalling the VMware Tools VSS option in various ways including restarts did not fix our issues. If you search the internet for solutions, you find many attempts but no real solution or explanation.

One of our admins spend several hours with the Veritas support without a solution, he was about to escalate the issue with them, when we found the root cause and could actually fix our issues.

First the steps to solve this:

  1. Uninstall the VMware Tools VSS option (no restart will be requiered)
  2. Make sure VMware VSS service was deleted
    1. If this is not the case, you might need to do so manually and remove additional DLL’s etc. as well as restart the system, but this is independent from this solution
  3. You might have already done steps 1 and 2 but you still get VSS exceptions from the backup that says you have more than one VSS agent installed:
    1. V-79-8192-38331 – The backup has detected that both the VMware VSS Provider and the Symantec VSS Provider have been installed on the virtual machine ‘hostname’. However, only one VSS Provider can be used on a virutal machine. You must uninstall the VMware VSS Provider.
    2. Now you wonder what causes this and you get stuck
    3. You could uninstall the Veritas/Symantec Backup Exec Agent and only back the system up per VMDK
    4. You would lose the GRT / granular backup / restore capabilities
  4. Check your registry for the following reg key
    1. HKLM\System\CurrentControlSet\Services\BeVssProviderConflict
    2. If this key exists, but your VMware VSS provider is uninstalled, you need to follow up with step 5
  5. Open a notepad as administrator
  6. Open this file in the notepad
    1. C:\ProgramData\VMware\VMware Tools\manifest.txt
  7. Search for the following two entries:
    1. Vcbprovider_2003.installed
    2. vcbprovider.installed
  8. Make sure both of them are set to FALSE, most likely one of them is TRUE
  9. Run a test backup
    1. This test backup now should not show the exception anymore
    2. The registry key should vanish (refresh/press F5) without you taking action

So what happened?

You uninstalled the VMware Tools VSS provider, but this manifest file did not get updated. We actually could see that it sometimes does get updated and sometimes does not. This seems to be some kind of issue with the VMware Tools uninstalled/installer.

But why this manifest.txt file?

As we found out, there scripts that get executed by Symantec/Veritas Backup Exec before the backup. You might find them in two locations, and it seems to depend a bit on the Windows version which script is executed (at which location). You could edit them both and just undo the checks in the scripts, but this wouldn’t be correct. It is more correct to update the manifest.txt file. If you want to, you can check the date/time of the manifest.txt file before you change it – you might see it was not updated while you uninstalled the VMware Tools VSS provider (assuming you did only do this and not do additional installs/uninstalls within the VMware Tools / please note as well that this only is true when you still experienced those issues).

Now, back to those scripts, you find them here:

  • C:\Windows
  • C:\Program Files\Symantec\Backup Exec\RAWS\VSS Provider

The name of the script that matters:

  • pre-freeze-script.bat

This script checks several DLLs, registry entries, paths and on Windows 2008 and newer the ProgramData-Path for this specific manifest.txt file and the two entries mentioned.

Once you uninstall the VMware VSS provider, and the file did not get updated, you might see this issue and wonder how to solve it. The solution is to simply update it to mirror the uninstallation of the VMware VSS provider (vcbprovider). We double checked this with several installations and could see if the file actually gets updated, those two values are set to FALSE, if it doesn’t, at least one of the values remains true, what causes the pre-freeze-script.bat to write the registry key mentioned earlier and therefor causing the issue in the backup – exceptions.

If you still have the same issues after updating the manifest.txt, simply check all the DLL’s that are mentioned in the script and make sure they don’t exist. You might also consider to manually delete the registry-key (it seems to be just a dumy-key) to make sure there is no issue that prevents the script from deleting it. Make sure it does not re-appear after a backup! Otherwise you might still have some DLLs left on your system that cause the script to re-create the registry key.

Hope this helps a few of you out there. This was an ongoing issue for a while and I came accross those issues many times ever since Windows 2008. This applies to Windows 2008 R2, Windows 2012, Windows 2012 R2 and pretty sure to Windows 2016 as well.

It helped us getting rid of those issues completely and actually not even needing a single restart of the guest VM to solve the issues (removing the VMware VSS provide did not need a restart).

Password expiration notifications for end users

Password expiration notifications for end users

Today I wanted to share a script with you that allows you to inform your users per email that their password will expire or even is expired and reminds them about your password policies like complex passwords and how to chose a password. This is a simple VBScript and can easily be adjusted. The email will be generated from a file, in this case a HTML file that you provide. You can adjust the content of this HTML file as you need it. There are sure many commercial solutions out there that can do more then this script, but if you want to save the money and are satisfied with the provided options, this sure can be a good alternative.

Let me mention one thing about passwords first – most of us live with the usual policy that passwords should be changed periodically and need to be of a certain length and complexity. We all live with the daily calls of the help-desk about forgotten passwords, not changed passwords (pretty much why I wrote the script) and so on – what changed was a new recommendation in late 2017 or early 2018 that actually is based on statistics and data and now says – yes – complex passwords and certain lengths – but do not enforce periodic changes of those passwords due to this actually resulting in to less secure passwords while users might only change a number or even write those passwords more likely down and therefor compromising the whole attempt to secure the system.

Anyways – the next few lines will explain the parameters you can adjust in the top section of the VBS script – further below I will post the script and an example HTML file so you can start right away.

The options are between the lines 7 and 61 – don’t be scared – most of them are pretty simple to understand and are actually explained in the script itself. You should not need to modify anything outside those two lines.

About the parameter naming convention and possible values:

  • starting with str as strings – those expect text-markers and alphanumeric values “text”
  • starting with int as integer values – those are direct numeric values – e.g. 123
  • starting with bol are boolean values – those can be either TRUE or FALSE – meaning on or off

Here are the options you can set:

  • strSMTPServer: SMTP mail server DNS name or IP address
  • intSMTPServerPort: SMTP mail server port – normally 25
  • strFrom: SMTP mail from address
  • strToAdmin: SMTP mail to address for administrator emails
  • strAdminMailSubject: subject for mail to administrators
  • strUserMailSubjectExpired: subject for mails to user when password is expired
  • strUserMailSubjectWillExpire: subject for mail to user when password will expire – the exact word REPLACEWITHDAYS will be replaced by the days left value so mention it in the subject line if you want to see the value there
  • strBodyURL: URL or full file-path (HTML file path e.g. file://) to import for body, the entire content of this URL/FILE will be imported to the body of the email and should explain ways how to change the password
  • strAttachment: full file-path to an attachment for the email to the users / leave empty if no attachment
  • strLDAPSortColumn: per default: pwdLastSet / sort column for LDAP query
  • intStartWithPWexpiresInDays: If the passwords expires in days N or less, the script will inform the user – keep in mind – if you run the script daily, those users will get an email every day once their password will expire in less then the indicated days. 5 is sure a good start.
  • bolIgnoreDisabledAccounts: Disabled accounts should always be ignored
  • bolInformAdminAboutPWexpires: this will inform the admin about expiring passwords
  • bolInformAdminAboutPWisExpired: this will inform the admin about accounts with expired passwords
  • bolInformAdminAboutPWneverExpires: this will inform the admin about accounts with password set to never expire
  • bolInformAdminAboutUserCantChangePW: this will inform the admin about users who are not allowed to change their password
  • bolInformAdminAboutAccountDisabled: this will inform the admin about disabled accounts found – this would have been done in ADS by an administrator
  • bolInformAdminAboutExpiredUserAccount: this will inform the admin if the user account has an expiration date and the account is expired
  • bolInformAdminAboutAccountWithoutEMail: this will inform the admin about accounts without a set email address
  • bolInformAdminAboutStillGoodPasswords: this will inform the admin about users/passwords that are still valid
  • bolInformAdminAboutIgnoredUsersExcludedByGroup: this will inform the admin about users that have been ignored by the strGroupsExclude filter
  • Please Note: the status account locked will not be checked, this should be corrected automatically by the default security GPO instead (will be in most cases by default)
  • strSearchOUs: Filter Priority 1 – only users in those OU paths will be processed. Use LDAP DN like: “OU=Folder,OU=Folder,DC=Domain,DC=local”, you do not need to include the DC=Domain,DC=local – the script will add this information if necessary. Use | (pipe) if you want to add more then one LDAP DN path. Leave empty (“”) to disable this filter
  • strGroupsExclude : Filter Priority 2 – if the user object is still not excluded, this group exclude filter will be applied. If the user is member of one of those groups (if multiple groups are defined), he will be ignored. Use | (pipe) if you want to add more then one GroupName. Leave empty (“”) to disable this filter. Example: “Group Number1|GroupNumber2”
  • strGroupsInclude: Filter Priority 3 – if the user object is still not excluded, this group Include filter will be applied. The user has to be a member of one of those groups (if multiple groups are defined). Use | (pipe) if you want to add more then one GroupName. Leave empty (“”) to disable this filter. Example: “Group Number1|GroupNumber2”
  • bolDebug: set TRUE for script-output, highly recommended to execute the Script in CMD with CSCRIPT <ScriptName> so you see it in a command window instead of dialog boxes.
  • bolAttachDebugToAdminMail: the debug output will be attached to the admin-mail (independent from bolDebug)
  • bolTestDebugOutputToConsoleOnly: this will disable the mail.send – only output to the CMD will be generated, please enable bolDebug
  • bolRedirectMailToAdmin: this will redirect all mails to the admin, instead of sending them to the user – the subject line will include the user-mail address in this case – this allows you to do a real test and actually see what would be send out to whom – without actually sending the emails to the end user
  • bolAdminMailOnly: this will send the admin-mail only, no user mail will be generated

As always – feel free to reach out to me if you have any questions or comments.

Script based SQL Express backups

Script based SQL Express backups

SQL Express is widely used but has huge downside, there is no SQL Agent available. Even Windows internal databases, especially WSUS / Windows Update Services / Microsoft Update Services are in the end SQL Express like databases that do not have a SQL Agent.

Now, you can have central SQL Servers with Agents have them backed up – and I recommend on doing so if possible. But for the many times this is not possible, you will need to find another way to create those nice little .BAK files for SQL internal backups aka. SQL Maintenance Plan Backups. To work around this issue, I once wrote a script that automates this for each database found on a specific SQL server. It creates the backups via SQLCMD commands and even does a clean up of obsolete files (files older than x days), almost like SQL Maintenance Plans do it.

The script is divided in to a .CMD file that executes the actual backup and where you set the configuration/parameters and a .VBS file that is controlled by the actual .CMD script and will perform the backup cleanup. In the end you can have the .CMD send a email report – I used the SMTPSEND program from Michael Kocum (https://www.dataenter.com/download.asp) for this since I already ad it flying around – you could replace the mail send option with another prepared SMTPSEND client, a VBS script or just remove it completely.

Adjusting the settings / parameter:

This is all done in the SQLBACKUP.CMD file – the header section pretty much will explain all you need to know, from SQL Server to SQL-User and Password over Mail-Server to recipients.

If you want to execute the SQLBackups as the Windows-User that is executing the script, you need to exchange the REM (remarks) for the following two lines further down in the scripts. I apologize for the inconvenience, this is a old script I never updated to have those settings in the header (more automated), I always just changed the lines.

Everything else should be rather easy. Of course you will need sufficient access rights to the SQL databases and your destination backup folder. The task-scheduler might work best if you execute the script with “cmd /c c:\scripts\sqlbackup.cmd” (change the path as you need it) and set the working directory / startup folder right. It might help to execute the task with elevated rights etc. – all depending on your systems configuration.

Below are the two scripts – I hope this helps some of you. The generated .BAK files can simply be restored in SQL services via the GUI cause they are native SQL backup files.

Join systems to a domain and create KeePass server entries for local admin’s

Join systems to a domain and create KeePass server entries for local admin’s

Please note – this script was updated – you find the updated post here.

One of the challenges in most daily IT operations is onboarding of workstations and servers (respective domain join). Over the years I came across and tried many ways to accomplish this. Today I wanted to share a script and solution others might find helpful, but first lets get down to some theory and background.

The goals and challenges:

  • simple domain join after a system was imaged
    • this is in theory possible in a fully automated process via various imaging solutions – I found that WDS (Microsoft Windows Deployment Services are in most cases the easiest way to accomplish this while having the possibility to use this in consulting for various clients, in enterprise for various departments etc. Since Windows 10 came in to the equation some of the automation with WDS became more challenging – so keeping it simple with some additional manual labor is often the easiest way to accomplish this – to simplify the process a PowerShell script became a perfect solution).
  • systems should have a local admin account (not administrator / SID 500 / who should remain disabled) with an individual password
    • typing this manual you always risk that the password is misspelled either in your password database or on the actual operating system
    • if you think it is a good idea to have the same password on all your clients I actually suggest you do some security related research!

The PowerShell script below will do the following for you:

  1. Ask for the name of the system (this will change the hostname/computername)
  2. Ask for credentials for KeePassPleasant Password Server
  3. Ask for credentials to join the system to the domain
  4. Create a local admin user account on the system
  5. Generate a password for this account
  6. Check if there is an existing KeePassPleasant Password Server entry for this system
  7. If not – it will proceed and create a entry with the machine name, username, password and various additional information like
    1. manufacturer
    2. model
    3. serial number / service tag
    4. UEFI BIOS Windows license key
    5. MAC addresses of all network cards Windows knows about
  8. And finally it will join the domain and put the system right away in to the defined OU

The whole script is only an example – you don’t have to use KeePassPleasant Password Server nor is the script perfect for any situation – you can take it and modify it as you need it – point it to various IT Asset databases or let you chose from predefined OUs etc. – adjust it as you needed – in general it is a very useful baseline and I wanted to share it.

One of the challenges is to execute the script as administrator (elevated rights) and as well bypass the script execution restrictions without compromising them in a default image, like disabling this important security feature on the image itself. To accomplish this, a simple CMD-Script actually will execute the PowerShell script. CMD-Script can right-clicked and executed as administrator and gain elevated rights. This is as of today not possible by default with PowerShell scripts (.ps1).

Create the following two files “Execute-DomainJoin.cmd” and “Execute-DomainJoin.ps1” and save them in the same directory or e.g. a portable flash drive. Adjust the PowerShell script so it connects to your domain and local systems.

Please note – this script was updated – you find the updated post here.

Explaining, adjusting and guiding your through the PowerShell script

It is important that you understand the script so you can make adjustments to it. I will try to explain everything that is important and reference some line-numbers while doing so.

  1. Lines 1-30 are just a general introduction and show some generic information
  2. Lines 31-76 hold some functions to generate a password, to bypass some certificate issues etc
    1. Lines 35-38 are worth taking a look at, here are all the characters of the four categories that will be used to generate a password. Excluded are already usually hard to read characters in some fonts and other characters that might cause issues – of course, adjust especially line 38 to your preferences and add more symbols or remove what you don’t want to use
  3. Lines 77-89 are just informational
  4. Lines 90-96 expect some user-input
    1. new computername
    2. get credentials for the domain join (admin)
      1. the script will not validate the credentials, in theory this could be done but I never found it that important
    3. get credentials to read/write on the password database server (often not the actual admin-credentials, therefor I separated those two)
      1. the script will not validate the credentials, in theory this could be done but I never found it that important
    4. the local admin username that will be created
      1. $localAdminUser = $(“$ComputerName” + “_Admin”)
      2. the above line will create a hostname_admin account – you can adjust this to your preferences
    5. 94-95 will generate a password and encrypt it so it can be used to create the local account
  5. Lines 97-103 are just informational
  6. Lines 104-216 – this is actually the whole password server communication and entry check and generation
    1. 104-115 those lines gather various information from your current system like serial number, UEFI Windows keys, etc. – you can keep em as is
    2. 116 – please enter the URL to your password server here 
    3. 117 – here your need to enter the folder where the generated credentials are going to be put in on your password server
    4. 118 – this is the subject of the entry that will be generated – adjust this to your preferences
    5. 119-120 – those are username/password for the entry – you should leave this as is
    6. 121-134 – those lines are the details in your password server entry – adjust them to your likes
    7. 135-165 – this actually will execute the following on the REST API on your password server
      1. connect to it
      2. check if a entry with the same username already exists
    8. 166-189 – this will raise an alert that this user already exists on your password server – 189 will actually exit the whole script
    9. 190-216 – this block will write to the password server – cause it did not find an entry with the new username
  7. Lines 217-241 this shows the new created username and password – it actually suggests you compare the entries on your password server to the information shown to make sure everything is correct
  8. Lines 242-251 will create the new local admin account on the system and set the password
  9. Lines 252-267 are informational
  10. Line 268 will execute the actual domain join
    1. please adjust the -Domain and the -OUPath parameter to your specific needs
    2. note that the command will automatically restart the system
  11. Lines 269-282 Those lines are informational – actually – if anything would go wrong those lines would be shown and help to take further steps after the failed domain join – in most cases those suggestions will help – in the end the error output shown by the command for the domain join (line 268) would indicate what went wrong. The restart of the system actually would bypass this message in the end (more or less)

If you have any questions, feel free to reach out to me. The script could be cleaned up more – but I wanted to provide a working version of it – so I just did a quick clean up or some special stuff and posted it here. Personally I like things a bit more structured, but as said – this is just a general example.

Please note – this script was updated – you find the updated post here.

This script is also mentioned on the API Examples page on the Pleasant Solutions web site here.

Monitoring Shadow-Copies with PRTG

Monitoring Shadow-Copies with PRTG

This was originally posted here by myself: https://kb.paessler.com/en/topic/65026-monitor-shadow-copies-age#reply-247626

This is my solution for it – we monitor specific drives we enabled for shadow copy and wanted to see amount of shadows, newest should be within x hours and oldest should be at a minimum n hours – those limits can be configured with the limitations rather easily.

Main issue is – we talk about WMI modules that are only available in x64 if you use a x64 system. Now PRTG is executing sensors in x86, even thought it is installed on x64. Now, played around a while and came up with this simple solution.

Parameters for the parser-script (the one you need to execute) are: %host C: %host D: etc.

Parser script, needs to be in EXEXML directory: Name: Get-ShadowCopyStatsXMLx64parser.cmd

PS1 script, should to be in EXEXML directory – if not, adjust path in parser script: Name: Get-ShadowCopyStatsXML.ps1

PS: yes, the PS1 could be further optimized, but it took me already a while to find out the main issue was the x86/x64 combination, what is not possible – see Microsoft articles in MSDN/KB for more information while the Shadow-Copy WMI modules are only available in x64 on a x64 system.

Hope this helps others with the same challenge 🙂

PRTG – Veritas Backup Exec monitoring

This was originally posted by myself here: https://kb.paessler.com/en/topic/58233-symantec-backupexec-monitoring#reply-262024

We monitor backups a little bit more advanced – I thought I should share this knowledge as well…

Single Job Monitoring: Monitor a single job and it’s results – allows you configure the Job-Name with a PRTG filter value in the SQL sensor. The results will include various values – most notable are:

  • FinalJobStatus (as text)
  • TotalDataSizeBytes
  • TotalNumberOfDirectories
  • TotalNumberOfFiles
  • TotalRateMBMin

In order to implement this sensor, add a SQLv2 Sensor and configure like this:

  • Database: BEDB
  • Instance Name: BKUPEXEC (in most cases)
  • Use input parameter: specify the exact job name
  • DBNull = Error

Channels: Most channels are rather simple to configure, they are counters, SpeedDisk or BytesDisk – as PRTG has those channel types integrated already. The special channel is FINALJOBSTATUS – in order to have this working you will need the “backupexec.jobstatus.ovl” file in your %programfilesx86%\PRTG…\Lookups directory – see below for the file.

SQL Script for single job monitoring:

The backupexec.jobstatus.ovl file:

Another SQL script we use is the one below – this actually approaches the whole monitoring more in an overview – it still depends on the JobHistory table, meaning, the job must have been running.. in theory you could work around this and actually get information on the scheduler etc.. the script below is a pure example.

Finally I wanted to mention what are our real challenges are, and we don’t yet have a really good solution: Our backup runs FULL starting Friday evening… during the week we run incremental backups. Now the incremental backups are not as critical… so let’s focus on the weekends.

What happened every now and then was that e.g. only a few tapes where write able and other might still have been locked or one of our libraries jammed etc..

In the end, it means – we e.g. came in Monday morning and discovered that 50+ % of the backups did not run.

Now, the question is, how do you monitor this. There are about a 150 jobs – they are stacked on each other. In theory I expect let say 5 running jobs, 0 completed and a 145 pending – starting Friday night – over the weekend this number will change constantly.

What I did not yet find is that a good solution that when Backup Exec waits for user interaction like insert tapes, offline library, etc. does wait for user interaction.

As well as the fact that I can’t tell PRTG on Friday I expect e.g. 150 jobs pending, on Saturday 1 PM the number should be more like 75 jobs pending and on Sunday 6 AM is should be down to 50 pending and Sunday 8 PM it should be 0 pending and 150 successful.

This is very granular, making it hard to find a solution. The jobs in our case will not finish – they are within their weekend time-window and will not be auto cancelled and therefor only manually looking in to Backup Exec will tell you if we are making progress or not.

It could be a solution to constantly see if the Total Bytes backed up goes up – but this again is challenging, we would need to compare values over time.. PRTG is as far as I know not directly able to do so and this would mean we would need to have a temp file with values form the last check in some kind of script or database that we would compare too…

So far I did not come up with the ultimate solution – every now and then I think about it a little more.. but well, I am not there yet.