The following script was created to bypass an issue in the SOAP API in relation with VMware, hardware vendor drivers and PRTG. In any case, you could use the same script for other monitoring systems or any other purpose – of course while adjusting it to your needs.
You can find more information about the issue here: https://kb.paessler.com/en/topic/82458-vmware-host-hardware-status-soap-sensor-returns-warnings-after-update-to-vmware-6-7
In order to make this work you need to install the VMware PowerCLI PowerShell extension on the PRTG probe server. Further will you need to inject username and password as well as the vCenter name and internal hostname in vCenter.
Test it in PowerShell as the Probe-User first – you should see the results. Eventually the script create a sensor with multiple channels – sensors in GREEN status will be counted only – sensors in UNKNOWN status will be counted and returned as text, while as long as no YELLOW or RED status (warning or error) occurs, the sensor still stays green/okay. Warning or Error levels will automatically apply and have the problematic hardware systems in the sensor message text.
My first attempt was to show all channels on top of the summary – due to getting over 100 separate hardware statuses back and the limitation in PRTG of 50 channels per sensor, I dropped the idea – while the script still has all the code to handle it.
#make sure the VMware PowerCLI is installed for all users
There is a way to read out and process ALL alerts of your VMware environment using PowerShell and reporting the results back to PRTG. The script further down in this article does this. What you get is similar to the graphic here.
This show you the following channels:
this will be green as long there aren’t any not acknowledged warnings or alerts in VMware
if the warning or alert is acknowledged, the sensor / script will return to green cause it is nothing that is new
Total Alerts – amount of alerts acknowledged and not ackowledged
Total Alerts – Acknowledged
Total Alerts – NOT Acknowledged
Total Warnings – Acknowledged
Total Warnings – NOT Acknowledged
Total Warnings and Alerts
Total Warnings and Alerts – Acknowledged
Total Warnings and Alerts – NOT Acknowledged
As you can see – you can get more granular on your PRTG statuses if you use the channels for Warnings/Alerts that are acknowledged. You could set upper warning or error limits of 0 to keep a warning / error level in PRTG if you want to see them still.
While I was writing the script, I decided to create a new lookup value in PRTG to make it more clear. If you adjust the script in regards to add additional statuses for the channel overall status – you will need to adjust this file as well.
Let’s start with the value lookup file, you need to copy the text from the first script block in to a file you store here: C:\Program Files (x86)\PRTG Network Monitor\lookups\custom
Once you have both files created, go to PRTG and add a new sensor called EXE/Script Advanced and select the new created script file. As Parameter you either type the host-name of your vSphere server or if you created it underneath the device in PRTG just use %host.
UPDATE: I changed the script cause I found it to be better to go with the following expected parameters and always making sure you have control over username and password used to connect to VMware. Please use the follow parameter moving forward:
credentials to connect to VMware can be a challenge as I tested this
you might need to have the service account of the PRTG probe have sufficient access rights – needs working SSO
alternative use a stored credentials file in PowerShell – somewhat secure
or provide the credentials clear text in PowerShell – least secure
please see line 20 respective the command “connect-viserver” for more details
updated the script – it now expects username and password as parameter
You might wanna test the script before you add a sensor to PRTG – the best way to do this is directly on the PRTG server with the service account of the PRTG probe to make sure it will work as a sensor later on.
Keep in mind that the script expects a parameter – the VMware vSphere server name / web-address.
Today we look at independent backup networks especially in regards to LTO 7 and VMware ESX hosts. Be aware – this very example also applies to any backup to disk (B2D / Backup-2-Disk) solution. But a good reseller / vendor would inform you about this right away anyways.
LTO 7 and later like LTO 8 drives have a write speed faster then a 1 GBit network can handle, making it now really necessary to think about options. On top of it, you do not want to over utilize the LAN side of your servers so that the impact on the user / application facing side stays minimal. This leaves you with two options, you can group switch ports assuming you have enough 1 GB ports and use them, you will need at least 3 ports combined, or you create a whole backup network on a 10 GB basis.
Let’s run some numbers:
LTO7 has a write speed of about 300 MB/s uncrompressed and up to 750 MB/s compressed
LTO8 (L8) has a write speed of about 360 MB/s uncrompressed and up to 900 MB/s compressed
Now – your network connection is meassured in MBit/s not MByte/s. Byte to bit is 8 bit are one byte, so we need to multiply those speeds in byte with 8 bit too see the network speed numbers.
LTO7 uncrompressed = 300MB/s * 8 = 2400 MBit/s
LTO7 compressed = 750MB/s * 8 = 6000 MBit/s
LTO8 uncrompressed = 360MB/s * 8 = 2880 MBit/s
LTO8 compressed = 900MB/s * 8 = 7200 MBit/s
Assuming you want to go with grouped ports, you see that with LTO7 you would need 6 ports and LTO8 7 to 8 ports to fully utilize the speed and minimize your backup window. Additionally think about the read speed that might affect you as well – not just for recovery but for the verify of your backup.
Now – this means – add at least one 10 GB switch and one 10 GB NIC to each server – let’s do this with an example:
3x VMware ESX hosts – LAN side and management is configured 1 GB – we assume there is some kind of storage behind them that has the iOPS and speed we need like an SSD based storage
1x Backup media server that has an LTO7 or LTO8 drive connected – 1 GB on the LAN side
What we need – minimal:
4x 10 GB NICs
1x 10 GB switch
4x CAT6e or CAT7 cables
What I would recommend – nice to have:
4x 10 GB NICs – dual port
2x 10 GB switches
10x CAT7 cables – 2x to stack/trunk the switches if not stacked otherwise
This is a nice to have – a fail-over, but the minimal configuration is sufficient as well.
Cable this all in – create a new IP-scope / VLAN on the backup side – you do not need any default Gateway etc. on the Backup-Network side (10 GB). Just an independent IP scope and have every host assigned a static address.
This keeps the regular network traffic and any broadcasts away from this network and your backup will run totally independent. You might need to disable your anti-virus solution on this NIC / IP-scope on the backup media server as well, cause it might actually influence the speed quite drastically. Having it separated and independent helps keeping the security up.
On the VMware hosts – I like to even allow VMware to vMotion on this backup-LAN – simply because it is extremely efficient there – independent from your LAN and if you have it from your iSCSI network as well. But that’s just an idea.
Now – the backup – how will it grab the data from the 10 GB side of your VMware hosts – especially if you have a vSphere cluster and grab the backup through the cluster?
Simple – you adjust the hosts file on your media server. Each and every VMware host needs to be listed in the hosts-file on the media server with the IP that it has in your 10 GB backup network. This way DNS and everything will act normal in your environment, only the backup-media server will reach out to those hosts on the 10 GB network due to the IP resolution of those hosts. This is the easiest way to accomplish this.
You will not to add a 10 GB connection, backup-network IP address etc. to your VMware vSphere controller – it can stay on your LAN or server-management network as is. This also means there is no reason to mention him in the hosts file on the media server.
How this works: your backup will contact the vSphere controller on the LAN side it will then be redirected to the host that currently holds the VM you want to backup the media server now will contact the VMware host directly – due to the hosts-file entry on the 10 GB backup-network backup will process…
This of course would work with a physical server as well – like a physical file-server etc. – though, today this is rather rare and especially VMware backups are actually large files that benefit most from the LTO7 write speed so the above makes sense there most. It wouldn’t matter if you do the same to an Hyper-V environment or any other VM host/guest solution. In theory it should always work the same.
What real world write speeds can you expect? This is the big question – here are some real world examples of this – those are single jobs on per VM basis, meaning it includes even the tape-load and tape-unload processing time and udpating the catalogs while using Veritas Backup Exec.
Backup size (VM size)
elapsed time in minutes
job rate write/overall
job rate verify
The above list is just an example – realistically we see speeds between about 3,000 MB/min to 18,000 MB/min as for the overall speed. This is due to the VM itself for some part – thin or thick provisioned, what kind of data is it holding, how busy is the host cause we might double trouble him due to multiple drives doing backups at the same time to the same host etc… In average we see around 8,000 to 9,000 MB/min in speed, what is still great – and I wanted to show as well that it can vary quite a bit so don’t be scared. We still did improve the time the backup took from going from an LTO4 LAN based backup scenario to an LTO7 independent backup network while cutting the time in half, actually, even less then half. The slowest speeds we see today are due to systems that can only be backed up on the LAN side, while the ports are grouped there but we still don’t have the same speed as we see on the backup-network side. Many factors come in play but that all depends on the individual situation.
Hoping the information above helps some of you out there – keep in mind that your situation might be different, run some examples and ideas and if you have questions, reach out – this remains an example of what I really implemented at a company and how it affected the backup configuration and management.
If you try to PXE boot a VMware guest system that e.g. uses WDS / Windows Deployment Services or similar, you might encounter that the boot.wim etc. download unbelievable slow. This can take several hours. This has especially to do with booting via the VMware EFI environment. VMware BIOS does not cause this issue. You could switch a EFI system to BIOS to capture/deploy the image – but this is not really a solution rather then bypassing it.
The solution for this is pretty simple, while the download is transferred through a TFTP, VMware has an issue with the blocksize and this gets a bit messed up due to the variable blocksize between the VM guest system and your PXE server.
Set the Maximum Block Size to 1456 – what is the exact value VMware needs to work properly. Disable further the Variable Window Extension and try to PXE boot again – you will see it will load in about a minute now – depending on your WinPE image size and your issues are in the past.
open your Windows Deployment Services
right click on the server and select Properties
navigate to the register card TFTP
set the Maximum Block Size to 1456
uncheck the Enable Variable Window Extension checkbox
If you have a virtual system, like VMware, and a storage device behind it you might be able to mirror your whole real / live environment and create a complete playground or shadow network to simulate any of your guest VMs like the real system and being able to change, update and adjust the configuration within this lab environment.. This is actually rather easy to accomplish and can save you a lot of headache while using it.
This is just an example, you might be able to accomplish something similar while just cloning single VMs and in theory it wouldn’t even matter if you use VMware vSphere or something like Microsoft Hyper-V. Still, it has advantages to do it this way, but let me explain it.
You work with an VMware environment with one or more host systems and several guest VMs. You need to update software and configurations on those guest VMs but need to test this beforehand to ensure everything runs smooth. The VMs are stored on a central storage device that can do volume-level snapshots.
Prepare the environment
you need one NIC per host that you will connect to a switch
best use an independent switch that has no other network connections then to the 1x NIC per host
create an Shadow-LAN virtual switch in your VMware cluster and use the 1x NIC per host so VMs through the cluster can communicate proper
What you do:
create a snapshot and mount it to the VMware host systems as an additional volume
go to the clone-volume file system and add the VMs you need to the inventory (you might want to rename to shadow-<servername> so you can easily identify them)
re-configure their network connection from your regular LAN virtual switch to the shadow-LAN virtual switch you prepared
start the added guest VM
if you are asked if you moved or copied the guest just say you moved it – to avoid hardware / MAC and other changes possibly causing Windows to want to re-activate
VMware might complain about a duplicate MAC address – you can ignore this, cause you are on two different / independent networks
Real usage example:
Let me give you a more detailed example on this with a few more details on what I personally used and did with this. The example you find below should help you understand the whole principle better.
VMware cluster with e.g. 10x host systems – we had enough RAM and CPU power that we could have 3x hosts go down – you won’t need that much, but of course you would have buffer for RAM and CPU usage
Nimble All-Flash storage arrays in the background connected to all the VMware hosts and using the Nimble VMware plugins (Note: Nimble was bought by HP / HPE as off today)
the Nimbles are configured to do volume level snapshots multiple times per day
all physical host systems had a dedicated network card (NIC) connected to an independent physical switch that was NOT connected to any other network switch
a virtual switch SHADOW-LAN was created and those physical NICs of the hosts systems had been assigned to it
this allowed any VM connected to this virtual switch to communicate with other VMs connected to the same virtual switch on other hosts
Due to migrations, software updates and quality controlled systems we constantly had the challenge to test and changes and adjustments thoroughly. So I came up with the solution to just clone a snapshot on the Nimble storage array the VM resided and mounting it to the VMware cluster, taking only minutes and then moving forward to add e.g. domain controllers, DHCP servers, necessary file-system servers and the target guest system to the inventory in VMware, adjusting their name so we could quickly identify them (even adding them to resource pools if necessary) and of course most important just changing their virtual switch configuration to the shadow switch.
Advantages and possibilities:
This now allowed to simulate the whole real world system (VMs) and simulate every change that we wanted. In order to get software there we attached if necessary VHDs that did hold what we needed or we even used a secondary internet connection to briefly connect to licensing services or update services that Vendors only provided online. The advantages of the solutions go even further:
simulate everything you have in your VMware environment available and have it working like the real / live system
if necessary, provide internet access while connecting a SECONDARY internet connection (router / firewall) to physical shadow network switch
adding real printers to the shadow switch to be able to test print-outs (we had those cases)
add physical workstations to simulate whole production environments
update / refresh the whole system in only a few minutes by using a fresh-snapshot clone
only minimal to almost none impact on the storage / free space of your storage device
this is due to grabbing a Nimble snapshot that was cloned and therefor created a new branch and only the deltas (changes) had an impact on the storage – we talk even for “bigger” simulations only about a few gigabyte changed data – if at all that much – of course depending on your storage and what you do
we installed VMware console connections on quality testing workstations so they could access the system directly on the console
of course only granting them minimal rights to this specific pool of VMs
avoiding that their access to the VMware environment had any impact to the real system
documenting any changes, challenges faced and solutions found
due to a physical switch, and best practice using a layer-3 switch, we where able to simulate whole VLANs, routing etc. within the environment and even connecting various physical systems like printers, workstations and temporarily an internet-connection to this environment
It is only a small amount of effort to initially prepare for those simulations, cause the virtual shadow switch, the physical shadow switch and the hosts network card connections to this physical switch are a one time effort. After this you just clone and mount snapshots and add the actual VMs you need while adjusting their network connection to the virtual shadow switch.
Once setup, preparing simulations usually takes less then 30 minutes till everything is cloned, mounted, added to the inventory (incl. NIC adjust to the shadow switch) and booted up.
Why not just clone all VMs via VMware?
Good questions – the answer is simple, this would have an impact on your storage capacity, cause it would create an actual clone. And it actually takes longer to clone individual VMs then to just grab a storage-level snapshot and being able to adjust what you want down to the volume level on the storage. Even the clean-up might be more intense or leave some unwanted data back – while a clone on off the volume only needs you to remove the VM guest system from the inventory and then unmount and delete the whole shadow volume.
I did write this all up cause I wanted to share it – the whole idea is not that special in theory, but I thought it is an good example on how you can accomplish having a huge and decent lab environment with only minimal effort. In any case, I hope the idea behind it will help some off you out there 🙂
Here is the article I wrote – starting with a description of the actual issue:
We have been experiencing VSS issues with VMware guests in regards to the installed Backup Exec Agent and a previously installed VMware Tools VSS option.
Uninstalling the VMware Tools VSS option in various ways including restarts did not fix our issues. If you search the internet for solutions, you find many attempts but no real solution or explanation.
One of our admins spend several hours with the Veritas support without a solution, he was about to escalate the issue with them, when we found the root cause and could actually fix our issues.
First the steps to solve this:
Uninstall the VMware Tools VSS option (no restart will be requiered)
Make sure VMware VSS service was deleted
If this is not the case, you might need to do so manually and remove additional DLL’s etc. as well as restart the system, but this is independent from this solution
You might have already done steps 1 and 2 but you still get VSS exceptions from the backup that says you have more than one VSS agent installed:
V-79-8192-38331 – The backup has detected that both the VMware VSS Provider and the Symantec VSS Provider have been installed on the virtual machine ‘hostname’. However, only one VSS Provider can be used on a virutal machine. You must uninstall the VMware VSS Provider.
Now you wonder what causes this and you get stuck
You could uninstall the Veritas/Symantec Backup Exec Agent and only back the system up per VMDK
You would lose the GRT / granular backup / restore capabilities
If this key exists, but your VMware VSS provider is uninstalled, you need to follow up with step 5
Open a notepad as administrator
Open this file in the notepad
Search for the following two entries:
Make sure both of them are set to FALSE, most likely one of them is TRUE
Run a test backup
This test backup now should not show the exception anymore
The registry key should vanish (refresh/press F5) without you taking action
So what happened?
You uninstalled the VMware Tools VSS provider, but this manifest file did not get updated. We actually could see that it sometimes does get updated and sometimes does not. This seems to be some kind of issue with the VMware Tools uninstalled/installer.
But why this manifest.txt file?
As we found out, there scripts that get executed by Symantec/Veritas Backup Exec before the backup. You might find them in two locations, and it seems to depend a bit on the Windows version which script is executed (at which location). You could edit them both and just undo the checks in the scripts, but this wouldn’t be correct. It is more correct to update the manifest.txt file. If you want to, you can check the date/time of the manifest.txt file before you change it – you might see it was not updated while you uninstalled the VMware Tools VSS provider (assuming you did only do this and not do additional installs/uninstalls within the VMware Tools / please note as well that this only is true when you still experienced those issues).
This script checks several DLLs, registry entries, paths and on Windows 2008 and newer the ProgramData-Path for this specific manifest.txt file and the two entries mentioned.
Once you uninstall the VMware VSS provider, and the file did not get updated, you might see this issue and wonder how to solve it. The solution is to simply update it to mirror the uninstallation of the VMware VSS provider (vcbprovider). We double checked this with several installations and could see if the file actually gets updated, those two values are set to FALSE, if it doesn’t, at least one of the values remains true, what causes the pre-freeze-script.bat to write the registry key mentioned earlier and therefor causing the issue in the backup – exceptions.
If you still have the same issues after updating the manifest.txt, simply check all the DLL’s that are mentioned in the script and make sure they don’t exist. You might also consider to manually delete the registry-key (it seems to be just a dumy-key) to make sure there is no issue that prevents the script from deleting it. Make sure it does not re-appear after a backup! Otherwise you might still have some DLLs left on your system that cause the script to re-create the registry key.
Hope this helps a few of you out there. This was an ongoing issue for a while and I came accross those issues many times ever since Windows 2008. This applies to Windows 2008 R2, Windows 2012, Windows 2012 R2 and pretty sure to Windows 2016 as well.
It helped us getting rid of those issues completely and actually not even needing a single restart of the guest VM to solve the issues (removing the VMware VSS provide did not need a restart).
To provide the best experiences, we use technologies like cookies to store and/or access device information. Consenting to these technologies will allow us to process data such as browsing behavior or unique IDs on this site. Not consenting or withdrawing consent, may adversely affect certain features and functions.
Functional Always active
The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
The technical storage or access that is used exclusively for statistical purposes.The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.