Hi All

I have a weird one. I’m hoping someone can give me an idea where to start looking.

I have Windows Server 2012r2 virtual machines that at random hang.
All of the servers are running Windows Server 2012r2 hyper v and have at least two guest virtual servers. One Windows Server 2012r2 running active directory and one Windows Server 2012r2 remote desktop server.

When the guest VM hangs the other VM’s and the host continue running fine. None of the windows servers log any faults around the time of the guest hanging. All of the servers have the latest windows updates. On one of the servers i replaced all of the hardware and it still happens. I’ve disabled the backups and antivirus. I’m at a loss at what to try next.

The hangs seem to only occur overnight, when the servers are not in use. The remote desktop servers hang more often than the active directory servers.

Does anyone have any idea where i can start to track down this cause.

5 Spice ups

Sounds like the VMQ issue. I think it happens with Broadcom NICs mostly. I lived through this several years ago.

On the host and all the VMs, go into the properties of the NICs and turn off VMQ.

You can search for this for more info.

You say “hang” but if this is the cause, they’re accessible in a console session. They just don’t respond via network.

Some more info i just noticed.
The VM hasn’t completely hung. I can still ping it. I can’t do anything else, its completely unresponsive.
In hyper v manager the heartbeat is still showing as OK.

Unfortunately i don’t think its the network VMQ issue. The servers don’t have broadcom nic’s and the VM’s are not available via the console.

  • What is running on the host, just which hypervisor as bare metal or a full 2012 R2 operating system with Hyper-V role and possibly other applications?

  • Which license / editions of 2012 R2 do you have and what’s the limit on number of VMs and physical hosts to be covered per license?

  • How many Windows 2012 R2 licenses for which edition do you have?

I don’t remember the license restrictions of 2012 R2. But with current standard license of 2019, you could run 2 VMs when using just bare metal installation of Hyper-V on the host.

What is running on the hosts (server 2012r2 with hyper-v role or hyper-v server) ?

What is the specs of the host ?

What is the make & model of the host ? Did you update firmware for the host ?

Did you check the health of the host components mainly Temperature, RAM, RAID controller and storage ? Did you check the resource availability on the host ? Your scenario sounds like resource contention.

1 Spice up

What is the host hardware? What are the host NICs?

Default 1G NICs on Dell and HPE servers use Broadcom/Qlogic chips. The native Microsoft drivers for those chips are terrible and known cause this exact problem.

I would get the latest NIC firmware and drivers from your server OEM and apply them to your host.

2 Spice ups

Power Mode ?

1 Spice up

Have you also tried replacing the virtual machine’s virtual hardware? You can do that via creating a similar virtual machine from scratch and reattaching the virtual disks or cloning the virtual machine using Hyper-V Manager Export/Import routine or free V2V Converter V2V Converter / P2V Converter - Converting VM Formats .

The VM hasn’t completely hung. I can still ping it. I can’t do anything else, its completely unresponsive.

That is pretty strange. Can you check if any ports are still open and listening while it responds to ping? Are you able to use remote Powershell or any other RPC-based options to interact with the virtual machine when it is stuck?

Not necessarily. Could be typical for busy situations, especially in combination with lack of ressources.

  • Do you have some monitors accessible during such a seemingly hanging situation?

  • If not, does this situation happen as frequently when you recreate such a situation with more physical and more virtual memory, following the above mentioned steps to create a VM with much more memory than currently but still not more than you have physical memory?

  • If you have access to such monitors in such situations, is the clock advancing at normal speed or much slower, with which services or applications using most resources of CPU and RAM, and eventually how much swapping going on due to lack of RAM?

1 Spice up

Lack of resources does not happen abruptly and is usually clearly reflected in Windows System logs, which is not the case here as OP describes.

Though I agree with Scheff, a valid option would be running some hypervisor monitoring to find any possible bottlenecks if present. Veeam ONE could be of help here https://www.veeam.com/virtualization-management-one-solution.html .

1 Spice up