I’m having an issue with PXE booting machines for WDS Imaging, after having it work fine for 25 machines, then suddenly stop working, I’ve struggling with it for about a week trying to figure out what went wrong, I’m no farther along, so I’ve taken out a lot of variables, and here’s the situation:

My setup:

The physical server is a nice and beefy Dell server, 2012R2, with teamed Broadcom NICs. It’s running as Domain Controller, DNS server and DHCP server in addition to WDS. The DHCP server is set for option 60 - PXEClient (Per numerous guides when running DHCP and WDS on the same server.)

I have isolated the server and test clients onto a plain old unmanaged network switch, with 5 Lenovo Thinkpad Edge 530c laptops and 5 Dell OptiPlex 980 PCs as the PXE clients.

The issue:
When I PXE boot the client machines one of three things happens.

  1. 2 or 3 machines work fine, they get the boot file and begin the imaging process.

  2. Several machines start the TFTP transfer, then keep restarting the transfer until they error out.

  3. Several machines give the error - PXE-E55 Proxy DHCP Service did not reply to request on port 4011.

The last message is certainly curious. Because of the fact this server did indeed work fine for 25 machines a couple weeks ago, and 2 or 3 machines work fine now, I’m really confused as to what’s happening that causes the last message, a known problem when WDS and DHCP reside on the same server without DHCP options 60 set, to happen.

It’s almost like there’s something causing conflicting problems with PXE, which is why I isolated these machines on to their own network switch for testing.

Clearly the issue resides on the server, but I simply cannot figure out what the problem is.

In addition to isolating everything I have.

  1. Deleted the contents of the \RemoteInstall\Mgmt folder and let the server recreate it.

  2. Completely uninstalled WDS, deleted the RemoteInstall folder, and resinstalled WDS.

No luck.

I’m currently in the process of capturing some Wireshark logs, but I’m not that great at deciphering those, so I’m not sure how much help that will be to me.

2 Spice ups

On a complete whim, while sitting at home tonight I remoted and created a bunch of virtual machines on my Hyper-V server, and checked them booting PXE against the WDS machine. It failed in the same ways the physical clients did, a few worked, a few started getting the file and a few got the 4011 error.

Next, I uninstalled WDS, then reinstalled WDS, but chose the radio button for stand alone server, rather than AD Integrated.

I reset all my Hyper-V machines and had them boot PXE against the WDS machine, and every single almost instantly received the PXE boot file and worked.

And by instantly, I mean like 1 or 2 seconds after starting the PXE process.

Before the clients that did work took 10 - 15 seconds.

So something about tying my WDS server to Active Directory caused issue, that for some reason didn’t crop up a week or so later after a successful deployment. Weird. I did notice that the WDS Configuration UI is much more responsive in standalone mode. I lose the functionality of it dropping it into the correct OU for my freshly imaged machines, but it works.

To the best of my knowledge there’s nothing wrong with my AD structure, I have two domain controllers running Server 2012R2 at the 2012R2 functional domain level.

I wonder what I need to troubleshoot to figure out why it works standalone vs AD Integrated?

Is it possible to connect a pxe client directly to the WDS server to eliminate the network and see what happens?

@techsup. WDS and DHCP are running on different servers but on the same subnet so broadcasts are not a problem: As long as the client deploying an image from the WDS server and the WDS server are on the same subnet and different servers, everything works fine with no DHCP options configured.

Can you use a 3rd party tftp client on a computer to connect to the WDS server and see if u can see the boot file.

Maybe it will experience the same error and help narrow things down with an error message to work with.

Also I think you need dhcp options 60, 66 and 67 set.

Yeah, I really have no clue what’s happening here, but when I went back and installed it as a stand alone WDS server, instead of Integrated in AD, it works fine.

The TFTP transfer works, only having Option 60 works, dunno. It just seems that something’s wrong with the AD DS integration portion.

As a test I spun up a virtual server, installed WDS integrated to AD and pointed DHCP to it. Same issue. When I went back and reinstalled WDS on the virtual server and made it a standalone server it worked fine as well.

So, for some reason in my environment making the WDS server integrated with AD causes problems.

I just went back and edited my Unattend.xml files to take care of the PC naming and domain joining.