13

We have training rooms where normally Windows XP is installed (via PXE). The "normal" DNS/DHCP infrastructure are Windows-Servers. The training room has its own VLAN (different from the Windows servers), so there is most propably an IP helper for DHCP requests active on the Cisco router where all PCs from that room are connected to.

Now we wanted to convert some of the PCs to Linux instead. The idea was: Put our own Laptop with a DHCP server into the VLAN of the room and override the "normal" DHCP response. The idea was that this should work, since a directly attached DHCP server in that VLAN should have a faster response-time than the "normal" DHCP server located some hops away from that VLAN.

It turned out that this did not work. We had to manually release the lease on the original DHCP server to get it working.

On the Laptop we did see the client requesting the IP and "our" dhcp was sending NACKs to the Windows IP request, before that we did offer our own response.

Old Question: Why did this not work out as expected? What is making the PC regain its old lease?

Update 2012-08-08:

The regain-issue has been explained in the DHCP-RFC. Now this explains why the PC regains its old lease.

Now we do release the IP from the Windows-DHCP-server before giving it another try.

Again - the Windows-DHCP-server wins.

I suspect that there is some algorithm for the dhcp-client which determines the "best" dhcp-answer for the client. The new question is:

How does the client choose the "best" answer?

Nils
  • 7,657
  • 3
  • 31
  • 71

4 Answers4

9

Assuming the router is still acting as a DHCP relay and forwarding the request to your original server, then the reason it did that is simply because that Windows DHCP server told it to go ahead and use the IP. In this instance the DHCPNACK from the new server is irrelevant, as a DHCP client will consider all responses, and since it got an offer from the Windows DHCP box, its perfectly happy to use it.

PC: Oh hi world, can I use 192.168.1.123?

New-DHCP: I say no.

Old-DHCP: I say yes.

PC: Someone said yes! Sweet, I'll use it!

ThatGraemeGuy
  • 15,314
  • 12
  • 51
  • 78
  • After cold-boot of the PC the conversation starts with "my MAC is XYZ - please give me an IP". Then both DHCP-servers offer IPs... the only difference is that it has an active lease on one of the servers - but this is just the server`s perspective. – Nils Aug 03 '12 at 20:54
  • 1
    not if the PC already had an IP address. if it previously had an IP address assigned by a DHCP server, it will ask to use that one first before asking for another address. – longneck Aug 03 '12 at 20:56
  • @longneck where will that IP be stored on the PC? – Nils Aug 03 '12 at 20:57
  • off the top of my head, i don't know. but the proper way to clear it is to use ipconfig /release – longneck Aug 03 '12 at 21:03
  • 3
    @longneck - the op is asking about in a PXE environment, where we're assuming that the boot BIOS has no recollection about previous boots or IP addresses – Mark Henderson Aug 03 '12 at 21:11
  • good point, but how do you "release" a PXE IP address? the OP specifically mentions doing that. – longneck Aug 03 '12 at 21:13
  • Your windows DHCP administrator could set a 1000ms delay on that scope... might fix your issue. – SpacemanSpiff Aug 27 '12 at 02:32
  • @SpacemanSpiff on which "scope" and what issue might be fixed by this? Perhaps make that an answer of its own? – Nils Aug 30 '12 at 21:11
4

It is vendor, even firmware specific how a client reacts to multiple DHCP answers.

Variants I have seen over the years are:

1) Accept the first regardless whether it is an ACK or NACK.

2) Take the first ACK, ignore NACK's completely.

3) Take the last ACK received within a set time-interval (usually 5-10 seconds).

Example: Some years ago we had issues with Ricoh MFP's.
We had 2 DHCP servers. One supplied the addresses, the other only additional DHCP options. The 2nd server always answered first.
The Ricoh's used variant 1) even if the 1st offer only contained DHCP options. Ricoh changed it to variant 2) with a firmware update after we explained the problem to them.

Tonny
  • 6,252
  • 1
  • 17
  • 31
  • The `OFFER` packets are what the client system's needing to decide between. `ACK` and `NACK` packets are only sent in response to a `REQUEST`, which only occurs after the client has "decided" which offer to go after. That is a pretty cool bug with the printers, though! – Shane Madden Aug 29 '12 at 15:19
  • @ShaneMadden That is correct, but I have seen numerous cases of clients sending a request in response of BOTH offers and then acting on the replies as I described. It's been a while since I looked at this in depth. I clearly remember NT4, W2K and XP being guilty of this. The Ricoh's did too. They ran a Linux 2.2 kernel and network stack. – Tonny Aug 29 '12 at 18:40
3

If nothing else helps - RTFM (read the fine manual). In this case the first one was the hit.

RFC 2131 outlines DHCP-operations.

Section 1.6 states that DHCP must:

Retain DHCP client configuration across server reboots, and, whenever possible, a DHCP client should be assigned the same configuration parameters despite restarts of the DHCP mechanism,

Now the interesting question is how that design goal is being achieved on a client that has no knowledge of its past. Section 3.2 outlines:

3.2 Client-server interaction - reusing a previously allocated network address

If a client remembers and wishes to reuse a previously allocated
network address, a client may choose to omit some of the steps
described in the previous section. The timeline diagram in figure 4
shows the timing relationships in a typical client-server interaction for a client reusing a previously allocated network address.

  1. The client broadcasts a DHCPREQUEST message on its local subnet. The message includes the client's network address in the 'requested IP address' option. As the client has not received its network address, it MUST NOT fill in the 'ciaddr' field. BOOTP relay agents pass the message on to DHCP servers not on the same subnet. If the client used a 'client identifier' to obtain its address, the client MUST use the same 'client identifier' in the DHCPREQUEST message.

  2. Servers with knowledge of the client's configuration parameters respond with a DHCPACK message to the client. Servers SHOULD NOT check that the client's network address is already in use; the client may respond to ICMP Echo Request messages at this point.

So a DHCP-server holding an active lease gets precedence by using a shortcut in the protcol.

  1. Client: DHCREQUEST (MAC-Adress, broadcast, will be transmittet in local broadcast domain - here the local VLAN and via IP-helper to the Windows-DHCP-server)
  2. Laptop-DHCP-Server: DHCPOFFER
  3. Windows-DHCP-Server: Hey - I already know you - DHCPACK
  4. Client: Oh - I got two responses. One that already knows me. Cool I will take that

From then on the Laptop-DHCP-Server is being ignored by the Client.

So the solution in our case will probably be (I will update this when we actually test it):

  1. Make sure Client is off
  2. Turn off DHCP-Server on Laptop, fake Client-MAC on Laptop, DHCP-Request
  3. Release IP
  4. Regain original IP and MAC, turn on DHCP-Server
  5. Turn on client and do a PXE-boot...
Nils
  • 7,657
  • 3
  • 31
  • 71
3

The new question should probably be in a different question - the title of the question doesn't fit at all with most of the body of the question.

In any case, with regard to how a client chooses which offer to go with, in the case where it has no current lease: it's up to the client, but in every DHCP client implementation that I'm aware of, it's a simple race.

RFC 2131 covers this:

DHCP clients are free to use any strategy in selecting a DHCP server among those from which the client receives a DHCPOFFER message.

There's an IETF draft out there that seems dead that would have added configurability to the selection process, and also mentions the lackluster client implementations (of over a decade ago, but not much has changed):

In practice, most vendor's implementation of policy here is very basic (e.g., first offer received or first acceptable offer received) and is "hard-coded" (i.e., non-configurable).

Having two DHCP servers providing service to the same network with different configuration just results in races, which is not desirable from a reliability or predictability perspective. There's really no reason you can't get your single DHCP server to provide what you need.

Shane Madden
  • 112,982
  • 12
  • 174
  • 248
  • You think that the "acceptable" offer is vendor-specific on the dhcp-client side? Since in our case it is not the "first" offer it must be something else - the behaviour is quite deterministic though, so I still think there is a common standard behind this. – Nils Aug 27 '12 at 21:25
  • @Nils Are you absolutely certain that the Windows server isn't getting its response to the client before the laptop in the same room? It intuitively seems like the laptop should win that race, but that might not be what's happening. – Shane Madden Aug 27 '12 at 22:27
  • I guess I will have to trace this on network level (with wireshark) to actually see what is happening there. Probably on a mirror-port of that client... – Nils Aug 28 '12 at 20:07