6

I'm experiencing weird behaviour in our local network. We have two devices, which gets the same IP address. After investigation I have found the IP address is out of range configured on our primary DHCP server on router.

I have scanned network with nmap for opened UDP ports 67 and I have found 9 additional DHCP servers. Most of them are on Windows computers of my colleagues and some are on embedded Linux devices we have here. My guess is that not all of them are actually assigning IP addresses and causing problems, but at least one must.

What is the most effective way to debug this situation?

Journeyman Geek
  • 133,878
Thugmek
  • 95

2 Answers2

5

On the DHCP clients that get duplicate IPs, start by looking at the options given with the DHCP offer. There's a non-zero chance that each DHCP server will reply with something pointing at itself, as Router/Gateway or Nameserver or something else.

If that doesn't help, pick a time when you can have a maintenance window, probably out of hours because this will interrupt normal services.
On your approved DHCP servers, set the lease time to something low like 5~10 minutes now so its easier to flush things.
Then shut down your proper DHCP servers, and use a test client to do a DHCP request. In another window run a packet sniffer like wireshark, or ideally tcpdump:

# sudo tcpdump -i any -v -nn port 67 or port 68

10:00:18.982020 eth0 Out IP (tos 0x10, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 328) 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 66:ed:6e:c6:9d:56, length 300, xid 0x38e3094d, Flags [none] Client-Ethernet-Address 66:ed:6e:c6:9d:56 Vendor-rfc1048 Extensions DHCP-Message (53), length 1: Request Parameter-Request (55), length 13: Subnet-Mask (1), BR (28), Time-Zone (2), Default-Gateway (3)...

10:00:18.984436 eth0 In IP (tos 0x10, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 334) 10.28.1.1.67 > 10.28.100.109.68: BOOTP/DHCP, Reply, length 306, xid 0x38e3094d, Flags [none] Your-IP 10.28.100.109 Client-Ethernet-Address 66:ed:6e:c6:9d:56 Vendor-rfc1048 Extensions DHCP-Message (53), length 1: ACK

  •       Server-ID (54), length 4: 10.28.1.1
          Lease-Time (51), length 4: 7200
          Subnet-Mask (1), length 4: 255.255.0.0
    
  •       Default-Gateway (3), length 4: 10.28.1.1
          Domain-Name (15), length 14: "criggie.org.nz"
    
  •       Domain-Name-Server (6), length 8: 10.28.1.1,10.28.1.1
          Netbios-Name-Server (44), length 4: 10.28.1.2
    
  •       NTP (42), length 4: 10.28.1.1
    

In this case, 66:ed:6e:c6:9d:56 is the MAC address of the client, and the DHCP reply came from 10.28.1.1 This same IP features as DNS and NTP server etc.

Once you have an IP address, quickly check the test-host's ARP table and get a MAC address before it times out.

# arp -an
? (10.28.2.9) at a0:36:9f:b9:ef:52 [ether] on eth0
? (10.28.1.1) at 00:0d:b9:35:29:c4 [ether] on eth0

So now you know 00:0d:b9:35:29:c4 is the hardware MAC address for the IP that gave a DHCP reply.

Finally, work out where on your network that IP/mac address is. A simple hostname lookup on the IP might tell you something useful :

# host 10.28.1.1
1.1.28.10.in-addr.arpa domain name pointer pfsense.criggie.org.nz.

or

$ dig -x 10.28.1.1
;; ANSWER SECTION:
1.1.28.10.in-addr.arpa. 1    IN   PTR   pfsense.criggie.org.nz.

Or if that's meaningless to you, have to track the MAC address back with an OUI lookup like https://www.wireshark.org/tools/oui-lookup.html
For me it returns "00:0D:B9 PC Engines GmbH" so I know its an APU or ALIX single board PC. You might get a result that means something to you, or you might get something broad like "hp inc"


Your next step is to look through your switch MAC lookup tables to work out what physical port this mac address connects to. This presumes you have at least one managed ethernet switch in your network and can log into it.

This totally depends on what kind of switch you have and is going to be wildly different between ecosystems. Here's a Juniper showing the device is on gigabit port 19:

root@sw1-poe> show ethernet-switching table brief | grep 00:0d:b9:35:29:c4 
  default           00:0d:b9:35:29:c4 Learn          0 ge-0/0/19.0

Same command on a Hasivo switch, showing that device is on port te1 (this one needs MAC in uppercase)

switch0# show mac address-table 00:0D:B9:35:29:C4
 VID | MAC Address       | Type    | Ports          
-----+-------------------+---------+----------
   1 | 00:0D:B9:35:29:C4 | Dynamic | te1 

If you only have unmanaged switches, your only option is to ping the IP repeatedly while unplugging parts of your LAN till the ping stops. If the remote host doesn't respond to ICMP pings, use arping which requires root:

~$ sudo arping  10.28.1.1
ARPING 10.28.1.1
60 bytes from 00:0d:b9:35:29:c4 (10.28.1.1): index=0 time=182.578 usec
60 bytes from 00:0d:b9:35:29:c4 (10.28.1.1): index=1 time=156.675 usec
60 bytes from 00:0d:b9:35:29:c4 (10.28.1.1): index=2 time=198.373 usec
^C
--- 10.28.1.1 statistics ---
3 packets transmitted, 3 packets received,   0% unanswered (0 extra)
rtt min/avg/max/std-dev = 0.157/0.179/0.198/0.017 ms

Finally once you DO find the mysterious DHCP server, explain kindly to whoever did it what the problems are and why it has caused you pain.

I ran two disjointed DHCP servers for years on the same network without issue. The solution was for the sysadmin (me) to make sure both were serving the same set of options with correct values, and for each to have its own separate range as a dynamic pool that didn't overlap. I also set many DHCP reservations so a lot of hosts would always get the same "static" IP but I could change options like nameserver in just two DHCP servers.

You can do this - let us know how you get on.

Criggie
  • 2,580
2

On client that receives IP address from rogue DHCP server, type ipconfig /all in cmd. There is a line says DHCP Server. Start looking from there.

vasin1987
  • 139