On the DHCP clients that get duplicate IPs, start by looking at the options given with the DHCP offer. There's a non-zero chance that each DHCP server will reply with something pointing at itself, as Router/Gateway or Nameserver or something else.
If that doesn't help, pick a time when you can have a maintenance window, probably out of hours because this will interrupt normal services.
On your approved DHCP servers, set the lease time to something low like 5~10 minutes now so its easier to flush things.
Then shut down your proper DHCP servers, and use a test client to do a DHCP request.
In another window run a packet sniffer like wireshark, or ideally tcpdump:
# sudo tcpdump -i any -v -nn port 67 or port 68
10:00:18.982020 eth0 Out IP (tos 0x10, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 328)
0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 66:ed:6e:c6:9d:56, length 300, xid 0x38e3094d, Flags [none]
Client-Ethernet-Address 66:ed:6e:c6:9d:56
Vendor-rfc1048 Extensions
DHCP-Message (53), length 1: Request
Parameter-Request (55), length 13:
Subnet-Mask (1), BR (28), Time-Zone (2), Default-Gateway (3)...
10:00:18.984436 eth0 In IP (tos 0x10, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 334)
10.28.1.1.67 > 10.28.100.109.68: BOOTP/DHCP, Reply, length 306, xid 0x38e3094d, Flags [none]
Your-IP 10.28.100.109
Client-Ethernet-Address 66:ed:6e:c6:9d:56
Vendor-rfc1048 Extensions
DHCP-Message (53), length 1: ACK
Server-ID (54), length 4: 10.28.1.1
Lease-Time (51), length 4: 7200
Subnet-Mask (1), length 4: 255.255.0.0
Default-Gateway (3), length 4: 10.28.1.1
Domain-Name (15), length 14: "criggie.org.nz"
Domain-Name-Server (6), length 8: 10.28.1.1,10.28.1.1
Netbios-Name-Server (44), length 4: 10.28.1.2
NTP (42), length 4: 10.28.1.1
In this case, 66:ed:6e:c6:9d:56 is the MAC address of the client, and the DHCP reply came from 10.28.1.1 This same IP features as DNS and NTP server etc.
Once you have an IP address, quickly check the test-host's ARP table and get a MAC address before it times out.
# arp -an
? (10.28.2.9) at a0:36:9f:b9:ef:52 [ether] on eth0
? (10.28.1.1) at 00:0d:b9:35:29:c4 [ether] on eth0
So now you know 00:0d:b9:35:29:c4 is the hardware MAC address for the IP that gave a DHCP reply.
Finally, work out where on your network that IP/mac address is. A simple hostname lookup on the IP might tell you something useful :
# host 10.28.1.1
1.1.28.10.in-addr.arpa domain name pointer pfsense.criggie.org.nz.
or
$ dig -x 10.28.1.1
;; ANSWER SECTION:
1.1.28.10.in-addr.arpa. 1 IN PTR pfsense.criggie.org.nz.
Or if that's meaningless to you, have to track the MAC address back with an OUI lookup like https://www.wireshark.org/tools/oui-lookup.html
For me it returns "00:0D:B9 PC Engines GmbH" so I know its an APU or ALIX single board PC. You might get a result that means something to you, or you might get something broad like "hp inc"
Your next step is to look through your switch MAC lookup tables to work out what physical port this mac address connects to. This presumes you have at least one managed ethernet switch in your network and can log into it.
This totally depends on what kind of switch you have and is going to be wildly different between ecosystems. Here's a Juniper showing the device is on gigabit port 19:
root@sw1-poe> show ethernet-switching table brief | grep 00:0d:b9:35:29:c4
default 00:0d:b9:35:29:c4 Learn 0 ge-0/0/19.0
Same command on a Hasivo switch, showing that device is on port te1 (this one needs MAC in uppercase)
switch0# show mac address-table 00:0D:B9:35:29:C4
VID | MAC Address | Type | Ports
-----+-------------------+---------+----------
1 | 00:0D:B9:35:29:C4 | Dynamic | te1
If you only have unmanaged switches, your only option is to ping the IP repeatedly while unplugging parts of your LAN till the ping stops. If the remote host doesn't respond to ICMP pings, use arping which requires root:
~$ sudo arping 10.28.1.1
ARPING 10.28.1.1
60 bytes from 00:0d:b9:35:29:c4 (10.28.1.1): index=0 time=182.578 usec
60 bytes from 00:0d:b9:35:29:c4 (10.28.1.1): index=1 time=156.675 usec
60 bytes from 00:0d:b9:35:29:c4 (10.28.1.1): index=2 time=198.373 usec
^C
--- 10.28.1.1 statistics ---
3 packets transmitted, 3 packets received, 0% unanswered (0 extra)
rtt min/avg/max/std-dev = 0.157/0.179/0.198/0.017 ms
Finally once you DO find the mysterious DHCP server, explain kindly to whoever did it what the problems are and why it has caused you pain.
I ran two disjointed DHCP servers for years on the same network without issue. The solution was for the sysadmin (me) to make sure both were serving the same set of options with correct values, and for each to have its own separate range as a dynamic pool that didn't overlap. I also set many DHCP reservations so a lot of hosts would always get the same "static" IP but I could change options like nameserver in just two DHCP servers.
You can do this - let us know how you get on.