5

I need to create iptables rules for the following scenario:

  • Different hosts send UDP data to host A. The target port is 1234.
  • Host A (8.2.3.4) redirects the received UDP data to hosts B1 (7.2.3.1), B2 (22.93.12.3), ... Bn (12.42.1.3); the IP addresses are just for illustrations there.
  • It is not about load balancing, so every host B1, B2, ... Bn must receive all packages. Therefore, host A has to duplicate the packages.
  • The forwarded packages must have the correct target IP (B1, B2, ... Bn) and source IP (host A)
  • I cannot change anything on the initial hosts which send data to host A
  • I cannot change anything on the target hosts B1, B2, ... Bn
  • The hosts B1, B2, ... Bn do not have to be able to answer back

I've tried to solve this with PREROUTING / mangle:

HOST_A=8.2.3.4
HOST_B1=7.2.3.1
HOST_B2=22.93.12.3
...
HOST_BN=12.42.1.3

iptables -F -t mangle iptables -t mangle -A PREROUTING -d $HOST_A -p udp --dport 1234 -j TEE --gateway $HOST_B1 iptables -t mangle -A PREROUTING -d $HOST_A -p udp --dport 1234 -j TEE --gateway $HOST_B2 ... iptables -t mangle -A PREROUTING -d $HOST_A -p udp --dport 1234 -j TEE --gateway $HOST_BN

iptables -L -t mangle

The hosts B1, B2, ... Bn do not seem to receive the data. Does anyone know what is wrong? Debugging is actually quite tricky (I found not really a way to do it).

Thanks

1 Answers1

4

Samplicator

There's a tool that matches all the criteria, samplicator:

UDP Samplicator

This small program receives UDP datagrams on a given port, and resends those datagrams to a specified set of receivers. In addition, a sampling divisor N may be specified individually for each receiver, which will then only receive one in N of the received packets.

An example command to run on host A that would match OP's examples for destinations B1 B2 and Bn:

samplicate -p 1234 7.2.3.1/1234 22.93.12.3/1234 12.42.1.3/1234

That's probably the more reasonable thing to do.


nftables

For an unreasonable approach, this could be done in-kernel, either with iptables with difficulty (example for only two duplicates in my answer in this UL SE Q/A), or with nftables which can do stateless NAT avoiding some complexity related to conntrack zones.

nftables uses iptables's TEE equivalent: dup.

DUP STATEMENT

The dup statement is used to duplicate a packet and send the copy to a different destination.

dup to device
dup to address device device

Note: the first syntax can't be used in the ip family, but only in the netdev family which would add multiple issues, one among them being having to guess the correct MAC addresses (in case of Ethernet interface) which would also need alterations, so becomes less practical.

dup has the same behaviour as TEE: the duplicate must immediately be "evacuated" to a gateway on an interface (gateway meaning an IPv4 address that, if relevant for the type of interface, is used only to resolve using ARP the destination MAC address, but is not in the IP packet itself) probably to avoid it being processed abnormally by the rest of the routing stack. The final destination cannot be the gateway address (unless it happens to be in the same LAN).

Here a special routing setup is done in order to inject back the already statelessly SNATed (to A's own address) / DNATed (to final destinations) packets to the host itself so that it will further route them to their destination as if it had actually been emitted by itself. This requires to accept receiving one own's local IP address, and disabling all reverse path filter on the receiving veth interface (thus requiring to do this on the all setting as well). It could still be re-enabled explicitly on eth0 if needed.

Setup on host A in addition to its initial running network configuration (with its single interface eth0):

HOST_A=8.2.3.4

ip link add name vethinj up type veth peer name vethgw ip link set vethgw up sysctl -w net.ipv4.conf.vethgw.forwarding=1 sysctl -w net.ipv4.conf.vethgw.accept_local=1 sysctl -w net.ipv4.conf.vethgw.rp_filter=0 sysctl -w net.ipv4.conf.all.rp_filter=0 ip route add $HOST_A/32 dev vethinj

There's no additional IP address to assign. Linux using the weak host model, the IP address on eth0 is available on vethgw so is reachable as gateway through vethinj once the route is added.

The nftables rules in the file multiply.nft, to be loaded with nft -f multiply.nft below will:

  • match the packet to multiply
  • do a stateless SNAT (ip saddr set) to A's own address. This doesn't involve conntrack's NAT.
  • for each destination Bx, do a stateless DNAT (ip daddr set) to destination HOST_Bx and duplicate the packet to injection's side of the veth pair, using A's own address as gateway.
  • drop the remaining altered original because it's of no use anymore.

multiply.nft:

define HOST_A=8.2.3.4
define HOST_B1=7.2.3.1
define HOST_B2=22.93.12.3
define HOST_Bn=12.42.1.3

table ip multiply delete table ip multiply

table ip multiply { chain c { type filter hook prerouting priority -300; policy accept; iif != vethgw ip daddr $HOST_A udp dport 1234 ip saddr set $HOST_A goto cmultiply }

    chain cmultiply {
            jump cdnatdup
            drop
    }

    chain cdnatdup {
            ip daddr set $HOST_B1 dup to $HOST_A device vethinj
            ip daddr set $HOST_B2 dup to $HOST_A device vethinj

            ip daddr set $HOST_Bn dup to $HOST_A device vethinj
    }

}

To add a new destination 192.0.2.2 after this, one can do manually:

nft add rule ip multiply cdnatdup ip daddr set 192.0.2.2 dup to 8.2.3.4 device vethinj

Below, a text schematic for single packet coming from S2 and duplicated to B1 B2 and Bn:

S1 →┄┄┄┄┄╮                        ╭┄┄┄┄┄┄┄┄→ B1
S2 →┄┄┄╮ ┊                        ┊ ╭┄┄┄┄┄┄→ B2
...    ┊ ┊                        ┊ ┊       ...
Sn →┄╮ ┊ ┊                        ┊ ┊ ╭┄┄┄┄→ Bn
     ┊ ┊ ╰┄┄  ┄      ┌──────┐↗┄┄┄┄╯ ┊ ┊
     ┊ ╰┄┄┄┄┄┄┄┄┄┄┄┄→│ eth0 │→┄┄┄┄┄┄╯ ┊
     ╰┄┄┄┄┄┄ ┄       └──────┘↘┄┄┄┄┄┄┄┄╯
                    ↙      ↖↖↖
             snat  ↯        ↑↑↑  internal
  nftables:  dnat  ⇊    A   ↑↑↑  routing
            & dup ↓↓↓       ↑↑↑
               ┌───────┐  ┌───────┐
               │vethinj│  │vethgw │
               └───────┘  └───────┘
                    ╰─── ⇶ ───╯ 
                   virtual wire

Note: as this NAT wasn't handled by conntrack there's no possibility to have B1 B2...Bn answer back to the original source, but this was waived by OP.

A.B
  • 6,306