7

For IPv4, it is easy to create a rule that only accepts connections from hosts of the same subnet, for example (assuming my computer is 192.168.42.2, and the incoming connection is 192.168.42.20):

table ip firewall {
    chain incoming {
        type filter hook input priority 0; policy drop;
        ip saddr 192.168.42.0/24 tcp dport 8080 accept
    }
}

How does one do this for IPv6? I know there's always the link-local address and theoretically this rule should work:

ip6 saddr fe80::/64 tcp dport 8080 accept

The problem now is that I have mDNS setup, and the address it returns is the globally-routable address, something like 2001:db8::1234. Because of that, the packets I receive from other hosts (despite being on the same subnet) all have an ip6 saddr with the 2001:db8 prefix which gets blocked by the firewall.

I cannot simply add a rule that matches 2001:db8::/64, because that prefix comes from the ISP and changes from time to time. Setting up a ULA so that I get a predictable prefix doesn't seem possible either, since the router is ISP-mandated and the configuration interface it has for IPv6 is painfully hollow.

So... this is why I am looking for something that is akin to this:

ip6 saddr & ffff:ffff:ffff:ffff:: == ip6 daddr & ffff:ffff:ffff:ffff:: tcp dport 8080 accept

But nftables doesn't seem to accept that. Is there something I can do to workaround this, or am I missing something?

Haden
  • 73

3 Answers3

6

The feature doesn't exist ... yet?

Currently, nftables can only use one register (in its virtual state machine): it applies bitwise operations on the left hand side (LHS) to compare the result with a constant or a set on the right hand side (RHS). It cannot use two variable operands (meaning: both from packet) in LHS and RHS.

There's WIP about improving this in these patch series (they are not accepted yet): nf-next libnftnl nftables:

Currently bitwise boolean operations (AND, OR and XOR) can only have one variable operand. [...] We add support for evaluating these operations directly in kernel space on one register and either an immediate value or a second register.

This will probably still take additional iterations and some time before it is made available (the idea has been floating around since 2019 and maybe earlier, but it's still not available). Once done, one can image that OP's precise rule:

ip6 saddr & ffff:ffff:ffff:ffff:: == ip6 daddr & ffff:ffff:ffff:ffff:: tcp dport 8080 accept

would work as expected.


Workaround

That said, what could be done today?

One can use an external tool that reacts to the host changing address and updates a set so it has the host's IPv6 network/netmask as content. A set is fine to use as RHS. In the end it's not dynamic as in variable operands but it's still dynamic enough for the need.

Add a set to OP's ruleset (I won't put a full ruleset, this is beyond the scope of the question. Just remember that usually ct state related,established accept as well as allowing the loopback interface should be present and that a table of family inet rather than ip6 could merge some rules for IPv4 plus IPv6 when relevant).

ip6firewall.nft:

table ip6 firewall        #for idempotence
delete table ip6 firewall #for idempotence

table ip6 firewall { set myip6net { typeof ip6 saddr flags interval }

chain acceptmyip6netsrc {
    ip6 saddr @myip6net counter accept
}

}

This could be called from a base input chain with:

tcp dport 8080 jump acceptmyip6netsrc

I'll assume the network interface name is eth0. The script below uses a very simple event loop with ip monitor and will keep running: use it as a service, not in crontab. It will trigger whenever an address event happens (most of the time uselessly when a Router Advertisement that refreshes timeouts and changes nothing happens). ip monitor's output isn't easy to parse, so just ignore it and use ip -json addr to retrieve actual values. The script has room for improvement but does the job.

Requires tools which are usually available in distributions:

  • jq for efficient JSON parsing
  • netmask (handles correctly any abbreviation of an IPv6 address, so 2001:db8::4:5:6:7:8/64 is correctly transformed into 2001:db8:0:4::/64).

updatemyip6net.sh:

#!/bin/sh

{ echo init; ip -6 -o monitor address dev eth0; } | while read dummy; do myip6addr=$(ip -json -6 addr show dev eth0 scope global | jq -j '.[].addr_info[] | if .local then .local,"/",.prefixlen,"\n", halt else empty end' ) myip6net=$(netmask $myip6addr)

nft -f - <<EOF
    flush set ip6 firewall myip6net
    add element ip6 firewall myip6net { $myip6net }

EOF

done

Above,

ip -json -6 addr show dev eth0 scope global | jq -j '.[].addr_info[] | if .local then .local,"/",.prefixlen,"\n", halt else empty end'

is quite long, but replacing it with the simpler:

ip -j -6 route get 2001:4860:4860::8888 | jq -r '.[].prefsrc'

doesn't get the netmask, and hardcoding /64 everywhere should be avoided.

A.B
  • 6,306
0

There are a few ways to do this.

systemd-networkd

When using systemd-networkd it is possible to avoid hardcoding or using wrapper scripts to detect the network subnet prefix given by IPv6 Prefix Delegation.

The systemd.network(5) option NFTSet= can get the network prefixes of a connection. The Arch Linux wiki shows how to use Dynamic named sets using systemd-networkd here.

The example given is to create a config file for systemd-networkd like the following.

/etc/systemd/network:


[DHCPv4]
NFTSet=prefix:inet:my_table:eth_ipv4_prefix
NFTSet=ifindex:inet:my_table:eth_ifindex

[DHCPv6] NFTSet=prefix:inet:my_table:eth_ipv6_prefix NFTSet=ifindex:inet:my_table:eth_ifindex

[IPv6AcceptRA] NFTSet=prefix:inet:my_table:eth_ipv6_prefix NFTSet=ifindex:inet:my_table:eth_ifindex

Then, create the sets in /etc/nftables.conf or in a /etc/nftables.d/NN-some-file.nft drop-in:

table inet my_table {
set eth_ipv4_prefix {
    type ipv4_addr
    flags interval
    comment "Populated by systemd-networkd"
}
set eth_ipv6_prefix {
    type ipv6_addr
    flags interval
    comment "Populated by systemd-networkd"

    elements = { fe80::/10 }
}
set eth_ifindex {
    type iface_index
    comment "Populated by systemd-networkd"
}

chain my_input {
    type filter hook input priority filter; policy drop;

    iif @eth_ifindex ip6 saddr @eth_ipv6_prefix jump my_input_lan comment "Connections from LAN"
    iif @eth_ifindex ip saddr @eth_ipv4_prefix jump my_input_lan comment "Connections from LAN"
}

}

While using SystemD might appeal to you while configuring Servers, or machines already running systemd-networkd, sometimes it's not quite the right fit or systemd-networkd might not be easy to set up. Another common alternative is NetworkManager.

NetworkManager

NetworkManager is usually a better fit for desktop Linux machines, and supports hook scripts.

NetworkManager-dispatcher(8) service can execute scripts for the user in response to network events. It will execute scripts in /{etc,usr/lib}/NetworkManager/dispatcher.d directories (or subdirectories) in alphabetical order in response to network events.

To use this method, create a script in /etc/NetworkManager/dispatcher.d

/etc/NetworkManager/dispatcher.d/01-nftables-cidr-update.sh:

#!/bin/bash

echo "$0: $*"

if [ -z "${DEVICE_IP_IFACE}" ] && [ -n "$1" ]; then DEVICE_IP_IFACE=$1 fi

case $2 in dhcp6-change) if [ -n "${DEVICE_IP_IFACE}" ]; then mapfile -t IPv6_PREFIXES < <( ip -json -6 addr show dev "${DEVICE_IP_IFACE}" scope global |
jq -r '.[].addr_info[] | select(.family == "inet6" and .scope == "global" and .prefixlen < 128) | if .local then "(.local)/(.prefixlen)" else empty end' )

echo &quot;Update nftables set with new IPv6 prefix(es):&quot;
for prefix in &quot;${IPv6_PREFIXES[@]}&quot;; do
  echo &quot;$prefix&quot;
  nft add element ip6 filter my_ipv6_prefixes &quot;{ $prefix }&quot;
done

fi ;; dhcp4-change) if [ -n "${DEVICE_IP_IFACE}" ]; then mapfile -t IPv4_CIDRS < <( ip -json -4 addr show dev "${DEVICE_IP_IFACE}" scope global |
jq -r '.[].addr_info[] | select(.family == "inet" and .scope == "global" and .prefixlen < 32) | if .local then "(.local)/(.prefixlen)" else empty end' ) echo "Update nftables set with new IPv4 CIDR(s):" for cidr in "${IPv4_CIDRS[@]}"; do echo "$cidr" nft add element ip filter my_ipv4_cidrs "{ $cidr }" done fi ;; esac

debug passed env vars

env

Then, similarly create some NFTables sets to hold the IPv6 prefix(es) and IPv4 network CIDR(s):

/etc/nftables.d/02-input-set-lan-cidrs.nft:

#!/usr/bin/nft -f
# vim:set ts=2 sw=2 et:

Idempotence: destroy if already existing,

but don't fail if it does not exist.

(useful for /etc/nftables.d/ drop-in scripts that usually get re-included when restarting nftables.service)

destroy set ip6 filter my_ipv6_prefixes

Create an IPv6 set to track dynamic delegated prefixes

add set ip6 filter my_ipv6_prefixes { type ipv6_addr flags interval size 65536 auto-merge comment "Populated by NetworkManager-dispatcher script" }

again... idempotence

destroy set ip filter my_ipv4_cidrs

Create an IPv4 set to track dynamic DHCP CIDR

add set ip filter my_ipv4_cidrs { type ipv4_addr flags interval size 65536 auto-merge comment "Populated by NetworkManager-dispatcher script" }

This script will then automatically add the discovered networks triggered by dhcp4-change and dhcp6-change events from NetworkManager.

NOTE:

  • The above jq .prefixlen check will ignore the /128 IPv6 IP and /32 IPv4 IP assigned to the interface, because these will always be within the netmask / CIDR. Adding them with auto-merge option on the NFTables set will simply merge them into the same CIDR prefix and have no measurable effect.
  • The NetworkManager-dispatcher script on its' own will not take care of removing old CIDRs or IPv6 network prefixes. You might want to detect the old ones at the time of an interface down event, and remove them from the set. For this, you would also need the a similar pre-down script because the old CIDRs and IPv6 prefixes are not saved when the down hook runs. A similar pre-down script to accomplish this might be:

/etc/NetworkManager/dispatcher.d/pre-down.d/01-nftables-dhcp-cidr-remove.sh:

#!/bin/bash

echo "$0: $*"

if [ -z "${DEVICE_IP_IFACE}" ] && [ -n "$1" ]; then DEVICE_IP_IFACE=$1 fi

case $2 in pre-down) if [ -n "${DEVICE_IP_IFACE}" ]; then mapfile -t IPv6_PREFIXES < <( ip -json -6 addr show dev "${DEVICE_IP_IFACE}" scope global |
jq -r '.[].addr_info[] | select(.family == "inet6" and .scope == "global" and .prefixlen < 128) | if .local then "(.local)/(.prefixlen)" else empty end' )

echo &quot;Delete OLD IPv6 prefix(es) from nftables set:&quot;
for prefix in &quot;${IPv6_PREFIXES[@]}&quot;; do
  echo &quot;$prefix&quot;
  nft delete element ip6 filter my_ipv6_prefixes &quot;{ $prefix }&quot;
done

fi if [ -n "${DEVICE_IP_IFACE}" ]; then mapfile -t IPv4_CIDRS < <( ip -json -4 addr show dev "${DEVICE_IP_IFACE}" scope global |
jq -r '.[].addr_info[] | select(.family == "inet" and .scope == "global" and .prefixlen < 32) | if .local then "(.local)/(.prefixlen)" else empty end' ) echo "Delete OLD IPv4 CIDR(s) from nftables set:" for cidr in "${IPv4_CIDRS[@]}"; do echo "$cidr" nft delete element ip filter my_ipv4_cidrs "{ $cidr }" done fi ;; esac

debug passed env vars

env

Now those sets can be used in normal NFTables rules:

table ip filter {

chain input { type filter hook input priority filter; policy drop; ip saddr @my_ipv4_cidrs jump my_input_lan comment "Connections from IPv4 LAN CIDR" } }

table ip6 filter {

chain input { type filter hook input priority filter; policy drop;

ip6 saddr @my_ipv6_prefixes jump my_input_lan comment &quot;Connections from IPv6 GUA prefix&quot;

}

}

Hope this helps some folks with integrating NFTables with dynamic IPv6 prefix delegation and IPv4 DHCP LAN networks!

0

@TrinitronX offered an excellent solution to my query regarding "how to maintain dynamic IPv6 prefix(es) updated on the firewall".

I just encountered a minor issue:

What occurs if, for any reason, I am required to restart the firewall? Those dynamic prefix entries will be eliminated (flushed) during the restart procedure.

Consequently, I took the following steps:

I modified the Unit section of systemd-networkd as shown below:

[Unit]
PartOf=nftables.service

The purpose of the "PartOf" directive is to ensure that systemd-networkd.service is restarted whenever (and after) nftables.service is restarted.

This way, all dynamic prefix entries are restored to the firewall.

Deny
  • 1
  • 2