3

I am trying to get a server running with 2 network cards. One network card will have a dynamic ip(DHCP) and the other will have a static ip 192.168.0.24. I have 2 network cards on this server, 1GB(enp4s0) and 10GB(enp5s0)

My current fresh OS installation:

oven@oven-f1:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 22.04.5 LTS
Release:    22.04
Codename:   jammy

This netplan is the default one that comes with a fresh OS install using default network configs:

oven@oven-f1:~$ sudo cat /etc/netplan/50-cloud-init.yaml
# This file is generated from information provided by the datasource.  Changes
# to it will not persist across an instance reboot.  To disable cloud-init's
# network configuration capabilities, write a file
# /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following:
# network: {config: disabled}
network:
    ethernets:
        enp4s0:
            dhcp4: true
    version: 2
    wifis: {}

status of network cards with this netplan:

oven@oven-f1:~$ ip link show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: enp4s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether d8:43:ae:90:b8:2e brd ff:ff:ff:ff:ff:ff
3: enp5s0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 74:fe:ce:ea:db:b5 brd ff:ff:ff:ff:ff:ff

default routes with this netplan config:

oven@oven-f1:~$ ip route
default via 192.168.0.1 dev enp4s0 proto dhcp src 192.168.0.27 metric 100 
192.168.0.0/24 dev enp4s0 proto kernel scope link src 192.168.0.27 metric 100 
192.168.0.1 dev enp4s0 proto dhcp scope link src 192.168.0.27 metric 100 

Below is the new netplan config im trying to implement:

network:
  version: 2
  renderer: networkd
  ethernets:
    enp4s0:
      dhcp4: true
    enp5s0:
      dhcp4: true
      addresses:
        - 192.168.0.24/24
      routes:
        - to: 0.0.0.0/0
          via: 192.168.0.1
      nameservers:
        addresses: [8.8.8.8, 8.8.4.4]

The problem is once i run a sudo netplan --debug apply with the new config:

oven@oven-f1:~$ sudo netplan --debug apply
    ** (generate:1778): DEBUG: 07:51:08.869: starting new processing pass
** (generate:1778): DEBUG: 07:51:08.869: enp5s0: adding new route
** (generate:1778): DEBUG: 07:51:08.869: starting new processing pass
** (generate:1778): DEBUG: 07:51:08.869: We have some netdefs, pass them through a final round of validation
** (generate:1778): DEBUG: 07:51:08.869: enp4s0: setting default backend to 1
** (generate:1778): DEBUG: 07:51:08.869: Configuration is valid
** (generate:1778): DEBUG: 07:51:08.869: enp5s0: setting default backend to 1
** (generate:1778): DEBUG: 07:51:08.869: Configuration is valid
** (generate:1778): DEBUG: 07:51:08.869: Generating output files..
** (generate:1778): DEBUG: 07:51:08.869: Open vSwitch: definition enp4s0 is not for us (backend 1)
** (generate:1778): DEBUG: 07:51:08.869: NetworkManager: definition enp4s0 is not for us (backend 1)
** (generate:1778): DEBUG: 07:51:08.869: Open vSwitch: definition enp5s0 is not for us (backend 1)
** (generate:1778): DEBUG: 07:51:08.869: NetworkManager: definition enp5s0 is not for us (backend 1)
** (process:1776): DEBUG: 07:51:09.042: starting new processing pass
** (process:1776): DEBUG: 07:51:09.042: enp5s0: adding new route
** (process:1776): DEBUG: 07:51:09.042: starting new processing pass
** (process:1776): DEBUG: 07:51:09.042: We have some netdefs, pass them through a final round of validation
** (process:1776): DEBUG: 07:51:09.042: enp4s0: setting default backend to 1
** (process:1776): DEBUG: 07:51:09.042: Configuration is valid
** (process:1776): DEBUG: 07:51:09.042: enp5s0: setting default backend to 1
** (process:1776): DEBUG: 07:51:09.042: Configuration is valid
** (process:1776): DEBUG: 07:51:09.128: starting new processing pass
** (process:1776): DEBUG: 07:51:09.128: enp5s0: adding new route
** (process:1776): DEBUG: 07:51:09.128: starting new processing pass
** (process:1776): DEBUG: 07:51:09.128: We have some netdefs, pass them through a final round of validation
** (process:1776): DEBUG: 07:51:09.128: enp4s0: setting default backend to 1
** (process:1776): DEBUG: 07:51:09.128: Configuration is valid
** (process:1776): DEBUG: 07:51:09.128: enp5s0: setting default backend to 1
** (process:1776): DEBUG: 07:51:09.128: Configuration is valid
** (process:1776): DEBUG: 07:51:09.128: starting new processing pass
** (process:1776): DEBUG: 07:51:09.128: enp5s0: adding new route
** (process:1776): DEBUG: 07:51:09.128: starting new processing pass
** (process:1776): DEBUG: 07:51:09.128: We have some netdefs, pass them through a final round of validation
** (process:1776): DEBUG: 07:51:09.128: enp4s0: setting default backend to 1
** (process:1776): DEBUG: 07:51:09.128: Configuration is valid
** (process:1776): DEBUG: 07:51:09.128: enp5s0: setting default backend to 1
** (process:1776): DEBUG: 07:51:09.128: Configuration is valid

status of network cards with this new netplan:

oven@oven-f1:~$ ip link show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: enp4s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether d8:43:ae:90:b8:2e brd ff:ff:ff:ff:ff:ff
3: enp5s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 74:fe:ce:ea:db:b5 brd ff:ff:ff:ff:ff:ff

default routes with this new netplan config:

oven@oven-f1:~$ ip route
default via 192.168.0.1 dev enp5s0 proto static 
default via 192.168.0.1 dev enp4s0 proto dhcp src 192.168.0.27 metric 100 
192.168.0.0/24 dev enp5s0 proto kernel scope link src 192.168.0.24 
192.168.0.0/24 dev enp4s0 proto kernel scope link src 192.168.0.27 metric 100 
192.168.0.1 dev enp4s0 proto dhcp scope link src 192.168.0.27 metric 100 

There are no errors with my config but I loose ssh access to the server. I can still access the internet from the server and ssh to other machines but I cannot ssh into the server from my laptop.

I can't ping the server but I can still see its address:

s@M1 ~ % nslookup oven-f1
Server:     2001:8003:d44e:7600::1
Address:    2001:8003:d44e:7600::1#53

Name: oven-f1.modem Address: 192.168.0.27

s@M1 ~ % ping oven-f1 PING oven-f1.modem (192.168.0.27): 56 data bytes Request timeout for icmp_seq 0 Request timeout for icmp_seq 1 Request timeout for icmp_seq 2 Request timeout for icmp_seq 3 ^C --- oven-f1.modem ping statistics --- 5 packets transmitted, 0 packets received, 100.0% packet loss

Im not sure why I cannot ssh into the server after enabling 2 network cards, any help would be greatly appreciated as I'm quite stuck

Edit answer update

below is the working netplan config. I simply split the 10GB cards and 10GB switch into different subnet 192.168.1.0/24 and kept the 1GB cards and switch on 192.168.0.0/24.

network:
  version: 2
  renderer: networkd
  ethernets:
    enp4s0:
      dhcp4: true
    enp5s0:
      dhcp4: false
      addresses:
        - 192.168.1.24/24

also updated the hosts file on the servers to map hosts on the 192.168.1.0/24 subnet

Justin S
  • 133

2 Answers2

4

The first issue here is that you have two independent subnets that use the same address numbering.
Don't do that. If the 10Gbit1 switch has no connection to the main 1Gbit LAN switch, then they should use different network numbers – e.g. 192.168.1.0/24 for one and 192.168.10.0/24 for the other – regardless of whether it's the root cause of your problem or not.

This is because operating systems primarily route their local subnets as a single unit, and also because they route responses independently from the requests (with some exceptions). So when you try to contact 192.168.0.26, but you have two network cards connected to two different 192.168.0.0/24 subnets, the OS will not try to determine which network has such an address (with Windows possibly being an exception) – instead the entire /24 route through card A is set to have higher priority than the identical route through card B.

For example, when your Linux server has these routes:

192.168.0.0/24 dev enp5s0 proto kernel scope link src 192.168.0.24 
192.168.0.0/24 dev enp4s0 proto kernel scope link src 192.168.0.27 metric 100 

and it needs to respond to your PC's SSH connection request, it will send that response through dev enp5s0 (your 10Gbit card) because that route has same prefix length (/24) and smaller metric (lower cost); and since the 10Gbit card has no connection to the 1Gbit subnet, that packet will never reach the PC.


Connecting the 10Gbit switch to the 1Gbit switch would be the easiest way to make it work by joining them both into a single network (without any loss in performance) – although it would make the 1Gbit card unnecessary to begin with.

But since you have no more ports free on the 10Gbit switch and have to keep the subnets separate out of necessity, then you must renumber one of them – and that alone should make everything work, at least for now (as long as you only have two subnets in general).

As a side note (once you have the subnet numbering tidied up), it is technically enough to have the 2nd card on just one server, which then may act as a router between the "1Gbit" and "10Gbit" subnets – although if you have enough ports and enough cards, then installing the 2nd card on every server will certainly make the configuration simpler.


Another issue is that you have a default route on enp5s0, even though you say that this goes to a switch not connected to any router – meaning that 192.168.0.1 probably doesn't exist on this network, and even if it does, it won't have Internet access anyway. This is likely lead to another instance of the earlier problem, i.e. that the useless "0.0.0.0/0 through enp5s0" route has higher priority over the working "0.0.0.0/0 through enp4s0" route.

Don't configure a default gateway if the network doesn't have one. The default route is not required for the interface to work. (Same-subnet communications happen without using a gateway, by definition!)



1 Note that "GB" (uppercase) means "gigabyte", which is an 8x–10x difference from "Gb" (gigabit).

grawity
  • 501,077
2

The reason this is happening, is you have 2 nics on the same broadcast subnet, packets will only be sent on one of the nic.

You are attempting to access the server (ssh) from the same subnet, therefore default route will not be a factor, it will only be the 192.168.0.0/24 routes.

On the static interface you have the route: 192.168.0.0/24 dev enp5s0 proto kernel scope link src 192.168.0.24

On the DHCP you have: 192.168.0.0/24 dev enp4s0 proto kernel scope link src 192.168.0.27 metric 100

As the metric on the dhcp is 100, and the static is 0 due to no metric being assigned, the 0 route will be the route that will be used. That means all traffic for 192.168.0.0/24 will be sent via the enp5s0. This causes issues as the mac address that the client will recieve from is differnt from the one it sent to, also unless you have enabled route_localnet the two interfaces can not route traffic between them.

If you wish to have the two nic on the same subnet, you have many options, some will depend on what type of switch you are using. Two options that will work no matter the switch you are using.

  1. Change one of the interfaces to have an subnet of 255.255.255.255, I would assume you would want the 1gb card to have this.

ip a add 192.168.0.27/32 dev enp4s0 ip a del 192.168.0.27/24 dev enp4s0

you will need to update your netplan config to apply the new ip address, and disable dhcp on that interface. Otherwise it will get the /24 subnet again.

you will also need to enable route_localnet

sysctl -w net.ipv4.conf.all.route_localnet=1

To make that persistent across reboots

echo net.ipv4.conf.all.route_localnet=1 >> /etc/sysctl.conf

This will result in any packets addressed to 192.168.0.27 to be sent from this interface, the metric wont matter, as the higher subnet will always be used, metric is only a factor when the subnets are the same. Using a /32 address will also disable broadcasting on that interface, this will stop broadcast looping.

  1. Bridging the interfaces. The easiest way for this is to install bridge-utils

On Ubuntu: apt install bridge-utils

brctl addbr br0 brctl addif br0 enp4s0 brctl addif br0 enp5s0

ip link set enp4s0 promisc on ip link set enp5s0 promisc on

ip link set br0 up dhclient br0 ip a del 192.168.0.27/24 dev enp4s0 ip a del 192.168.0.24/24 dev enp5s0

This will give the bridge interface a new IP address.

This setup may cause a broadcast loop, if it does, try stp.

brctl stp br0 on

make sure you can access the terminal either locally or with some out of band management like iDrac or iLo, switching to bridge may cause connective to drop while it is switching.

****Note, I do not recommend this setup, you should not have 2 nics on the same broadcast subnet (vlans are one way to create separate broadcast subnets). Link aggregation would be the best solution.

If the network interfaces are connected to different physical segments of the network, then the bridge option is the way to go, and is recommended, as a bridge is another term for a switch, your server will therefore act as a switch between the two physical segments, joining the 2 physical into one logical network.