0

I need to inject AER errors onto a SUSE machine. I've modprobbed the aer_inject module just fine, and I compiled the aer-inject tool from kernel.org.

Whenever I run it, I get the following error.

Error: Failed to write, No such device

Even though my device exists according to lspci -vvv, and I'm running with root permissions.

Here's my file that I'm using to pass to aer-inject

AER
PCI_ID 18:00.0
COR_STATUS BAD_TLP
HEADER_LOG 0 1 2 3

And on my machine, 18:00.0 corresponds to

18:00.0 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5]

Which has Advanced Error Reporting according to lspci -vvv

Why am I getting this error? Am I using the tool correctly? What should I put for the PCI_ID field if not what I see in lspci?

Raven
  • 209

1 Answers1

0

I just hit this same "Failed to write, No such device" issue on openSUSE Leap 15.2 running on a Dell T30 server. It turns out there is some level of ownership for AER handling and the aer_inject module will fail to find devices if the AER handling support appears to be associated elsewhere (possibly tied by BIOS to ACPI?). Regardless, I got aer-inject to work by appending pcie_ports=native to the kernel command line and rebooting.

FWIW I used yast2 to append the pcie_ports=native option: yast2 -> System -> Boot Loader -> Kernel Parameters -> Optional Kernel Command Line Parameter

CaldeiraG
  • 2,623
  • 8
  • 21
  • 34
Allan
  • 1