I installed slurm on a workstation and it seemed to work, i can use the slurm commands, srun is working too.
But when i try to launch a job from a script using sbatch test.sh i get the following error : Batch job submission failed: I/O error writing script/environment to file even if the script is the simplest like
#!/bin/bash
srun hostname
Asked
Active
Viewed 5,591 times
5
jeb
- 78,592
- 17
- 171
- 225
Daoud El Kadiri
- 51
- 1
- 5
-
Is `slurmd` running as root? – damienfrancois Feb 09 '21 at 09:44
-
yes it is running as root – Daoud El Kadiri Feb 09 '21 at 10:03
-
Slurm seems to complain it cannot write to the location defined by `SlurmdSpoolDir`. Could be a faulty or read-only filesystem? – damienfrancois Feb 09 '21 at 15:07
-
Yes it turned out it was a permission problem, i had the rw permission for root user but in my conf file user was set to slurm. I changed it to root and it worked. – Daoud El Kadiri Feb 11 '21 at 06:37
-
Was this the `SlurmUser` or `SlurmdUser` that you set to root to get things to work again? – winni2k Feb 04 '22 at 08:06
-
In my case it just happens from time to time and I don't understand why. – 3r1c Jan 09 '23 at 17:31
1 Answers
0
Make sure slurmd is running as root. See the SlurmdUser parameter in slurm.conf. Its default value is root and it should be so.
Note this is different from the SlurmUser parameter, that defines the user which runs the controller processes ; this one is preferably not root.
If the configuration is correct, then you might have a faulty filesystem at the location referred to in the SlurmdSpoolDir parameter, where slurmd writes the submission script and environment for jobs assigned to the node.
damienfrancois
- 52,978
- 9
- 96
- 110