S'abonner au fil des news (flux RSS)
Something is broken or hang in our networking. We are working on it.
EDIT 11:19 - I still don't now precisely what happened, but every login node, visu node and compute node need a reboot.
EDIT 11:40 - @login OK, @visu OK
reboot of compute nodes ongoing, every running jobs will be lost.
EDIT 14:30 - @compute is restarting
EDIT 15h00 - @compute is fully restarted
Stay tuned with this newsfeed.
We may have to deal with extrem mesures in the next few weeks/months, including :
This is due to electricity prices and possible power outages.
Situation is still very fluid, decisions have not been made yet.
“Hope for the best, but prepare for the worst.”
Stay tuned with this newsfeed.
A storage node for /scratch/Cral
is having issues. This result in a partial unavailability (not loss) of data.
Stay tuned with this newsfeed.
Some updates in our slurm documentation, see:
A Power Distribution Unit (and a fan) unexpectedly died yesterday evening. Cascade group s92node[67-78] was brutally shutdown, running jobs have been lost.
Stay tuned with this newsfeed.