20221027 / prepare for reboot...

there's a ghost network glitch that make compute nodes unavailable

“connexion reset by peer”, “launch failed requeued held”, “JobHeldAdmin” → all the same, the node is gone rogue.

E5, Lake and Cascade partitions are in “DRAIN” mode, awaiting a general reboot of compute nodes.

E5-GPU, Epyc and Cascade are already OK, login nodes and visualization nodes also.

DO NOT WRITE DIRECTLY TO PSMN STAFF, USE THE WEB FORMS: Formulaires du PSMN

Stay tuned with this newsfeed.

EDIT: nevermind, Cascade need a reboot too…

newsfeed/20221027.txt · Dernière modification : 2022/10/27 14:04 de ltaulell