====== 20221027 / prepare for reboot... ======

there's a ghost network glitch that make compute nodes unavailable

"connexion reset by peer", "launch failed requeued held", "JobHeldAdmin" -> all the same, the node is gone rogue.

E5, Lake and Cascade partitions are in "DRAIN" mode, awaiting a general reboot of compute nodes.

E5-GPU, Epyc <del>and Cascade</del> are already OK, login nodes and visualization nodes also.

<note important>
DO NOT WRITE DIRECTLY TO PSMN STAFF, USE THE WEB FORMS:
[[contact:forms:accueil|Formulaires du PSMN]]

Stay tuned with this newsfeed.
</note>

EDIT: nevermind, Cascade need a reboot too...

{{tag>}}