S'abonner au fil des news (flux RSS)
Following Friday 10th unknown problem, most nodes (including @comp & @visu) have been rebooted and are working properly.
Nodes still running jobs (prior to problem) are queued for reboot. Things should be fixed in the next days.
Status like “launch_failed_requeued_held”, “JobHeldAdmin”, etc will be cleaned along.
Stay tuned with this newsfeed.
I don't know what it is, for now. And I'm looking into it.
Might need a general reboot, at first sight :/
EDIT 13:00: OK, general reboot of @comp, @visu… @nodes will follow shortly…
c8220node1, along with a bunch of c8220 nodes will be powered off today (02/02/2023, expected restart next monday 06/02/2023)
Please use another login node: See https://meso-centres-lyon.pages.in2p3.fr/psmn-rtd/clusters_usage/login_nodes.html
As some configuration modifications need a reboot, a large bunch of nodes is in draining mode, reducing available computing capacity.
Stay tuned with this newsfeed.
Due to:
a lot of E5 nodes are in draining mode, reducing available computing capacity.
Stay tuned with this newsfeed.
“resilver in progress” since replacement of a dead disk today.
impacts on all bio accounts (lbmc, rdp, igfl, ciri…)