S'abonner au fil des news (flux RSS)
PSMN will be resuming to normal operation in a moment.
EDIT 15:00: Call me paranoïd. Temperatures are rising again . Clusters are at minimal frequencies, partitions half-open.
Cooling system is down, and it is worst than the last time. Idle nodes have been drain and poweroff.
Working nodes might follow, if the situation doesn't evolve well.
EDIT 16h30: all partitions DOWN, 230+ nodes powered OFF. All running jobs continue, at minimal frequencies. No starting jobs.
EDIT 21h30: T°C back to normal with current reduced load. PSMN stay in DRAIN mode until further notice.
EDIT morning: I forgot → half login nodes and visu nodes powered OFF (no need for residual heat).
Bad news: scratch/Lake is dead. Definitely. (5 disks dead over 12 during my vacation). Data are lost, rebuild is impossible.
Fortunately, the new hardware we plan to put in production in October has been delivered. A new scratch/Lake, with better performances, will be available in a few weeks.
Good news: UPS is back online and cramicule is finished, frequencies will resume tonight (minimum frequency by day, maximum frequency at night).
Our main electrical protection (Uninterruptible Power System) is down (bypass mode) due to failures
.
All nodes are in minimal frequency (energy saving mode) until further order.
LT from vacation.
Il n'y aura pas de création de comptes pendant la période de fermeture administrative de l'ENSL.