====== 20200710 / Post-Mortem ====== The correct assumption is "Shitstorm hit the fan." (I stand corrected) We are not done yet: * ssh.psmn is under attack from a botnet, that's why "maximum authentication attempts exceeded", * master LDAP server is down. We are running from slave1 (backup from yesterday), * All scratch are almost back (expect for some nodes on E5 and X5), * /homes and /Xnfs should be OK everywhere ("should", as in "remount is ongoing"), **EDIT 13:00**: master LDAP server is back online \o/ ! **EDIT 13:50**: Cluster X5 is fully up & running. **EDIT 14:05**: Clusters E5 and Lake up & running. {{tag> shitstorm }}