Ceci est une ancienne révision du document !


20180109 / Reboot next monday

WE ARE IN TESTING MODE. REDUCED PRODUCTION. FULL ACCESS.

  • /scratch E5 and x5 are in full FUBAR split-brain. Data will be erased next monday (2018/01/15).
  • E5 cluster is working (except for /scratch)
  • softwares are being rebuild
  • documentation is being updated (with your help),

== Security upgrades

Two security breaches have been announced last week : Spectre and Meltdown.

We need to reboot each server after upgrade, as it's a kernel upgrade.

== /scratch on E5 and x55 clusters

The glusterfs filesystem (that serve the /scratch on clusters) is beyond repair. Their was so many misuses, issues plus errors from previous system that the new autoheal/autorepair included in the new software version cannot do much.

After the analysis of ~6 485 000 000 files (yep, billions), it has find more than 440 000 files in errors or split-brain that cannot be auto-heal nor auto-repair.

As a consequence, the E5 cluster will be restarted with an empty /scratch on next monday.

All data will be lost in both /scratch (E5 and x55), if you need these data, and you can access it, copy them before next monday.

newsfeed/20180109.1515518145.txt.gz · Dernière modification : 2020/08/25 15:58 (modification externe)