Ceci est une ancienne révision du document !
Explanations will be long, take a coffee, a tea, or quit ;o)
Last week, we experienced a threshold effect (again) while adding new nodes to E5 cluster. It lead us to reboot a big part of debian nodes. Doing this open a Murphy's box (you know the law ? It's the same, with a ribbon on it).
- an electrical problem appear where we add the new nodes, on the same distribution unit of a block of x5570, including their scratch servers. → unexpected reboots while trying to figure out what's going on.
to upgrade switchs
And now ? Upgrades are (mostly) OK, scratch has been checked and is OK. Some E5 nodes are not OK, and have been pushed away from queuing system.