Différences

Ci-dessous, les différences entre deux révisions de la page.

Lien vers cette vue comparative

Prochaine révision
Révision précédente
newsfeed:20141007 [2014/10/07 12:39] ltaulellnewsfeed:20141007 [2020/08/25 17:58] (Version actuelle) – modification externe 127.0.0.1
Ligne 3: Ligne 3:
 Explanations will be long, take a coffee, a tea, or quit ;o) Explanations will be long, take a coffee, a tea, or quit ;o)
  
-Last week, we experienced a threshold effect (again) while adding new nodes  +Last week, we experienced a threshold effect (again) while adding new nodes to E5 cluster. It lead us to reboot a big part of debian nodes. Doing this open a Murphy's box (you know the law ? It's the same, with a ribbon on it).
-to E5 cluster. It lead us to reboot a big part of debian nodes. Doing this  +
-open a Murphy's box (you know the law ? It's the same, with a ribbon on it).+
  
-an electrical problem appear where we add the new nodes, on the same  +  * an electrical problem appear where we add the new nodes, on the same distribution unit of a block of x5570, including their scratch servers.
-  distribution unit of a block of x5570, including their scratch servers.+
   -> unexpected reboots while trying to figure out what's going on.   -> unexpected reboots while trying to figure out what's going on.
  
-new E5 nodes cannot connect to "old" infiniband switchs (firmware problem)+  * new E5 nodes cannot connect to "old" infiniband switchs (firmware problem)
   -> nodes need a new infiniband card firmware, and reboot   -> nodes need a new infiniband card firmware, and reboot
   -> OS kernel need an upgrade, which need to be propagated   -> OS kernel need an upgrade, which need to be propagated
   -> some infiniband cables wont work anymore (wires need an upgrade, maybe)   -> some infiniband cables wont work anymore (wires need an upgrade, maybe)
  
-new E5 infiniband switch cannot connect to "old" infiniband switchs (firmware +  * new E5 infiniband switch cannot connect to "old" infiniband switchs (firmware problem) 
-  problem) +
   -> old switchs need a new firmware, and reboot   -> old switchs need a new firmware, and reboot
-  -> a special machine, with a very early OS, with experimental libs is needed  +  -> a special machine, with a very early OS, with experimental libs is needed to upgrade switchs
-     to upgrade switchs+
  
-old E5 nodes bios is incompatible with new infiniband card firmware...+  * old E5 nodes bios is incompatible with new infiniband card firmware...
   -> all E5 nodes need to be bios upgraded, and reboot   -> all E5 nodes need to be bios upgraded, and reboot
  
-with all theses, scratch has become incoherent, checkfs needed...+  * with all theses, scratch has become incoherent, checkfs needed...
  
-And now ? Upgrades are (mostly) OK, scratch has been checked and is OK. +And now ? Upgrades are (mostly) OK, scratch has been checked and is OK. Some E5 nodes are not OK, and have been pushed away from queuing system.
-Some E5 nodes are not OK, and have been pushed away from queuing system.+
  
  
newsfeed/20141007.1412678361.txt.gz · Dernière modification : 2020/08/25 17:58 (modification externe)