Ci-dessous, les différences entre deux révisions de la page.
Prochaine révision | Révision précédenteDernière révisionLes deux révisions suivantes | ||
newsfeed:20141007 [2014/10/07 10:39] – ltaulell | newsfeed:20141007 [2014/10/07 10:42] – ltaulell | ||
---|---|---|---|
Ligne 3: | Ligne 3: | ||
Explanations will be long, take a coffee, a tea, or quit ;o) | Explanations will be long, take a coffee, a tea, or quit ;o) | ||
- | Last week, we experienced a threshold effect (again) while adding new nodes | + | Last week, we experienced a threshold effect (again) while adding new nodes to E5 cluster. It lead us to reboot a big part of debian nodes. Doing this open a Murphy' |
- | to E5 cluster. It lead us to reboot a big part of debian nodes. Doing this | + | |
- | open a Murphy' | + | |
- | - an electrical problem appear where we add the new nodes, on the same | + | * an electrical problem appear where we add the new nodes, on the same distribution unit of a block of x5570, including their scratch servers. |
- | | + | |
-> unexpected reboots while trying to figure out what's going on. | -> unexpected reboots while trying to figure out what's going on. | ||
- | - new E5 nodes cannot connect to " | + | * new E5 nodes cannot connect to " |
-> nodes need a new infiniband card firmware, and reboot | -> nodes need a new infiniband card firmware, and reboot | ||
-> OS kernel need an upgrade, which need to be propagated | -> OS kernel need an upgrade, which need to be propagated | ||
-> some infiniband cables wont work anymore (wires need an upgrade, maybe) | -> some infiniband cables wont work anymore (wires need an upgrade, maybe) | ||
- | - new E5 infiniband switch cannot connect to " | + | * new E5 infiniband switch cannot connect to " |
- | | + | |
-> old switchs need a new firmware, and reboot | -> old switchs need a new firmware, and reboot | ||
- | -> a special machine, with a very early OS, with experimental libs is needed | + | -> a special machine, with a very early OS, with experimental libs is needed to upgrade switchs |
- | to upgrade switchs | + | |
- | - old E5 nodes bios is incompatible with new infiniband card firmware... | + | * old E5 nodes bios is incompatible with new infiniband card firmware... |
-> all E5 nodes need to be bios upgraded, and reboot | -> all E5 nodes need to be bios upgraded, and reboot | ||
- | - with all theses, scratch has become incoherent, checkfs needed... | + | * with all theses, scratch has become incoherent, checkfs needed... |
- | And now ? Upgrades are (mostly) OK, scratch has been checked and is OK. | + | And now ? Upgrades are (mostly) OK, scratch has been checked and is OK. Some E5 nodes are not OK, and have been pushed away from queuing system. |
- | Some E5 nodes are not OK, and have been pushed away from queuing system. | + | |