S'abonner au fil des news (flux RSS)

Fil des news

20190513 / Arrêt de E5-2670comp1 et E5-2670comp2

Bonjour, les serveurs de compilation E5-2670comp1 et E5-2670comp2 vont être arrêtés temporairement mardi 14/05/2019 pour permettre leur déplacement. L'arrêt durera au plus 2 heures.

Edit 2019/05/14 11h30
E5-2670comp1 et E5-2670comp2 sont OK

2019/05/13 13:48 · gilquin

20190509 / Problème alimentation électrique

Une interruption de l'alimentation électrique a eu lieu hier mercredi 8 mai 2019 vers 13h, cela a causé l'arrêt de très nombreux serveurs. La remise en condition opérationnelle normale va durer au moins toute la matinée.

Hervé Gilquin pour le staff.

2019/05/09 05:17 · gilquin

20190424 / It's alive !!

Scratch E5 is back from the dead, as fresh as a new born (hence empty).

Reminder, new scratch hierarchy:

/scratch/
     ├── E5/        (existing E5 scratch, available to E5 cluster)
     ├── nvme/      (local to some servers)
     ├── ssd/       (local to some servers)
     ├── project_name (local to some servers, with dedicated hardware)
     ...
     └── X5/        (existing X5 scratch, available to X5 cluster)
2019/04/24 14:55 · ltaulell

20190420 / Restart

We are up, except for some nodes not mounting homes and scratch E5 FUBAR.

Maintenance operations will occur next week.

2019/04/20 13:59 · ltaulell

20190405 / Next power outage Saturday, April 20th 2019.

Hi all,

FYI, next planified main power outage for ENS Lyon (Monod site) is scheduled for Saturday, April 20th 2019.

Planned stops for PSMN are:

  1. stop of all queues on Friday, April 19th (before 10h AM)
  2. stop of allo-psmn *at 10h AM* on Friday, April 19th

Scheduling of restart will depend on maintenance operations from both PSMN and ENS (ASAP, when DC goes back to “fully operationnal”).

At this occasion, there will be an OS upgrade on compute servers (Debian 9.5 → 9.8), hence possible updates on softwares.

Request For Comments:


  • CUDA upgrade

We would like to upgrade nvidia driver and CUDA devkit from 8 to '9.0 + 9.2' on all compute servers (including visualization ones).

  • Unified scratch

We propose a new scratchs hierarchy, enabling easy inclusions of upcoming hardware.

/scratch/
     ├── E5/        (existing E5 scratch, available to E5 cluster)
     ├── nvme/      (local to some servers)
     ├── ssd/       (local to some servers)
     ├── project_name (local to some servers, with dedicated hardware)
     ...
     └── X5/        (existing X5 scratch, available to X5 cluster)

Main documentation will be updated to include these changes. You will need to change your scripts accordingly.

  • Cleanup of E5 scratch

E5 scratch is in bad shape and need a fresh cleanup. We propose to erase it and restart from zero.

If any comments, send it to staff.psmn, please.

2019/04/08 14:14 · ltaulell
news/blog.txt · Dernière modification : 2020/08/25 15:58 de 127.0.0.1