S'abonner au fil des news (flux RSS)

Fil des news

20230711 / Urgent upgrades

We have to make urgent hardware upgrades on a few fileservers. These need a reboot afterward, fileservers will be unavailable a few minutes, at most. Expect lags and delays during these upgrades.

impacted services: $HOME and /Xnfs shares (mostly everyone)

2023/07/11 08:55 · ltaulell

20230710 / Alerte Orange Canicule

We are powering down half of each main cluster (E5, Lake, Cascade):

  • Ease on cooling system on hot days,
  • Energy saving (and cost)…

Powering up will occur “on-demand” (more than one day in PENDING state).

2023/07/10 07:47 · ltaulell

20230705 / infiniband Cascade (S01E02)

I'm putting down Infiniband network on Cascade cluster (including scratches) for maintenance and debugging purposes.

EDIT 10:15: everything Cascade back online (QM8700 ↔ QM8790…).

FIXME: /scratch/Cascade is 98% full, please cleanup, or I'll do it…

kind reminder about scratches usage → http://www.ens-lyon.fr/PSMN/Documentation/filesystems/scratch.html

2023/07/05 07:10 · ltaulell

20230704 / infiniband Cascade (scratch, MPI...)

We are “hot-“modifying the InfiniBand network of Cascade cluster. You may experience some temporarly access problems on Cascade scratches.

/scratch/Cascade is 99% full, hence performing as badly as possible (it's even not performing at all, for now). You should have cleanup while you can, because now, I'm in charge…: Erasing all files & directory older than 180 days (that's six months old) on /scratch/Cascade. Do NOT complain, documentation is crystal-clear (http://www.ens-lyon.fr/PSMN/Documentation/filesystems/scratch.html)

EDIT 17:00: due to a misconfiguration we cannot find, half of scratch nodes (Cascade and Cral) are not connected anymore.

2023/07/04 12:13 · ltaulell

20230615 / volume highenergy (S01E02)

/Xnfs/highenergy volume is back online. Go easy, a scan/rebuild is still ongoing…

2023/06/15 16:32 · ltaulell
news/blog.txt · Dernière modification : 2020/08/25 15:58 de 127.0.0.1