S'abonner au fil des news (flux RSS)

Fil des news

20220613 / Acoustic measurements

“Sous réserve de conditions météorologiques favorables, la réalisation des mesures acoustiques MLE / SING est programmée dans la nuit du jeudi 16 au vendredi 17 juin 2022.”

Under favorable weather conditions, acoustic measurements (yes, again) will occur during the night from Thursday 16th of june to Friday 17th of june 2022.

We need to shutdown all of compute nodes and parts of storage to meet the required power/cooling conditions of the measurements.

Queues and partitions will be blocked tomorrow (Tuesday 14/06/2022) until Friday.

All compute nodes will be powered off Thursday 16th, end of afternoon.

EDIT:

There will be a few upgrades and reboot of some servers, Friday before restart :

  • allo-psmn and ssh.psmn
  • data10 (homes chimie)
  • data13 (/Xnfs/cosmos)
2022/06/13 12:08 · ltaulell

20220511 / Debian11 crashed

While doing a big software install on debian 11 master system image, I crashed it… (long story short: a vicious 'apt-get -y upgrade' was not commented out…)

All debian 11 nodes are impacted, mostly crashed or unavailable.

I'll finish the software update before cleaning my mess. Sorry.

UPDATE: upgrade done, cleanup done. nodes OK. slurm OK.

2022/05/11 14:02 · ltaulell

20220506 / Migration News, scratches

  • deb9-deb11 migration: Plans are made to be changed…

Debian11/Slurm Upgrade was not planned to be a one-day operation:

  • E5 cluster will be shutting down, by pieces, to make room for Cascade extension (starting yesterday)
  • Cascade extension will be powered up slowly (mostly during June 2022)

then Summer holydays…

In the meantime:

  • E5 and Lake test nodes *are available* for tests and migrations purposes
    • E5: c82gluster1 is the login node (for now)
    • Lake: c6420node171 is the login node (for now)
    • Cascade: s92node01 is the login node.

Please do test and prepare your slurm scripts…

Be aware that homes and groups/teams storages (/Xnfs) are the same between systems.

Then, in September, we'll see (E5 and Lake clusters final migrations, scratches migrations).

  • Scratches

You are doing it wrong (mostly).

DO NOT store scripts, SGE/slurm logs, small files, source code, binaries on Scratches: it degrade general performance VERY fast, for everyone.

Scratches are meant for large temporary files, and large I/O operations, WITHIN a job. That's all.

DO cleanup!!! Everytime a job is finished, scratch should be clean up (with exception for long workflows)

DO VERIFY your cleanup operations!!

General purpose scratches (E5N/, Lake/) are full again. We will erase files older than 90 days next week (blind shoot).

2022/05/06 09:05 · ltaulell

20220505 / E5 cluster, partial poweroff

Queues E5-2670deb128A to D are now disabled and will be powered off definitely next week.

The sliding block puzzle is starting…

2022/05/05 09:24 · ltaulell

20220428 / E5 cluster migration

We will stop parts of E5 cluster (older nodes), begining Week of 2 to 6 of May 2022.

E5 scratch, visu nodes and 'newer' E5 nodes will stay on deb9/SGE system until further notice.

2022/04/28 08:26 · ltaulell
news/blog.txt · Dernière modification : 2020/08/25 15:58 de 127.0.0.1