Computing resources =================== Our clusters are grouped by CPU generations, **available** :term:`RAM` **size** and infiniband networks. They are then sliced into *partitions* (See :doc:`partitions_overview`). Big picture ----------- .. WARNING:: **E5 End Of Life**, cluster has been powered off at end of April 2025. **Hardware specifications 'per node':** +-----------+---------------+----------+----------+---------+------------------+----------------------------+ | Clusters | CPU family | nb cores | RAM (GB) | Network | main Scratch | **Best use case** | +===========+===============+==========+==========+=========+==================+============================+ | | E5 + GPU | 8 | 124 | | | sequential, small parallel | | | | | | | | , GPU computing | + +---------------+----------+----------+ + +----------------------------+ | Lake | Sky Lake | 32 | 94, 124, | 56Gb/s | /scratch/Lake | medium parallel, | + +---------------+ + 188, 378 + + + sequential + | | Cascade Lake | | | | | | + +---------------+----------+----------+---------+ +----------------------------+ | | AMD Epyc | 128 | 503 | 100Gb/s | | large parallel | +-----------+---------------+----------+----------+---------+------------------+----------------------------+ | Cascade | Cascade Lake | 96 | 377 | 100Gb/s | /scratch/Cascade | large parallel, | + +---------------+----------+----------+ + + memory intensive + | | AMD Genoa | 64 | 756 | | | | + +---------------+----------+----------+ + + + | | Emerald 8562Y+| 64 | 504 | | | | + +---------------+----------+----------+ + + + | | Emerald 8592+ | 128 | 1008 | | | | +-----------+---------------+----------+----------+---------+------------------+----------------------------+ See :doc:`partitions_overview` for more hardware details and partitions slicing. **Available** :term:`RAM` **size** may vary a little (not all RAM is available for computing, GB vs GiB, etc.). GPU Specifications ------------------ PSMN offers two types of NVIDIA GPUs which are available on E5-GPU and Cascade-GPU Partitions. **GPU specifications per cluster:** +---------------+---------------+-----------------------+--------------+----------------+----------------------+ | Partition | login nodes | CPU Model | GPU | CUDA support | Compute Cap. Version | +===============+===============+=======================+==============+================+======================+ | E5-GPU | r730gpu01 | E5-2637v3 @ 3.5GHz | 2x RTX2080Ti | 11.7 -> 12.2 | 7.5 | +---------------+---------------+-----------------------+--------------+----------------+----------------------+ | Cascade-GPU | | Platinum 9242 @ 2.3GHz| 1x L4 | 11.7 -> 12.2 | 8.9 | +---------------+---------------+-----------------------+--------------+----------------+----------------------+ **Hardware specifications per GPU Type:** +---------------------------------+--------------+--------------+ | Specification | RTX2080Ti | L4 | +=================================+==============+==============+ | Architecture | Turing | Ada Lovelace | +---------------------------------+--------------+--------------+ | Cores | 4352 | 7424 | +---------------------------------+--------------+--------------+ | FP64 (DP, TFLOPS) | 0.42 | 0.47 | +---------------------------------+--------------+--------------+ | FP32 (SP, TFLOPS) | 13.4 | 16.4 | +---------------------------------+--------------+--------------+ | FP16 (HP, TFLOPS) | 26.9 | 30.3 | +---------------------------------+--------------+--------------+ | Tensor Core | 240 | 544 | +---------------------------------+--------------+--------------+ | SM Count | 68 | 60 | +---------------------------------+--------------+--------------+ | Boost clock speed (MHz) | 1545 | 2040 | +---------------------------------+--------------+--------------+ | Core clock speed (MHz) | 1350 | 795 | +---------------------------------+--------------+--------------+ | GPU Memory (GB) | 11 | 24 | +---------------------------------+--------------+--------------+ | GPU Memory Bandwidth (GB/s) | 616 | 300 | +---------------------------------+--------------+--------------+ | Max Thermal Design Power (W) | 260 | 72 | +---------------------------------+--------------+--------------+ Available resources ------------------- Use the ``sinfo`` [#sinfo]_ command to get the dynamic view of partitions (default one is noted with a '*', also ``sinfo -l``, ``sinfo -lNe`` and ``sinfo --summarize``): .. code-block:: bash $ sinfo PARTITION AVAIL TIMELIMIT NODES STATE NODELIST E5* up 8-00:00:00 4 idle c82gluster[1-4] Cascade up 8-00:00:00 77 idle s92node[02-78] Or informations state about a particular partition: .. code-block:: bash $ sinfo -p Epyc PARTITION AVAIL TIMELIMIT NODES STATE NODELIST Epyc up 8-00:00:00 1 mix c6525node002 Epyc up 8-00:00:00 12 alloc c6525node[001,003-006,008-014] Epyc up 8-00:00:00 1 idle c6525node007 To see more informations (cpus and cpu organization, :term:`RAM` size [in MiB], state/availability), use one of these: .. code-block:: bash $ sinfo --exact --format="%9P %.8z %.8X %.8Y %.8c %.7m %.5D %N" PARTITION S:C:T SOCKETS CORES CPUS MEMORY NODES NODELIST E5* 2:8:1 2 8 16 128872 4 c82gpgpu[31-34] E5* 2:8:1 2 8 16 64328 3 c82gluster[2-4] E5-GPU 2:4:1 2 4 8 128829 1 r730gpu20 Lake 2:16:1 2 16 32 385582 3 c6420node[172-174] Cascade 2:48:1 2 48 96 385606 77 s92node[02-78] $ sinfo --exact --format="%9P %.8c %.7m %.5D %.14F %N" PARTITION CPUS MEMORY NODES NODES(A/I/O/T) NODELIST E5* 16 128872 4 3/1/0/4 c82gpgpu[31-34] E5* 16 64328 3 3/0/0/3 c82gluster[2-4] E5-GPU 8 128829 1 0/1/0/1 r730gpu20 Lake 32 385582 3 1/2/0/3 c6420node[172-174] Cascade 96 385606 77 47/26/4/77 s92node[02-78] $ sinfo --exact --format="%9P %.8c %.7m %.20C %.5D %25f" --partition E5,E5-GPU PARTITION CPUS MEMORY CPUS(A/I/O/T) NODES AVAIL_FEATURES E5* 16 256000 248/120/16/384 24 local_scratch E5* 16 128828 354/30/0/384 24 (null) E5* 16 257852 384/0/0/384 24 (null) E5* 32 257843 384/0/0/384 12 (null) E5* 16 64328 48/0/0/48 3 (null) E5* 16 128872 64/0/0/64 4 (null) E5-GPU 8 127000 32/128/0/160 20 gpu ``A/I/O/T`` standing for ``Allocated/Idle/Other/Total``, in CPU terms. .. code-block:: bash $ sinfo -lN | less NODELIST NODES PARTITION STATE CPUS S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON [...] c82gluster4 1 E5* idle 16 2:8:1 64328 0 1 (null) none s92node02 1 Cascade idle 96 2:48:1 385606 0 1 (null) none [...] .. important:: * HyperThreading [#ht]_ is activated on all Intel nodes, but not available as computing resources (*real cores vs logical cores*). * :term:`RAM` size is in MiB, and you cannot reserve more than 94% of it, by node. Basic defaults -------------- * default partition: Lake-short * default time: 10 minutes * default cpu(s): 1 core * default memory size: 4GiB / core Features -------- Some nodes have *features* [#features]_ (``gpu``, ``local_scratch``, etc.). To request a feature/constraint, you must add the following line to your submit script: ``#SBATCH --constraint=``. Example: .. code-block:: bash #!/bin/bash #SBATCH --name=my_job_needs_local_scratch #SBATCH --time=02:00:00 #SBATCH --ntasks=8 #SBATCH --mem-per-cpu=4096M #SBATCH --constraint=local_scratch Only nodes having features matching the job constraints will be used to satisfy the request. Maximums -------- Here are some maximums of usable resources **per job**: * **maximum** wall-time : **8 days** ('8-0:0:0' as 'day-hours:minutes:secondes') * **maximum** nodes per job and/or **maximum** cores per job: +-------------+-------+-------+-----+ | Partition | nodes | cores | gpu | +=============+=======+=======+=====+ | E5-GPU | 18 | 144 | 18 | +-------------+-------+-------+-----+ | Lake | 24 | 768 | | +-------------+-------+-------+-----+ | Epyc | 14 | 1792 | | +-------------+-------+-------+-----+ | Cascade | 76 | 7296 | | +-------------+-------+-------+-----+ | Cascade-GPU | 12 | 1152 | 12 | +-------------+-------+-------+-----+ | Genoa | 10 | 640 | | +-------------+-------+-------+-----+ | Emerald | 4 | 256 | | +-------------+-------+-------+-----+ Anything more **must be justified using** `our contact forms `_. .. [#sinfo] You can get the complete list of parameters by referring to the ``sinfo`` manual page (``man sinfo``). .. [#ht] `See HyperThreading `_ .. [#features] See ``sbatch`` manual page (``man sbatch``, -C, --constraint=).