nf-core/configs: LRZ CM4 Configuration
About
All nf-core pipelines have been successfully configured for use on the CoolMuc4 cluster that is provided by the Leibniz Rechenzentrum (LRZ) of the Bavarian Academy of Sciences, located in Garching, Germany.
NB: You will need an account to use the LRZ Linux cluster.
General usage
NB: Please note that running nextflow on a login node is not permitted.
NB: Please note that it is not possible to run nextflow with the SLURM executor in a job, compute nodes cannot submit jobs.
Instead of having nextflow run on a login node and submit jobs to the SLURM scheduler, the nextflow head job, coordinating the workflow, has to run inside a SLURM-job and job scheduling is done ‘inside’ the SLURM job using the flux or local executors. This is outlined here and implemented in -profile lrz_cm4. This profile detects if the flux executor has been started, and will switch executor accordingly. Example sbatch scripts are provided below.
Setup
To use, run the pipeline with -profile lrz_cm4. This will download and launch the lrz_cm4.config.
We recommend using nextflow >= 25.04.2 with apptainer (>=1.3.4) for containerization.
These are available as modules (please confirm the module name using module avail):
## Load Nextflow, apptainer, and flux environment modules
module load nextflow/25.04.2 apptainer/1.3.4 fluxIn case additional flexibility / other versions are needed, a conda environment containing the required packages is also an option.
This could be done as follows for a temporary environment on SCRATCH_DSS:
module load micromamba
export ENV_PATH=$SCRATCH_DSS/env_nfcore # Adjust path as desired
micromamba create \
-p $ENV_PATH \
-c conda-forge \
-c bioconda \
nextflow nf-core apptainer flux-core flux-sched
micromamba activate $ENV_PATHFor a more persistent environment in $HOME consider:
module load micromamba
micromamba create \
-n nf-env \
-c conda-forge \
-c bioconda \
nextflow nf-core apptainer flux-core flux-sched
micromamba activate nf-envConsiderations
While testing can be done with partial nodes, or interactive jobs, we recommend requesting at least one full node for production runs. Both local and flux executor can be used for single-node runs, multi-node runs must use flux to make use of the additional resources. Please note that during testing, we observed that the same test-run of nf-core/rnaseq took around 11h with the local executor, and 8h with the flux executor, which we largely attribute to more efficient scheduling.
Test setup
The test was performed using the test_full profile of nf-core/rnaseq, with a customized samplesheet, containing a total of 24 samples.
We compared the performance of local and flux on a single node, and scaling of flux across 1, 2, or 4 nodes.
| Executor | # Nodes | Time |
|---|---|---|
| local | 1 | 11:06:57 |
| flux | 1 | 08:15:11 |
| flux | 2 | 04:45:39 |
| flux | 4 | 03:36:09 |
This is a short summary of a more extensive test, kindly conducted by Martin Ohlerich at LRZ. If you would like to learn more, please take a look here
Examples
Full node(s)
When running a nextflow pipeline on one or more full node(s), we advise to use flux.
There are some specific settings required to make flux use all available logical processing units when running inside a SLURM job, which are set correctly in the example script. The script below requests 4 nodes.
#SBATCH -D .
#SBATCH -o log.%x.%j
#SBATCH -J nf_flux_hwt_4N
#SBATCH --get-user-env
#SBATCH -M cm4
#SBATCH -p cm4_std
#SBATCH --qos=cm4_std
#SBATCH --nodes=4-4 # 1-1, 2-2, 3-3, or 4-4
#SBATCH --ntasks-per-node=2
#SBATCH -c 112
#SBATCH --hint=multithread
#SBATCH --export=none
#SBATCH --time=1-00:00:00 # Max of 2 days
module load nextflow/25.04.2 apptainer/1.3.4 flux
# OR
# Please use either modules or a conda environment
conda activate nf-env
# Write commands to heredoc to pass to flux start
# For runs that are not using the test profile, modify accordingly.
# The complete nextflow command needs to be here.
cat > workflow.sh << EOT
nextflow run nf-core/rnaseq \
-profile test,lrz_cm4
EOT
# Make file executable
chmod u+x workflow.sh
# Start flux via srun
srun --export=all --mpi=none flux start ./workflow.shUsage of logical CPU with flux
By default, flux discovers physical CPU. To make use of the logical CPU available, the following settings are required:
SLURMhas to use multithreading- two
flux-brokers are required per node, each serving half of the logical CPU (there are two logical per physical CPU). For this reason, we start with--ntasks-per-node=2 fluxmust not be cpu-bound bySLURM/srun.
Partial node
Run nextflow inside a SLURM job using either local or flux for job scheduling within the SLURM allocation. We recommend runs on less than 1 full node only for testing purposes.
In case the cm4_tiny partition of the cm4 cluster, the serial partition of serial cluster, or terramem partition of the inter cluster is to be used (i.e. if the job requires less 1 full node) please prepare a script similar to the one below:
NB: This config assumes that memory is not requested explicitly, and computes the memory resourceLimit as 4.5GB * number of CPUs
#SBATCH -D .
#SBATCH -o log.%x.%j
#SBATCH -J nf_partial_node
#SBATCH --get-user-env
#SBATCH -M cm4
#SBATCH -p cm4_tiny
#SBATCH --qos=cm4_tiny
#SBATCH -c 24
#SBATCH --hint=multithread
#SBATCH --export=none
#SBATCH --time=1-00:00:00 # Max of 2 days
nextflow run nf-core/rnaseq \
-profile test,lrz_cm4Config file
/* ----------------------------------------------------
* Nextflow config file for the LRZ cm4 cluster
* ----------------------------------------------------
*/
manifest {
name = 'LRZ CM4 Configuration'
author = 'Niklas Schandry, Amit Fenn, Frederik Dröst'
homePage = 'plantmicrobe.de'
description = 'Configuration for LRZ CM4 cluster'
}
params {
// Configuration metadata
config_profile_name = 'LRZ CM4'
config_profile_description = 'LRZ CM4 configuration'
config_profile_contact = 'Niklas Schandry(@nschan), Amit Fenn (@amitfenn)'
config_profile_url = 'https://doku.lrz.de/job-processing-on-the-linux-cluster-10745970.html/'
config_version = '1.0.0'
// Default output directory (relative to launch directory)
outdir = 'results'
}
apptainer {
enabled = true
autoMounts = true
}
process {
executor =
System.getenv("FLUX_URI") ? // If this is set we are in a flux-in-slurm situation
'flux' : // Since we only support flux and local approaches, the alternative is local
'local'
resourceLimits = [
cpus: System.getenv("SLURM_CPUS_ON_NODE") ? // for <1 node, we use slurm and can use this var
System.getenv("SLURM_CPUS_ON_NODE").toInteger() : // for > 1 node, we use flux, the maximum we can allocate to one job 112 CPU
112,
memory: System.getenv("SLURM_CPUS_ON_NODE") ? // if we are in a slurm job, we assume that MEM-per-CPU is 4.5GB
(System.getenv("SLURM_CPUS_ON_NODE").toInteger() * 4500.MB) :
480.GB // if we are not in a slurm job, we are in a node-spanning flux job, and one job can request up to 480.GB (488GB available per node)
]
}
trace {
enabled = true
overwrite = true
}
report {
enabled = true
overwrite = true
}
timeline {
enabled = true
overwrite = true
}
dag {
enabled = true
overwrite = true
}