Frequently Asked Question
Portable Batch System (PBS) is a widely used workload management system for High-Performance Computing (HPC) environments. It helps manage and schedule jobs on computing clusters by distributing them across available resources such as CPUs and GPUs. This article introduces the PBS directives and commands that are invaluable when working on the Aziz supercomputer, enabling the users to submit, monitor, and manage the computational tasks efficiently.
1. Basic PBS Script Structure
A typical PBS script, preferred with a .pbs
extension, consists of the following sections:
- PBS directives: These directives provide instructions to the PBS system about the requested computing nodes for a job.
- Job script: This section contains the actual commands that you want to execute on the computing cluster.
2. Common PBS Directives
1. #PBS -N jobname
- Specifies the job name.
2. #PBS -q queue_name
- Specifies the queue to which the job should be submitted.
3. #PBS -l select=number:ncpus=cores_per_node
- Requests a specific number of nodes with a specified number of cores per node.
4. #PBS -l walltime=hours:minutes:seconds
- Sets the maximum wall-clock time for the job.
5. #PBS -o output_file
- Redirects standard output to a specified file.
6. #PBS -e error_file
- Redirects standard error to a specified file.
3. Example PBS Script
The Aziz Supercomputer is equipped with different types of compute nodes, including both CPUs and GPUs. The following is a general PBS example. For specific PBS scripts to run jobs on CPUs or GPUs, please refer to the corresponding sections.
Bash
#!/bin/bash
#PBS -N my_job.pbs # Job name
#PBS -q queue_Name # Choose the compute node GPU(A100) CPU(thin)or(fat)
#PBS -l select=number_of_nodes:ncpus=cpu_cores_per_node
#PBS -l walltime=02:00:00 # Set maximum wall time of 2 hours (better for 1st run)
#PBS -o output.txt
#PBS -e error.txt
# Give the directory where your code is
cd /path/to/my_files/
# Load necessary modules
module load module_name
# Execute your script
pyhton code_file.py # run command for python files
./ code_file.exe # run command for executable c code files files
# Execute the program (e.g., a parallel program using MPI)
mpirun ./my_parallel_program # Define the number of MPI processes (mpiprocs) and OpenMP Threads (ompthreads) in the select statement of PBS directive
# or execute your script with time stamp
time pyhton code_file.py # run command for python files
# unload modules after execution
module purge
Remember: Always check knowledge base or consult the support team for specific PBS options and configurations available on Aziz. By understanding and effectively using PBS scripts, you can optimize your computational workloads and make the most of the Aziz supercomputer's resources.