Job submission test: Difference between revisions

From NU HPC Wiki
Jump to navigation Jump to search
No edit summary
 
Line 24: Line 24:
#SBATCH --output=serial_output_%j.log  # Standard output log
#SBATCH --output=serial_output_%j.log  # Standard output log
#SBATCH --error=serial_error_%j.log    # Standard error log
#SBATCH --error=serial_error_%j.log    # Standard error log
#SBATCH --mem=1G
#SBATCH --partition=CPU
python3 hello_serial.py
</pre>
</pre>


Line 66: Line 70:
#SBATCH --ntasks=4                      # Total number of tasks (CPUs)
#SBATCH --ntasks=4                      # Total number of tasks (CPUs)
#SBATCH --time=0-00:05:00                # Time limit (5 minutes)
#SBATCH --time=0-00:05:00                # Time limit (5 minutes)
#SBATCH --mem=2G
#SBATCH --partition=CPU
#SBATCH --output=mpi_output_%j.log      # Standard output log
#SBATCH --output=mpi_output_%j.log      # Standard output log
#SBATCH --error=mpi_error_%j.log        # Standard error log
#SBATCH --error=mpi_error_%j.log        # Standard error log
Line 72: Line 78:
# Load MPI module
# Load MPI module
<pre>
<pre>
module load openmpi/4.1.0
module load OpenMPI
</pre>
</pre>


Line 116: Line 122:
#SBATCH --nodes=1                      # Number of nodes
#SBATCH --nodes=1                      # Number of nodes
#SBATCH --ntasks=1                      # Number of tasks (CPUs)
#SBATCH --ntasks=1                      # Number of tasks (CPUs)
#SBATCH --gpus=a100:1                  # Request 1 GPU
#SBATCH --time=0-00:05:00              # Time limit (5 minutes)
#SBATCH --time=0-00:05:00              # Time limit (5 minutes)
#SBATCH --partition=NVIDIA
#SBATCH --mem=1G
</pre>
</pre>


# Load CUDA module
# Load CUDA module
<pre>
<pre>
module load cuda/11.4.1
module load CUDA
</pre>
</pre>


Line 128: Line 135:
<pre>
<pre>
nvcc -o hello_gpu hello_gpu.cu
nvcc -o hello_gpu hello_gpu.cu
</pre>
</pre>



Latest revision as of 09:56, 20 November 2024

Job Submission – Hello World Examples for NU HPC System

This section provides working examples of SLURM job submission scripts for serial, MPI, and GPU "Hello World" tasks. Each example can be directly copied and run on the NU HPC system.

Single-Core Serial Job Example

Step 1: Creating the Python Script

First, create a Python script called hello_serial.py. This program prints "Hello World".

print("Hello World from Serial!")

Step 2: Creating the SLURM Batch Script

Next, create the SLURM batch script called submit_serial.sh to submit the serial job. This script specifies job parameters and runs the Python program:

#!/bin/bash
#SBATCH --job-name=hello_serial        # Job name
#SBATCH --ntasks=1                     # Use a single core
#SBATCH --time=0-00:05:00              # Time limit (5 minutes)
#SBATCH --output=serial_output_%j.log  # Standard output log
#SBATCH --error=serial_error_%j.log    # Standard error log
#SBATCH --mem=1G
#SBATCH --partition=CPU

python3 hello_serial.py
  1. Load Python module (adjust as necessary)
module load python/3.8.5
  1. Run the Python script
python3 hello_serial.py

MPI Job Example

Step 1: Creating the MPI Program

Create a simple MPI program called hello_mpi.c that prints "Hello World". The program can be written in C.

#include <mpi.h>
#include <stdio.h>

int main(int argc, char *argv[]) {
    MPI_Init(&argc, &argv);
    int world_rank;
    MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
    printf("Hello World from process %d!\n", world_rank);
    MPI_Finalize();
    return 0;
}

Step 2: Creating the SLURM Batch Script

Now, create the SLURM batch script called submit_mpi.sh to submit the MPI job:

#!/bin/bash
#SBATCH --job-name=hello_mpi            # Job name
#SBATCH --nodes=2                        # Number of nodes
#SBATCH --ntasks=4                       # Total number of tasks (CPUs)
#SBATCH --time=0-00:05:00                # Time limit (5 minutes)
#SBATCH --mem=2G
#SBATCH --partition=CPU
#SBATCH --output=mpi_output_%j.log      # Standard output log
#SBATCH --error=mpi_error_%j.log        # Standard error log
  1. Load MPI module
module load OpenMPI
  1. Compile the MPI program
mpicc -o hello_mpi hello_mpi.c
  1. Run the MPI program
mpirun -np 4 ./hello_mpi

GPU Job Example

Step 1: Creating the CUDA Program

Create a simple CUDA program called hello_gpu.cu that prints "Hello World".

#include <stdio.h>

__global__ void helloFromGPU() {
    printf("Hello World from GPU!\n");
}

int main() {
    helloFromGPU<<<1, 1>>>();
    cudaDeviceSynchronize();
    return 0;
}

Step 2: Creating the SLURM Batch Script

Next, create the SLURM batch script called submit_gpu.sh to submit the GPU job:

#!/bin/bash
#SBATCH --job-name=hello_gpu           # Job name
#SBATCH --output=gpu_output_%j.log     # Standard output log
#SBATCH --error=gpu_error_%j.log       # Standard error log
#SBATCH --nodes=1                       # Number of nodes
#SBATCH --ntasks=1                      # Number of tasks (CPUs)
#SBATCH --time=0-00:05:00               # Time limit (5 minutes)
#SBATCH --partition=NVIDIA
#SBATCH --mem=1G
  1. Load CUDA module
module load CUDA
  1. Compile the CUDA program
nvcc -o hello_gpu hello_gpu.cu

  1. Run the CUDA program
./hello_gpu

Useful SLURM commands

Command Description
sbatch <file> Submit a job script
squeue View the current job queue
scancel <jobid> Cancel a job by job ID
sinfo Show information about available partitions
scontrol show job Display detailed information about a specific job
sacct View historical job accounting information

By following these steps, you can quickly test serial, MPI, and GPU jobs on the NU HPC system. Just copy the scripts, adjust as needed, and submit them using SLURM.