Software 2: Difference between revisions

From NU HPC Wiki
Jump to navigation Jump to search
m (Admin moved page Software to Software 2 without leaving a redirect)
 
(34 intermediate revisions by the same user not shown)
Line 1: Line 1:
This page provides an overview of the main software used in Shabyt, along with guidelines on their operational workflows. It is essential to note that while this resource covers a significant amount of the software used in Shabyt, it is not a full list.
This page gives an overview of the software available in NU HPC facilities and explains how to use it.    


Additionally, this resource incorporates practical insights and instructions on navigating and utilizing these software tools effectively. The emphasis is on presenting the information in a manner that is accessible to individuals new to these platforms and valuable to those seeking a deeper academic understanding.
== Software Installation ==
Software installation on the Shabyt system follows specific criteria to ensure compatibility and effective utilization of resources. Users can request the installation of new software if it meets the following conditions:


Overall, this page serves as a valuable resource for anyone interested in gaining knowledge about the primary software landscape within Shabyt and learning how to engage with these tools efficiently.
* '''Availability and Licensing:''' The software must be freely available or covered by a site license held by NU.
* '''Compatibility:''' It should be compatible with the existing operating system environment on Shabyt to ensure seamless integration and functionality.
* '''Resource Utilization:''' The software should be able to effectively utilize the resources available on Shabyt, optimizing performance and efficiency.


For guidance or support regarding the installation of new software packages, users should contact the Shabyt system administrators at hpcadmin@nu.edu.kz.
Additionally, software are installed in accordance with priorities.
* '''Priority 1:''' Software that can be installed using the EasyBuild application is given first priority. A list of supported EasyBuild software can be found [https://docs.easybuild.io/version-specific/supported-software/#arcashla here].
* '''Priority 2:''' Applications which can't be installed through EasyBuild, but essential for multiple User Groups are prioritized next.
* '''Priority 3:''' Application which can't be installed through EasyBuild, but essential for individual users.
It's important to know that this isn't a complete list of all the software in Shabyt system.
== Environment Modules ==
== Environment Modules ==
Shabyt uses Environment modules to dynamically set up environments for different software. Module commands set, change, or delete environment variables that are needed for a particular software. The ‘<code>module load</code>‘ command will set ''PATH'', ''LD_LIBRARY_PATH'' and other environment variables such that user may choose a desired version of applications or libraries more easily. More details can be found [https://lmod.readthedocs.io/en/latest/ here].
In linux environment variables are values that can change and impact how programs behave on a computer system. They are name-value pairs that all processes can access within a particular user environment or shell session. These variables provide a flexible and convenient method for managing system-wide settings, configuring applications, and customizing system behavior.
 
Shabyt uses Environment modules (also know as LMOD) to dynamically set up environment variables for different software. Module commands set, change, or delete environment variables that are needed for a particular software. The ‘<code>module load</code>‘ command will set ''PATH'', ''LD_LIBRARY_PATH'' and other environment variables such that user may choose a desired version of applications or libraries more easily. More details can be found [https://lmod.readthedocs.io/en/latest/ here].
{| class="wikitable"
{| class="wikitable"
|+Environment module commands
|+Environment module commands
Line 44: Line 58:


== Anaconda ==
== Anaconda ==
[[File:Conda logo.svg.png|centre|535x535px]]
'''Description:''' Anaconda, also known as "conda," is a tool for managing Python packages. It helps you create virtual environments for different Python and package versions. You can use Anaconda to install, remove, and update packages within your project environments. For instance you can create virtual environment for game development which requires Pygame with version of Python and you can create environment for machine learning which requires Pytorch with new version of Python. 


'''Description:''' "Anaconda" (shortly "conda"), a Python package management, which helps you create an environment for many different versions of Python and package versions. Anaconda is also used to install, remove, and upgrade packages in your project environments.  
'''Usage:''' module load Anaconda3/2022.05 


'''Usage:''' module load Anaconda3/2022.05 
'''Working with Anaconda environments'''


=== Working with Anaconda environments ===
Below is a list of main commands you should use in order to start working with Anaconda.
# Check out available environments: <code>conda env list</code>
# To Check available environments, please type: <code>conda env list</code>
# View a list of packages in an environment
# View a list of packages in an environment
#* If the environment is not activated: <code>conda list -n tensorflow-gpu</code>
#* If the environment is not activated, please type: <code>conda list -n virtualenv</code>
#* If the environment is activated: <code>conda list</code>
#* If the environment is activated, then type: <code>conda list</code>
# Create Conda environment
# Create Conda environment
#* Create an environment: <code>conda create -n virtualenv</code>
#* Create an environment: <code>conda create -n virtualenv</code>
Line 64: Line 78:
<code>conda remove -n virtualenv --all</code> or <code>conda env remove -n virtualenv</code>
<code>conda remove -n virtualenv --all</code> or <code>conda env remove -n virtualenv</code>


=== Working with packages ===
'''Working with packages'''
 
Install packages into ''virtualenv'' environment
Install packages into ''virtualenv'' environment


* If the environment is not activated : <code>conda --name virtualenv install PACKAGENAME</code>
* If the environment is not activated, please type: <code>conda --name virtualenv install PACKAGENAME</code>
* If the environment is activated: <code>conda install PACKAGENAME</code>
* If the environment is activated, please type: <code>conda install PACKAGENAME</code>
* Install multiple packages at once: <code>conda install pkg1 pkg2 pkg3</code>
* If you want to install multiple packages at once: <code>conda install pkg1 pkg2 pkg3</code>
* Install package with specific version: <code>conda install numpy=1.15.2</code>
* If you need to install package with specific version: <code>conda install numpy=1.15.2</code>


'''External links'''   
'''External links'''   
Line 81: Line 96:


[https://docs.conda.io/projects/conda/en/4.6.0/_downloads/52a95608c49671267e40c689e0bc00ca/conda-cheatsheet.pdf Conda Cheat Sheet]
[https://docs.conda.io/projects/conda/en/4.6.0/_downloads/52a95608c49671267e40c689e0bc00ca/conda-cheatsheet.pdf Conda Cheat Sheet]
== Ansys ==
'''Description:''' The ANSYS suite of tools can be used to numerically simulate a wide range of structural and fluid dynamics issues encountered in several engineering, physics, medical, aerospace, and automotive sector applications.
'''Usage:''' Loading the ANSYS module module load ansys/2022r1 Launching the workbench is accomplished by: runwb2 The workbench provides access to Fluent, CFX, ICEM, Mechanical APDL/model, and many other languages and models. The appropriate GUIs can be launched outside of the workbench using fluent, cfx5pre, icemcfd, and launcher.


== CUDA ==
== CUDA ==
'''Description:''' Nvidia created the parallel computing platform and programming model known as CUDA for use with its GPUs for general computing (graphics processing units). By utilizing the capability of GPUs for the parallelizable portion of the calculation, CUDA enables developers to accelerate computationally heavy applications.
CUDA (Compute Unified Device Architecture) is a parallel computing platform and application programming interface model created by NVIDIA. It allows to use NVIDIA graphics processing units (GPUs) for general purpose processing.  


'''Usage:''' module load CUDA/11.4.1 To check if CUDA has been loaded, type: <code>nvcc --version</code>
'''Usage:''' module load cuda/11.4.1


To check GPUs on GPU nodes: <code>nvidia-smi --list-gpus</code>
== GCC ==
== GCC ==
The GNU Compiler Collection, commonly known as GCC, is a set of compilers and development tools available for Linux, Windows, various BSDs, and a wide assortment of other operating systems. It includes support primarily for C and C++ and includes Objective-C, Ada, Go, Fortran, and D. The Free Software Foundation (FSF) wrote GCC and released it as completely free (as in libre) software.
The GNU Compiler Collection, commonly known as GCC, is a set of compilers and development tools available for Linux, Windows, various BSDs, and a wide assortment of other operating systems. It includes support primarily for C and C++ and includes Objective-C, Ada, Go, Fortran, and D. The Free Software Foundation (FSF) wrote GCC and released it as completely free (as in libre) software.
Line 99: Line 110:
When you run GCC on a source code file, it first uses a preprocessor to include header files and discard comments. Next, it tokenizes the code, expands macros, detects any compile-time issues, then prepares it for compilation. It is then sent to the compiler, which creates syntax trees of the program’s objects and control flow and uses those to generate assembly code. The assembler then converts this code into the binary executable format of the system. Finally, the linker includes references to any external libraries as needed. The finished product is then executable on the target system.
When you run GCC on a source code file, it first uses a preprocessor to include header files and discard comments. Next, it tokenizes the code, expands macros, detects any compile-time issues, then prepares it for compilation. It is then sent to the compiler, which creates syntax trees of the program’s objects and control flow and uses those to generate assembly code. The assembler then converts this code into the binary executable format of the system. Finally, the linker includes references to any external libraries as needed. The finished product is then executable on the target system.


=== '''GCC examples''' ===
'''GCC examples'''
Compiling a program with GCC can be a straightforward matter/<syntaxhighlight lang="bash">
gcc hello.c -o hello
</syntaxhighlight>Running this command processes the hello.c file and generates a binary called “hello”. Additional parameters can be passed.<syntaxhighlight lang="bash">
gcc hello.c -O3 -o hello
</syntaxhighlight>In this example, the optimization parameter is set to 3, leading to more optimized code generation.


Additional libraries can be included as well.<syntaxhighlight lang="bash">
Compiling a program with GCC can be a straightforward matter
gcc hello.c -lncurses -o hello
</syntaxhighlight>This example includes the ncurses library.


More complex compilations are managed by ''Makefiles'' and are invoked with the “make” command.
<code>gcc hello.c -o hello</code>
 
Running this command processes the hello.c file and generates a binary called “hello”.
 
Additional parameters can be passed.


'''External link'''
<code>gcc hello.c -O3 -o hello</code>


GNU Compiler Collection official page: https://gcc.gnu.org/\
In this example, the optimization parameter is set to 3, leading to more optimized code generation.


== GROMACS ==
More complex compilations are managed by ''Makefiles'' and are invoked with the “make” command.
'''Description:''' GROMACS is a flexible package for performing molecular dynamics, simulating the Newtonian equations of motion for systems containing hundreds of thousands to millions of particles. It is intended for biochemical molecules, such as proteins, lipids, and nucleic acids, with complex bonded interactions. However, GROMACS is fast at calculating nonbonded interactions, so many groups use it for non-biological systems, like polymers.


'''Usage:''' To load GROMACS software: module load GROMACS/2021.5-foss-2021b-CUDA-11.4.1 The GROMACS executable is either gmx or gmx mpi if an OpenMPI module is used. When you type gmx help commands, a list of gmx commands and their functions will be displayed.
'''External link'''


'''Batch jobs:''' Users are encouraged to create their own scripts for batch submissions. Below are examples of batch submission scripts.
[https://gcc.gnu.org/ Official Page ]


Parallel MPI #!/bin/bash #SBATCH --job-name=gromacs #SBATCH --mail-user=<YOUR_NU_ID>@nu.edu.kz #SBATCH --mail-type=FAIL,BEGIN,END #SBATCH --output=gmx-%j.out #SBATCH --ntasks=2 #SBATCH --cpus-per-task=4 #SBATCH --ntasks-per-socket=1 #SBATCH --time=24:00:00 #SBATCH --mem-per-cpu=1gb module purge module load OpenMPI/4.1.1-GCC-11.2.0 module load GROMACS/2021.5-foss-2021b-CUDA-11.4.1 export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK srun --mpi=pmix_v3 gmx mdrun -ntomp ${SLURM_CPUS_PER_TASK} -s topol.tpr
== Apptainer ==
Singularity is an open-source application for creating and running software containers, designed primarily for high-performance computing on shared Linux-based computing clusters like CARC systems.
Singularity containers provide a '''custom user space''' and enable portable, reproducible, stable, and secure software environments on Linux systems. A Singularity container bundles a primary application and all of its dependencies into a single image file, which can also include data, scripts, and other files if desired. In addition, Singularity containers have direct access to the Linux kernel on the host system (e.g., Discovery or Endeavour compute nodes), so there is no substantial performance penalty when using a container compared to using natively installed software on the host system.


Please ensure that you paste this code into the MediaWiki editor and make any necessary adjustments for formatting or links if needed.
With Singularity, you can:


== Software Installation ==
* Install anything you want (based on any Linux operating system)
Software installation on the Shabyt system follows specific criteria to ensure compatibility and effective utilization of resources. Users can request the installation of new software provided it meets the following conditions:
* Ease installation issues by using pre-built container images
* Ensure the same software stack is used among a research group
* Use the same software stack across Linux systems (e.g., any HPC center or cloud computing service)


* '''Availability and Licensing:''' The software must be freely available or covered by a site license held by NU.
== Ansys ==
* '''Compatibility:''' It should be compatible with the existing operating system environment on Shabyt to ensure seamless integration and functionality.
'''Description:''' The ANSYS suite of tools can be used to numerically simulate a wide range of structural and fluid dynamics issues encountered in several engineering, physics, medical, aerospace, and automotive sector applications.
* '''Resource Utilization:''' The software should be able to effectively utilize the resources available on Shabyt, optimizing performance and efficiency.


For guidance or support regarding the installation of new software packages, users can contact the Shabyt system administrators at hpcadmin@nu.edu.kz.
'''Usage:''' Loading the ANSYS module module load ansys/2022r1 Launching the workbench is accomplished by: runwb2 The workbench provides access to Fluent, CFX, ICEM, Mechanical APDL/model, and many other languages and models. The appropriate GUIs can be launched outside of the workbench using fluent, cfx5pre, icemcfd, and launcher.


Additionally, software installations are prioritized as follows:
== GROMACS ==
'''Description:''' GROningen MAchine for Chemical Simulations (GROMACS) is a free, open-source, molecular dynamics package. GROMACS can simulate the Newtonian equations of motion for systems with hundreds to millions of particles. GROMACS is primarily designed for biochemical molecules like proteins, lipids and nucleic acids that have a lot of complicated bonded interactions, but since GROMACS is extremely fast at calculating the nonbonded interactions (that usually dominate simulations), many groups are also using it for research on non-biological systems, e.g. polymers.


* '''EasyBuild Applications:''' Software that can be installed using the EasyBuild application is given first priority. A list of supported EasyBuild software can be found [https://docs.easybuild.io/version-specific/supported-software/#arcashla here].
'''Usage:''' To load GROMACS software: module load GROMACS/2021.5-foss-2021b-CUDA-11.4.1 The GROMACS executable is either gmx or gmx mpi if an OpenMPI module is used. When you type gmx help commands, a list of gmx commands and their functions will be displayed.
* '''Critical User Group Applications:''' Applications essential for User Groups but not installable through EasyBuild are prioritized next, ensuring their availability and functionality for specific needs.
* '''Individual User Requests:''' Requests from individual users are processed after addressing the priorities mentioned above, ensuring a systematic approach to software installations based on need and compatibility.

Latest revision as of 04:37, 3 July 2024

This page gives an overview of the software available in NU HPC facilities and explains how to use it.

Software Installation

Software installation on the Shabyt system follows specific criteria to ensure compatibility and effective utilization of resources. Users can request the installation of new software if it meets the following conditions:

  • Availability and Licensing: The software must be freely available or covered by a site license held by NU.
  • Compatibility: It should be compatible with the existing operating system environment on Shabyt to ensure seamless integration and functionality.
  • Resource Utilization: The software should be able to effectively utilize the resources available on Shabyt, optimizing performance and efficiency.

For guidance or support regarding the installation of new software packages, users should contact the Shabyt system administrators at hpcadmin@nu.edu.kz.

Additionally, software are installed in accordance with priorities.

  • Priority 1: Software that can be installed using the EasyBuild application is given first priority. A list of supported EasyBuild software can be found here.
  • Priority 2: Applications which can't be installed through EasyBuild, but essential for multiple User Groups are prioritized next.
  • Priority 3: Application which can't be installed through EasyBuild, but essential for individual users.

It's important to know that this isn't a complete list of all the software in Shabyt system.

Environment Modules

In linux environment variables are values that can change and impact how programs behave on a computer system. They are name-value pairs that all processes can access within a particular user environment or shell session. These variables provide a flexible and convenient method for managing system-wide settings, configuring applications, and customizing system behavior.

Shabyt uses Environment modules (also know as LMOD) to dynamically set up environment variables for different software. Module commands set, change, or delete environment variables that are needed for a particular software. The ‘module load‘ command will set PATH, LD_LIBRARY_PATH and other environment variables such that user may choose a desired version of applications or libraries more easily. More details can be found here.

Environment module commands
Command Description
module avail List of available software
module keyword [word] Search for available modules matching the keyword
module spider [word] Show the details of any modules matching the keyword
module whatis [module] Show the short description about module
module load [package1] [package2] Load the environment for the default version of the modulefile
module load [package]/[version] Load the environment for the specified version of module
module unload [package1] [package2] Unload previously loaded packages
module swap [moduleA] [moduleB] Unload modulefile A and load modulefile B
module list List any currently loaded module(s)
module purge Unload all currently loaded modules

Anaconda

Description: Anaconda, also known as "conda," is a tool for managing Python packages. It helps you create virtual environments for different Python and package versions. You can use Anaconda to install, remove, and update packages within your project environments. For instance you can create virtual environment for game development which requires Pygame with version of Python and you can create environment for machine learning which requires Pytorch with new version of Python.

Usage: module load Anaconda3/2022.05

Working with Anaconda environments

Below is a list of main commands you should use in order to start working with Anaconda.

  1. To Check available environments, please type: conda env list
  2. View a list of packages in an environment
    • If the environment is not activated, please type: conda list -n virtualenv
    • If the environment is activated, then type: conda list
  3. Create Conda environment
    • Create an environment: conda create -n virtualenv
    • Create an environment with a specific Python version: conda create -n virtualenv python=3.12
    • Create an environment to target directory: conda create -p /shared/home/{username}/.conda/envs/virtualenv
  4. Activate an environment: source activate virtualenv
  5. Deactivate an environment: conda deactivate
  6. Remove an environment

conda remove -n virtualenv --all or conda env remove -n virtualenv

Working with packages

Install packages into virtualenv environment

  • If the environment is not activated, please type: conda --name virtualenv install PACKAGENAME
  • If the environment is activated, please type: conda install PACKAGENAME
  • If you want to install multiple packages at once: conda install pkg1 pkg2 pkg3
  • If you need to install package with specific version: conda install numpy=1.15.2

External links

Documentation

User Guide

Video

Conda Cheat Sheet

CUDA

CUDA (Compute Unified Device Architecture) is a parallel computing platform and application programming interface model created by NVIDIA. It allows to use NVIDIA graphics processing units (GPUs) for general purpose processing.

Usage: module load cuda/11.4.1

To check GPUs on GPU nodes: nvidia-smi --list-gpus

GCC

The GNU Compiler Collection, commonly known as GCC, is a set of compilers and development tools available for Linux, Windows, various BSDs, and a wide assortment of other operating systems. It includes support primarily for C and C++ and includes Objective-C, Ada, Go, Fortran, and D. The Free Software Foundation (FSF) wrote GCC and released it as completely free (as in libre) software.

GCC is a toolchain that compiles code, links it with any library dependencies, converts that code to assembly, and then prepares executable files. It follows the standard UNIX design philosophy of using simple tools that perform individual tasks well. The GCC development suite utilizes these discrete tools to compile software.

When you run GCC on a source code file, it first uses a preprocessor to include header files and discard comments. Next, it tokenizes the code, expands macros, detects any compile-time issues, then prepares it for compilation. It is then sent to the compiler, which creates syntax trees of the program’s objects and control flow and uses those to generate assembly code. The assembler then converts this code into the binary executable format of the system. Finally, the linker includes references to any external libraries as needed. The finished product is then executable on the target system.

GCC examples

Compiling a program with GCC can be a straightforward matter

gcc hello.c -o hello

Running this command processes the hello.c file and generates a binary called “hello”.

Additional parameters can be passed.

gcc hello.c -O3 -o hello

In this example, the optimization parameter is set to 3, leading to more optimized code generation.

More complex compilations are managed by Makefiles and are invoked with the “make” command.

External link

Official Page

Apptainer

Singularity is an open-source application for creating and running software containers, designed primarily for high-performance computing on shared Linux-based computing clusters like CARC systems. Singularity containers provide a custom user space and enable portable, reproducible, stable, and secure software environments on Linux systems. A Singularity container bundles a primary application and all of its dependencies into a single image file, which can also include data, scripts, and other files if desired. In addition, Singularity containers have direct access to the Linux kernel on the host system (e.g., Discovery or Endeavour compute nodes), so there is no substantial performance penalty when using a container compared to using natively installed software on the host system.

With Singularity, you can:

  • Install anything you want (based on any Linux operating system)
  • Ease installation issues by using pre-built container images
  • Ensure the same software stack is used among a research group
  • Use the same software stack across Linux systems (e.g., any HPC center or cloud computing service)

Ansys

Description: The ANSYS suite of tools can be used to numerically simulate a wide range of structural and fluid dynamics issues encountered in several engineering, physics, medical, aerospace, and automotive sector applications.

Usage: Loading the ANSYS module module load ansys/2022r1 Launching the workbench is accomplished by: runwb2 The workbench provides access to Fluent, CFX, ICEM, Mechanical APDL/model, and many other languages and models. The appropriate GUIs can be launched outside of the workbench using fluent, cfx5pre, icemcfd, and launcher.

GROMACS

Description: GROningen MAchine for Chemical Simulations (GROMACS) is a free, open-source, molecular dynamics package. GROMACS can simulate the Newtonian equations of motion for systems with hundreds to millions of particles. GROMACS is primarily designed for biochemical molecules like proteins, lipids and nucleic acids that have a lot of complicated bonded interactions, but since GROMACS is extremely fast at calculating the nonbonded interactions (that usually dominate simulations), many groups are also using it for research on non-biological systems, e.g. polymers.

Usage: To load GROMACS software: module load GROMACS/2021.5-foss-2021b-CUDA-11.4.1 The GROMACS executable is either gmx or gmx mpi if an OpenMPI module is used. When you type gmx help commands, a list of gmx commands and their functions will be displayed.