Software
Shabyt HPC cluster provides a large and diverse set of software, including various compilers, programming languages, libraries, and applications. Users can also install software of their choosing themselves in their home directories.
The following guides provide instructions for finding and loading software, building and installing software, as well as using popular applications.
Software Installation
Software installation on the Shabyt system follows specific criteria to ensure compatibility and effective utilization of resources. Users can request the installation of new software if it meets the following conditions:
- Availability and Licensing: The software must be freely available or covered by a site license held by NU.
- Compatibility: It should be compatible with the existing operating system environment on Shabyt to ensure seamless integration and functionality.
- Resource Utilization: The software should be able to effectively utilize the resources available on Shabyt, optimizing performance and efficiency.
For guidance or support regarding the installation of new software packages, users should contact the Shabyt system administrators at hpcadmin@nu.edu.kz.
Additionally, software are installed in accordance with priorities.
- Priority 1: Software that can be installed using the EasyBuild application is given first priority. A list of supported EasyBuild software can be found here.
- Priority 2: Applications which can't be installed through EasyBuild, but essential for multiple User Groups are prioritized next.
- Priority 3: Application which can't be installed through EasyBuild, but essential for individual users.
It's important to know that this isn't a complete list of all the software in Shabyt system.
Environment Modules
In linux environment variables are values that can change and impact how programs behave on a computer system. They are name-value pairs that all processes can access within a particular user environment or shell session. These variables provide a flexible and convenient method for managing system-wide settings, configuring applications, and customizing system behavior.
Shabyt uses Environment modules (also known as Lmod) to dynamically set up environment variables for different software. Module commands set, change, or delete environment variables that are needed for a particular software. The ‘module load
‘ command will set PATH, LD_LIBRARY_PATH and other environment variables such that user may choose a desired version of applications or libraries more easily. More details can be found here.
Command | Description |
---|---|
module avail | List of available software |
module keyword [word] | Search for available modules matching the keyword |
module spider [word] | Show the details of any modules matching the keyword |
module whatis [module] | Show the short description about module |
module load [package1] [package2] | Load the environment for the default version of the modulefile |
module load [package]/[version] | Load the environment for the specified version of module |
module unload [package1] [package2] | Unload previously loaded packages |
module swap [moduleA] [moduleB] | Unload modulefile A and load modulefile B |
module list | List any currently loaded module(s) |
module purge | Unload all currently loaded modules |
Using module avail
module avail
.[hpcadmin@ln01 ~]$ module avail
----------------------------------------------------------------------------- /shared/opt/easybuild/modules/all -----------------------------------------------------------------------------
AMD-uProf/3.5.671 PLUMED/2.7.3-foss-2021b help2man/1.49.3-GCCcore-12.3.0
AOCC/4.0.0-GCCcore-12.2.0 PMIx/3.1.5-GCCcore-9.3.0 help2man/1.49.3-GCCcore-13.2.0 (D)
ATK/2.38.0-GCCcore-11.3.0 PMIx/3.2.3-GCCcore-10.3.0 hwloc/2.2.0-GCCcore-9.3.0
Anaconda3/2022.05 PMIx/4.1.0-GCCcore-11.2.0 hwloc/2.2.0-GCCcore-10.2.0
Autoconf/2.69-GCCcore-9.3.0 PMIx/4.1.2-GCCcore-11.3.0 hwloc/2.4.1-GCCcore-10.3.0
Autoconf/2.69-GCCcore-10.2.0 PMIx/4.2.2-GCCcore-12.2.0 hwloc/2.5.0-GCCcore-11.2.0
Autoconf/2.71-GCCcore-10.3.0 PMIx/4.2.6-GCCcore-13.2.0 (L,D) hwloc/2.7.1-GCCcore-11.3.0
Autoconf/2.71-GCCcore-11.2.0 PROJ/8.0.1-GCCcore-10.3.0 hwloc/2.8.0-GCCcore-12.2.0
Autoconf/2.71-GCCcore-11.3.0 PROJ/9.0.0-GCCcore-11.3.0 (D) hwloc/2.9.2-GCCcore-13.2.0 (L,D)
Autoconf/2.71-GCCcore-12.2.0 Pango/1.48.5-GCCcore-10.3.0 hypothesis/6.13.1-GCCcore-10.3.0
Autoconf/2.71-GCCcore-12.3.0 Pango/1.48.8-GCCcore-11.2.0 hypothesis/6.14.6-GCCcore-11.2.0
Autoconf/2.71-GCCcore-13.2.0 (D) Pango/1.50.7-GCCcore-11.3.0 (D) hypothesis/6.46.7-GCCcore-11.3.0 (D)
module purge
. Then module list
will return No modules loaded
.
Using module spider
If you know the name of a software package, you can use the module spider
command to find out if it is available and how to load it.
[hpcadmin@ln01 ~]$ module spider python
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Python:
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Description:
Python is a programming language that lets you work more quickly and integrate your systems more effectively.
Versions:
Python/2.7.18-GCCcore-11.3.0-bare
Python/3.8.6-GCCcore-10.2.0
Python/3.9.5-GCCcore-10.3.0-bare
Python/3.9.5-GCCcore-10.3.0
Python/3.9.6-GCCcore-11.2.0-bare
Python/3.9.6-GCCcore-11.2.0
Python/3.10.4-GCCcore-11.3.0-bare
Python/3.10.4-GCCcore-11.3.0
Python/3.11.3-GCCcore-12.3.0
Python/3.11.5-GCCcore-13.2.0
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
For detailed information about a specific "Python" package (including how to load the modules) use the module's full name.
Note that names that have a trailing (E) are extensions provided by other modules.
For example:
$ module spider Python/3.11.5-GCCcore-13.2.0
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
This shows that there are multiple versions of Python available.
For more specific information, add the version to your command as given in the example:[hpcadmin@ln01 ~]$ module spider Python/3.11.5-GCCcore-13.2.0
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Python: Python/3.11.5-GCCcore-13.2.0
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Description:
Python is a programming language that lets you work more quickly and integrate your systems more effectively.
This module can be loaded directly: module load Python/3.11.5-GCCcore-13.2.0
Help:
Description
===========
Python is a programming language that lets you work more quickly and integrate your systems
more effectively.
More information
================
- Homepage: https://python.org/
Included extensions
===================
flit_core-3.9.0, packaging-23.2, pip-23.2.1, setuptools-68.2.2, setuptools-
scm-8.0.4, tomli-2.0.1, typing_extensions-4.8.0, wheel-0.41.2
Using module keyword
module keyword
command instead to search for modules.[hpcadmin@ln01 ~]$ module keyword python
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
The following modules match your search criteria: "python"
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Anaconda3: Anaconda3/2022.05
Built to complement the rich, open source Python community, the Anaconda platform provides an enterprise-ready data analytics platform that empowers companies to adopt a modern
open data science analytics architecture.
Python: Python/2.7.18-GCCcore-11.3.0-bare, Python/3.8.6-GCCcore-10.2.0, Python/3.9.5-GCCcore-10.3.0-bare, Python/3.9.5-GCCcore-10.3.0, Python/3.9.6-GCCcore-11.2.0-bare, ...
Python is a programming language that lets you work more quickly and integrate your systems more effectively.
Loading and unloading software
Typically, loading modules is as simple as enteringmodule load <software_name>
. The <software_name>
must be visible when you run module avail
. Lmod will set your environment such that the software specified in <software_name>
will be placed in your PATH
, and then you can run the commands associated with <software_name>
. If there are multiple versions of <software_name>
, you can specify a version. For example:[hpcadmin@ln01 ~]$ module load GCC
[hpcadmin@ln01 ~]$ module list
Currently Loaded Modules:
1) GCCcore/13.2.0 2) zlib/1.2.13-GCCcore-13.2.0 3) binutils/2.40-GCCcore-13.2.0 4) GCC/13.2.0
[hpcadmin@ln01 ~]$ module load GCC/11.3.0
The following have been reloaded with a version change:
1) GCC/13.2.0 => GCC/11.3.0 3) binutils/2.40-GCCcore-13.2.0 => binutils/2.38-GCCcore-11.3.0
2) GCCcore/13.2.0 => GCCcore/11.3.0 4) zlib/1.2.13-GCCcore-13.2.0 => zlib/1.2.12-GCCcore-11.3.0
[hpcadmin@ln01 ~]$ module list
Currently Loaded Modules:
1) GCCcore/11.3.0 2) zlib/1.2.12-GCCcore-11.3.0 3) binutils/2.38-GCCcore-11.3.0 4) GCC/11.3.0
(D)
next to it after running module avail
.:
To unload a specific module, enter:
[hpcadmin@ln01 ~]$ module unload GCC/11.3.0
Anaconda
Description: Anaconda, also known as "conda," is a tool for managing Python packages. It helps you create virtual environments for different Python and package versions. You can use Anaconda to install, remove, and update packages within your project environments. For instance you can create virtual environment for game development which requires Pygame with version of Python and you can create environment for machine learning which requires Pytorch with new version of Python.
Usage: module load Anaconda3/2022.05
Distributions
Quite often, the package manager is not distributed on its own, but with a set of packages that are required for the package manager to work, or even with some additional packages that required for most applications. For instance, the conda package manager is distributed with the Miniconda and Anaconda distributions. Miniconda contains the bare minimum packages for the conda package manager to work, and Anaconda contains multiple commonly used packages and a graphical user interface. The relation between these distributions and the package manager is depicted in the following diagram.
Channels
Conda channels are the locations where packages are stored. There are also multiple channels, with some important channels being:
defaults
, the default channel,anaconda
, a mirror of the default channel,bioconda
, a distribution of bioinformatics softwareconda-forge
, a community-led collection of recipes, build infrastructure, and distributions for the conda package manager.
The most useful channel that comes pre-installed in all distributions, is Conda-Forge. Channels are usually hosted in the official Anaconda page, but in some rare occasions custom channels may be used. For instance the default channel is hosted independently from the official Anaconda page. Many channels also maintain web pages with documentation both for their usage and for packages they distribute:
Working with Anaconda environments
Below is a list of main commands you should use in order to start working with Anaconda.
- To Check available environments, please type:
conda env list
- View a list of packages in an environment
- If the environment is not activated, please type:
conda list -n virtualenv
- If the environment is activated, then type:
conda list
- If the environment is not activated, please type:
- Create Conda environment
- Create an environment:
conda create -n virtualenv
- Create an environment with a specific Python version:
conda create -n virtualenv python=3.12
- Create an environment to target directory:
conda create -p /shared/home/{username}/.conda/envs/virtualenv
- Create an environment:
- Activate an environment:
source activate virtualenv
- Deactivate an environment:
conda deactivate
- Remove an environment
conda remove -n virtualenv --all
or conda env remove -n virtualenv
Working with packages
Install packages into virtualenv environment
- If the environment is not activated, please type:
conda --name virtualenv install PACKAGENAME
- If the environment is activated, please type:
conda install PACKAGENAME
- If you want to install multiple packages at once:
conda install pkg1 pkg2 pkg3
- If you need to install package with specific version:
conda install numpy=1.15.2
External links
GCC
The GNU Compiler Collection, commonly known as GCC, is a set of compilers and development tools available for Linux, Windows, various BSDs, and a wide assortment of other operating systems. It includes support primarily for C and C++ and includes Objective-C, Ada, Go, Fortran, and D. The Free Software Foundation (FSF) wrote GCC and released it as completely free (as in libre) software.
GCC is a toolchain that compiles code, links it with any library dependencies, converts that code to assembly, and then prepares executable files. It follows the standard UNIX design philosophy of using simple tools that perform individual tasks well. The GCC development suite utilizes these discrete tools to compile software.
When you run GCC on a source code file, it first uses a preprocessor to include header files and discard comments. Next, it tokenizes the code, expands macros, detects any compile-time issues, then prepares it for compilation. It is then sent to the compiler, which creates syntax trees of the program’s objects and control flow and uses those to generate assembly code. The assembler then converts this code into the binary executable format of the system. Finally, the linker includes references to any external libraries as needed. The finished product is then executable on the target system.
GCC examples
Compiling a program with GCC can be a straightforward matter
gcc hello.c -o hello
Running this command processes the hello.c file and generates a binary called “hello”.
Additional parameters can be passed.
gcc hello.c -O3 -o hello
In this example, the optimization parameter is set to 3, leading to more optimized code generation.
More complex compilations are managed by Makefiles and are invoked with the “make” command.
External link
Git
This section is under construction
Description: Git is an open-source version control system primarily used for software development. It has many appealing features, including seamless branching and merging, fast performance, and easy-to-learn workflows.
Git is installed by default via dnf package manager and users can use git without loading a module. If you need a specific version of Git, load by using module load git/{git version}
GROMACS
This section is under construction
Description: GROMACS: GROningen MAchine for Chemical Simulations
GROMACS is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles. It is primarily designed for biochemical molecules like proteins, lipids and nucleic acids that have a lot of complicated bonded interactions, but since GROMACS is extremely fast at calculating the nonbonded interactions (that usually dominate simulations) many groups are also using it for research on non-biological systems, e.g. polymers.
CUDA
Description: CUDA (Compute Unified Device Architecture) is a parallel computing platform and application programming interface (API) model created by NVIDIA. It allows software developers to use a CUDA-enabled graphics processing unit (GPU) for general purpose processing, an approach known as General Purpose GPU (GPGPU) computing.
Usage: module load cuda/11.4.1
Executables
nvcc
nvidia-smi
Monitoring GPU
You can check the available GPUs, their current usage, installed version of the nvidia drivers, and more with the command nvidia-smi. Either in an interactive job or after connecting to a node running your job with ssh,nvidia-smi
output should look something like this:[hpcadmin@gn01 ~]$ nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03 Driver Version: 460.32.03 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 108... On | 00000000:02:00.0 Off | N/A |
| 23% 34C P8 9W / 250W | 1MiB / 11178MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
MATLAB
This section is under construction
Description: MATLAB is a programming environment for algorithm development, data analysis, visualization, and numerical computation. Using MATLAB, you can solve technical computing problems faster than traditional programming languages, such as C, C++, and Fortran.
You can use MATLAB in various applications, including signal and image processing, communications, control design, test and measurement, financial modeling and analysis, and computational biology.
Usage: module load matlab/r2022aEnvironment variables
PATH /shared/opt/matlab/r2022a:$PATH
LD_LIBRARY_PATH /shared/opt/matlab/r2022a/extern/bin/glnxa64
MLM_LICENSE_FILE 27000@ln01Licensing
LAMMPS
This section is under construction