The GFI computing system: Difference between revisions
mNo edit summary |
mNo edit summary |
||
| Line 1: | Line 1: | ||
The Geophysical Institute use a department server | The Geophysical Institute use a department compute server cluster named "cyclone". Cyclone consists of three separate compute nodes organized in a cluster which access the same centralized storage. Two of the nodes are for regular computational tasks, and one is set up with a GPU to enable use for machine learning (ML) application. The new cyclone replaces the older system '"cyclone.hpc.uib.no", acquired in Summer 2018, which at the time of writing this documentation is still accessible as well (14.03.2026). | ||
The server cluster is maintained by UiB ITA and compute resources are located in NREC as virtual servers, currently operated with linux , Rocky 8 | The server cluster is maintained by UiB ITA and compute resources are located in NREC as virtual servers, currently operated with linux, Rocky 8 | ||
All users of the cyclones should subscribe to the mailing list [linux@gfi.uib.no] | All users of the cyclones should subscribe to the mailing list [linux@gfi.uib.no] to receive updates about maintenance downtime etc. | ||
Please notify the UiB support [https://hjelp.uib.no * https://hjelp.uib.no] in any case of problems or questions. You may give your support request some ref. to "Group of scientific computing" | Please notify the UiB support [https://hjelp.uib.no * https://hjelp.uib.no] in any case of problems or questions. You may give your support request some ref. to "Group of scientific computing". | ||
In the following, you will find: | In the following, you will find: | ||
==Users and intended use== | ==Users and intended use== | ||
| Line 23: | Line 13: | ||
The intended users at GFI for the cyclones are: | The intended users at GFI for the cyclones are: | ||
* MSc students for course and thesis work | * MSc students for course and thesis work. | ||
* PhD students and PostDocs for scientific work | * PhD students and PostDocs for scientific work. | ||
* all other researchers and technical staff for data analysis, data storage, and routine computations | * all other researchers and technical staff for data analysis, data storage, and routine computations. | ||
==Requirements== | ==Requirements== | ||
| Line 47: | Line 37: | ||
* lifetime of 5-7 years | * lifetime of 5-7 years | ||
==System characteristics== | == System characteristics and good user practice == | ||
The system has been created as a low threshold, multi-user system. Resources are limited, and shared among all users. This implies that users need to have an overview over which resources they are using. | The system has been created as a low threshold, multi-user system. Resources are limited, and shared among all users. This implies that users need to have an overview over which resources they are using. | ||
| Line 80: | Line 70: | ||
The compute system is separated into three separate nodes. Two nodes (cyclone1 and cyclone2) are dedicated to computing tasks without GPU use. A third node is dedicated for GPU applications (cyclone 3). In detail, the specifications of the 3 compute nodes are: | The compute system is separated into three separate nodes. Two nodes (cyclone1 and cyclone2) are dedicated to computing tasks without GPU use. A third node is dedicated for GPU applications (cyclone 3). In detail, the specifications of the 3 compute nodes are: | ||
Cyclone 1 - CPU applications: | '''Cyclone 1 - CPU applications (cyclone1.gfi.uib.no):''' | ||
AMD EPYC Processor, 64 physical cores and 512 GB memory per node, 2.3 GHz - no GPU | AMD EPYC Processor, 64 physical cores and 512 GB memory per node, 2.3 GHz - no GPU | ||
Cyclone 2 - CPU applications: | '''Cyclone 2 - CPU applications (cyclone2.gfi.uib.no):''' | ||
AMD EPYC Processor, 64 physical cores and 512 GB memory per node, 2.3 GHz - no GPU | AMD EPYC Processor, 64 physical cores and 512 GB memory per node, 2.3 GHz - no GPU | ||
Cyclone 3 - GPU applications: | '''Cyclone 3 - GPU applications (cyclone3.gfi.uib.no):''' | ||
Intel CPU, 16 physical cores, 128 GB memory | Intel CPU, 16 physical cores, 128 GB memory | ||
| Line 96: | Line 86: | ||
== GPU acceleration == | == GPU acceleration == | ||
One of the cyclones (cyclone3) is equipped with an advanced Graphics Processing Units (GPUs) of type NVIDIA L40S-24Q. The GPU is shared between all users on the system. | One of the cyclones (cyclone3) is equipped with an advanced Graphics Processing Units (GPUs) of type NVIDIA L40S-24Q, and is reserved for GPU application usage only. The GPU is shared between all users on the system. | ||
The GPU usage can be monitored using the command | The GPU usage can be monitored using the command | ||
Revision as of 09:36, 17 March 2026
The Geophysical Institute use a department compute server cluster named "cyclone". Cyclone consists of three separate compute nodes organized in a cluster which access the same centralized storage. Two of the nodes are for regular computational tasks, and one is set up with a GPU to enable use for machine learning (ML) application. The new cyclone replaces the older system '"cyclone.hpc.uib.no", acquired in Summer 2018, which at the time of writing this documentation is still accessible as well (14.03.2026).
The server cluster is maintained by UiB ITA and compute resources are located in NREC as virtual servers, currently operated with linux, Rocky 8
All users of the cyclones should subscribe to the mailing list [linux@gfi.uib.no] to receive updates about maintenance downtime etc.
Please notify the UiB support * https://hjelp.uib.no in any case of problems or questions. You may give your support request some ref. to "Group of scientific computing".
In the following, you will find:
Users and intended use
The intended users at GFI for the cyclones are:
- MSc students for course and thesis work.
- PhD students and PostDocs for scientific work.
- all other researchers and technical staff for data analysis, data storage, and routine computations.
Requirements
Users should be able to use the system in interactive use (click and point, typing) and in batch use (using shell scripts). Typical tasks are data analysis, plotting. GFI will also run routine data processing of observations and forecasts on this system.
Therefore, we required the system to be or have
- low threshold access (easy to use)
- interactive use
- safe usage (against unintentional overuse)
- optimized for serial I/O (as is typical for post-processing tasks)
- single-node parallelisation jobs
- lifetime of 5-7 years
System characteristics and good user practice
The system has been created as a low threshold, multi-user system. Resources are limited, and shared among all users. This implies that users need to have an overview over which resources they are using.
The system characteristics, and the good user practice that follows along with these, are as follows:
- There is (currently) no queue system on any of the "cyclones". Computational tasks are started immediately.
- Computational tasks will be terminated if users log out. To keep jobs running while logged out, start jobs in the background using the linux commands 'tmux', 'screen' or 'nohup'.
- There is resource limitation per user. Each session can use up to 8 out of the 64 CPU cores. In addition, each session can allocate up to 120 GB of memory. This is meant to prevent individual users from accidentally bringing down the system.
- Software packages are activated using '''module load''' command. Use 'module spider <software>' to find available software. The available software stack is identical on all cyclones.
- Users have access to a home directory, a work storace, and long-term project and shared resource storage. The same storage environments is connected to all of the cyclones. Use the work storage for all non-permanent input/output. Files on the work environment are deleted after 60 days. Use NIRD for long-term storage of larger output.
- The system is maintained by the experts from UiB's HPC group. There is bi-weekly maintenance planned, typically on Wednesdays. Sometimes this will involve a reboot of the system. Plan longer jobs that they don't get interrupted or interfere with maintenance scheduling.
- The system is maintained by the experts from UiB's HPC group. There is bi-weekly maintenance planned, typically on Wednesdays. Sometimes this will involve a reboot of the system. Plan longer jobs that they don't get interrupted or interfere with maintenance scheduling.
- Subscribe to the mailing list [linux@gfi.uib.no](mailto:linux@gfi.uib.no) to receive updates about problems and maintenance
Storage environment
GFI&SKD have a disposal of approx 450TB network storage quotas from UiB ITA(May 2024). This storage is all organized via ITA standard NetApp solution and distributed towards cyclones servers, GFI&SKD Windows clients, Macos clients and Linux clients. The storage is also available for GFI&SKD users connecting from UiB-vpn.
- The GFI&SKD storage is organized into larger shared areas for model data, more limited group and project areas, areas for individual users and common areas for short time storage. The different storage areas are in general maintained via ITA backup procedures and governed by group or individual storage quotas.
- GFI&SKD storage is organized in two main folders: Linux: /Data/gfi - Win11/Mac: \\klient.uib.no\felles\matnat\gfi and Linux: /Data/skd - Win11/Mac: \\klient.uib.no\felles\matnat\skd
Computational resources
The compute system is separated into three separate nodes. Two nodes (cyclone1 and cyclone2) are dedicated to computing tasks without GPU use. A third node is dedicated for GPU applications (cyclone 3). In detail, the specifications of the 3 compute nodes are:
Cyclone 1 - CPU applications (cyclone1.gfi.uib.no):
AMD EPYC Processor, 64 physical cores and 512 GB memory per node, 2.3 GHz - no GPU
Cyclone 2 - CPU applications (cyclone2.gfi.uib.no):
AMD EPYC Processor, 64 physical cores and 512 GB memory per node, 2.3 GHz - no GPU
Cyclone 3 - GPU applications (cyclone3.gfi.uib.no):
Intel CPU, 16 physical cores, 128 GB memory
1 NVIDIA GPU, L40S-24Q, 24 GB GPU memory
GPU acceleration
One of the cyclones (cyclone3) is equipped with an advanced Graphics Processing Units (GPUs) of type NVIDIA L40S-24Q, and is reserved for GPU application usage only. The GPU is shared between all users on the system.
The GPU usage can be monitored using the command
'''nvidia-smi'''
Programming on this GPU is done using CUDA.
Modules are available for programming with CUDA.
Python users can load module PyTorch
