Skip to content

Presentation

Tolosa (TOols Library for unstructured Ocean models and Surge Applications) is a free and open-source computational framework for simulating ocean and river dynamics on unstructured meshes using MPI parallelization and original numerical schemes. It is developed at IMT Toulouse (INSA/CNRS) in collaboration with SHOM, ICJ Lyon, LAMA Chambéry, and INRAE Grenoble.

Mathematical Models

Phase-resolved ocean wave modeling via a novel Boussinesq-type model (hyperbolized Green-Naghdi equations) and an original wave breaking model based on enstrophy

Numerical Methods

Schemes on unstructured meshes for complex shorelines; global stability by entropy dissipation; compromise between accuracy, robustness, and performance

High Performance Computing

Written in modern Fortran 2008 with OOP features; CPU and GPU with MPI parallelization; GPU speedup up to ~11× over a CPU node (AMD MI300A, Adastra)

Tolosa follows the KISS principle (Keep It Simple and Stupid), leveraging modern object-oriented programming features from the Fortran 2003 and 2008 standards. The codebase is lightweight and maintainable while delivering optimal computational efficiency.

All Tolosa simulation codes are built on the shared Tolosa-lib library, which provides reusable structures and tools for mesh handling, MPI parallelization, I/O, and numerical utilities.

Operational Marine Flooding

SHOM and Météo France deploy Tolosa-sw for marine flooding warnings coupled with atmospheric pressure forecasts. A 5-day Atlantic prediction runs in 5 minutes on 640 CPU cores (2.6 M cell mesh).

Coastal Wave Modeling

Tolosa-lct handles large-scale phase-resolved simulations: Île de Ré (~11 M cells) and Saint-Malo (~16 M cells at 1 m coastal resolution) with JONSWAP spectral input.

Academic Research

Soliton gases (LEGI Grenoble), turbulence models for hydraulics (G. L. Richard, J.-P. Vila), surface tension in Saint-Venant (D. Bresch, C. Ruyer-Quil), wave-structure interactions.

Benchmark on the Saint-Venant model (36 M cells, ~204 000 time steps, Jean-Zay and Adastra supercomputers):

HPC absolute performance — wall-clock time vs. number of nodes/GPUs

PlatformConfigurationTime (s)
CPU — Jean-Zay (40 cores/node)2 nodes5726
CPU — Jean-Zay (40 cores/node)4 nodes2845
CPU — Jean-Zay (40 cores/node)8 nodes1558
CPU — Jean-Zay (40 cores/node)16 nodes687
GPU — NVIDIA A100 (Jean-Zay, 8 GPUs/node)1 GPU1555
GPU — NVIDIA A100 (Jean-Zay, 8 GPUs/node)2 GPUs795
GPU — NVIDIA A100 (Jean-Zay, 8 GPUs/node)4 GPUs410
GPU — NVIDIA A100 (Jean-Zay, 8 GPUs/node)8 GPUs221
GPU — AMD MI300A (Adastra, 4 GPUs/node)1 GPU1011
GPU — AMD MI300A (Adastra, 4 GPUs/node)2 GPUs508
GPU — AMD MI300A (Adastra, 4 GPUs/node)4 GPUs273
GPU — AMD MI300A (Adastra, 4 GPUs/node)8 GPUs149

Both CPU and GPU architectures demonstrate good MPI parallel scaling. At 1 GPU: A100 is ~3.7× faster than 2 CPU nodes; MI300A is ~5.7× faster than 2 CPU nodes.