Olivier Aumage

Many computer applications of high societal, economical, scientific values require vast amounts of processing power to solve problems of interest. The High-Performance Computing (HPC) research domain is twofold. On the hardware side, it is in charge of building supercomputers able to supply the necessary processing power. On the software side, it aims at proposing programming models, methods, environments and tools to let applications harness this power. The focus of my research works is on this software side.

Collaborative Projects

Inria Associate Team Maelstrom

Begin: 2022
End: 2025
Duration: 36 months
Participants:
- Inria Team STORM
- SIMULA HPC Department
Personal role: co-Principal Investigator

Maelstrom associate team aims to build on the potential for synergy between STORM and SIMULA to extend the effectiveness of FEniCS on heterogeneous, accelerated supercomputers, while preserving its friendliness for scientific programmers, and to readily make the broad range of applications on top of FEniCS benefit from Maelstrom’s results.

EuroHPC (JU) MICROCARD

Begin: 2021
End: 2025
Duration: 42 months
Participants:
- University of Bordeaux
- Inria
- University of Strasbourg
- Karlsruhe Institute of Technology
- Simula
- Zuse Institute Berlin
- Megware
- Numericor
- OROBIX
- Università di Pavia
- Università della Svizzera Italiana
Personal role: Participant

MICROCARD is a European research project to build software that can simulate cardiac electrophysiology using whole-heart models with sub-cellular resolution, on future exascale supercomputers. It is funded by EuroHPC call Towards Extreme Scale Technologies and Applications.

Full List of Projects

International Projects

EuroHPC (JU) MICROCARD

Begin: 2021
End: 2025
Duration: 42 months
Participants:
- University of Bordeaux
- Inria
- University of Strasbourg
- Karlsruhe Institute of Technology
- Simula
- Zuse Institute Berlin
- Megware
- Numericor
- OROBIX
- Università di Pavia
- Università della Svizzera Italiana
Personal role: Participant

H2020 EXA2PRO

Begin: 2018
End: 2021
Duration: 39 months
Participants:
- ICCS
- Linköping University
- CERTH
- Inria
- Jülich
- Maxeler
- CNRS
- University of Macedonia
Personal role: Participant

The vision of EXA2PRO is to develop a programming environment that will enable the productive deployment of highly parallel applications in exascale computing systems. EXA2PRO programming environment will integrate tools that will address significant exascale challenges. It will support a wide range of scientific applications, provide tools for improving source code quality, enable efficient exploitation of exascale systems’ heterogeneity and integrate tools for data and memory management optimization Additionally, it will provide fault-tolerance mechanisms, both user-exposed and at runtime system level and performance monitoring features. EXA2PRO will be evaluated using 4 applications from 3 different domains, which will be deployed in JUELICH supercomputing centre: High energy physics, materials and supercapacitors.

PRACE-5IP

Begin: 2017
End: 2019
Duration: 28 months
Personal role: Participant

Project PRACE 5th Implementation Phase’’ gathers 25 european entities (supercomputing centers, research institutes, universities) for the implementation and the development of the ecosystem enabling the effective exploitation of the capacity of national supercomputing centers, through the supporting software offer, through trainings and development of strategies and good practices. The contribution of Team Storm aims at the evolution of the KStar OpenMP compiler and OpenMP standard development for the transparent management of accelerators and distributed memory nodes.

H2020 INTERTWinE

Begin: 2015
End: 2018
Duration: 36 months
Participants:
- EPCC/University of Edinburgh
- Barcelona Supercomputing Center
- KTH
- Inria
- Fraunhofer Institute
- DLR
- T-Systems Software
- University Jaume 1
- University of Manchester.
Personal role: Local Contact

The objective of Project INTERTWinE is to implement, develop, and promote interoperability between programming interfaces, runtime systems, numerical libraries, in order to enable their joint use with high performance computing applications. On the one side, the contribution of the team STORM focusses on enabling the StarPU task-based runtime system to dynamically adapt its usage of computing resources (computing cores, accelerators) in accordance with instant requirements, and to lend/reclaim resources in concertation with other runtime systems. On the other side, it also focusses on the evolution of the OpenMP parallel language specification, in order to integrate elements of sharing and negociation of resources.

FP7 Mont-Blanc 2

Begin: 2013
End: 2017
Duration: 42 months
Participants:
- Barcelona Supercomputing Center
- Bull SAS
- STMicroelectronics SAS
- ARM
- Juelich Supercomputing Center
- LRZ
- Univ. Stuttgart
- CINECA
- CNRS
- Inria
- CEA
- Univ. Bristol
- Allinea Software
Personal role: Participant, Local Contact (2015/12–2017/03)

Project Mont-Blanc 2 aims at developing the software stack of the Mont-Blanc experimental platform, exploring ARM-based computing node alternatives for supercomputers, especially from the performance point of view The development of the analysis and reassembly engine of the MAQAO software (Initially from Team Runtime, later from Team STORM) has been developed within this context.

FP7 MCA HPC-GA

Begin: 2012
End: 2014
Duration: 36 months
Participants:
- Brazil
- Mexico
- Spain
- France
Personal role: Local Contact

Designing and porting Geophysics applications on contemporaneous supercomputers necessitates a strong expertise, in order to master their heterogeneous architectures made of multi-core processors and accelerating board. The goal of Project HPC-GA is to study and extend the runtime supports for an efficient exploitation of these machines within geophysics simulation applications. The contribution of Team Runtime has been the extension of the StarPU and ForestGOMP software components, and there use in the project’s applications.

ANR JST FP3C

Begin: 2010
End: 2013
Duration: 36 months
Participants:
- Japan
- France
Personal role: Participant

The goal of this project is to contribute to establish the programming models, languages and software technologies to explore high performance computing beyond petascale, on the way to exascale on French and Japanese supercomputers The contribution of Team Runtime has mainly focussed on interfacing our StarPU and NewMadeleine software components with software components of our Japanese peers.

Egide PHC Pessoa MAE

Begin: 2008
End: 2010
Duration: 24 months
Participants:
- University of Lisbon
- University of Evora
- Inria Team RUNTIME
Personal role: Participant

This collaboration focusses on the design of constraint solving engines (such as GNU Prolog) for parallel architectures, on top of the runtime systems designed within the Runtime team. On of the goals is notably to exploit heterogeneous computing hardware such as the IBM Cell processor, which shows an interesting potential for this specific use. The collaboration effort has mainly been directed towards the exploitation of the Marcel thread library within the constraint solvers designed by the Portuguese Teams.

NSF/Inria – UNH

Begin: 1998
End: 2001
Duration: 48 months
Participants:
- UNH CS Dept.
- Inria Team REMAP
Personal role: Participant

The first collaboration project (1998-99) between the Group of Luc Bougé (LIP ENS Lyon) and the Group of Phil Hatcher (University of New Hampshire) has been jointly funded by Inria and the NSF. It aimed at studying the use of distributed multithreading for supporting parallel applications developed with high level languages. During this first phase, I have made a 3-month visit at the computer science laboratory of the UNH, funded by the project, on the topic of parallel, multithreaded file servers. The success of this collaboration led to the implementation of a second collaborative project (2000-01) centered on the study of a system support for Java programs on clusters of workstations, for which Madeleine 2 was used as the communication support.

International Inria Projects

Inria Associate Team Maelstrom

Begin: 2022
End: 2025
Duration: 36 months
Participants:
- Inria Team STORM
- SIMULA HPC Department
Personal role: co-Principal Investigator

National Projects

ANR SOLHARIS

Begin: 2019
End: 2023
Duration: 48 months
Participants:
- CNRS-IRIT
- Inria Bordeaux
- Inria Lyon
- CEA CESTA
- Airbus Group Innovations
Personal role: Participant

SOLHARIS aims at achieving strong and weak scalability (i.e., the ability to solve problems of increasingly large size while making an effective use of the available computational resources) of sparse, direct solvers on large scale, distributed memory, heterogeneous computers. These solvers will rely on asynchronous task-based parallelism, rather than traditional and widely adopted message-passing and multithreading techniques; this paradigm will be implemented by means of modern runtime systems which have proven to be good tools for the development of scientific computing applications.

PIA ELCI

Begin: 2015
End: 2018
Duration: 36 months
Participants:
- Bull SAS
- CEA
- Inria
- SAFRAN
- CERFACS
- CORIA
- CENAERO
- ONERA
- IFPEN
- UVSQ
- Kitware
- AlgoTech
Personal role: Participant

Project ELCI (Environnement Logiciel pour le Calcul Intensif, Software Environment for HPC) aims at developing a new generation of software stacks for controlling supercomputers, of numerical solvers, of pre-/post-/co-processing software, of programming and execution environments, and to validate these developments by demonstrating their ability at offering a better level of scalability, of resilience, of safety, of modularity, of abstraction, and of interactivity on applicative testcases One of the contributions of Team STORM has been the study, in partnership with Inria Team AVALON, of component models for task-based programming environments such as StarPU. A second contribution of Team STORM has been the interfacing of StarPU with CEA’s MPC environment.

ANR SOLHAR

Begin: 2013
End: 2017
Duration: 48 months
Participants:
- Inria Bordeaux
- Inria Lyon
- CNRS-IRIT
- CEA CESTA
- Airbus Group Innovations
Personal role: Participant

This project aims at studying and designing algorithms and direct methods for solving sparse linear systems on platforms equipped with accelerators, through the intermediation of a runtime system. The StarPU runtime system has been one of the runtime systems studied, extended, and used as the foundation for the software developments of the project

ANR ProHMPT

Begin: 2009
End: 2012
Duration: 42 months
Participants:
- Inria Runtime
- CAPS Entreprise
- CEA INAC
- UVSQ PRiSM
- Inria Grenoble
- Bull SAS
- CEA CESTA
Personal role: Principal Investigator

The goal of Project ANR ProHMPT is the exploration of architectures equipped with accelerators (such as the graphical processors, or GPUs), and the evolution of the software toolchain at the compiler, runtime and analysis levels, to enable high performance computing on such platforms This new generation of tools for heterogeneous platforms aims at addressing the needs in terms of performances from nanosimulations and aerodynamics applications. Project ProHMPT has in particular resulted in the design of the StarPU runtime system

ANR NUMASIS

Begin: 2006
End: 2009
Duration: 36 months
Participants:
- ID-IMAG
- BRGM
- Bull SAS
- CEA
- IRISA
- LaBRI
- LMA
- TOTAL SA
Personal role: Participant

Computer architectures are evolving towards the use of hierarchical memories, organised with a non-uniform access of processors to memory banks. In order to extract the highest hardware performances, parallel applications need software components to enable the accurate distribution of computing processes and data, to avoid expensive non-local memory accesses. Project ANR NUMASIS aims at studying and extending the available mechanisms at the level of operating systems and middlewares for managing the smart distribution of computations and data on such platforms. The targeted field, seismology, is particularly representative of the requirements from high performance scientific applications. The contributions from Team Runtime have been on the development and interfacing of the Marcel multithread library and the ForestGOMP OpenMP runtime system with the applications of the project

ANR LEGO

Begin: 2006
End: 2009
Duration: 36 months
Participants:
- LIP ENS-Lyon
- IRISA
- Inria Futurs/LaBRI
- IRIT
- CERFACS
- CRAL
Personal role: Local Contact

The goal of Project ANR LEGO is to propose and implement a multi-paradigm programming model (components, data sharing, master/slave, and workflow) constituting the state of the art for computing grid programming. It relies on efficient scheduling, deployment and communication technologies. The LEGO Model aims at supporting three kinds of classical high performance computing applications: climate modelling, astronomical simulation, and linear algebra. The contribution of Team Runtime has been the development and the interfacing of the NewMadeleine communication library with the applications of the project

ACI Grid 5000

Begin: 2003
End: 2006
Duration: 36 months
Personal role: Participant

The Grid 5000 Program has taken place within the context of the French ACI Grid Action It aimed at promoting the creation of an experimental platform for computer science research made of a large scale computation and data grid deployed on the territory of France The contribution of Team Runtime has been the development of the Madeleine 2 and the NewMadeleine communication libraries in a grid-computing context on the Grid 5000 platform"

ACI Grid RMI

Begin: 2001
End: 2002
Duration: 24 months
Personal role: Participant

This French Action coordinated by Christian Perez (IRISA Laboratory) has focussed on the theme of computing resources and data globalisation. One of the objective of this project was to design an object-oriented distributed platform to exploit multiple networking technologies in a transparent manner. This platform is built on extensions of the Marcel multithreading library and the Madeleine 2 communication library.

RNRT VTHD

Begin: 2000
End: 2001
Duration: 24 months
Personal role: Participant

This collaborative project grouped several French research teams with the purpose of exploiting the VTHD (\textit{very high bandwidth}) 2.5 Gb/s network of the RENATER national entity, linking several research centers from Inria and from France Telecom. Research efforts have been devoted primarily on protocols, middlewares and applications. I have been more specifically involved in the designed of a communication interface tailored for this network, based on the Madeleine 2 communication library.

National Inria Projects

Inria IPL HAC SPECIS

Begin: 2015
End: 2019
Duration: 48 months
Participants:
- Teams AVALON and POLARIS (Inria Rhône-Alpes)
- Teams MYRIADS and SUMO (Inria Bretagne Atlantique)
- Teams HIEPACS and STORM (Inria Bordeaux - Sud-Ouest)
- Team MEXICO (Inria Île-de-France)
- Team VERIDIS (Inria Grand Est)
Personal role: Participant

The goal of the HAC SPECIS (High-performance Application and Computers: Studying PErformance and Correctness In Simulation) project is to answer methodological needs of HPC application and runtime developers and to allow to study real HPC systems both from the correctness and performance point of view. To this end, we gather experts from the HPC, formal verification and performance evaluation community.

Inria IPL C2S@Exa

Begin: 2013
End: 2015
Duration: 24 months
Participants:
- ANDRA
- CEA IRFM
- 10 Inria Teams
Personal role: Local Contact

This multidisciplinary IPL action from Inria (Inria Project Lab) aims primarily at developing a continuum of skills to efficiently exploit the computing capacity of future petaflops machines and beyond, for running complex, large scale numerical simulations The contribution of Team Runtime, and later of Team STORM has been the exploitation and the evolution of StarPU within linear algebra libraries and numerical simulation software, and the evolution of the KStar OpenMP compiler.

Local Projects

Cluster SysNum, IDEX Bordeaux

Begin: 2018
End: 2019
Duration: 12 months
Personal role: Participant

The SysNum cluster aims to develop top level research on the next generation of digital interconnected systems, going from sensor design to the decision processes. The contribution of Team STORM / SATANAS from LaBRI laboratory, in partnership with Team BKB, was to study the port of the StarPU runtime system as an alternative support for data analysis applications built on top of the Apache Spark environment. This contribution took place as part of Task T3.2: \emph{Convergence of High Performance Computing and Big Data}.

GENCI CVT

Begin: 2017
End: 2017
Duration: 12 months
Personal role: Principal Investigator

The Technical Watch Cell (\emph{Cellule de Veille Technologique}, CVT) for the French supercomputing centers’ supervising entity GENCI aims at anticipatively exploring emerging computing architectures, to prepare future hardware upgrades at these national centers. In this context, I have conducted a study on behalf of Team STORM about the adaptation of StarPU on two emerging platforms: a cluster of Intel Xeon Phi KNL nodes installed at CINES computing center and a cluster of IBM Power 8+ nodes equipped with accelerators installed at IDRIS.

Cluster CPU, IDEX Bordeaux

Begin: 2015
End: 2016
Duration: 24 months
Participants:
- Team STORM
- IMS laboratory
Personal role: Participant

The objectives of Cluster CPU, part of Bordeaux’s initiative of excellence (IDEX) are to federate the main public actors from Bordeaux related to numerical sciences, in order to develop them up to the level enabling their use as certification tools, stimulation tools for innovation, competitiveness, visibility, attractiveness of the CPU community in terms of research, education, promotion, especially from the international point of view The contribution of Team STORM, in partnership with the IMS Lab from the University of Bordeaux has been the design and development of AFF3CT (initially named P-EDGE during the first few months of the project), a software suite for the simulation and the analysis of algorithms for Error Correction Codes used in wireless communications.

Aquitaine Regional Council / CEA

Begin: 2013
End: 2016
Duration: 36 months
Participants:
- CEA CESTA
- Inria Bordeaux
Personal role: Participant

Running task-based applications on distributed architectures made of thousands of heterogeneous nodes remains a challenge for the community. In the general case, it is necessary to widen the scope of the runtime system to the entire machine. This is the purpose of this project about the design of a task-based runtime system able to schedule large task graphs on architectures with a large number of nodes. This project has enabled the co-funding of the PhD Thesis of Marc Sergent, for which I have been one of the advisors. One of the major outcome of this project and of M. Sergent’s PhD Thesis is the extension of the programming model of StarPU to support large scale distributed and heterogeneous platforms.

Alta

Begin: 2004
End: 2004
Duration: 12 months
Participants:
- Inria Teams Paris (Rennes) and Runtime
- Team AND from LIFC Laboratory
Personal role: Participant

Study on the potential performance gains to be obtained in selectively disabling the reliable message retransmission mechanisms for asynchronous, iterative distributed computations with robustness properties against specific message losses. The study has been conducted on the Madeleine 2 communication library.

Local Inria Projects

Inria AFF3CT

Begin: 2019
End: 2020
Duration: 12 months
Personal role: co-Principal Investigator

AFF3CT is a toolchain for designing, validation and experimentation of new Error Correcting codes. This toolchain is written in C++, and this constitutes a difficulty for many industrial users, who are mostly electronics engineers. The goal of this ADT is to widen the number of possible users by designing a Matlab and Python interface for AFF3CT, in collaboration with existing users, and proposing a parallel framework in OpenMP.

Inria ADT KStar

Begin: 2013
End: 2015
Duration: 24 months
Participants:
- Inria Team MOAIS (Inria Grenoble)
- Inria Team Runtime
Personal role: co-Principal Investigator

The goal of this action is to enable C/C++ application developers to exploit the power of heterogeneous architectures (more CPUs, GPUs, accelerators) through a high level of abstraction using programming directives (pragmas) compliant with the OpenMP standard. The KStar OpenMP compiler designed within the context of this action allows to automatically translate an OpenMP compliant application into an application targeting one of the runtime systems developed by Team MOAIS and Runtime respectively: Kaapi for Team MOAIS, and StarPU for Team Runtime. This action notably resulted in the design of the KStar OpenMP compiler and the KaSTORS benchmark suite.

Inria ADT VisiMar

Begin: 2009
End: 2011
Duration: 24 months
Participants: Inria Team Runtime
Personal role: Principal Investigator

Marcel is a multithread library providing a high performance scheduler able to adapt itself to multi-core, multi-processor, and NUMA parallel architectures. This library is a foundation for the software components designed within the Runtime team, and in the context of collaborations. This action aims at enhancing the diffusion and durability of the Marcel Library by extending it with an API compliant with the POSIX Threads specification.

Inria ADT AMPLI

Begin: 2009
End: 2011
Duration: 24 months
Participants:
- Inria Team Runtime
- Inria Team Concha (Pau)
Personal role: Participant

The aim of this project is to develop tools for the direct simulation of tridimensional turbulent flows in simple geometries on parallel machines within the CONCHA framework developed by Team Concha (Inria antenna from Pau). The collaboration has mainly focussed on the interfacing of CONCHA on top of the Marcel multithreading library.

ResCapA Inria Action

Begin: 1998
End: 2099
Duration: 24 months
Personal role: Participant

This action aimed at uniting the efforts of several research teams around the topic of network technologies with addressing capabilities, such as the Scalable Coherent Interface (SCI) One of the result of this action was the initial design of the Madeleine 2 communication library and its port on the SCI network technology.

Industrial Projects

CodePlay

Begin: 2015
End: 2016
Duration: 12 months
Personal role: Participant

Samsung GRO grant

Begin: 2013
End: 2013
Duration: 12 months
Personal role: Participant

TOTAL SA, CIFRE PhD grant

Begin: 2012
End: 2015
Duration: 36 months
Personal role: Participant

Alcatel

Begin: 2001
End: 2002
Duration: 24 months
Personal role: Participant

Microsoft

Begin: 2000
End: 2001
Duration: 24 months
Personal role: Participant

Research Software

AFF3CT

High-Performance Fast Forward Error Correction Toolbox for building numeric communication chains in Software Defined Radio (SDR).

Numerical communications rely on an encoding process that involves integrating specific redundancy patterns in the communication flow prior to transmission. On the receiving side a matching decoding process uses knowledge of the encoded redundancy patterns to reconstruct the source signal that may have been altered or distorted by the environment during the transmission. Algorithms in charge of encoding and decoding such redundancy patterns are named Error Correction Code (ECC) encoders and decoders. They are ubiquitous in numerical communications, including WiFi and cellular phone communications, and are computationally expensive, especially the decoders.

AFF3CT offers a C++ library of high performance building blocks for designing efficient software transmission and reception chains. The elementary blocks leverage the MIPP C++ SIMD wrapper to take advantage of streaming instruction sets from modern processors. Whole chains may also be parallelized for increased transmission rates on multicore and multiprocessor architectures.

The AFF3CT distribution also comes with a simulator application to validate the error correction properties of algorithms and to experiments with a range of algorithm variants and parameters. The simulator compute residual Bit Error Rates (BER) and Frame Error Rates (FER) on a generated numerical communication altered with a controlled noise level.

MIPP

MIPP (MyIntrinsics++) is a portable and Open-source wrapper (MIT license) for vector intrinsic functions (SIMD) written in C++11.

The MIPP wrapper works for SSE, AVX, AVX-512 and ARM NEON (32-bit and 64-bit) instructions, as well as preliminary support for ARM SVE. It supports simple/double precision floating-point numbers and signed integer arithmetic (64-bit, 32-bit, 16-bit and 8-bit).

STARPU

Task-based parallel and distributed runtime system for heterogeneous clusters.

Supercomputer architectures are inherently diverse (see the Top500 list for instance) due to being tailored for distinct kinds of workloads, and as a result of the various paths explored to design high performance computing platforms. This technological diversity makes it difficult to efficiently port an application from one supercomputer to another one. Task-based runtime systems offer a means to simplify this process by enforcing good practices in application design, such as the separation of concerns between applicative algorithmics, kernels optimization and hardware resource management, and by delegating part of the optimization process to some third-party software.

StarPU is a task-based runtime system based on the Sequential Task Flow (STF) programming model. From a sequential flow of tasks submitted by the application, StarPU drives a parallel execution taking advantage of the computing units available on the platform nodes, including specialized hardware accelerators, such as GPUs (Graphical Processing Units) and FPGAs (Fully Programmable Gate Arrays). StarPU transparently manages data transfers, replication and replicates consistency between the computing units. It builds a performance model of task kernels on the heterogeneous units to dynamically determine the most suitable mapping of the kernels according to the units performance and availability.

Other Active Software Projects

NewMadeleine

Communication library for high performance networks.

Discontinued Software Projects

ForestGOMP

GNU libgomp ABI-compatible library on top of the Marcel multithread library for hierarchical NUMA scheduling.

Kastors

Benchmark suite of OpenMP dependent tasks kernels.

KSTAR

Source-to-source OpenMP compiler to transform C/C++ programs into StarPU programs.

Madeleine

Communication library for high performance networks, superseded by NewMadeleine.

MAQAO

Framework for binary code analysis and modification (private project).

MARCEL

Multithread library for multicore, NUMA platforms.

NetIbis

Composable communication stack for the Ibis grid computing environment.

STARPU Resource Manager

StarPU Resource Manager (now included in StarPU).

	Olivier Aumage
	Inria, 200 avenue de la vieille tour, F-33405 Talence Cedex, France
	(+33) 5 24 57 41 19
	olivier.aumage@inria.fr
	LinkedIn profile

On-going Courses

Former Courses

Academic Levels