GEPS: Boosting Generalization in Parametric PDE Neural Solvers through Adaptive Conditioning

Armand Kassaï Koupaï¹, Jorge Mifsut-Benet¹, Yuan Yin², Jean-Noël Vittaut³, Patrick Gallinari^1,4

¹Sorbonne Université ISIR, ²Valeo.ai, ³Sorbonne Université LIP6, ⁴Criteo AI Lab
NeurIPS 2024

The chosen multi-environment setting for GEPS for the Kolmogorov PDE. GEPS leverages information from multiple environments, each environment defined by specif PDE parameter values, to boost generalization in parametric PDEs neural solvers.

Motivating the adaptive conditioning approach

One goal of this paper is to demonstrate the relevance of a multi-environment setting, compared to a more traditional ERM method, which advocate the use of large datasets to boost generalization. We thus compare it with an ERM approach when scaling the number of training trajectories and environments. The models are trained on a range of environments - corresponding to different coefficients of the underlying PDE - and evaluated on the same environments with different initial conditions.

Comparison of ERM methods approaches (shades of blue) and Poseidon (green) with our adaptive conditioning approach GEPS (red). We evaluate the performance of different methods when increasing the number of training trajectories (at the top) and when increasing the number of environements (at the bottom) for two PDE equations: Gray-Scott (left) and Burgers (right).

We experimentally show that traditional ERM methods fail to capture the diversity of behaviors and their performance stagnate when increasing the number of training trajectories per environments and environments. Poseidon is able to capture such diversity, but its performance is lower compared to GEPS while pretrained on very large datasets. GEPS benefits from being trained on a large amounts of environments and training trajectories, as its performance scale respectively with the number of trajectories and environements.

GEPS framework

We introduce our framework for learning to adapt neural PDE solvers to unseen environments. It leverages a 1^st order adaptation rule for low-rank and rapid adaptation to a new PDE instance. We consider two common settings for learning PDE solvers; the first one leverages pure data-driven approaches and the second leverages priors information for learning hybrid neural PDE solvers.

Our adaptation framework for for a hybrid model. For the agnostic model, the physics component (bottom left) is removed. Blocks in blue refer to the physical (bottom left) and data-driven (bottom right) modules. Blocks in pink refer to the trainable modules. The green block describes the adaptation mechanism for the physical (top left) and data-driven (top right) component. c^e is the context used to adapt the two components.

Results

We provide quantitative results of our methods compared to other adaptive conditioning approaches.

In-distribution and Out-distribution results on 32 new test trajectories per environment. For out-distribution, models are fine-tuned on 1 trajectory per environment. Metric is the relative L2. '--' indicates inference has diverged.

We provide qualtitative results for the Burgers equation. For more details, please refer to our paper.