Beyond Language: Applying MLX Transformers to Engineering Physics (2410.04167v1)

Published 5 Oct 2024 in cs.CE, cs.LG, and physics.comp-ph

Abstract: Transformer Neural Networks are driving an explosion of activity and discovery in the field of LLMs. In contrast, there have been only a few attempts to apply Transformers in engineering physics. Aiming to offer an easy entry point to physics-centric Transformers, we introduce a physics-informed Transformer model for solving the heat conduction problem in a 2D plate with Dirichlet boundary conditions. The model is implemented in the machine learning framework MLX and leverages the unified memory of Apple M-series processors. The use of MLX means that the models can be trained and perform predictions efficiently on personal machines with only modest memory requirements. To train, validate and test the Transformer model we solve the 2D heat conduction problem using central finite differences. Each finite difference solution in these sets is initialized with four random Dirichlet boundary conditions, a uniform but random internal temperature distribution and a randomly selected thermal diffusivity. Validation is performed in-line during training to monitor against over-fitting. The excellent performance of the trained model is demonstrated by predicting the evolution of the temperature field to steady state for the unseen test set of conditions.

Summary

The paper presents a physics-informed Transformer model that leverages MLX to accurately simulate 2D heat conduction under complex Dirichlet boundary conditions.
It employs block and autoregressive prediction modes to capture long-term dependencies and maintain stability through iterative forecasting.
The numerical results show high fidelity between predictions and ground truth across challenging scenarios, highlighting the model's adaptability.

An Overview of "Beyond Language: Applying MLX Transformers to Engineering Physics"

This paper presents an intriguing application of Transformer models, specifically targeting problems within engineering physics. The focus is on leveraging the MLX machine learning framework to adapt Transformer architectures for solving a classical physics problem: heat conduction in a two-dimensional plate with Dirichlet boundary conditions.

Context and Contribution

The context of this research is established by acknowledging the widespread adoption of Transformer models in NLP, which contrasts with their limited application within physics and engineering domains. This work contributes by demonstrating the adaptability and efficiency of Transformers in solving the heat conduction problem, utilizing the MLX framework, which capitalizes on the unified memory architecture of Apple M-series processors.

Methodological Approach

The authors employ a physics-informed Transformer model to predict the evolution of the temperature field in a 2D plate. The solution strategy involves solving the heat equation using central finite differences to generate datasets for training, validation, and testing. The generated datasets are characterized by randomly initialized Dirichlet boundary conditions, initial internal temperature distributions, and variable thermal diffusivities. Notably, this approach introduces complexity in the boundary conditions in advanced configurations, challenging the model's ability to generalize.

The paper explores two key predictive modes:

Block Prediction Mode: Here, the model is trained to predict an entire sequence of temperature evolution from a set of initial frames, capitalizing on the Transformer's ability to model long-term dependencies.
Autoregressive Stepwise Mode: This mode trains the model to predict subsequent frames iteratively, using its own predictions as input, thereby testing its predictive stability over extended sequences.

Numerical Results and Model Performance

The model demonstrates excellent performance across various configurations:

In the base case, the model achieves test losses in both block and autoregressive modes that indicate high fidelity between predictions and ground truth.
For more complex boundary conditions, the model adapts by optimizing parameters such as the number of encoder layers and learning rate scheduling, still maintaining competitive performance metrics.

Heatmaps of the final projection layer reveal insights into how the model captures spatial dependencies, particularly emphasizing areas with high boundary-driven variability.

Implications and Future Directions

The successful application of Transformers to a PDE-governed problem underscores their potential beyond traditional NLP tasks. This work sets a foundational methodology for future explorations in applying Transformers to more complex engineering scenarios, such as turbulence modeling and multi-phase flows, where capturing intricate spatial-temporal relationships is crucial.

Future work could benefit from exploring distributed computing capabilities, optimizing model configurations for different computational environments, and comparing the efficiency of Transformer-based models with other NN architectures like CNNs and RNNs in physics-based simulations.

Conclusion

The paper exemplifies how Transformers can be adapted for complex, physics-informed modeling tasks, offering insights into model architecture choices, training strategies, and performance diagnostics. This pivot from language to physics highlights a promising avenue for expanding the applicability of advanced machine learning models within scientific computing and engineering disciplines.