- The paper demonstrates that reverse-mode automatic differentiation is mathematically equivalent to the adjoint-state method, enabling flexible seismic inversion frameworks.
- The methodology adapts deep learning platforms to automatically compute gradients, significantly boosting computational efficiency and scalability on GPUs and distributed systems.
- The approach efficiently reconstructs complex subsurface models and earthquake parameters, paving the way for advanced, scalable geophysical simulations.
A General Approach to Seismic Inversion with Automatic Differentiation
The paper presents a framework for seismic inversion that leverages reverse-mode automatic differentiation (AD) as a computational tool, offering an alternative to the adjoint-state method traditionally used in seismology. The work is underpinned by the insight that reverse-mode AD and the adjoint-state method are mathematically equivalent, facilitating the development of a flexible framework for seismic inverse problems, termed ADSeismic. This framework capitalizes on the infrastructure provided by deep learning platforms, such as TensorFlow and PyTorch, allowing it to efficiently compute gradients on CPU, GPU, and TPU architectures.
Methodology
The core innovation lies in adapting AD, which is predominantly used for training deep neural networks, to the context of seismic inverse problems. Sequences of linear/non-linear operations in AD perfectly map onto the discretized numerical simulations of Partial Differential Equations (PDEs) commonly used in seismology. By employing reverse-mode AD, the framework can automatically compute derivatives without the need for bespoke derivation of adjoint equations tailored to each problem. This capability is harnessed using a deep learning-based library, ADCME, to express seismic simulations and perform computations.
The paper emphasizes significant computational gains when switching from CPUs to GPUs, presenting a 20-fold increase for acoustic wave equations and a 60-fold increase for elastic wave equations. These improvements are attributed to the parallel processing capabilities inherent to GPU architecture, facilitated by TensorFlow's auto-parallelization. Additionally, the framework supports distributed computing across multiple GPUs, allowing larger problem sizes by distributing source functions across devices and aggregating their outputs.
Applications and Results
ADSeismic has been tested across a broad spectrum of inverse seismic applications, demonstrating its versatility and robustness:
- Velocity Model Estimation: The framework successfully reconstructs complex subsurface models like the Marmousi model, showcasing its efficacy in capturing intricate earth structures.
- Earthquake Location and Source Time Function Retrieval: The approach is extended to estimate earthquake hypocenter locations and source time histories, delivering accurate fits to observed seismic data. ADSeismic's flexible parameterization facilitates tackling the discontinuities inherent in delta function sources.
- Rupture Imaging: In modeling seismic rupture processes, the framework captures spatiotemporal phenomena with minimal changes to forward simulation code. This is crucial for understanding dynamic earthquake slip.
Implications and Future Directions
Beyond computational efficiency, the generic nature of ADSeismic enhances adaptability to evolving computational and hardware paradigms in deep learning. As AD techniques advance and specialized hardware continues to develop, the capability of ADSeismic is expected to expand, thereby lowering barriers to experimenting with novel seismological models.
Limitations and Challenges
The framework is not without challenges. Its reliance on memory-intensive AD makes it susceptible to scaling issues, especially with larger models. While solutions such as checkpointing exist to mitigate these concerns, the need to balance computational efficiency and feasibility remains. Additionally, it inherits the ill-posedness challenges of inverse problems, particularly cycle-skipping, which necessitates adequate initialization or model regularization.
In conclusion, the paper articulates a sound case for employing AD as a powerful tool in seismic inversion, not only broadening the computational toolkit available to researchers in geophysics but also providing a contemporary approach to addressing traditionally complex and case-specific problems in this domain.