Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash 102 tok/s
Gemini 2.5 Pro 51 tok/s Pro
GPT-5 Medium 30 tok/s
GPT-5 High 27 tok/s Pro
GPT-4o 110 tok/s
GPT OSS 120B 475 tok/s Pro
Kimi K2 203 tok/s Pro
2000 character limit reached

An AD based library for Efficient Hessian and Hessian-Vector Product Computation on GPU (2410.22575v1)

Published 29 Oct 2024 in cs.DC

Abstract: The Hessian-vector product computation appears in many scientific applications such as in optimization and finite element modeling. Often there is a need for computing Hessian-vector products at many data points concurrently. We propose an automatic differentiation (AD) based method, CHESSFAD (Chunked HESSian using Forward-mode AD), that is designed with efficient parallel computation of Hessian and Hessian-Vector products in mind. CHESSFAD computes second-order derivatives using forward mode and exposes parallelism at different levels that can be exploited on accelerators such as NVIDIA GPUs. In CHESSFAD approach, the computation of a row of the Hessian matrix is independent of the computation of other rows. Hence rows of the Hessian matrix can be computed concurrently. The second level of parallelism is exposed because CHESSFAD approach partitions the computation of a Hessian row into chunks, where different chunks can be computed concurrently. CHESSFAD is implemented as a lightweight header-based C++ library that works both for CPUs and GPUs. We evaluate the performance of CHESSFAD for performing a large number of independent Hessian-Vector products on a set of standard test functions and compare its performance to other existing header-based C++ libraries such as {\tt autodiff}. Our results show that CHESSFAD performs better than {\tt autodiff}, on all these functions with improvement ranging from 5-50\% on average.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (13)
  1. László Szirmay-Kalos. Higher order automatic differentiation with dual numbers. Period. Polytech. Electr. Eng. Comput. Sci., 65:1–10, 2020.
  2. Allan M. M. Leal. autodiff, a modern, fast and expressive C++ library for automatic differentiation. https://autodiff.github.io, 2018.
  3. JAX: composable transformations of Python+NumPy programs, 2018.
  4. Robin J. Hogan. Fast reverse-mode automatic differentiation using expression templates in c++. ACM Trans. Math. Softw., 40(4), July 2014.
  5. Tzu-Mao Li. Differentiable visual computing, 2019.
  6. Reverse-Mode Automatic Differentiation and Optimization of GPU Kernels via Enzyme. In In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC ’21, New York, NY, USA, 2021. Association for Computing Machinery.
  7. Automatic Differentiation of C++ Codes on Emerging Manycore Architectures with Sacado. ACM Trans. Math. Softw., 48(4), dec 2022.
  8. LLVM Code Optimisation for Automatic Differentiation: When Forward and Reverse Mode Lead in the Same Direction. In Proceedings of the Sixth Workshop on Data Management for End-To-End Machine Learning, DEEM ’22, New York, NY, USA, 2022. Association for Computing Machinery.
  9. GPU Accelerated Automatic Differentiation With Clad. Journal of Physics: Conference Series, 2438(1):012043, feb 2023.
  10. AutoMat: automatic differentiation for generalized standard materials on GPUs. Computational Mechanics, 69(2):589–613, nov 2021.
  11. Kokkos 3: Programming Model Extensions for the Exascale Era. IEEE Transactions on Parallel and Distributed Systems, 33(4):805–817, 2022.
  12. A rapidly convergent descent method for minimization. Comput. J., 6:163–168, 1963.
  13. A. Griewank and A. Walther. Principles and Techniques of Algorithmic Differentiation. Society for Industrial and Applied Mathematics, 2nd edition, 2008.
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Ai Generate Text Spark Streamline Icon: https://streamlinehq.com

Paper Prompts

Sign up for free to create and run prompts on this paper using GPT-5.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com