Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
129 tokens/sec
GPT-4o
28 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Using Rewrite Strategies for Efficient Functional Automatic Differentiation (2307.02447v2)

Published 5 Jul 2023 in cs.PL

Abstract: Automatic Differentiation (AD) has become a dominant technique in ML. AD frameworks have first been implemented for imperative languages using tapes. Meanwhile, functional implementations of AD have been developed, often based on dual numbers, which are close to the formal specification of differentiation and hence easier to prove correct. But these papers have focussed on correctness not efficiency. Recently, it was shown how an approach using dual numbers could be made efficient through the right optimizations. Optimizations are highly dependent on order, as one optimization can enable another. It can therefore be useful to have fine-grained control over the scheduling of optimizations. One method expresses compiler optimizations as rewrite rules, whose application can be combined and controlled using strategy languages. Previous work describes the use of term rewriting and strategies to generate high-performance code in a compiler for a functional language. In this work, we implement dual numbers AD in a functional array programming language using rewrite rules and strategy combinators for optimization. We aim to combine the elegance of differentiation using dual numbers with a succinct expression of the optimization schedule using a strategy language. We give preliminary evidence suggesting the viability of the approach on a micro-benchmark.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (25)
  1. Cartesian differential categories. Theory and Applications of Categories 22, 23 (2009), 622–672.
  2. JAX: composable transformations of Python+NumPy programs. http://github.com/google/jax
  3. Categorical Models for Simply Typed Resource Calculi. In Proceedings of the 26th Conference on the Mathematical Foundations of Programming Semantics, MFPS 2010, Ottawa, Ontario, Canada, May 6-10, 2010 (Electronic Notes in Theoretical Computer Science, Vol. 265), Michael W. Mislove and Peter Selinger (Eds.). Elsevier, 213–230. https://doi.org/10.1016/j.entcs.2010.08.013
  4. Alonzo Church. 1940. A Formulation of the Simple Theory of Types. J. Symb. Log. 5, 2 (1940), 56–68. https://doi.org/10.2307/2266170
  5. Reverse Derivative Categories. In 28th EACSL Annual Conference on Computer Science Logic, CSL 2020, January 13-16, 2020, Barcelona, Spain (LIPIcs, Vol. 152), Maribel Fernández and Anca Muscholl (Eds.). Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 18:1–18:16. https://doi.org/10.4230/LIPIcs.CSL.2020.18
  6. Categorical semantics of a simple differential programming language. In Proceedings of the 3rd Annual International Applied Category Theory Conference 2020, ACT 2020, Cambridge, USA, 6-10th July 2020 (EPTCS, Vol. 333), David I. Spivak and Jamie Vicary (Eds.). 289–310. https://doi.org/10.4204/EPTCS.333.20
  7. Leonardo de Moura and Sebastian Ullrich. 2021. The Lean 4 Theorem Prover and Programming Language. In Automated Deduction - CADE 28 - 28th International Conference on Automated Deduction, Virtual Event, July 12-15, 2021, Proceedings (Lecture Notes in Computer Science, Vol. 12699), André Platzer and Geoff Sutcliffe (Eds.). Springer, 625–635. https://doi.org/10.1007/978-3-030-79876-5_37
  8. Paulo Emílio de Vilhena and François Pottier. 2021. Verifying an Effect-Handler-Based Define-By-Run Reverse-Mode AD Library. arXiv preprint arXiv:2112.07292 (2021).
  9. Conal Elliott. 2018. The simple essence of automatic differentiation. Proc. ACM Program. Lang. 2, ICFP (2018), 70:1–70:29. https://doi.org/10.1145/3236765
  10. Achieving high-performance the functional way: a functional pearl on expressing high-performance optimizations as rewrite strategies. Proc. ACM Program. Lang. 4, ICFP (2020), 92:1–92:29. https://doi.org/10.1145/3408974
  11. Laurent Hascoët and Valérie Pascual. 2013. The Tapenade automatic differentiation tool: Principles, model, and specification. ACM Trans. Math. Softw. 39, 3 (2013), 20:1–20:43. https://doi.org/10.1145/2450153.2450158
  12. Futhark: purely functional GPU-programming with nested parallelism and in-place array updates. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2017, Barcelona, Spain, June 18-23, 2017, Albert Cohen and Martin T. Vechev (Eds.). ACM, 556–571. https://doi.org/10.1145/3062341.3062354
  13. Michael Innes. 2018. Don’t Unroll Adjoint: Differentiating SSA-Form Programs. CoRR abs/1810.07951 (2018). arXiv:1810.07951 http://arxiv.org/abs/1810.07951
  14. Simon L. Peyton Jones and Simon Marlow. 2002. Secrets of the Glasgow Haskell Compiler inliner. J. Funct. Program. 12, 4&5 (2002), 393–433. https://doi.org/10.1017/S0956796802004331
  15. Damiano Mazza and Michele Pagani. 2021. Automatic differentiation in PCF. Proc. ACM Program. Lang. 5, POPL (2021), 1–27. https://doi.org/10.1145/3434309
  16. Samuel Mimram. 2020. PROGRAM = PROOF.
  17. William S. Moses and Valentin Churavy. 2020. Instead of Rewriting Foreign Code for Machine Learning, Automatically Synthesize Fast Gradients. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, Hugo Larochelle, Marc’Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin (Eds.). https://proceedings.neurips.cc/paper/2020/hash/9332c513ef44b682e9347822c2e457ac-Abstract.html
  18. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada, Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d’Alché-Buc, Emily B. Fox, and Roman Garnett (Eds.). 8024–8035. https://proceedings.neurips.cc/paper/2019/hash/bdbca288fee7f92f2bfa9f7012727740-Abstract.html
  19. Getting to the point: index sets and parallelism-preserving autodiff for pointful array programming. Proc. ACM Program. Lang. 5, POPL (2021), 1–29. https://doi.org/10.1145/3473593
  20. Destination-passing style for efficient memory management. In Proceedings of the 6th ACM SIGPLAN International Workshop on Functional High-Performance Computing, FHPC@ICFP 2017, Oxford, UK, September 7, 2017, Phil Trinder and Cosmin E. Oancea (Eds.). ACM, 12–23. https://doi.org/10.1145/3122948.3122949
  21. Efficient differentiable programming in a functional array-processing language. Proc. ACM Program. Lang. 3, ICFP (2019), 97:1–97:30. https://doi.org/10.1145/3341701
  22. Eelco Visser. 2005. A survey of strategies in rule-based program transformation systems. J. Symb. Comput. 40, 1 (2005), 831–873. https://doi.org/10.1016/j.jsc.2004.12.011
  23. Building Program Optimizers with Rewriting Strategies. In Proceedings of the third ACM SIGPLAN International Conference on Functional Programming (ICFP ’98), Baltimore, Maryland, USA, September 27-29, 1998, Matthias Felleisen, Paul Hudak, and Christian Queinnec (Eds.). ACM, 13–26. https://doi.org/10.1145/289423.289425
  24. Demystifying Differentiable Programming: Shift/Reset the Penultimate Backpropagator. CoRR abs/1803.10228 (2018). arXiv:1803.10228 http://arxiv.org/abs/1803.10228
  25. Yann LeCun. 2018. Yann LeCun - OK, Deep Learning has outlived its usefulness… — Facebook. https://web.archive.org/web/20180106001630/https://www.facebook.com/yann.lecun/posts/10155003011462143 [Online; accessed 7-April-2022].
Citations (1)

Summary

We haven't generated a summary for this paper yet.