Generalizing Automatic Differentiation to Automatic Sparsity, Uncertainty, Stability, and Parallelism


Automatic differentiation is a “compiler trick” whereby a code that calculates f(x) is transformed into a code that calculates f'(x). This trick and its two forms, forward and reverse mode automatic differentiation, have become the pervasive backbone behind all of the machine learning libraries. If you ask what PyTorch or Flux.jl is doing that’s special, the answer is really that it’s doing automatic differentiation over some functions.

What I want to dig into in this blog post is a simple question: what is the trick behind automatic differentiation, why is it always differentiation, and are there other mathematical problems we can be focusing this trick towards? While very technical discussions on this can be found in our recent paper titled “ModelingToolkit: A Composable Graph Transformation System For Equation-Based Modeling” and descriptions of methods like intrusive uncertainty quantification, I want … READ MORE