Why Julia’s GPU Accelerated ODE Solvers are 20x-100x Faster than Jax and PyTorch


You may have seen the benchmark results and thought, “how the heck are the Julia ODE solvers on GPUs orders of magnitude faster than the GPU-accelerated Python libraries, that can’t be true?” In this talk I will go into detail about the architectural differences between the Julia approaches to generating GPU-accelerated solvers vs the standard ML library approach to GPU usage. By the end of the talk you’ll have a good enough understanding of models of GPU acceleration to understand why this performance difference exists, and the many applications that can take advantage of this performance improvement.

Claude Code sucks but is still useful: experiences maintaining Julia’s SciML scientific computing infrastructure


Claude Code sucks but is still useful: experiences maintaining Julia’s SciML scientific computing infrastructure

So it’s pretty public that for about a month now I’ve had 32 processes setup on one of the 64 core 128gb RAM servers to just ssh in, tmux to a window, and tell it to slam on some things non-stop. And it has been really successful!… with the right definition of success. Let me explain.

This is a repost of the long post in the Julia Discourse.

* How is Claude being used, and how useful has it been?

j-bowhay, post:1, topic:131009

I think the first will answer the others. Basically, Claude is really not smart at all. There is no extensive algorithm implementation that has come from AI. I know some GSoCers and SciML Small Grants applicants have used AI (many without disclosure) but no wholesale usage has … READ MORE

Implicit ODE Solvers Are Not Universally More Robust than Explicit ODE Solvers, Or Why No ODE Solver is Best


A very common adage in ODE solvers is that if you run into trouble with an explicit method, usually some explicit Runge-Kutta method like RK4, then you should try an implicit method. Implicit methods, because they are doing more work, solving an implicit system via a Newton method having “better” stability, should be the thing you go to on the “hard” problems.

This is at least what I heard at first, and then I learned about edge cases. Specifically, you hear people say “but for hyperbolic PDEs you need to use explicit methods”. You might even intuit from this “PDEs can have special properties, so sometimes special things can happen with PDEs… but ODEs, that should use implicit methods if you need more robustness”. This turns out to not be true, and really understanding the ODEs will help us understand better … READ MORE

Machine learning with hard constraints: Neural Differential-Algebraic Equations (DAEs) as a general formalism


We recently released a new manuscript Semi-Explicit Neural DAEs: Learning Long-Horizon Dynamical Systems with Algebraic Constraints where we showed a way to develop neural networks where any arbitrary constraint function can be directly imposed throughout the evolution equation to near floating point accuracy. However, in true academic form it focuses directly on getting to the point about the architecture, but here I want to elaborate about the mathematical structures that surround the object, particularly the differential-algebraic equation (DAE), how its various formulations lead to the various architectures (such as stabilized neural ODEs), and elaborate on the other related architectures which haven’t had a paper yet but how you’d do it (and in what circumstances they would make sense).

READ MORE

How chaotic is chaos? How some AI for Science / SciML papers are overstating accuracy claims


Just how chaotic are chaotic systems? Many of you may have heard of “the butterfly effect” but don’t quite know the mathematics behind such systems. What I want to demonstrate is the “sensitive dependence to initial conditions” property of chaotic systems and just how sensitive these systems are. The reason this has come up is that I have seen some AI papers claiming to be able to predict the timeseries of a chaotic system (many more can be found online too, just highlighting a few random ones). What I want to bring to the forefront is an examination of what is really being claimed: just how hard is it to actually forecast a chaotic system? And if they aren’t doing that, what have they done instead?

Quick Understanding of Chaos: Sensitive Dependence and the Shadowing Lemma

First of … READ MORE

Open Source Component-Based Modeling with ModelingToolkit


Component-based modeling systems such as Simulink and Dymola allow for building scientific models in a way that can be composed. For example, Bob can build a model of an engine, and Alice can build a model of a drive shaft, and you can then connect the two models and have a model of a car. These kinds of tools are used all throughout industrial modeling and simulation in order to allow for “separation of concerns”, allowing experts to engineer their domain and compose the final digital twins with reusable scientific modules. But what about open source? In this talk we will introduce ModelingToolkit, an open source component-based modeling framework that allows for composing pre-built models and scales to large high-fidelity digital twins.

PyData is … READ MORE

The Numerical Analysis of Differentiable Simulation: Automatic Differentiation Can Be Incorrect


ISCL Seminar Series

The Numerical Analysis of Differentiable Simulation: How Automatic Differentiation of Physics Can Give Incorrect Derivatives

Scientific machine learning (SciML) relies heavily on automatic differentiation (AD), the process of constructing gradients which include machine learning integrated into mechanistic models for the purpose of gradient-based optimization. While these differentiable programming approaches pitch an idea of “simply put the simulator into a loss function and use AD”, it turns out there are a lot more subtle details to consider in practice. In this talk we will dive into the numerical analysis of differentiable simulation and ask the question: how numerically stable and robust is AD? We will use examples from the Python-based Jax (diffrax) and PyTorch (torchdiffeq) libraries in order to demonstrate how canonical … READ MORE

JuliaSim: Building a Product which improves Open Source Sustainability


January 26 2025 in Differential Equations, HPC, Julia, Scientific ML | Tags: | Author: Christopher Rackauckas

How do you build products that support open source communities? In this non-technical talk with OpenTeams I discuss how the MIT Julia Lab, PumasAI, and JuliaHub have all been essential pillars of the julialang opensource community in its goal to achieve sustainable open science. If you’ve ever been curious about what the difference is between the Julia Lab and JuliaHub is, the evolution of these groups, and what kinds of different contributions they make to the open source community, in this talk I go through as many details as I could!

Differences Between Methods for Solving Stiff ODEs


April 6 2024 in Differential Equations, Mathematics | Tags: | Author: Christopher Rackauckas

I found these notes from August 2018 and thought they might be useful so I am posting them verbatim.

A stiff ordinary differential equation is a difficult problem to integrator. However, many of the ODE solver suites offer quite a few different choices for this kind of problem. DifferentialEquations.jl offers almost 200 different choices for example. In this article we will dig into what the differences between these integrators really is so that way you can more easily find which one will be most efficient for your problem.

Quick Overview (tl;dr)

  1. A BDF, Rosenbrock, ESDIRK method are standard
  2. For small equations, Rosenbrock methods have performance advantages
  3. For very stiff systems, Rosenbrock and Rosenbrock-W methods do not require convergence of Newton’s method and thus can take larger steps, being more efficient
  4. BDF integrators are only L-stable (and A-stable) to order 2, so if the problem is … READ MORE

Direct Automatic Differentiation of (Differential Equation) Solvers vs Analytical Adjoints: Which is Better?


Automatic differentiation of a “solver” is a subject with many details for doing it in the most effective form. For this reason, there are a lot of talks and courses that go into lots of depth on the topic. I recently gave a talk on some of the latest stuff in differentiable simulation with the American Statistical Association, and have some detailed notes on such adjoint derivations as part of the 18.337 Parallel Computing and Scientific Machine Learning graduate course at MIT. And there are entire organizations like my SciML Open Source Software Organization which work day-in and day-out on the development of new differentiable solvers.

I’ll give a brief summary of all my materials here below.

Continuous vs Discrete Differentiation of Solvers

AD of a solver can be done in essentially two different ways: either directly performing automatic … READ MORE