Automatic differentiation (autodiff) is built on two transformations: Jacobian-vector products (JVPs) and vector-Jacobian products (VJPs).

We can see this as the last layer before the gradient of the scalar-valued loss is

which is a vector-Jacobian product.

These also have the property that for composition of functions, we can compose their JVP/VJPs.

jacobian-vector product

  • Right hand side is first two terms of Taylor series of , thus explaining what happens with a nudge from vector .
  • Evaluates Jacobian at each column
  • jax.jvp

vector-jacobian product

  • Evaluates Jacobian at each row
  • jax.vjp