Adjoint Matching (Domingo-Enrich et al., 2025) is a recently proposed technique
in the generative modeling literature that fine-tunes a flow-matching
model to maximize a reward function under a standard KL regularization/constraint. Normally, generating/sampling a sample
from a flow model involves sampling a noise and then solve an ordinary differential equation (ODE) via numerical integration.
One may also convert this ODE into
an SDE that admits the same path marginals subject to any noise schedule (σ), using the score function:
Since the optimal velocity field and the score has a simple relationship (e.g., see Eq. 4.79 in the flow-matching tutorial),
we can simplify the ODE to SDE conversion further. Here, we assume the flow velocity field reconstructs the flow defined by
:
Under a memoryless noise schedule (e.g.,
), the conversion can be further
simplified into
The memoryless schedule ensures the noise is independent from the output from the SDE:
.
This allows Domingo-Enrich et al., 2025 to use the following stochastic optimal control objective to any flow model against a reward function.
At the optimum, fine-tuned velocity field generates the optimal KL-constrained distribution (an expoential tilt of the base flow model):
Direct optimization of the SOC objective above requires backpropagation through time (BPTT).
The process can be equivalently represented as an
ODE of the adjoint state (the gradient of the loss function
with respect to the noisy sample) in continuous time from t=1 to t=0.
Notice how the ODE depends on the fine-tuned velocity field.
This means that any ill-conditioness in the fine-tuned flow model
may blow up the ODE and making the optimization unstable.
Adjoint matching takes a step further by observing that
many terms in the adjoint state does not have effect on the optimal solution.
This allows us to construct a "lean" adjoint state instead:
Now, the adjoint state ODE has no dependency on the fine-tuned velocity field.
Instead, it only depends on the base velocity field. This means that
the ill-conditioness in the fine-tuned flow model should not blow up adjoint state ODE, making
the optimization more stable. The "lean" adjoint state has the same boundary condition (at t=1)
as the original adjoint state. With this "lean" adjoint state, we can complete the adjoint matching
objective as follows:
This is it! While we hope this provides a minimal summary for adjoint matching for our work,
the original paper contains more detail
and provides a more general framework.