The paper builds on important research contributions by other mathematicians / statisticians at the University of Bristol. The algorithm studied is the conditional particle filter of Andrieu, Doucet & Holenstein (2010), with a crucial "backward sampling" modification proposed by Nick Whiteley in the discussion of that paper.
Fifteen years later, finally it was established rigorously that this modification is so beneficial. This conditional particle filter is often used when analyzing streams of data associated with an unobserved hidden process: the world evolves according to some dynamics and we see partial and noisy information at certain times: the data is a record of T points. The computational complexity of this Markov chain algorithm can be captured as the product of the mixing time of the Markov chain and the cost of simulating each step of the Markov chain: showing that this is O(T log(T)) for suitable models whereas the conditional particle filter without backward sampling has a computational complexity that is O(T^2).
To give some history, the conditional particle filter arose from a truly remarkable paper by Andrieu, Doucet & Holenstein which studied a kind of evolutionary system of particles with a selection mechanism favouring certain types of particles known as Sequential Monte Carlo. This system of particles was previously known to produce accurate approximations of certain integrals that arise in statistics. However, it was not known that the system could also be used to define a Markov chain with a certain stationary distribution: a key property of another branch of numerical integration techniques known as Markov chain Monte Carlo. Hence, that paper brought two popular and very effective branches of Monte Carlo methodology together at once. This combination has been used to great effect in several application domains, e.g. fitting models for epidemics such as COVID-19 and Ebola, for economic data and for phenomena in the natural sciences.
The way the original conditional particle filter works is that given a current trajectory, one can simulate a particle system where that trajectory serves as a reference: it can have offspring and in doing so affects the other particles in the system. After simulating the system, a new trajectory is selected, which can share some of the reference’s points. In fact, it is known that unless the number of particles N in the system scales at least linearly with T, a kind of particle degeneracy occurs in which a large early chunk of the reference is very likely to be part of the new trajectory. Scaling N with T, however, leads to an O(1) mixing time with an O(T^2) complexity as the computational complexity per step of the Markov chain is O(NT).
The critical idea of Whiteley was to avoid this degeneracy by not selecting a whole trajectory from the particle system, but to randomly select points from different trajectories in a particular way to construct a more diverse path. This can have some points in common with the reference, but typically not many, and this essentially leads to mixing over several steps even when the number of particles is a constant independent of T. Empirically, adding this backward sampling step to the conditional particle filter was known to provide greatly improved performance but a theoretical result was elusive.
The key to the proof is the construction of a probabilistic coupling of two backward-sampling, conditional particle filters, such that with quite a bit of care it could be shown that after O(log(T)) steps they would coincide. The coupling construction also lends itself nicely to parallelizable, unbiased estimation methods, as recently investigated by Pierre Jacob and several of his collaborators.
Huge congratulations to Anthony on this achievement.