Scalable Bayesian Inference Using Stochastic Gradient Markov Chain Monte Carlo

Putcha, Srshti and Fearnhead, Paul and Nemeth, Christopher (2024) Scalable Bayesian Inference Using Stochastic Gradient Markov Chain Monte Carlo. PhD thesis, Lancaster University.

Text (2024putchaphd)
2024putchaphd.pdf - Published Version
Download (7MB)

Abstract

Bayesian inference offers a flexible framework to account for uncertainty across all unobserved quantities in a model. Markov chain Monte Carlo (MCMC) is a class of sampling algorithms which simulate from the Bayesian posterior distribution. These methods are generally regarded as the go-to computational technique for practical Bayesian modelling. MCMC is well-understood, offers (asymptotically) exact inference, and can be implemented intuitively. Samplers built upon the Metropolis-Hastings algorithm can benefit from strong theoretical guarantees under reasonable conditions. Derived from discrete-time approximations of Itô diffusions, gradient-based samplers (Roberts and Rosenthal, 1998; Neal, 2011) leverage local gradient information in their proposal, allowing for efficient exploration of the posterior. The most championed of the diffusion processes are the overdamped Langevin diffusion and Hamiltonian dynamics. In large data settings, standard MCMC can falter. The per-iteration cost of calculating the loglikelihood in the Metropolis-Hastings acceptance step scales with dataset size. Gradient-based samplers are doubly afflicted in this scenario, given that a full-data gradient is computed each iteration. These issues have prompted considerable interest in developing approaches for scalable Bayesian inference. This thesis proposes novel contributions for stochastic gradient MCMC (Welling and Teh, 2011; Ma et al., 2015; Nemeth and Fearnhead, 2021). Stochastic gradient MCMC utilises data subsampling to construct a noisy, unbiased estimate of the gradient of the log-posterior. The first two chapters review key background from the literature. Chapter 3 presents our first paper contribution. In this work, we extend stochastic gradient MCMC to time series, via non-linear, non-Gaussian state space models. Chapter 4 presents the second paper contribution of this thesis. Here, we examine the use of a preferential subsampling distribution to reweight the stochastic gradient and improve variance control. Chapter 5 evaluates the feasibility of using determinantal point processes (Kulesza et al., 2012) for data subsampling in SGLD. We conclude and propose directions for future work in Chapter 6.

Item Type:

Thesis (PhD)

Uncontrolled Keywords:

Research Output Funding/yes_externally_funded

Subjects:

?? yes - externally funded ??

Departments:

Faculty of Science and Technology > Mathematics and Statistics

ID Code:

215949

Deposited By:

ep_importer_pure

Deposited On:

12 Mar 2024 13:40

Refereed?:

Published?:

Published

Last Modified:

24 Jun 2025 01:15

URI:

https://eprints.lancs.ac.uk/id/eprint/215949