Friday October 26, 2pm, Carslaw 173
University of Sydney, School of Mathematics and Statistics
Metropolis-Hastings MCMC with dual mini-batches
For many decades Markov chain Monte Carlo (MCMC) methods have been the main workhorse of Bayesian inference. However, traditional MCMC algorithms are computationally intensive. In particular, the Metropolis-Hastings (MH) algorithm requires passing over the entire dataset to evaluate the likelihood ratio in each iteration. We propose a general framework for performing MH-MCMC using two mini-batches (MHDB) of the whole dataset each time and show that this gives rise to approximately a tempered stationary distribution. We prove that MHDB preserves the modes of the original target distribution and derive an error bound on the approximation for a general class of models including mixtures of exponential family distributions, linear binary classification and regression. To further extend the utility of the algorithm to high dimensional settings, we construct a proposal with forward and reverse moves using stochastic gradient and show that the construction leads to reasonable acceptance probabilities. We demonstrate the performance of our algorithm in neural network applications and show that compared with popular optimisation methods, our method is more robust to the choice of learning rate and improves testing accuracy.
Rachel Wang is currently a lecturer and DECRA fellow in the School of Mathematics and Statistics. She received her PhD in Statistics from UC Berkeley in 2015 and subsequently spent two years as a Stein Fellow / Lecturer in the Department of Statistics at Stanford University. Her research interests include statistical network theory, statistical machine learning, and their applications to complex genomic datasets.