Events
DMS Statistics and Data Science Seminar 
Time: Nov 15, 2023 (02:00 PM) 
Location: 354 Parker Hall 
Details: Speaker: Dr. Raghu Pasupathy (Purdue University)
Title: Batching as an Uncertainty Quantification Device
Abstract: Consider the context of a statistician, simulationist, or an optimizer seeking to assess the quality of \(\theta_n\), an estimator of an unknown object \(\theta \in \mathbb{R}^d\), constructed using data \((Y_1, Y_2,\ldots, Y_n)\) gathered from a source such as a dataset, a simulation, or an optimization routine. The unknown object \(\theta\) is assumed to be a statistical function of the probability measure that generates the stationary time series \((Y_1, Y_2,\ldots, Y_n)\). In such contexts, resampling methods such as the bootstrap or subsampling have been the classical answer to the question of how to approximate the sampling distribution of the error \(\theta_n  \theta\). In this talk, we propose a simple alternative called batching. Batching works by appropriately grouping the data \((Y_1, Y_2,\ldots, Y_n)\) into contiguous and possibly overlapping batches, each of which is then used to construct an estimate of \(\theta\). These batch estimates, along with the original estimate \(\theta_n\), are then combined and scaled appropriately to approximate any functional such as the bias, or the meansquared error of the error \(\theta_n  \theta\), or to construct the \((1\alpha)\)confidence region on \(\theta\). We show that batching, like bootstrapping, enjoys strong consistency and highorder accuracy properties. Furthermore, we show that the weak asymptotics of batched studentized statistics are not necessarily normal but characterizable. In particular, using large overlapping batches when constructing confidence regions delivers consistently favorable performance. A number of theoretical and practical questions about batching are open.
