Suppose that we want to compute $\operatorname{Var}[f(X, Y)]$ where $X$ and $Y$ are independent random variables. Imagine that, for a single sample $X = x$, we have an efficient method to estimate $\operatorname{Var}[f(x, Y)]$. For example, if $Y$ is a testing set and $f(x, Y)$ is the mean performance on the testing set, then its variance can be obtained efficiently using the bootstrap estimator. Note that we will use lower-case to denote a normal variable and upper-case to denote a random variable.

Let $M(x)$ and $S(x)$ denote the mean and variance that can be computed efficiently

Then, the question is, how can we compute $\operatorname{Var}[f(X, Y)]$ in terms of $M(X)$ and $S(X)$?

First, let’s look at the decomposition of the variance that we can compute easily

Now, let’s examine the joint mean and variance

The first term of the variance is

Hence, the joint variance can be computed entirely in terms of $M(X)$ and $S(X)$.

This is the mean of the variances plus the variance of the means, and can alternatively be written