Tag Archives: Jacobian

Transforming random variables

In certain cases, it is useful to transform a random variable \Theta to another random variable \Psi via a transformation \Psi=f(\Theta). For instance, this situation arises typically if one wants to sample from the probability density function p_{\scriptscriptstyle{\Theta}}(\theta) of a positively-valued random variable \Theta. Markov chain Monte Carlo (MCMC) algorithms are conventionally designed to draw correlated samples from a desired target (density) p_{\scriptscriptstyle{\Psi}}(\psi) of a random variable \Psi taking values over the real line. So, if the target of interest p_{\scriptscriptstyle{\Theta}} has support over the positive real line and the MCMC algorithm samples from a density p_{\scriptscriptstyle{\Psi}} with support over the real line, then a random variable transformation, such as \Psi=\log{(\Theta)}, can help resolve the matter. In particular, the transformation allows to sample from p_{\scriptscriptstyle{\Psi}} via the MCMC method of choice and then the inverse transformation \Theta=\exp{(\Psi)} converts the simulated Markov chain to a set of sample points from p_{\scriptscriptstyle{\Theta}}. Obviously, it is needed to find the form of the target density p_{\scriptscriptstyle{\Psi}} on which MCMC will be applied.

Although such random variable transformations are common practice, one may need to look up the formula for the transformation to pass from the original density p_{\scriptscriptstyle{\Theta}} to the transformed density p_{\scriptscriptstyle{\Psi}}. The main source of confusion is whether one needs the Jacobian associated with the transformation f or with the inverse transformation f^{-1}.

There is a way to retrieve the formula intuitively via a geometric argument, rather than trying to uncover it mnemonically. The main argument is that of area preservation in the case of univariate random variables. It suffices to realize that for a small displacement, the area below the curves of the two densities is the same, which means that

p_{\scriptscriptstyle{\Psi}}(\psi)d\psi=p_{\scriptscriptstyle{\Theta}}(\theta)d\theta.

This realization suffices to recover the remaining steps. It follows that

p_{\scriptscriptstyle{\Psi}}(\psi)=p_{\scriptscriptstyle{\Theta}}(\theta)\frac{d\theta}{d\psi}.

Notice that

\Theta \overset{f}{\underset{f^{-1}}\rightleftarrows} \Psi ,

which gives the transformed density

p_{\scriptscriptstyle{\Psi}}(\psi)=p_{\scriptscriptstyle{\Theta}}(f^{-1}(\psi))\left|\frac{df^{-1}(\psi)}{d\psi}\right|.

The Jacobian in the univariate case is the derivative \frac{df^{-1}(\psi)}{d\psi}, associated with the inverse transformation f^{-1}. The absolute value of the derivative ensures that the density p_{\scriptscriptstyle{\Psi}} is non-negative. Having understood the univariate case, the multivariate scenario follows straightforwardly as

p_{\scriptscriptstyle{\boldsymbol\Psi}}(\boldsymbol\psi)=p_{\scriptscriptstyle{\boldsymbol\Theta}}(f^{-1}(\boldsymbol\psi))\left|\frac{\partial f^{-1}_{i}(\boldsymbol\psi)}{\partial_{j}\boldsymbol\psi}\right|,

where \left|\frac{\partial f^{-1}_{i}(\boldsymbol\psi)}{\partial_{j}\boldsymbol\psi}\right| denotes the determinant of the Jacobian of f^{-1}.

To follow through with the example

\Theta \overset{\log}{\underset{\exp}\rightleftarrows} \Psi ,

notice that f=\log, f^{-1}=\exp. So, the derivative \frac{df^{-1}(\psi)}{d\psi} becomes

\displaystyle\frac{df^{-1}(\psi)}{d\psi}=\frac{d \exp (\psi)}{d\psi}=\exp{(\psi)},

whence

p_{\scriptscriptstyle{\Psi}}(\psi)=p_{\scriptscriptstyle{\Theta}}(\exp{(\psi)}) \exp{(\psi)}.

The target log-density for MCMC is thus

\log{(p_{\scriptscriptstyle{\Psi}}(\psi))}=\log{(p_{\scriptscriptstyle{\Theta}}(\exp{(\psi)}))} + \psi.