Queens University at Kingston


multiple File 5
[multiple Contents] [Previous File] [Next File]

E. Random Variables and Matrix Notation

To derive the statistical properties of OLS estimates, we must define the notion of a random vector along with the notion of the mean and variance of a random vector.

  1. Random matrix
  2. A random matrix is a matrix of random variables. For example the vector of disturbance terms is a random vector since each element of the vector is a random variable. A constant matrix is, of course, a special type of random matrix in the same sense that a constant is a special case of a random variable that does not vary across points in the sample space.

  3. Expectation of a random matrix V (m x r).
  4. The expectation of a random matrix is the matrix of expectations of each random variable in the random vector. That is, let V be a m x r random matrix, and let $v_{ij}$ denote the random variable in the ith row and jth column of V. Then
    $$E[V] \equiv \left|\matrix{ E[v_{11}]&E[v_{12}&\dots&E[v_{1r}]\cr E[v_{21}]&E[v_{22}]&\dots&E[v_{2r}]\cr \vdots&\vdots&\vdots&vdots\cr E[v_{m1}]&E[v_{m2}]&\dots&E[v_{mr}] }\right|$$
    Note that expectation is still a linear operator when defined this way. That is, if A is a n x m matrix of constants and C is a n x r vector of constants, then
    E[AV + C] = AE[V] + C

    Notice that m does not have to equal r. That is, the transformation matrix A may change the dimensions of the random matrix. AV is a n x r random matrix, not a m x r. The proof of this result requires you to write out what an element of the matrix AV+C looks like, apply the expectation operator to it, and then see that the result is exactly the same as if you had written out AE[V] + C.

  5. Variance of a random vector
  6. Let v by m x 1. Then Var(v) $\equiv$ E[ (v-E[v])(v-E[v])' ] This generalizes the notion of covariance, which itself generalized the notion of variance. Notice that (v-E[v])(v-E[v])' is a m x m matrix (when v is m x 1). So Var(v) is always a square matrix. If we multiply (v-E[v])(v-E[v])' out, we can see that the ith row, jth column of it contains $(v_i-E[v_i])(v_j-E[v_j])$ . The expectation of this scalar random variable is by definition:
    $$E[ (v_i-E[v_i])(v_j-E[v_j]) ] = Cov(v_i,v_j)$$
    So each element of Var(v) is a covariance. The diagonal elements of Var(v) are simply the variances of the corresponding elements of the v vector. Since Cov(r,s) = Cov(s,r), the matrix Var(v) is symmetric. Summing this up,
    $$Var(v) = \pmatrix{ Var(v_1) & Cov(v_1,v_2) & Cov(v_1,v_3) &\dots &Cov(v_1,v_m)\cr Cov(v_1,v_2) & Var(v_2) & Cov(v_2,v_3) &\dots &Cov(v_2,v_m)\cr Cov(v_1,v_3) & Cov(v_2,v_3) & Var(v_3) &\dots &Cov(v_3,v_m)\cr \vdots &\vdots &\vdots & \vdots &\vdots\cr Cov(v_1,v_m) & Cov(v_2,v_m) & Cov(v_3,v_m) &\dots &Var(v_m)} $$

Variance of a linear transformation

Let v be m x 1 random vector and A a n x m matrix of constants. Then
Var[Av] = E[ (Av-AE[v])(Av-AE[v])' ] = E[ A(v-E[v])(v-E[v])'A'] = AVar[v]A'
This result generalizes the simpler result that when multiplying a random variable by a constant it changes the variance by the square of the constant. In fact, if m=n=1, then we simply get back the old formula Var(av) = a&149;Var(v).

Here's an example. Let $v= \pmatrix{v_w1\cr v_2}$ with
$$Var[v] = \pmatrix{\sigma^2_1&0\cr 0 &\sigma^2_2}$$
This means that v contains two random variables that have zero covariance. (They might be statistically independent of each other as well, since independence implies zero covariance). Let A = (2 -2). Then
$$w = Av = 2v_1 - 2v_2$$
is the random variable that is simply twice the difference of the two random variables in the vector v. From the result above,
$$\eqalign{Var[Av] = &AVar[v]A'\cr = &\pmatrix{2 & -2}\pmatrix{\sigma^2_1&0\cr 0 &\sigma^2_2}\pmatrix{2\cr -2}\cr = &\pmatrix{2\sigma^2_1 -2\sigma^2_2}\pmatrix{2\cr -2}\cr = 4\sigma^2_1 + 4\sigma^2_2\cr}$$
This is exactly the formula we would have used without the matrix notation, since the weighted sum of zero-covariance random variables is the sum of the variances multiplied by the square of the weights. Keeping track of variances and covariances in matrices is very convenient.

Assumptions of the LRM in Matrix Notation


  • Assumption A3

    The matrix X is fixed (non-random) across samples.

  • Assumption A4

    E[u] = 0, where 0 is a N x 1 vector containing 0s.

  • Assumption A5-A6

    Var[u] = $\sigma^2$ I

    In words: the variance matrix for u is diagonal (0 covariance across observations) and has equal elements on the diagonal (equal variance).

  • Assumption A7 $u \sim N(0,\sigma^2 I)$

    In words: The u vector is a normal vector, in the sense that each element of u is normally distributed. Notice that the only change between simple and multiple regression is the change in the dimension of X. So if we had started out with matrix notation we would have been able to use these forms of the assumptions from the start. That is, A4 above is the same as the A4, just using different notation. The same is true of A3 through A7, the only difference being that A5 and A6 can stated together.

    Now we can start analyzing the statistical properties of OLS estimate $\hat\beta^{OLS}$ and $\hat\sigma^2$ .

    [multiple Contents] [Next File] [Top of File]

    This document was created using HTX, a (HTML/TeX) interlacing program written by Chris Ferrall.
    Document Last revised: 1997/1/5