Queens University at Kingston

HyperMetricsNotes

regress File 3
[regress Contents] [Previous File] [Next File]


  • Assumptions about Observed Terms
  • Assumption A2 about the LRM

    The sample information is generated from the PRE.

    That is, the linear relationship between X and Y is the actual relationship between the variables in the data set. This is a critical assumption. If at some point we don't think that we can write down the form of the relationship between X and Y, then it is difficult to say anything about the relationship.

    Assumption A3 about the LRM

    The variable X is non-random across samples.

    That is, our analysis will presume that if another sample were taken from the PRE then: the value $X_1$ would be the same as before (but $Y_1$ would differ because of a different draw of $u_1$ ); the value of $X_2$ would be the same as before (but $Y_2$ would differ because of a different draw of $u_2$ ); etc. We are not assuming that X is constant within a sample, but across samples. And we are explicitly assuming

    When the sample comes from a real experiment, the values of X are under the control of the researcher. In these cases, A3 is not difficult to maintain. For example, recall the example of sampling young Canadians and asking about their sex (obviously exogenous) and the number of cigarettes they smoke (obviously endogenous). Suppose person 1 in the sample is male ($X_1$ = 0) and person 2 is female ($X_2$ = 1). If we think about taking a second sample of 600 people, then A3 implies that $X_1$ would again equal 0 and $X_2$ would again equal 1. We are not assuming that person 1 is the same person in each sample, but simply the same sex in each sample. Therefore, the endogenous variable, $Y_1$ , would differ across samples, but according to A3 the exogenous variable is assumed to be the same.
  • With most economic data, another sample of data would most likely involve different values of X. For example, if we re-study the issue of minimum wages five years later and used data from the intervening years, we would most likely have new values of X that didn't appear in the earlier sample.

    A3 would then appear to be a difficult to take seriously. But all of our results can be derived from much weaker assumptions. In particular, the actual assumption that we need to make is

    Assumption A3*

    $cov(X_i,u_i) = 0.$

    See covariance. That is, the important part of A3 is not that $X_i$ is the same across samples. If it is the same, then $X_i$ is non-random across samples and a non-random variable (i.e. constant) always have zero covariance with any other random variable, in this case the disturbance term $u_i$ . So A3 implies A3*, but the reverse is not true. A3* is the assumption that really matters for our purposes, but it is much simpler to start with A3 and think of X as non-random.
  • Assumptions about the Unobserved Term $u$
  • The final component of the LRM is the disturbance term u. It captures all factors that influence the value of Y beyond the value of X. Since u is assumed to be a random variable, these assumptions concern the distribution of u.

    Assumption A4

    $E[ u_i ] = 0$ for $i=1,2,\dots,N$
    The disturbance term has mean of 0.

    Assumption A5

    $Cov[u_i,u_j] = 0$ for $i\ne j$
    (The disturbance terms in any two observations have no covariance, and hence are uncorrelated.)

    Assumption A6

    $Var[u_i] = \sigma^2$ for $i=1,2,\dots,N$
    Each disturbance term has the same variance, an unknown population parameter $\sigma^2$ .)

    Assumption A7

    $u_i \sim N(0,\sigma^2)$ for $i=1,2,\dots,N$
    (That is, the disturbance follows the normal distribution for each observation in the sample.)
    These assumptions together determine what we need to know about the distribution of the $u_i$ 's to ensure that OLS estimates are "good". A4 is really a "technical" assumption. It says that on average the disturbance terms are 0. We already have an intercept term $\beta_1$ that is freely estimated. The disturbance term u is therefore a deviation from $\beta_1$ and u can have a mean of 0 without imposing any restriction on the model. A4 is called a normalization, because if we dropped A4 we could change some other assumption (set $\beta_1$ = 0) and get back to the same model.

    A5 and A6 are more important than A4. A5 says that the different observations in the data set do not have statistically related disturbance terms. Notice, we didn't say "independent," because independence is a stronger assumption than 0 covariance. In fact, independence between two random variables is very hard to test, but checking or testing whether the disturbance terms have 0 covariance is not difficult.

    A6 says that each of the disturbance terms has the same variance, $\sigma^2$ . It assumes that each observation has equal variance around the PRE. This means each observation provides the same information (in a sense made clear later) about where the PRE is located. If, instead, some observations had lower variances than others, then the low variance observations tend to be closer to the PRE. When trying to find the PRE (by estimating $\beta_1$ and $\beta_2$ ) we would want to put more weight on the low variance observations. The term ordinary least squares really means equally weighted least squares, and Assumption A6 about the distribution of $u_i$ therefore relates to the performance of OLS.

  • Interpreting the slope coefficient $\beta_2$
  • Until introduced to the LRM, people are usually not comfortable with a statistical relationship like the PRE. Before we go any further, let us consider the non-statistical relationship underneath the PRE. The way that randomness is removed is to integrate out variations across the sample space. In other words, we eliminate randomness by taking expectations.


    $$\eqalignno{E[Y_i|X_i] &= E[ \beta_1 + \beta_2 X_i + u_i]\cr &= \beta_1 + \beta_2 X_i + E[u_i]\cr\cr 	 E[Y_i|X_i] &= \beta_1 + \beta_2 X \hskip 0.25in (using A2) &(E7)\cr}$$
    Notice that we are taking the conditional expectation of $Y_i$ , conditional upon knowing the value of $X_i$ . Equation E7 is another way to think of the LRM. It says that we are assuming a linear relationship between the exogenous variable $X_i$ and the expected value of the random variable $Y_i$ . This is an ordinary linear relationship; there are no random variables left in E7 because $E[Y_i]$ is a population parameter. Taking expectations wipes out the influence of the random factor $u_i$ .

    It is then perfectly legal and reasonable to take derivatives in E7:
    $${\partial E[Y_i|X_i] \over \partial X_i} = \beta_2$$
    We interpret the slope coefficient $\beta_2$ as a derivative, the rate of change in $E[Y_i]$ as the exogenous variable X changes. In our minimum wage example, $\beta_2$ would determine the rate at which expected employment (across provinces) changes as the minimum wage changes. Notice that the disturbance term u has no effect on this interpretation. In others, the level of $Y_i$ is moved around by the disturbance terms, but $\beta_2$ measures how $Y_i$ would change with $X_i$ .



    [regress Contents] [Next File] [Top of File]

    This document was created using HTX, a (HTML/TeX) interlacing program written by Chris Ferrall.
    Document Last revised: 1997/1/5