
A3 would then appear to be a difficult to take seriously. But all of our
results can be derived from much weaker assumptions. In particular,
the actual assumption that we need to make is
would be the same as before (but
would differ because of a different draw of
);
the value of
would be the same as before
(but
would differ because of a different draw of
);
etc.
We are not assuming that X is constant within a sample,
but across samples. And we are explicitly assuming
= 0) and person 2 is female (
= 1).
If we think about taking a second sample of 600 people, then A3
implies that
would again equal 0 and
would again equal 1.
We are not assuming that person 1 is the same person
in each sample, but simply the same sex in each sample. Therefore,
the endogenous variable,
, would differ across samples, but according
to A3 the exogenous variable is assumed to be the same.
is the same across
samples. If it is the same, then
is non-random across samples and a
non-random variable (i.e. constant) always have zero covariance with any
other random variable, in this case the disturbance term
. So A3 implies
A3*, but the reverse is not true. A3* is the
assumption that really matters for our purposes, but it is much simpler
to start with A3 and think of X as non-random.
for
for
for
.)
for
's to ensure that OLS estimates are "good". A4
is really a "technical" assumption. It says that on average the disturbance
terms are 0. We already have an intercept term
that is freely
estimated. The disturbance term u is therefore a deviation from
and u can have a mean of 0 without imposing any restriction on the model.
A4 is called a normalization, because if we dropped A4 we could
change some other assumption (set
= 0) and get back to
the same model.A5 and A6 are more important than A4. A5 says that the different observations in the data set do not have statistically related disturbance terms. Notice, we didn't say "independent," because independence is a stronger assumption than 0 covariance. In fact, independence between two random variables is very hard to test, but checking or testing whether the disturbance terms have 0 covariance is not difficult.
A6 says that each of the disturbance terms has the same variance,
.
It assumes that each observation has equal variance around the PRE. This
means each observation provides the same information (in a sense made clear
later) about where the PRE is located. If, instead, some
observations had lower variances than others,
then the low variance observations tend to be closer to the PRE. When
trying to find the PRE (by estimating
and
)
we would want to put more weight on the low variance observations.
The term ordinary least squares really
means equally weighted least squares, and Assumption A6
about the distribution of
therefore relates to the performance of OLS.
Until introduced to the LRM, people are usually not comfortable with
a statistical relationship like the PRE. Before we go
any further, let us consider the non-statistical
relationship underneath the PRE.
The way that randomness is removed is to integrate out variations
across the sample space. In other words, we eliminate randomness
by taking expectations.
Notice that we are taking the conditional
expectation of
, conditional upon knowing the value of
.
Equation E7 is another way to think of the LRM. It says that we
are assuming a linear relationship between the exogenous
variable
and the expected value
of the random
variable
. This is an ordinary linear relationship;
there are no random variables left in E7 because
is a population
parameter. Taking expectations wipes out the influence of the random
factor
.
It is then perfectly legal and reasonable to take derivatives in E7:
We interpret the slope coefficient
as a derivative, the rate of change in
as the exogenous variable X
changes. In our minimum wage example,
would determine the rate
at which expected employment (across provinces) changes as the minimum wage changes.
Notice that the disturbance term u has no effect on this interpretation. In
others, the level of
is moved around by the disturbance terms, but
measures how
would change with
.