Queens University at Kingston

HyperMetricsNotes

glossary File 6
[glossary Contents] [Previous File] [Next File]


F. Statistical Estimation

Set Up

Given a population described by a random variable Y. Suppose that the distribution of Y is known except for one or more population parameters, $a$ .

Example

Suppose Y is the number of cigarettes and we know (or assume) that Y is normally distributed, but we don't know Y's mean $\mu$ . In this case, the unknown parameter $a$ equals $\mu$ .
We observe N observations from the population, called a sample:
Sample = $(Y_1, Y_2, ... Y_N)$
Each observation in the sample can be thought of as a different random variable. The actual value in the data set for observation 10 is a realization of the corresponding random variable Y_10.
We can think of the sample space here as all possible outcomes in a data set:
$(Y_1,Y_2,...,Y_N)$ .

Estimation from a Sample

An Estimator is in general terms simply
a function of a sample space
Basic Properties that all Estimator share
Because an estimator is a function of the sample space, then it is a random variable.
Because an estimator is a random variable, it has a distribution, a mean, a variance, etc.
Because estimators have distributions, they have statistical properties
Because estimators are functions of sample information, they are usually represesnted by formulas or procedures
Desirable Properties than Estimators May or May not Have.
Let $\hat{a}$ be an estimate of some population parameter $a$ .
  1. Unbiased
    $E[\hat{a}] = a$
    In words: Across samples, average value of $\hat{a}$ equals the value of the population parameter $a$ .
  2. Asymptotically Unbiased (with sample size n)":
    $\lim_{n\to\infty} E[\hat a(n)]=a$
    In words: With ``enough" data, the estimator is unbiased.
  3. Asymptotically Normal
    $\lim_{n\to\infty} \hat{a}(n) \sim N(E[\hat{a}(\infty)],Var[\hat{a}(\infty)])$
    In words: With "enough" data, the distribution of the estimator becomes the normal distribution.
  4. Consistent - requires two things of $\hat{a}$ :
    1. Asymptotically Unbiased
    2. Asymptotically Variance of estimator goes to 0: $\lim_{n\to \infty} Var[\hat a(n)] = 0$
    In words: With enough data the distribution of the estimator collapses onto the true value.

Example: Estimation of the population mean

The usual estimator for the population mean is the sample mean (let's call it $\hat{\mu}$ ):

$$\hat\mu = {1\over N}\sum_{i=1}^N Y_i$$
Notice $\hat{\mu}$ is a function (formula).
Since $\hat{\mu}$ is a function of random variables, it is a random variable itself. Its value is not determined until the sample is drawn from the population. The usual estimator for the population variance is
$$\hat{\sigma^2} = {1\over N-1}\sum_{i=1}^N (Y_i-\hat\mu)^2$$
For a given sample (realization of all the Yi's), $\hat{\mu}$ takes on a particular value. But the sample mean, as a concept, is a function and a random variable. It is not conceptually a number.
Statistical Properties of $\hat{\mu}$ and $\hat\sigma^2$
  1. $E[\hat{\mu}] = \mu$
    In words: The mean of the sample mean is the population mean. It is an unbiased estimator.
  2. $Var[\hat{\mu}] = \sigma^2 / N$
    In words: The variance of the sample mean is the variance of the underlying random variable divided by the sample size. It follows directly that $\lim_{N\to \infty}Var[\hat{\mu}]=0$ .
  3. $\hat{\mu}$ is a consistent estimate of $\mu$ . This follows directly from the first two properties.
  4. $E[\hat\sigma^2] = \sigma^2$
    In words: the sample variance is an unbiased estimator of $\sigma^2$ .
  5. The Central Limit Theorem: $\lim_{n\to \infty} \hat\mu(n) \sim N(\mu,\sigma^2/n)$
    In words: The distribution of the sample mean for any population converges to a normal distribution as the sample size approaches infinity.
  6. $\hat{\mu}$ and $\hat\sigma^2$ are independently distributed. This imples that
    $$E[(\hat{\mu}-\mu)({1\over\hat\sigma^2})] = 0$$
Note that the variance of $\hat{\mu}$ depends upon the variance of the random variable itself $\sigma^2$ .
The Estimated Variance of the sample mean
$\hat{Var}[\hat{\mu}] \equiv \hat\sigma^2 / N$
The Estimated Standard Error of the sample mean
$\hat{se}[\hat{\mu}] \equiv \sqrt{\hat{Var}[\hat{\mu}]}$

You can see the Central Limit Theorm in action by running the tutorial for week 2.



[glossary Contents] [Next File] [Top of File]

This document was created using HTX, a (HTML/TeX) interlacing program written by Chris Ferrall.
Document Last revised: 1997/1/5