## D. Examples of Computing and Interpreting OLS Estimates of the MLRM

#### Purely Numerical Example

``````

. list y x1 x2 x3

y         x1         x2         x3
1. -6.545413          1          1          0
2. -1.589974          2          4   .6931472
3.  1.504102          3          9   1.098612
4. -5.820597          4         16   1.386294
5. -10.82546          5         25   1.609438
6.   -6.8942          6         36   1.791759
7. -13.92935          7         49    1.94591
8. -19.82583          8         64   2.079442
9. -17.99812          9         81   2.197225
10. -16.14051         10        100   2.302585

* Note that x2 = x1*x1, which is okay because x2 is not LINEARLY dependent
* on x1

. regress y x1 x2 x3

Source |       SS       df       MS                  Number of obs =      10
---------+------------------------------               F(  3,     6) =   14.91
Model |   402.13325     3  134.044417               Prob > F      =  0.0035
Residual |  53.9568585     6  8.99280976               R-square      =  0.8817
---------+------------------------------               Adj R-square  =  0.8225
Total |  456.090108     9  50.6766787               Root MSE      =  2.9988
------------------------------------------------------------------------------
y |      Coef.   Std. Err.       t     P>|t|       [95 Conf. Interval]
---------+--------------------------------------------------------------------
x1 |  -18.23062   6.387585     -2.854   0.029      -33.86047   -2.600757
x2 |   .7900252   .3449069      2.291   0.062      -.0539316    1.633982
x3 |   32.77203   11.73369      2.793   0.031       4.060729    61.48333
_cons |   10.54565   5.461673      1.931   0.102       -2.81858    23.90989
------------------------------------------------------------------------------

* These Stata matrix commands will compute the OLS estimates as well
*  See "help matrix"

*  The accum subcommand computes X'X using the variables you list
. matrix accum XpX = x1 x2 x3
. matrix list XpX, nohalf
x1         x2         x3      _cons
x1        385       3025  102.08283         55
x2       3025      25333  776.24766        385
x3  102.08283  776.24766  27.650244  15.104413
_cons         55        385  15.104413         10

*  The inv() function computes the matrix inverse
. matrix Xi = inv(XpX)
. matrix list Xi, nohalf
x1          x2          x3       _cons
x1   4.5370967  -.24149327  -8.1097843   -3.407188
x2  -.24149327   .01322843   .41656987   .18971409
x3  -8.1097843   .41656987   15.309946   5.4410995
_cons   -3.407188   .18971409   5.4410995   3.3170804

* the vecaccum subcommand treats the first listed variable as the y vector
* and computes X'y
. matrix vecaccum Xy = y x1 x2 x3
. matrix Xy = Xy'
. matrix list Xy, nohalf
y
x1  -703.48823
x2  -5634.6157
x3  -182.33711
_cons  -98.065355

. matrix beta = Xi * Xy
. matrix list beta, nohalf
y
x1  -18.230615
x2   .79002516
x3    32.77203
_cons   10.545653

* Note that the vector beta is identical to the "coef" column in the regress
* output table

*  This computes the estimated variance matrix - discussed later.
. matrix VCM = 8.99280976 * Xi
. matrix l VCM, nohalf
x1          x2          x3       _cons
x1   40.801248  -2.1717031  -72.929747  -30.640194
x2  -2.1717031   .11896076   3.7461336   1.7060627
x3  -72.929747   3.7461336   137.67943   48.930773
_cons  -30.640194   1.7060627   48.930773   29.829873

``````

#### Substantive Example

Does a person's regligious background relate to their sexual behavior? Or, more to the point, was Billy Joel right when he sang ``Catholic girls start much too late?". We can actually study this question "scientifically" using the NSLY used in one or more homework assignments. Here is an annotated log file for the analysis, including data cleaning, summary statistics, and estimates. Uses of the results (hypothesis test, predictions, and conclusion) come later. (Sound familiar??)
``````

. keep relig fage* sex
. *  to simplify matters, look only at women
. drop if sex==1
(6403 observations deleted)
. *  missing values of relig coded as missing
. drop if relig<0
(20 observations deleted)
. *  religion is coded into 10 categories, will collapse to 4
. tab relig
IN WHAT|
RELIGION WAS|
R RAISED -|      Freq.     Percent        Cum.
------------+-----------------------------------
0 |        221        3.53        3.53
1 |        261        4.17        7.70
2 |       1796       28.68       36.37
3 |         95        1.52       37.89
4 |        320        5.11       43.00
5 |        523        8.35       51.35
6 |        167        2.67       54.02
7 |       2159       34.47       88.49
8 |         58        0.93       89.41
9 |        663       10.59      100.00
------------+-----------------------------------
Total |       6263      100.00

. gen none = cond(relig==0,1,0)
. gen prot = cond(relig<7&relig>0,1,0)
. gen cath = cond(relig==7,1,0)
. *  Apologize for grouping everyone else into one bundle
. gen oth = cond(relig>7,1,0)
. * Hey, why not have a recoded religion variable (4 values)?
. gen relig2 = none + 2*prot + 3*cath + 4*oth
. tab relig2
relig2|      Freq.     Percent        Cum.
------------+-----------------------------------
1 |        221        3.53        3.53
2 |       3162       50.49       54.02
3 |       2159       34.47       88.49
4 |        721       11.51      100.00
------------+-----------------------------------
Total |       6263      100.00

. * Hey, who the heck is raised a 2?  We need labels!!
. label define religions 1 "none" 2 "protest" 3 "catholic" 4 "non_chrst"
. label values relig2 religions
. tab relig2
relig2|      Freq.     Percent        Cum.
------------+-----------------------------------
none |        221        3.53        3.53
protest |       3162       50.49       54.02
catholic |       2159       34.47       88.49
non_chrs |        721       11.51      100.00
------------+-----------------------------------
Total |       6263      100.00

. descr fage*

3. fage83       byte   8.0g                F - AGE @FIRST SEXUAL INTERCOUR
4. fage84       byte   8.0g                F - AGE FIRST SEXUAL INTERCOURS
5. fage85       byte   8.0g                F AGE 1ST HAD SEXUAL INTERCOURS

. *  first the person is asked if they have every had
. *  sexual intercourse. IF YES for the first time, then the person is
. *  asked age at first intercourse.  IF NO or YES in earlier year then
. *  fage is coded as negative.
. gen byte age = fage83 if fage83>0
(1259 missing values generated)
. * this adds people who said yes for first time in 1984
. replace age = fage84 if fage84>0
(2522 real changes made)
. * now 1985
. replace age = fage85 if fage85>0
(245 real changes made)
. * The variable age is now the age of first intercourse of
. * all people who have had intercourse by 1985
. tab age
age|      Freq.     Percent        Cum.
------------+-----------------------------------
2 |          1        0.02        0.02
3 |          1        0.02        0.04
8 |          3        0.05        0.09
9 |          4        0.07        0.16
10 |         11        0.20        0.36
11 |         11        0.20        0.55
12 |         29        0.52        1.07
13 |         95        1.70        2.77
14 |        236        4.22        6.99
15 |        521        9.31       16.30
16 |       1020       18.23       34.54
17 |       1059       18.93       53.47
18 |       1134       20.27       73.74
19 |        620       11.08       84.82
20 |        370        6.61       91.44
21 |        231        4.13       95.57
22 |        124        2.22       97.78
23 |         62        1.11       98.89
24 |         32        0.57       99.46
25 |         18        0.32       99.79
26 |          9        0.16       99.95
27 |          3        0.05      100.00
------------+-----------------------------------
Total |       5594      100.00

. regress age none cath oth

Source |       SS       df       MS                  Number of obs =    5594
---------+------------------------------               F(  3,  5590) =   31.10
Model |  460.623661     3   153.54122               Prob > F      =  0.0000
Residual |  27598.6357  5590  4.93714414               R-squared     =  0.0164
---------+------------------------------               Adj R-squared =  0.0159
Total |  28059.2594  5593   5.0168531               Root MSE      =   2.222

------------------------------------------------------------------------------
age |      Coef.   Std. Err.       t     P>|t|       [95 Conf. Interval]
---------+--------------------------------------------------------------------
none |  -.4101829    .160575     -2.554   0.011      -.7249723   -.0953935
cath |    .566983   .0659714      8.594   0.000       .4376534    .6963127
oth |   .3732208   .0980449      3.807   0.000       .1810148    .5654268
_cons |    17.2053   .0412396    417.204   0.000       17.12446    17.28615
------------------------------------------------------------------------------

. * significant t values indicate that that age
. * differs for that religion significantly compared to
. * protestant christians
. * If you ever want to see hatVar(beta_hat), use the matrix "get(VCE)" command
. matrix  eVbhat = get(VCE)
. matrix l eVbhat

symmetric eVbhat[4,4]
none       cath        oth      _cons
none  .02578433
cath   .0017007  .00435223
oth   .0017007   .0017007  .00961279
_cons  -.0017007  -.0017007  -.0017007   .0017007

. * Why don't we include dummy variable for each religion?
. regress age none cath oth prot

Source |       SS       df       MS                  Number of obs =    5594
---------+------------------------------               F(  3,  5590) =   31.10
Model |  460.623661     3   153.54122               Prob > F      =  0.0000
Residual |  27598.6357  5590  4.93714414               R-squared     =  0.0164
---------+------------------------------               Adj R-squared =  0.0159
Total |  28059.2594  5593   5.0168531               Root MSE      =   2.222

------------------------------------------------------------------------------
age |      Coef.   Std. Err.       t     P>|t|       [95 Conf. Interval]
---------+--------------------------------------------------------------------
none |  -.7834037   .1788735     -4.380   0.000      -1.134065   -.4327422
cath |   .1937622   .1027795      1.885   0.059      -.0077254    .3952499
oth |  (dropped)
prot |  -.3732208   .0980449     -3.807   0.000      -.5654268   -.1810148
_cons |   17.57853   .0889499    197.623   0.000       17.40415     17.7529
------------------------------------------------------------------------------

. * Answer:  because X'X is not invertible!!!!
``````

[multiple Contents] [Next File] [Top of File]

This document was created using HTX, a (HTML/TeX) interlacing program written by Chris Ferrall.
Document Last revised: 1997/1/5