Queens University at Kingston

HyperMetricsNotes

multiple File 4
[multiple Contents] [Previous File] [Next File]


D. Examples of Computing and Interpreting OLS Estimates of the MLRM

  1. Purely Numerical Example
  2. 
    
    . list y x1 x2 x3
    
                 y         x1         x2         x3
      1. -6.545413          1          1          0
      2. -1.589974          2          4   .6931472
      3.  1.504102          3          9   1.098612
      4. -5.820597          4         16   1.386294
      5. -10.82546          5         25   1.609438
      6.   -6.8942          6         36   1.791759
      7. -13.92935          7         49    1.94591
      8. -19.82583          8         64   2.079442
      9. -17.99812          9         81   2.197225
     10. -16.14051         10        100   2.302585
    
    * Note that x2 = x1*x1, which is okay because x2 is not LINEARLY dependent
    * on x1
    
    . regress y x1 x2 x3
    
      Source |       SS       df       MS                  Number of obs =      10
    ---------+------------------------------               F(  3,     6) =   14.91
       Model |   402.13325     3  134.044417               Prob > F      =  0.0035
    Residual |  53.9568585     6  8.99280976               R-square      =  0.8817
    ---------+------------------------------               Adj R-square  =  0.8225
       Total |  456.090108     9  50.6766787               Root MSE      =  2.9988
    ------------------------------------------------------------------------------
           y |      Coef.   Std. Err.       t     P>|t|       [95 Conf. Interval]
    ---------+--------------------------------------------------------------------
          x1 |  -18.23062   6.387585     -2.854   0.029      -33.86047   -2.600757
          x2 |   .7900252   .3449069      2.291   0.062      -.0539316    1.633982
          x3 |   32.77203   11.73369      2.793   0.031       4.060729    61.48333
       _cons |   10.54565   5.461673      1.931   0.102       -2.81858    23.90989
    ------------------------------------------------------------------------------
    
    * These Stata matrix commands will compute the OLS estimates as well
    *  See "help matrix"  
    
    *  The accum subcommand computes X'X using the variables you list
    . matrix accum XpX = x1 x2 x3
    . matrix list XpX, nohalf
                  x1         x2         x3      _cons
       x1        385       3025  102.08283         55
       x2       3025      25333  776.24766        385
       x3  102.08283  776.24766  27.650244  15.104413
    _cons         55        385  15.104413         10
    
    *  The inv() function computes the matrix inverse
    . matrix Xi = inv(XpX)
    . matrix list Xi, nohalf
                   x1          x2          x3       _cons
       x1   4.5370967  -.24149327  -8.1097843   -3.407188
       x2  -.24149327   .01322843   .41656987   .18971409
       x3  -8.1097843   .41656987   15.309946   5.4410995
    _cons   -3.407188   .18971409   5.4410995   3.3170804
    
    * the vecaccum subcommand treats the first listed variable as the y vector
    * and computes X'y
    . matrix vecaccum Xy = y x1 x2 x3
    . matrix Xy = Xy'
    . matrix list Xy, nohalf
                    y
       x1  -703.48823
       x2  -5634.6157
       x3  -182.33711
    _cons  -98.065355
    
    . matrix beta = Xi * Xy
    . matrix list beta, nohalf
                    y
       x1  -18.230615
       x2   .79002516
       x3    32.77203
    _cons   10.545653
    
    * Note that the vector beta is identical to the "coef" column in the regress
    * output table
    
    *  This computes the estimated variance matrix - discussed later.
    . matrix VCM = 8.99280976 * Xi
    . matrix l VCM, nohalf
                   x1          x2          x3       _cons
       x1   40.801248  -2.1717031  -72.929747  -30.640194
       x2  -2.1717031   .11896076   3.7461336   1.7060627
       x3  -72.929747   3.7461336   137.67943   48.930773
    _cons  -30.640194   1.7060627   48.930773   29.829873
    
    
    

  3. Substantive Example
  4. Does a person's regligious background relate to their sexual behavior? Or, more to the point, was Billy Joel right when he sang ``Catholic girls start much too late?". We can actually study this question "scientifically" using the NSLY used in one or more homework assignments. Here is an annotated log file for the analysis, including data cleaning, summary statistics, and estimates. Uses of the results (hypothesis test, predictions, and conclusion) come later. (Sound familiar??)
    
    
    
    . keep relig fage* sex
    . *  to simplify matters, look only at women
    . drop if sex==1
    (6403 observations deleted)
    . *  missing values of relig coded as missing
    . drop if relig<0
    (20 observations deleted)
    . *  religion is coded into 10 categories, will collapse to 4
    . tab relig
         IN WHAT|
    RELIGION WAS|
      R RAISED -|      Freq.     Percent        Cum.
    ------------+-----------------------------------
              0 |        221        3.53        3.53
              1 |        261        4.17        7.70
              2 |       1796       28.68       36.37
              3 |         95        1.52       37.89
              4 |        320        5.11       43.00
              5 |        523        8.35       51.35
              6 |        167        2.67       54.02
              7 |       2159       34.47       88.49
              8 |         58        0.93       89.41
              9 |        663       10.59      100.00
    ------------+-----------------------------------
          Total |       6263      100.00
    
    . gen none = cond(relig==0,1,0)
    . gen prot = cond(relig<7&relig>0,1,0)
    . gen cath = cond(relig==7,1,0)
    . *  Apologize for grouping everyone else into one bundle
    . gen oth = cond(relig>7,1,0)
    . * Hey, why not have a recoded religion variable (4 values)?
    . gen relig2 = none + 2*prot + 3*cath + 4*oth
    . tab relig2
          relig2|      Freq.     Percent        Cum.
    ------------+-----------------------------------
              1 |        221        3.53        3.53
              2 |       3162       50.49       54.02
              3 |       2159       34.47       88.49
              4 |        721       11.51      100.00
    ------------+-----------------------------------
          Total |       6263      100.00
    
    . * Hey, who the heck is raised a 2?  We need labels!!
    . label define religions 1 "none" 2 "protest" 3 "catholic" 4 "non_chrst"
    . label values relig2 religions
    . tab relig2 
          relig2|      Freq.     Percent        Cum.
    ------------+-----------------------------------
           none |        221        3.53        3.53
        protest |       3162       50.49       54.02
       catholic |       2159       34.47       88.49
       non_chrs |        721       11.51      100.00
    ------------+-----------------------------------
          Total |       6263      100.00
    
    . descr fage*
    
      3. fage83       byte   8.0g                F - AGE @FIRST SEXUAL INTERCOUR
      4. fage84       byte   8.0g                F - AGE FIRST SEXUAL INTERCOURS
      5. fage85       byte   8.0g                F AGE 1ST HAD SEXUAL INTERCOURS
    
    . *  first the person is asked if they have every had
    . *  sexual intercourse. IF YES for the first time, then the person is
    . *  asked age at first intercourse.  IF NO or YES in earlier year then
    . *  fage is coded as negative.
    . gen byte age = fage83 if fage83>0
    (1259 missing values generated)
    . * this adds people who said yes for first time in 1984
    . replace age = fage84 if fage84>0
    (2522 real changes made)
    . * now 1985
    . replace age = fage85 if fage85>0
    (245 real changes made)
    . * The variable age is now the age of first intercourse of
    . * all people who have had intercourse by 1985
    . tab age
             age|      Freq.     Percent        Cum.
    ------------+-----------------------------------
              2 |          1        0.02        0.02
              3 |          1        0.02        0.04
              8 |          3        0.05        0.09
              9 |          4        0.07        0.16
             10 |         11        0.20        0.36
             11 |         11        0.20        0.55
             12 |         29        0.52        1.07
             13 |         95        1.70        2.77
             14 |        236        4.22        6.99
             15 |        521        9.31       16.30
             16 |       1020       18.23       34.54
             17 |       1059       18.93       53.47
             18 |       1134       20.27       73.74
             19 |        620       11.08       84.82
             20 |        370        6.61       91.44
             21 |        231        4.13       95.57
             22 |        124        2.22       97.78
             23 |         62        1.11       98.89
             24 |         32        0.57       99.46
             25 |         18        0.32       99.79
             26 |          9        0.16       99.95
             27 |          3        0.05      100.00
    ------------+-----------------------------------
          Total |       5594      100.00
    
    . regress age none cath oth
    
      Source |       SS       df       MS                  Number of obs =    5594
    ---------+------------------------------               F(  3,  5590) =   31.10
       Model |  460.623661     3   153.54122               Prob > F      =  0.0000
    Residual |  27598.6357  5590  4.93714414               R-squared     =  0.0164
    ---------+------------------------------               Adj R-squared =  0.0159
       Total |  28059.2594  5593   5.0168531               Root MSE      =   2.222
    
    ------------------------------------------------------------------------------
         age |      Coef.   Std. Err.       t     P>|t|       [95 Conf. Interval]
    ---------+--------------------------------------------------------------------
        none |  -.4101829    .160575     -2.554   0.011      -.7249723   -.0953935
        cath |    .566983   .0659714      8.594   0.000       .4376534    .6963127
         oth |   .3732208   .0980449      3.807   0.000       .1810148    .5654268
       _cons |    17.2053   .0412396    417.204   0.000       17.12446    17.28615
    ------------------------------------------------------------------------------
    
    . * significant t values indicate that that age
    . * differs for that religion significantly compared to
    . * protestant christians
    . * If you ever want to see hatVar(beta_hat), use the matrix "get(VCE)" command
    . matrix  eVbhat = get(VCE)
    . matrix l eVbhat
    
    symmetric eVbhat[4,4]
                none       cath        oth      _cons
     none  .02578433
     cath   .0017007  .00435223
      oth   .0017007   .0017007  .00961279
    _cons  -.0017007  -.0017007  -.0017007   .0017007
    
    . * Why don't we include dummy variable for each religion?
    . regress age none cath oth prot
    
      Source |       SS       df       MS                  Number of obs =    5594
    ---------+------------------------------               F(  3,  5590) =   31.10
       Model |  460.623661     3   153.54122               Prob > F      =  0.0000
    Residual |  27598.6357  5590  4.93714414               R-squared     =  0.0164
    ---------+------------------------------               Adj R-squared =  0.0159
       Total |  28059.2594  5593   5.0168531               Root MSE      =   2.222
    
    ------------------------------------------------------------------------------
         age |      Coef.   Std. Err.       t     P>|t|       [95 Conf. Interval]
    ---------+--------------------------------------------------------------------
        none |  -.7834037   .1788735     -4.380   0.000      -1.134065   -.4327422
        cath |   .1937622   .1027795      1.885   0.059      -.0077254    .3952499
         oth |  (dropped)
        prot |  -.3732208   .0980449     -3.807   0.000      -.5654268   -.1810148
       _cons |   17.57853   .0889499    197.623   0.000       17.40415     17.7529
    ------------------------------------------------------------------------------
    
    . * Answer:  because X'X is not invertible!!!!
    

[multiple Contents] [Next File] [Top of File]

This document was created using HTX, a (HTML/TeX) interlacing program written by Chris Ferrall.
Document Last revised: 1997/1/5