H. D. Vinod, "Review of GAUSS for Windows, Including Its Numerical Accuracy", Journal of Applied Econometrics, Vol. 15, No. 2, 2000, pp. 211-220. Dear Reader: Thank you for your interest in using the GAUSS programs to test its accuracy. The programs used in this software review may be found in the file vinod-progs.zip. Since this file was created in a Windows system, Unix users should use "unzip -a" when unzipping it. OUTLINE of the contents of this file: A] PURPOSE. B] TARGET AUDIANCE C] SUPPORT. D] DISCLAIMER. E] REFERENCES. F] FILE NAMING CONVENTIONS. G] RECOMMENDED STEPS. H] DETAILED DESCRIPTION OF RUNS WITHIN DIRECTORIES [H.1] Univariate, [H.2] ANOVA, [H.3] OLS, and [H.3] Nonlinear Least squares. I] BRIEF DESCRIPTION OF Procs. J] CONCLUSION. A] PURPOSE: These programs are used in creating the tables appearing in Vinod (2000, Journal of Applied Econometrics). They show how to test GAUSS by using NIST(1999) reference datasets and certified answers available on the Web at the URL below in the list of references. The directory structure is designed for the convenience of an occasional user who may want to test only a few test problems, not the entire set. A directory for a problem has the problem itself, its step-by-step solution by GAUSS on my platform, the certified values, etc. Any updates to the published information will be posted here. We hope to give most up-to-date results as GUASS releases new versions and cooperation by readers will be appreciated and your help will be acknowledged by name. B] TARGET AUDIANCE: Use of these programs assumes a good working knowledge of GAUSS syntax and protocol. It is written by a nonprofessional programmer (namely me, H. Vinod) between Jan. to Sept. 1999. I am a professor of economics at Fordham Univ. in New York. C] SUPPORT: If you have any questions plese send E-mail to Vinod. Please do not ask questions before making a reasonable attempt to self-help. Please be sure to follow the steps indicated below. I am only familiar with my particular platform and will not be able to help you with any platform related problems. I have a Pentium III speed 500MHz 128Meg RAM PC type computer. My E-mail is vinod@fordham.edu D] DISCLAIMER: The software is being offered "as is" with no guarantees explicit or implicit. The author or his employer are in no way responsible for any losses financial or otherwise from the use of the software. It may be copied and used for nonprofit and personal use with proper acknowledgment to Vinod. The software may be run on DOS or Windows version of GAUSS. You may need separate sets of programs for the two, even though there is a lot of similarity. E] REFERENCES: Vinod, H. D. (2000) 'Review of GAUSS for Windows, Including Its Numerical Accuracy,' Journal of Applied Econometrics (in press) McCullough, B. D. and H. D. Vinod (1999), 'The Numerical Reliability of Econometric Software,' Journal of Economic Literature, 37, 633-665. NIST (1999) National Institute of Standards and Technology, US Dept. of Commerce, Technology Administration, Gaithersburg, MD 20899-001, Statistical Reference Data Sets (StRD). download at: www.itl.nist.gov/div898/strd/general/dataarchive.html F] FILE NAMING CONVENTIONS: All files are in plain ASCII text format even though the filename-extension is not txt. There are five directories in the zip file: root mainly contains commonly needed GAUSS procedures Each *.g (or procedure) file explains exactly what it does. univ has nine univariate problems. anova has eleven analysis of variance problems. There are subdirectories for each NIST-StRD problem of this set. Let pnam="problem name" close to the name in NIST-StRD (sometimes abridged) the directory name is close to pnam. ols has ordinary least squares problems.. There are 11 subdirectories for each NIST-StRD problem of this set. nls has nonlinear least squares problems. There are 27 subdirectories for each NIST-StRD problem of this set. G] RECOMMENDED STEPS: 1) Choose your root directory and copy the programs and subdirectories you may be interested in. If you are using DOS version check the file called gogauss.bat 2) Run all procedures *.g files on your platform and make sure they are accessible. This is a very important step. It is platform dependent, and I can help only if your platform resembles mine. To run the procedures in the Windows version you will simply edit and run vinodjae.src and vinodols.src files. vinodols.src files is from the directory OLS needed there when one uses the QR algorithm to do the OLS estimation. The QR algorithm is known to have better numerical accuracy. These files collect the *.g files of DOS version together You must run them first to make sure everything is ok. My windows version is in C:\gausswin. I have c:\gausswin\lib\user.lcg. A copy of this file is named user-lcg.txt is included here for your convenience. You may want your current user.lcg to have these statements included. be sure to check the path of the location of files 3) Run the specific problem files in proper sequence. Recall that pnam="problem name" is close to the name in NIST-StRD (sometimes abridged) and is also the subdirectory name. H] DETAILED DESCRIPTION OF RUNS WITHIN DIRECTORIES H.1] UNIV: UNIARIATE set of NIST has 9 problems. Let "problem name"=pnam for the purpose of following description. c:\jae\univ subdirectory is UNIV. There are no further subdirectories here. pnam without any extension has GAUSS program. pnam.out with "out" as its extension has the output files. pnam.dat has data files downloaded from NIST-StRD. *.g are procedure files. A new kind of improved mean and standard deviation is implemented which does improve the NAD (no of accurate digits) score. See McCullough-Vinod (1999) for theory of improved mean algorithms. H.2] ANOVA: Analysis of Variance set has 11 problems and as many subdirectories. c:\jae\anova subdirectory is ANOVA see files named anov.g and anov2.g in the root directory c:\jae Each proc file does explain exactly what it does. problem name=pnam there are 11 subdirectories for each problem pnam.dat has data panm.txt has gauss program pnam.out has output file (text format) pnam.ori has original information downloaded form NIST-StRD A new kind of improved mean and standard deviation is implemented which does improve the NAD (no of accurate digits) score. anov2.g does this work H.3] OLS: ordinary least squares c:\jae\ols is the subdirectory within which there are 11 subdirectories. For my own procedures used here, see files named *.g in the root directory c:\jae. Each proc file does try to explain exactly what it does. Let us denote "problem name"=pnam there are 11 subdirectories for each OLS problem pnam.dat has data panm.txt has gauss program pnam.out has output file pnam.ori has original information downloaded form NIST-StRD In some cases pnam.ori file is absent but all the information as downloaded is included in the pnam.txt file itself with generous use of comment symbols. H.4] NLS Nonlinear least squares c:\jae\nls is the subdirectory within which there are 27 subdirectories. We test GAUSS program called CO (constrained optimum) using NIST-STRD data sets The CO manual (p. 48) suggests 'Newton' algorithm (among 5 choices), 'Brent' line search (4 choices), and central differences for numerical gradients (2 choices). Among the 5x4x2=40 ways of solving each nonlinear regression problem, we expect a typical researcher to use the choice in the manual. Since different choices yield different NADs, one may compare the final values of the minimands. Accordingly, for some examples, we also consider the 'bfgs' algorithm with the 'stepbt' line search and pick the solution with the smallest minimand. Future CO versions should automate ranking and selection among the many choices. Recall that we denote a problem name by "pnam". There are 27 problems in NLS set each problem has its own sub directory files named pnam.ori have original data files named pnam.1 have 1st stage run which genreates bh.fmt a vector of estimates. Two stages are necessary to cover the possibility of non convergence of the first stage algorithm. Files named pnam.2 are to be run only after the files named pnam.1 are run. Their outputs are called pnam.ou1 and pnam.ou2 respectively. The run times in output files are on a Pentium III speed 500MHz 128Meg RAM PC type computer. I] BRIEF DESCRIPTION OF Procs anov.g proc Computes analysis of variance ratio F statistic=(mean square between group means)/(mean sq within groups) ssd is called output is within, between, and F. anov2.g proc is similar to anov.g except that it calls ssd2 autocor.g proc computes specified autocorrelations for each column of a matrix. The data are assumed to have 0 mean. Use x = x-meanc(x)' if necessary. gogauss.bat file this is a batch file useful for getting GAUSS to find the needed files, proc files etc. The way I use it is to type gogauss before anything, then type gauss and then edit meanc2.g proc Computes numerically more accurate column means author H. D. Vinod, March 9, 1999 Reference: R. F. Ling, JASA, vol 69, p.859 nistols.g proc to compute the ols tests using NIST -STRD datasets DO NOT include col of ones for the intercept, it is automatic. cb=certified values of b the regr coefficients cstdb= certified standard errors for b cr2=certified R-square Output: printout of NAD results needs a proc for NAD calculation of scalars called numacc.g and another proc for NADs as the minimum from a vector of NADs numaccv2.g nistolsf.g f => regr. line is forced through the origin otherwise similar to nistols.g nistuni2.g proc is a 2nd version using numerically superior meanc2.g and stdc3.g procs to compute the mean std.dev and autocorrelation coefficient as done by NIST-StRD to find the no of accurate digits or NADs Input: y=data col.vector cybar=certified mean csig=certified standard deviation crho1=certified 1st order autocorrelatin coeff using NIST formula Output: printing of NADs for mean, std.dev and autocorr. coeff needs a proc for NAD calculation of scalars called numacc.g nistuniv.g proc is similar to nistuni2.g described above. This is the first version using GAUSS's meanc(x) to compute means. numacc.g proc to compute the Number of accurate digits NADs As described in the text Vinod (2000) Also, See McCullough, Amer Statistician eq.(2) page 360, Nov. 1998 for definition of NAD input q=observed value (or estimated value) c=certified value output: NAD number of accurate digits numaccv.g is vector version of Numacc.g described above. It returns the smallest NAD after computing the NAD for each vector element. numaccv2.g is vector version of Numacc.g described above. It returns the smallest NAD after computing the NAD for each vector element. The 2 in its name refers to 2nd version for detailed printing of Estimated value, Certified value, Absolute difference |q-c|, etc. reshape2.g is an improved version of GAUSS's reshape where one does not have to input the number of rows in the data ssd.g proc computes sum of squared deviations(ssd) from the column means using GAUSS's meanc ssd2.g proc is similar to ssd.g above, except that it uses meanc2.g to compute column means. J] CONCLUSION: Again, thanks for your interest. Please help me in updating this. Hope this is useful and eventually leads to better software and more reproducible econometrics. These routines and the associated files may be used freely for non-commercial purposes, provided that proper attribution is made. Please cite the paper Vinod, H. D. (2000) "Review of GAUSS for Windows, Including Its Numerical Accuracy", Journal of Applied Econometrics (in press) The routines and files may not be incorporated into any book or computer program without the express, written consent of the author. H. D. Vinod vinod@fordham.edu Updated on: March 27, 2000 Modified by JGM: April 7, 2000