|
Volume I Cross-Sectional and Panel Regression Methods |
|
|
|
|
xiii | |
|
|
xv | |
Preface to the Second Edition |
|
xvii | |
Preface to the First Edition |
|
xix | |
|
|
1 | (32) |
|
|
1 | (1) |
|
|
2 | (3) |
|
1.3 Command syntax and operators |
|
|
5 | (9) |
|
1.4 Do-files and log files |
|
|
14 | (5) |
|
|
19 | (1) |
|
1.6 Using results from Stata commands |
|
|
20 | (3) |
|
1.7 Global and local macros |
|
|
23 | (3) |
|
|
26 | (3) |
|
1.9 Mata and Python in Stata |
|
|
29 | (1) |
|
1.10 Some useful commands |
|
|
29 | (1) |
|
|
30 | (1) |
|
1.12 Community-contributed commands |
|
|
30 | (1) |
|
1.13 Additional resources |
|
|
31 | (1) |
|
|
31 | (2) |
|
2 Data management and graphics |
|
|
33 | (52) |
|
|
33 | (1) |
|
|
33 | (3) |
|
|
36 | (7) |
|
|
43 | (17) |
|
2.5 Manipulating datasets |
|
|
60 | (7) |
|
2.6 Graphical display of data |
|
|
67 | (16) |
|
|
83 | (1) |
|
|
83 | (2) |
|
3 Linear regression basics |
|
|
85 | (64) |
|
|
85 | (1) |
|
3.2 Data and data summary |
|
|
85 | (9) |
|
3.3 Transformation of data before regression |
|
|
94 | (2) |
|
|
96 | (6) |
|
3.5 Basic regression analysis |
|
|
102 | (21) |
|
3.6 Specification analysis |
|
|
123 | (9) |
|
|
132 | (8) |
|
|
140 | (5) |
|
|
145 | (2) |
|
3.10 Additional resources |
|
|
147 | (1) |
|
|
147 | (2) |
|
4 Linear regression extensions |
|
|
149 | (58) |
|
|
149 | (1) |
|
|
149 | (8) |
|
4.3 Out-of-sample prediction |
|
|
157 | (4) |
|
|
161 | (14) |
|
|
175 | (11) |
|
4.6 Regression decomposition analysis |
|
|
186 | (7) |
|
4.7 Shapley decomposition of relative regressor importance |
|
|
193 | (2) |
|
4.8 Difference-in-differences estimators |
|
|
195 | (9) |
|
|
204 | (1) |
|
|
204 | (3) |
|
|
207 | (38) |
|
|
207 | (1) |
|
5.2 Pseudorandom-number generators |
|
|
208 | (6) |
|
5.3 Distribution of the sample mean |
|
|
214 | (6) |
|
5.4 Pseudorandom-number generators: Further details |
|
|
220 | (7) |
|
|
227 | (5) |
|
5.6 Simulation for regression: Introduction |
|
|
232 | (10) |
|
|
242 | (1) |
|
|
242 | (3) |
|
6 Linear regression with correlated errors |
|
|
245 | (60) |
|
|
245 | (1) |
|
6.2 Generalized least-squares and FGLS regression |
|
|
246 | (4) |
|
6.3 Modeling heteroskedastic data |
|
|
250 | (6) |
|
6.4 OLS for clustered data |
|
|
256 | (9) |
|
6.5 FGLS estimators for clustered data |
|
|
265 | (4) |
|
6.6 Fixed-effects estimator for clustered data |
|
|
269 | (8) |
|
6.7 Linear mixed models for clustered data |
|
|
277 | (9) |
|
6.8 Systems of linear regressions |
|
|
286 | (9) |
|
6.9 Survey data: Weighting, clustering, and stratification |
|
|
295 | (6) |
|
6.10 Additional resources |
|
|
301 | (1) |
|
|
302 | (3) |
|
7 Linear instrumental-variables regression |
|
|
305 | (68) |
|
|
305 | (1) |
|
7.2 Simultaneous equations model |
|
|
306 | (4) |
|
7.3 Instrumental-variables regression |
|
|
310 | (6) |
|
7.4 Instrumental-variables example |
|
|
316 | (14) |
|
|
330 | (9) |
|
7.6 Diagnostics and tests for weak instruments |
|
|
339 | (14) |
|
7.7 Inference with weak instruments |
|
|
353 | (9) |
|
7.8 Finite sample inference with weak instruments |
|
|
362 | (1) |
|
|
363 | (4) |
|
7.10 Three-stage least-squares systems estimation |
|
|
367 | (1) |
|
7.11 Additional resources |
|
|
368 | (1) |
|
|
369 | (4) |
|
8 Linear panel-data models: Basics |
|
|
373 | (48) |
|
|
373 | (1) |
|
8.2 Panel-data methods overview |
|
|
373 | (6) |
|
8.3 Summary of panel data |
|
|
379 | (15) |
|
8.4 Pooled or population-averaged estimators |
|
|
394 | (3) |
|
8.5 Fixed-effects or within estimator |
|
|
397 | (4) |
|
|
401 | (1) |
|
8.7 Random-effects estimator |
|
|
402 | (4) |
|
8.8 Comparison of estimators |
|
|
406 | (6) |
|
8.9 First-difference estimator |
|
|
412 | (2) |
|
8.10 Panel-data management |
|
|
414 | (4) |
|
8.11 Additional resources |
|
|
418 | (1) |
|
|
419 | (2) |
|
9 Linear panel-data models: Extensions |
|
|
421 | (38) |
|
|
421 | (1) |
|
9.2 Panel instrumental-variables estimation |
|
|
421 | (4) |
|
9.3 Hausman-Taylor estimator |
|
|
425 | (3) |
|
9.4 Arellano-Bond estimator |
|
|
428 | (17) |
|
|
445 | (11) |
|
|
456 | (1) |
|
|
456 | (3) |
|
10 Introduction to nonlinear regression |
|
|
459 | (20) |
|
|
459 | (1) |
|
10.2 Binary outcome models |
|
|
459 | (3) |
|
|
462 | (4) |
|
10.4 MEs and coefficient interpretation |
|
|
466 | (6) |
|
|
472 | (2) |
|
10.6 Nonlinear least squares |
|
|
474 | (2) |
|
10.7 Other nonlinear estimators |
|
|
476 | (1) |
|
10.8 Additional resources |
|
|
477 | (1) |
|
|
477 | (2) |
|
11 Tests of hypotheses and model specification |
|
|
479 | (58) |
|
|
479 | (1) |
|
11.2 Critical values and p-values |
|
|
480 | (5) |
|
11.3 Wald tests and confidence intervals |
|
|
485 | (13) |
|
11.4 Likelihood-ratio tests |
|
|
498 | (4) |
|
11.5 Lagrange multiplier test (or score test) |
|
|
502 | (3) |
|
|
505 | (7) |
|
|
512 | (7) |
|
11.8 The power onemean command for multiple regression |
|
|
519 | (10) |
|
|
529 | (3) |
|
11.10 Permutation tests and randomization tests |
|
|
532 | (2) |
|
11.11 Additional resources |
|
|
534 | (1) |
|
|
534 | (3) |
|
|
537 | (42) |
|
|
537 | (1) |
|
|
537 | (2) |
|
12.3 Bootstrap pairs using the vce(bootstrap) option |
|
|
539 | (8) |
|
12.4 Bootstrap pairs using the bootstrap command |
|
|
547 | (8) |
|
12.5 Percentile-t bootstraps with asymptotic refinement |
|
|
555 | (5) |
|
12.6 Wild bootstrap with asymptotic refinement |
|
|
560 | (9) |
|
12.7 Bootstrap pairs using bsample and simulate |
|
|
569 | (1) |
|
12.8 Alternative resampling schemes |
|
|
570 | (5) |
|
|
575 | (1) |
|
12.10 Additional resources |
|
|
576 | (1) |
|
|
577 | (2) |
|
13 Nonlinear regression methods |
|
|
579 | (64) |
|
|
579 | (1) |
|
13.2 Nonlinear example: Doctor visits |
|
|
580 | (2) |
|
13.3 Nonlinear regression methods |
|
|
582 | (15) |
|
13.4 Different estimates of the VCE |
|
|
597 | (7) |
|
|
604 | (5) |
|
|
609 | (3) |
|
|
612 | (17) |
|
|
629 | (3) |
|
|
632 | (8) |
|
13.10 Additional resources |
|
|
640 | (1) |
|
|
640 | (3) |
|
14 Flexible regression: Finite mixtures and nonparametric |
|
|
643 | (40) |
|
|
643 | (1) |
|
14.2 Models based on finite mixtures |
|
|
644 | (6) |
|
14.3 FMM example: Earnings of doctors |
|
|
650 | (15) |
|
|
665 | (3) |
|
|
668 | (7) |
|
14.6 Nonparametric regression |
|
|
675 | (5) |
|
14.7 Partially parametric regression |
|
|
680 | (1) |
|
14.8 Additional resources |
|
|
681 | (1) |
|
|
681 | (2) |
|
|
683 | (26) |
|
|
683 | (1) |
|
15.2 Conditional quantile regression |
|
|
684 | (4) |
|
15.3 CQR for medical expenditures data |
|
|
688 | (11) |
|
15.4 CQR for generated heteroskedastic data |
|
|
699 | (4) |
|
15.5 Quantile treatment effects for a binary treatment |
|
|
703 | (3) |
|
15.6 Additional resources |
|
|
706 | (1) |
|
|
707 | (2) |
|
|
709 | (18) |
|
A.1 Stata matrix commands |
|
|
709 | (7) |
|
|
716 | (6) |
|
|
722 | (3) |
|
|
725 | (2) |
|
|
727 | (14) |
|
|
727 | (2) |
|
|
729 | (9) |
|
|
738 | (2) |
|
|
740 | (1) |
|
|
741 | (78) |
|
C.1 Mata moptimize() function |
|
|
741 | (10) |
|
C.2 Mata optimize() function |
|
|
751 | (3) |
|
|
754 | (65) |
Glossary of abbreviations |
|
755 | (6) |
References |
|
761 | (16) |
Author Index |
|
777 | (6) |
Subject Index |
|
783 | (828) |
|
Volume II Nonlinear Models and Causal Inference Methods |
|
|
|
|
xiii | |
|
|
xv | |
|
16 Nonlinear optimization methods |
|
|
819 | (38) |
|
|
819 | (1) |
|
16.2 Newton-Raphson method |
|
|
819 | (5) |
|
|
824 | (5) |
|
16.4 Overview of ml, moptimize(), and optimize() |
|
|
829 | (2) |
|
16.5 The ml command: If method |
|
|
831 | (6) |
|
16.6 Checking the program |
|
|
837 | (7) |
|
16.7 The ml command: lf0--lf2, d0---d2, and gf0 methods |
|
|
844 | (7) |
|
16.8 Nonlinear instrumental-variables (GMM) example |
|
|
851 | (3) |
|
16.9 Additional resources |
|
|
854 | (1) |
|
|
854 | (3) |
|
|
857 | (44) |
|
|
857 | (1) |
|
17.2 Some parametric models |
|
|
858 | (2) |
|
|
860 | (2) |
|
|
862 | (7) |
|
17.5 Goodness of fit and prediction |
|
|
869 | (8) |
|
|
877 | (3) |
|
|
880 | (1) |
|
|
881 | (6) |
|
17.9 Endogenous regressors |
|
|
887 | (8) |
|
17.10 Grouped and fractional data |
|
|
895 | (3) |
|
17.11 Additional resources |
|
|
898 | (1) |
|
|
898 | (3) |
|
|
901 | (48) |
|
|
901 | (1) |
|
18.2 Multinomial models overview |
|
|
901 | (4) |
|
18.3 Multinomial example: Choice of fishing mode |
|
|
905 | (3) |
|
18.4 Multinomial logit model |
|
|
908 | (6) |
|
18.5 Alternative-specific conditional logit model |
|
|
914 | (8) |
|
|
922 | (7) |
|
18.7 Multinomial probit model |
|
|
929 | (5) |
|
18.8 Alternative-specific random-parameters logit |
|
|
934 | (4) |
|
18.9 Ordered outcome models |
|
|
938 | (4) |
|
|
942 | (1) |
|
18.11 Multivariate outcomes |
|
|
943 | (3) |
|
18.12 Additional resources |
|
|
946 | (1) |
|
|
946 | (3) |
|
19 Tobit and selection models |
|
|
949 | (72) |
|
|
949 | (1) |
|
|
950 | (3) |
|
|
953 | (8) |
|
19.4 Tobit for lognormal data |
|
|
961 | (9) |
|
19.5 Two-part model in logs |
|
|
970 | (4) |
|
|
974 | (8) |
|
19.7 Nonnormal models of selection |
|
|
982 | (4) |
|
19.8 Prediction from models with outcome in logs |
|
|
986 | (3) |
|
19.9 Endogenous regressors |
|
|
989 | (2) |
|
|
991 | (4) |
|
|
995 | (24) |
|
19.12 Additional resources |
|
|
1019 | (1) |
|
|
1019 | (2) |
|
|
1021 | (78) |
|
|
1021 | (1) |
|
20.2 Modeling strategies for count data |
|
|
1022 | (4) |
|
20.3 Poisson and negative binomial models |
|
|
1026 | (18) |
|
|
1044 | (6) |
|
20.5 Finite-mixture models |
|
|
1050 | (19) |
|
20.6 Zero-inflated models |
|
|
1069 | (10) |
|
20.7 Endogenous regressors |
|
|
1079 | (10) |
|
|
1089 | (1) |
|
20.9 Quantile regression for count data |
|
|
1090 | (6) |
|
20.10 Additional resources |
|
|
1096 | (1) |
|
|
1096 | (3) |
|
21 Survival analysis for duration data |
|
|
1099 | (40) |
|
|
1099 | (1) |
|
21.2 Data and data summary |
|
|
1100 | (4) |
|
21.3 Survivor and hazard functions |
|
|
1104 | (5) |
|
21.4 Semiparametric regression model |
|
|
1109 | (9) |
|
21.5 Fully parametric regression models |
|
|
1118 | (11) |
|
21.6 Multiple-records data |
|
|
1129 | (3) |
|
21.7 Discrete-time hazards logit model |
|
|
1132 | (3) |
|
21.8 Time-varying regressors |
|
|
1135 | (1) |
|
|
1136 | (1) |
|
21.10 Additional resources |
|
|
1137 | (1) |
|
|
1137 | (2) |
|
22 Nonlinear panel models |
|
|
1139 | (52) |
|
|
1139 | (1) |
|
22.2 Nonlinear panel-data overview |
|
|
1139 | (6) |
|
22.3 Nonlinear panel-data example |
|
|
1145 | (3) |
|
22.4 Binary outcome and ordered outcome models |
|
|
1148 | (19) |
|
22.5 Tobit and interval-data models |
|
|
1167 | (5) |
|
|
1172 | (12) |
|
22.7 Panel quantile regression |
|
|
1184 | (3) |
|
22.8 Endogenous regressors in nonlinear panel models |
|
|
1187 | (1) |
|
22.9 Additional resources |
|
|
1188 | (1) |
|
|
1188 | (3) |
|
23 Parametric models for heterogeneity and endogeneity |
|
|
1191 | (78) |
|
|
1191 | (1) |
|
23.2 Finite mixtures and unobserved heterogeneity |
|
|
1192 | (3) |
|
23.3 Empirical examples of FMMs |
|
|
1195 | (29) |
|
23.4 Nonlinear mixed-effects models |
|
|
1224 | (7) |
|
23.5 Linear structural equation models |
|
|
1231 | (20) |
|
23.6 Generalized structural equation models |
|
|
1251 | (10) |
|
23.7 ERM commands for endogeneity and selection |
|
|
1261 | (5) |
|
23.8 Additional resources |
|
|
1266 | (1) |
|
|
1266 | (3) |
|
24 Randomized control trials and exogenous treatment effects |
|
|
1269 | (68) |
|
|
1269 | (2) |
|
|
1271 | (1) |
|
24.3 Randomized control trials |
|
|
1272 | (10) |
|
24.4 Regression in an RCT |
|
|
1282 | (8) |
|
24.5 Treatment evaluation with exogenous treatment |
|
|
1290 | (2) |
|
24.6 Treatment evaluation methods and estimators |
|
|
1292 | (10) |
|
24.7 Stata commands for treatment evaluation |
|
|
1302 | (3) |
|
24.8 Oregon Health Insurance Experiment example |
|
|
1305 | (7) |
|
24.9 Treatment-effect estimates using the OHIE data |
|
|
1312 | (11) |
|
24.10 Multilevel treatment effects |
|
|
1323 | (9) |
|
24.11 Conditional quantile TEs |
|
|
1332 | (2) |
|
24.12 Additional resources |
|
|
1334 | (1) |
|
|
1335 | (2) |
|
25 Endogenous treatment effects |
|
|
1337 | (68) |
|
|
1337 | (1) |
|
25.2 Parametric methods for endogenous treatment |
|
|
1338 | (3) |
|
25.3 ERM commands for endogenous treatment |
|
|
1341 | (7) |
|
25.4 ET commands for binary endogenous treatment |
|
|
1348 | (8) |
|
25.5 The LATE estimator for heterogeneous effects |
|
|
1356 | (7) |
|
25.6 Difference-in-differences and synthetic control |
|
|
1363 | (6) |
|
25.7 Regression discontinuity design |
|
|
1369 | (19) |
|
25.8 Conditional quantile regression with endogenous regressors |
|
|
1388 | (6) |
|
25.9 Unconditional quantiles |
|
|
1394 | (7) |
|
25.10 Additional resources |
|
|
1401 | (1) |
|
|
1402 | (3) |
|
|
1405 | (28) |
|
|
1405 | (1) |
|
26.2 Overview of spatial regression models |
|
|
1406 | (1) |
|
|
1407 | (4) |
|
26.4 The spatial weighting matrix |
|
|
1411 | (2) |
|
26.5 OLS regression and test for spatial correlation |
|
|
1413 | (1) |
|
26.6 Spatial dependence in the error |
|
|
1414 | (3) |
|
26.7 Spatial autocorrelation regression models |
|
|
1417 | (10) |
|
26.8 Spatial instrumental variables |
|
|
1427 | (1) |
|
26.9 Spatial panel-data models |
|
|
1428 | (1) |
|
26.10 Additional resources |
|
|
1429 | (1) |
|
|
1430 | (3) |
|
27 Semiparametric regression |
|
|
1433 | (32) |
|
|
1433 | (1) |
|
|
1434 | (4) |
|
|
1438 | (2) |
|
27.4 Nonparametric single regressor example |
|
|
1440 | (10) |
|
27.5 Nonparametric multiple regressor example |
|
|
1450 | (3) |
|
27.6 Partial linear model |
|
|
1453 | (3) |
|
|
1456 | (2) |
|
27.8 Generalized additive models |
|
|
1458 | (3) |
|
27.9 Additional resources |
|
|
1461 | (1) |
|
|
1462 | (3) |
|
28 Machine learning for prediction and inference |
|
|
1465 | (62) |
|
|
1465 | (1) |
|
28.2 Measuring the predictive ability of a model |
|
|
1466 | (11) |
|
28.3 Shrinkage estimators |
|
|
1477 | (5) |
|
28.4 Prediction using lasso, ridge, and elasticnet |
|
|
1482 | (11) |
|
|
1493 | (3) |
|
28.6 Machine learning methods for prediction |
|
|
1496 | (5) |
|
28.7 Prediction application |
|
|
1501 | (6) |
|
28.8 Machine learning for inference in partial linear model |
|
|
1507 | (9) |
|
28.9 Machine learning for inference in other models |
|
|
1516 | (7) |
|
28.10 Additional resources |
|
|
1523 | (1) |
|
|
1524 | (3) |
|
29 Bayesian methods: Basics |
|
|
1527 | (52) |
|
|
1527 | (1) |
|
29.2 Bayesian introductory example |
|
|
1528 | (4) |
|
29.3 Bayesian methods overview |
|
|
1532 | (6) |
|
|
1538 | (11) |
|
|
1549 | (3) |
|
29.6 A linear regression example |
|
|
1552 | (8) |
|
29.7 Modifying the MH algorithm |
|
|
1560 | (2) |
|
|
1562 | (5) |
|
29.9 Bayesian model selection |
|
|
1567 | (2) |
|
29.10 Bayesian prediction |
|
|
1569 | (3) |
|
|
1572 | (4) |
|
29.12 Additional resources |
|
|
1576 | (1) |
|
|
1576 | (3) |
|
30 Bayesian methods: Markov chain Monte Carlo algorithms |
|
|
1579 | (32) |
|
|
1579 | (1) |
|
30.2 User-provided log likelihood |
|
|
1579 | (5) |
|
30.3 MH algorithm in Mata |
|
|
1584 | (5) |
|
30.4 Data augmentation and the Gibbs sampler in Mata |
|
|
1589 | (6) |
|
|
1595 | (4) |
|
30.6 Multiple-imputation example |
|
|
1599 | (9) |
|
30.7 Additional resources |
|
|
1608 | (1) |
|
|
1608 | (3) |
Glossary of abbreviations |
|
1611 | (6) |
References |
|
1617 | (18) |
Author Index |
|
1635 | (6) |
Subject Index |
|
1641 | |