Preface |
|
xiii | |
Contributors |
|
xv | |
|
|
1 | (12) |
|
|
|
1 | (2) |
|
1.2 The structure of this book |
|
|
3 | (8) |
|
|
11 | (2) |
|
|
11 | (2) |
|
2 On secondary analysis of datasets that cannot be linked without errors |
|
|
13 | (26) |
|
|
|
13 | (3) |
|
|
14 | (1) |
|
2.1.2 Outline of investigation |
|
|
15 | (1) |
|
2.2 The linkage data structure |
|
|
16 | (4) |
|
|
17 | (1) |
|
2.2.2 Agreement partition of match space |
|
|
18 | (2) |
|
2.3 On maximum likelihood estimation |
|
|
20 | (2) |
|
2.4 On analysis under the comparison data model |
|
|
22 | (8) |
|
2.4.1 Linear regression under the linkage model |
|
|
22 | (2) |
|
2.4.2 Linear regression under the comparison data model |
|
|
24 | (1) |
|
2.4.3 Comparison data modelling (I) |
|
|
25 | (2) |
|
2.4.4 Comparison data modelling (II) |
|
|
27 | (3) |
|
2.5 On link subset analysis |
|
|
30 | (4) |
|
2.5.1 Non-informative balanced selection |
|
|
30 | (3) |
|
2.5.2 Illustration for the C-PR data |
|
|
33 | (1) |
|
|
34 | (5) |
|
|
35 | (4) |
|
3 Capture-recapture methods in the presence of linkage errors |
|
|
39 | (34) |
|
|
|
|
|
39 | (1) |
|
3.2 The capture-recapture model: short formalization and notation |
|
|
40 | (2) |
|
3.3 The linkage models and the linkage errors |
|
|
42 | (5) |
|
3.3.1 The Fellegi and Sunter linkage model |
|
|
42 | (2) |
|
3.3.2 Definition and estimation of linkage errors |
|
|
44 | (1) |
|
3.3.3 Bayesian approaches to record linkage |
|
|
45 | (2) |
|
3.4 The DSE in the presence of linkage errors |
|
|
47 | (10) |
|
3.4.1 The Ding and Fienberg estimator |
|
|
47 | (1) |
|
3.4.2 The modified Ding and Fienberg estimator |
|
|
48 | (1) |
|
|
49 | (3) |
|
|
52 | (5) |
|
3.5 Linkage-error adjustments in the case of multiple lists |
|
|
57 | (8) |
|
3.5.1 Log-linear model-based estimators |
|
|
57 | (3) |
|
3.5.2 An alternative modelling approach |
|
|
60 | (1) |
|
3.5.3 A Bayesian proposal |
|
|
61 | (1) |
|
|
62 | (3) |
|
|
65 | (8) |
|
|
66 | (7) |
|
4 An overview on uncertainty and estimation in statistical matching |
|
|
73 | (28) |
|
|
|
|
|
73 | (2) |
|
4.2 Statistical matching problem: notations and technicalities |
|
|
75 | (2) |
|
4.3 The joint distribution of variables not jointly observed: estimation and uncertainty |
|
|
77 | (10) |
|
|
81 | (2) |
|
4.3.2 Bounding the matching error via measures of uncertainty |
|
|
83 | (4) |
|
4.4 Statistical matching for complex sample surveys |
|
|
87 | (7) |
|
4.4.1 Technical assumptions on the sample designs |
|
|
88 | (2) |
|
4.4.2 A proposal for choosing a matching distribution |
|
|
90 | (1) |
|
4.4.3 Reliability of the matching distribution |
|
|
91 | (2) |
|
4.4.4 Evaluation of the matching reliability as a hypothesis problem |
|
|
93 | (1) |
|
4.5 Conclusions and pending issues: relationship between the statistical matching problem and ecological inference |
|
|
94 | (7) |
|
|
96 | (5) |
|
5 Auxiliary variable selection in a statistical matching problem |
|
|
101 | (20) |
|
|
|
|
|
101 | (2) |
|
5.2 Choice of the matching variables |
|
|
103 | (8) |
|
5.2.1 Traditional methods based on association |
|
|
104 | (1) |
|
5.2.2 Choosing the matching variables by uncertainty reduction |
|
|
105 | (1) |
|
5.2.3 An illustrative example |
|
|
106 | (3) |
|
5.2.4 The penalised uncertainty measure |
|
|
109 | (2) |
|
5.3 Simulations with European Social Survey data |
|
|
111 | (6) |
|
|
117 | (4) |
|
|
117 | (4) |
|
6 Minimal inference from incomplete 2 × 2-tables |
|
|
121 | (16) |
|
|
|
|
121 | (4) |
|
|
125 | (2) |
|
6.3 Maximum corroboration set |
|
|
127 | (3) |
|
6.4 High assurance estimation of $$0 |
|
|
130 | (1) |
|
|
131 | (1) |
|
6.6 Application: missing OCBGT data |
|
|
132 | (5) |
|
|
133 | (4) |
|
7 Dual- and multiple-system estimation with fully and partially observed covariates |
|
|
137 | (32) |
|
Peter G. M. van der Heijden |
|
|
|
|
|
|
|
138 | (2) |
|
7.2 Theory concerning invariant population-size estimates |
|
|
140 | (6) |
|
7.2.1 Terminology and properties |
|
|
140 | (2) |
|
|
142 | (2) |
|
7.2.3 Graphical representation of log-linear models |
|
|
144 | (1) |
|
|
145 | (1) |
|
7.3 Applications of invariant population-size estimation |
|
|
146 | (2) |
|
7.3.1 Modelling strategies with active and passive covariates |
|
|
146 | (1) |
|
7.3.2 Working with invariant population-size estimates |
|
|
147 | (1) |
|
7.4 Dealing with partially observed covariates |
|
|
148 | (6) |
|
7.4.1 Framework for population-size estimation with partially observed covariates |
|
|
148 | (2) |
|
|
150 | (2) |
|
7.4.3 Interaction graphs for models with incomplete covariates |
|
|
152 | (1) |
|
7.4.4 Results of model fitting |
|
|
152 | (2) |
|
7.5 Precision and sensitivity |
|
|
154 | (3) |
|
|
154 | (2) |
|
|
156 | (1) |
|
7.5.3 Comparison of the EM algorithm with the classical model |
|
|
157 | (1) |
|
7.6 An application when the same variable is measured differently in both registers |
|
|
157 | (4) |
|
7.6.1 Example: Injuries in road accidents in the Netherlands |
|
|
158 | (2) |
|
7.6.2 More detailed breakdown of transport mode in accidents |
|
|
160 | (1) |
|
|
161 | (8) |
|
7.7.1 Alternative approaches |
|
|
161 | (3) |
|
|
164 | (1) |
|
|
165 | (4) |
|
8 Estimating population size in multiple record systems with uncertainty of state identification |
|
|
169 | (28) |
|
|
|
169 | (3) |
|
8.2 A latent class model for capture-recapture |
|
|
172 | (9) |
|
8.2.1 Decomposable models |
|
|
174 | (2) |
|
|
176 | (1) |
|
|
176 | (2) |
|
|
178 | (1) |
|
8.2.5 A mixture of different components |
|
|
178 | (1) |
|
|
179 | (2) |
|
8.3 Observed heterogeneity of capture probabilities |
|
|
181 | (5) |
|
|
181 | (1) |
|
|
182 | (4) |
|
8.4 Evaluating the interpretation of the latent classes |
|
|
186 | (1) |
|
|
187 | (10) |
|
|
189 | (2) |
|
8.5.2 Simulations results |
|
|
191 | (1) |
|
|
192 | (5) |
|
9 Log-linear models of erroneous list data |
|
|
197 | (22) |
|
|
|
197 | (2) |
|
9.2 Log-linear models of incomplete contingency tables |
|
|
199 | (1) |
|
9.3 Modelling marginally classified list errors |
|
|
200 | (6) |
|
|
200 | (3) |
|
9.3.2 Maximum likelihood estimation |
|
|
203 | (1) |
|
9.3.3 Estimation based on list-survey data |
|
|
204 | (2) |
|
9.4 Model selection with zero degree of freedom |
|
|
206 | (6) |
|
9.4.1 Latent likelihood ratio criterion |
|
|
206 | (3) |
|
|
209 | (3) |
|
9.5 Homelessness data in the Netherlands |
|
|
212 | (7) |
|
9.5.1 Data and previous study |
|
|
212 | (1) |
|
9.5.2 Analysis allowing for erroneous enumeration |
|
|
213 | (4) |
|
|
217 | (2) |
|
10 Sampling design and analysis using geo-referenced data |
|
|
219 | (28) |
|
|
|
|
|
|
|
|
219 | (2) |
|
10.2 Geo-referenced data and potential locational errors |
|
|
221 | (1) |
|
10.3 A brief review of spatially balanced sampling methods |
|
|
222 | (4) |
|
10.3.1 Local pivotal methods |
|
|
223 | (1) |
|
10.3.2 Spatially correlated Poisson sampling |
|
|
224 | (1) |
|
10.3.3 Balanced sampling through the cube method |
|
|
225 | (1) |
|
|
225 | (1) |
|
10.4 Spatial sampling for estimation of under-coverage rate |
|
|
226 | (6) |
|
10.5 Business surveys in the presence of locational errors |
|
|
232 | (7) |
|
|
239 | (8) |
|
|
240 | (7) |
Index |
|
247 | |