List of Figures |
|
xvii | |
List of Tables |
|
xxii | |
Foreword |
|
xxvi | |
1 Introduction and objectives |
|
1 | (14) |
|
1.1 Why write this book? Who might find it useful? Why five volumes? |
|
|
2 | (1) |
|
1.1.1 Why write this series? Who might find it useful? |
|
|
2 | (1) |
|
|
2 | (1) |
|
1.2 Features you'll find in this book and others in this series |
|
|
3 | (4) |
|
|
3 | (1) |
|
1.2.2 The lighter side (humour) |
|
|
3 | (1) |
|
|
3 | (1) |
|
|
4 | (1) |
|
1.2.5 Discussions and explanations with a mathematical slant for Formula-philes |
|
|
5 | (1) |
|
1.2.6 Discussions and explanations without a mathematical slant for Formula-phobes |
|
|
5 | (1) |
|
|
6 | (1) |
|
|
6 | (1) |
|
1.2.9 Useful Microsoft Excel functions and facilities |
|
|
6 | (1) |
|
1.2.10 References to authoritative sources |
|
|
7 | (1) |
|
|
7 | (1) |
|
1.3 Overview of chapters in this volume |
|
|
7 | (1) |
|
1.4 Elsewhere in the 'Working Guide to Estimating & Forecasting' series |
|
|
8 | (5) |
|
1.4.1 Volume I: Principles, Process and Practice of Professional Number Juggling |
|
|
9 | (1) |
|
1.4.2 Volume II: Probability, Statistics and Other Frightening Stuff |
|
|
10 | (1) |
|
1.4.3 Volume III: Best Fit Lines and Curves, and Some Mathe-Magical Transformations |
|
|
10 | (1) |
|
1.4.4 Volume IV: Learning, Unlearning and Re-Learning Curves |
|
|
11 | (1) |
|
1.4.5 Volume V: Risk, Opportunity, Uncertainty and Other Random Models |
|
|
12 | (1) |
|
1.5 Final thoughts and musings on this volume and series |
|
|
13 | (1) |
|
|
14 | (1) |
2 Measures of Central Tendency: Means, Modes, Medians |
|
15 | (52) |
|
2.1 'S' is for shivers, statistics and spin |
|
|
15 | (2) |
|
2.1.1 Cutting through the mumbo-jumbo: What is or are statistics? |
|
|
16 | (1) |
|
2.1.2 Are there any types of statistics that are not 'Descriptive'? |
|
|
17 | (1) |
|
2.1.3 Samples, populations and the dreaded statistical bias |
|
|
17 | (1) |
|
2.2 Measures of Central Tendency |
|
|
17 | (2) |
|
2.2.1 What do we mean by 'Mean'? |
|
|
18 | (1) |
|
2.2.2 Can we take the average of an average? |
|
|
19 | (1) |
|
2.3 Arithmetic Mean - the Simple Average |
|
|
19 | (11) |
|
2.3.1 Properties of Arithmetic Means: A potentially unachievable value! |
|
|
21 | (2) |
|
2.3.2 Properties of Arithmetic Means: An unbiased representative value of the whole |
|
|
23 | (2) |
|
2.3.3 Why would we not want to use the Arithmetic Mean? |
|
|
25 | (1) |
|
2.3.4 Is an Arithmetic Mean useful where there is an upward or downward trend? |
|
|
26 | (1) |
|
2.3.5 Average of averages: Can we take the Arithmetic Mean of an Arithmetic Mean? |
|
|
27 | (3) |
|
|
30 | (11) |
|
2.4.1 Basic rules and properties of a Geometric Mean |
|
|
30 | (1) |
|
2.4.2 When might we want to use a Geometric Mean? |
|
|
31 | (2) |
|
2.4.3 Finding a steady state rate of growth or decay with a Geometric Mean |
|
|
33 | (6) |
|
2.4.4 Using a Geometric Mean as a Cross-Driver Comparator |
|
|
39 | (1) |
|
2.4.5 Using a Geometric Mean with certain Non-Linear Regressions |
|
|
39 | (1) |
|
2.4.6 Average of averages: Can we take the Geometric Mean of a Geometric Mean? |
|
|
40 | (1) |
|
|
41 | (7) |
|
2.5.1 Surely estimators would never use the Harmonic Mean? |
|
|
42 | (3) |
|
2.5.2 Cases where the Harmonic Mean and the Arithmetic Mean are both inappropriate |
|
|
45 | (1) |
|
2.5.3 Average of averages: Can we take the Harmonic Mean of a Harmonic Mean? |
|
|
45 | (3) |
|
2.6 Quadratic Mean: Root Mean Square |
|
|
48 | (3) |
|
2.6.1 When would we ever use a Quadratic Mean? |
|
|
48 | (3) |
|
2.7 Comparison of Arithmetic, Geometric, Harmonic and Quadratic Means |
|
|
51 | (1) |
|
|
52 | (8) |
|
2.8.1 When would we use the Mode instead of the Arithmetic Mean? |
|
|
54 | (1) |
|
2.8.2 What does it mean if we observe more than one Mode? |
|
|
54 | (1) |
|
2.8.3 What if we have two modes that occur at adjacent values? |
|
|
55 | (1) |
|
2.8.4 Approximating the theoretical Mode when there is no real observable Mode! |
|
|
56 | (4) |
|
|
60 | (2) |
|
2.9.1 Primary use of the Median |
|
|
61 | (1) |
|
|
61 | (1) |
|
2.10 Choosing a representative value: The 5-Ms |
|
|
62 | (3) |
|
2.10.1 Some properties of the 5-Ms |
|
|
63 | (2) |
|
|
65 | (1) |
|
|
66 | (1) |
3 Measures of Dispersion and Shape |
|
67 | (58) |
|
3.1 Measures of Dispersion or scatter around a central value |
|
|
67 | (1) |
|
3.2 Minimum, Maximum and Range |
|
|
68 | (2) |
|
|
70 | (9) |
|
3.3.1 Mean or Average Absolute Deviation (AAD) |
|
|
71 | (2) |
|
3.3.2 Median Absolute Deviation (MAD) |
|
|
73 | (4) |
|
3.3.3 Is there a Mode Absolute Deviation? |
|
|
77 | (1) |
|
3.3.4 When would we use an Absolute Deviation? |
|
|
77 | (2) |
|
3.4 Variance and Standard Deviation |
|
|
79 | (20) |
|
3.4.1 Variance and Standard Deviation - compensating for small samples |
|
|
84 | (7) |
|
3.4.2 Coefficient of Variation |
|
|
91 | (2) |
|
3.4.3 The Range Rule - is it myth or magic? |
|
|
93 | (6) |
|
3.5 Comparison of deviation-based Measures of Dispersion |
|
|
99 | (2) |
|
3.6 Confidence Levels, Limits and Intervals |
|
|
101 | (5) |
|
3.6.1 Open and Closed Confidence Level Ranges |
|
|
104 | (2) |
|
3.7 Quantiles: Quartiles, Quintiles, Deciles and Percentiles |
|
|
106 | (9) |
|
3.7.1 A few more words about Quartiles |
|
|
109 | (3) |
|
3.7.2 A few thoughts about Quintiles |
|
|
112 | (1) |
|
3.7.3 And a few words about Deciles |
|
|
113 | (1) |
|
3.7.4 Finally, a few words about Percentiles |
|
|
114 | (1) |
|
3.8 Other Measures of Shape: Skewness and Peakedness |
|
|
115 | (8) |
|
3.8.1 Measures of Skewness |
|
|
116 | (4) |
|
3.8.2 Measures of Peakedness or Flatness - Kurtosis |
|
|
120 | (3) |
|
|
123 | (1) |
|
|
124 | (1) |
4 Probability Distributions |
|
125 | (130) |
|
|
126 | (12) |
|
4.1.1 Discrete Distributions |
|
|
127 | (4) |
|
4.1.2 Continuous Distributions |
|
|
131 | (6) |
|
4.1.3 Bounding Distributions |
|
|
137 | (1) |
|
|
138 | (9) |
|
4.2.1 What is a Normal Distribution? |
|
|
138 | (1) |
|
4.2.2 Key properties of a Normal Distribution |
|
|
139 | (4) |
|
4.2.3 Where is the Normal Distribution observed? When can, or should, it be used? |
|
|
143 | (2) |
|
4.2.4 Probability Density Function and Cumulative Distribution Function |
|
|
145 | (1) |
|
4.2.5 Key stats and facts about the Normal Distribution |
|
|
146 | (1) |
|
4.3 Uniform Distributions |
|
|
147 | (8) |
|
4.3.1 Discrete Uniform Distributions |
|
|
147 | (2) |
|
4.3.2 Continuous Uniform Distributions |
|
|
149 | (1) |
|
4.3.3 Key properties of a Uniform Distribution |
|
|
150 | (3) |
|
4.3.4 Where is the Uniform Distribution observed? When can, or should, it be used? |
|
|
153 | (1) |
|
4.3.5 Key stats and facts about the Uniform Distribution |
|
|
154 | (1) |
|
4.4 Binomial and Bernoulli Distributions |
|
|
155 | (7) |
|
4.4.1 What is a Binomial Distribution? |
|
|
155 | (1) |
|
4.4.2 What is a Bernoulli Distribution? |
|
|
156 | (1) |
|
4.4.3 Probability Mass Function and Cumulative Distribution Function |
|
|
157 | (2) |
|
4.4.4 Key properties of a Binomial Distribution |
|
|
159 | (2) |
|
4.4.5 Where is the Binomial Distribution observed? When can, or should, it be used? |
|
|
161 | (1) |
|
4.4.6 Key stats and facts about the Binomial Distribution |
|
|
161 | (1) |
|
|
162 | (14) |
|
4.5.1 What is a Beta Distribution? |
|
|
162 | (2) |
|
4.5.2 Probability Density Function and Cumulative Distribution Function |
|
|
164 | (3) |
|
4.5.3 Key properties of a Beta Distribution |
|
|
167 | (2) |
|
4.5.4 PERT-Beta or Project Beta Distributions |
|
|
169 | (5) |
|
4.5.5 Where is the Beta Distribution observed? When can, or should, it be used? |
|
|
174 | (1) |
|
4.5.6 Key stats and facts about the Beta Distribution |
|
|
175 | (1) |
|
4.6 Triangular Distributions |
|
|
176 | (10) |
|
4.6.1 What is a Triangular Distribution? |
|
|
176 | (1) |
|
4.6.2 Probability Density Function and Cumulative Distribution Function |
|
|
176 | (2) |
|
4.6.3 Key properties of a Triangular Distribution |
|
|
178 | (7) |
|
4.6.4 Where is the Triangular Distribution observed? When can, or should, it be used? |
|
|
185 | (1) |
|
4.6.5 Key stats and facts about the Triangular Distribution |
|
|
185 | (1) |
|
4.7 Lognormal Distributions |
|
|
186 | (9) |
|
4.7.1 What is a Lognormal Distribution? |
|
|
186 | (3) |
|
4.7.2 Probability Density Function and Cumulative Distribution Function |
|
|
189 | (1) |
|
4.7.3 Key properties of a Lognormal Distribution |
|
|
190 | (3) |
|
4.7.4 Where is the Lognormal Distribution observed? When can, or should, it be used? |
|
|
193 | (1) |
|
4.7.5 Key stats and facts about the Lognormal Distribution |
|
|
194 | (1) |
|
4.8 Weibull Distributions |
|
|
195 | (12) |
|
4.8.1 What is a Weibull Distribution? |
|
|
195 | (1) |
|
4.8.2 Probability Density Function and Cumulative Distribution Function |
|
|
196 | (2) |
|
4.8.3 Key properties of a Weibull Distribution |
|
|
198 | (4) |
|
4.8.4 Where is the Weibull Distribution observed? When can, or should, it be used? |
|
|
202 | (3) |
|
4.8.5 Key stats and facts about the Weibull Distribution |
|
|
205 | (2) |
|
4.9 Poisson Distributions |
|
|
207 | (10) |
|
4.9.1 What is a Poisson Distribution? |
|
|
207 | (3) |
|
4.9.2 Probability Mass Function and Cumulative Distribution Function |
|
|
210 | (1) |
|
4.9.3 Key properties of a Poisson Distribution |
|
|
210 | (4) |
|
4.9.4 Where is the Poisson Distribution observed? When can, or should, it be used? |
|
|
214 | (2) |
|
4.9.5 Key stats and facts about the Poisson Distribution |
|
|
216 | (1) |
|
4.10 Gamma and Chi-Squared Distributions |
|
|
217 | (12) |
|
4.10.1 What is a Gamma Distribution? |
|
|
217 | (3) |
|
4.10.2 What is a Chi-Squared Distribution? |
|
|
220 | (1) |
|
4.10.3 Probability Density Function and Cumulative Distribution Function |
|
|
220 | (3) |
|
4.10.4 Key properties of Gamma and Chi-Squared Distributions |
|
|
223 | (3) |
|
4.10.5 Where are the Gamma and Chi-Squared Distributions used? |
|
|
226 | (2) |
|
4.10.6 Key stats and facts about the Gamma and Chi-Squared Distributions |
|
|
228 | (1) |
|
4.11 Exponential Distributions |
|
|
229 | (6) |
|
4.11.1 What is an Exponential Distribution? |
|
|
229 | (1) |
|
4.11.2 Probability Density Function and Cumulative Distribution Function |
|
|
229 | (1) |
|
4.11.3 Key properties of an Exponential Distribution |
|
|
230 | (3) |
|
4.11.4 Where is the Exponential Distribution observed? When can, or should, it be used? |
|
|
233 | (1) |
|
4.11.5 Key stats and facts about the Exponential Distribution |
|
|
234 | (1) |
|
4.12 Pareto Distributions |
|
|
235 | (15) |
|
4.12.1 What is a Pareto Distribution? |
|
|
235 | (1) |
|
4.12.2 Probability Density Function and Cumulative Distribution Function |
|
|
235 | (2) |
|
4.12.3 The Pareto Principle: How does it fit in with the Pareto Distribution? |
|
|
237 | (4) |
|
4.12.4 Key properties of a Pareto Distribution |
|
|
241 | (5) |
|
4.12.5 Where is the Pareto Distribution observed? When can, or should, it be used? |
|
|
246 | (3) |
|
4.12.6 Key stats and facts about the Pareto Distribution |
|
|
249 | (1) |
|
4.13 Choosing an appropriate distribution |
|
|
250 | (3) |
|
|
253 | (1) |
|
|
253 | (2) |
5 Measures of Linearity, Dependence and Correlation |
|
255 | (85) |
|
|
257 | (7) |
|
5.2 Linear Correlation or Measures of Linear Dependence |
|
|
264 | (20) |
|
5.2.1 Pearson's Correlation Coefficient |
|
|
264 | (6) |
|
5.2.2 Pearson's Correlation Coefficient - key properties and limitations |
|
|
270 | (9) |
|
5.2.3 Correlation is not causation |
|
|
279 | (2) |
|
5.2.4 Partial Correlation: Time for some Correlation Chicken |
|
|
281 | (1) |
|
5.2.5 Coefficient of Determination |
|
|
282 | (2) |
|
|
284 | (27) |
|
5.3.1 Spearman's Rank Correlation Coefficient |
|
|
286 | (9) |
|
5.3.2 If Spearman's Rank Correlation is so much trouble, why bother? |
|
|
295 | (2) |
|
5.3.3 Interpreting Spearman's Rank Correlation Coefficient |
|
|
297 | (4) |
|
5.3.4 Kendall's Tau Rank Correlation Coefficient |
|
|
301 | (6) |
|
5.3.5 If Kendall's Tau Rank Correlation is so much trouble, why bother? |
|
|
307 | (4) |
|
5.4 Correlation: What if you want to 'Push' it not 'Pull' it? |
|
|
311 | (25) |
|
5.4.1 The Pushy Pythagorean Technique or restricting the scatter around a straight line |
|
|
312 | (5) |
|
5.4.2 'Controlling Partner' Technique |
|
|
317 | (5) |
|
5.4.3 Equivalence of the Pushy Pythagorean and Controlling Partner Techniques |
|
|
322 | (1) |
|
5.4.4 'Equal Partners' Technique |
|
|
323 | (5) |
|
|
328 | (8) |
|
|
336 | (3) |
|
|
339 | (1) |
6 Tails of the unexpected (1): Hypothesis Testing |
|
340 | (52) |
|
|
341 | (3) |
|
6.1.1 Tails of the unexpected |
|
|
342 | (2) |
|
|
344 | (12) |
|
|
345 | (5) |
|
6.2.2 Example: Z-Testing the Mean value of a Normal Distribution |
|
|
350 | (2) |
|
6.2.3 Example: Z-Testing the Median value of a Beta Distribution |
|
|
352 | (4) |
|
6.3 Student's t-Distribution and t-Tests |
|
|
356 | (11) |
|
6.3.1 Student's t-Distribution |
|
|
356 | (3) |
|
|
359 | (2) |
|
6.3.3 Performing a t-Test in Microsoft Excel on a single sample |
|
|
361 | (3) |
|
6.3.4 Performing a t-Test in Microsoft Excel to compare two samples |
|
|
364 | (3) |
|
|
367 | (4) |
|
6.5 Chi-Squared Tests or f-Tests |
|
|
371 | (4) |
|
6.5.1 Chi-Squared Distribution revisited |
|
|
371 | (1) |
|
|
371 | (4) |
|
6.6 F-Distribution and F-Tests |
|
|
375 | (5) |
|
|
375 | (2) |
|
|
377 | (1) |
|
6.6.3 Primary use of the F-Distribution |
|
|
377 | (3) |
|
6.7 Checking for Normality |
|
|
380 | (10) |
|
|
380 | (6) |
|
6.7.2 Using a Chi-Squared Test for Normality |
|
|
386 | (3) |
|
6.7.3 Using the Jarque-Bera Test for Normality |
|
|
389 | (1) |
|
|
390 | (1) |
|
|
391 | (1) |
7 Tails of the unexpected (2): Outing the outliers |
|
392 | (51) |
|
7.1 Outing the outliers: Detecting and dealing with outliers |
|
|
392 | (7) |
|
7.1.1 Mitigation of Type I and Type II outlier errors |
|
|
396 | (3) |
|
|
399 | (9) |
|
7.2.1 Tukey Slimline Fences - for larger samples and less tolerance of outliers? |
|
|
407 | (1) |
|
7.3 Chauvenet's Criterion |
|
|
408 | (8) |
|
7.3.1 Variation on Chauvenet's Criterion for small sample sizes (SSS) |
|
|
413 | (1) |
|
7.3.2 Taking a Q-Q perspective on Chauvenet's Criterion for small sample sizes (SSS) |
|
|
414 | (2) |
|
|
416 | (3) |
|
7.5 Iglewicz and Hoaglin's MAD Technique |
|
|
419 | (6) |
|
|
425 | (4) |
|
7.7 Generalised Extreme Studentised Deviate (GESD) |
|
|
429 | (1) |
|
|
430 | (2) |
|
7.9 Doing the JB Swing - using Skewness and Excess Kurtosis to identify outliers |
|
|
432 | (5) |
|
7.10 Outlier tests - a comparison |
|
|
437 | (3) |
|
|
440 | (1) |
|
|
440 | (3) |
Glossary of estimating and forecasting terms |
|
443 | (19) |
Legend for Microsoft Excel Worked Example Tables in Greyscale |
|
462 | (1) |
Index |
|
463 | |