|
1 Data, Stats and Stories - An Introduction |
|
|
1 | (32) |
|
1.1 From Small to Big Data |
|
|
2 | (8) |
|
1.2 Numbers, Facts and Stats |
|
|
10 | (4) |
|
2.3 A Sampled History of Statistics |
|
|
14 | (8) |
|
|
22 | (3) |
|
2.5 Asking Questions and Getting Answers |
|
|
25 | (5) |
|
2.6 Presenting Answers Visually |
|
|
30 | (3) |
|
2 Python Programming Primer |
|
|
33 | (66) |
|
|
35 | (7) |
|
2.1.1 Scripting and Interacting |
|
|
38 | (3) |
|
|
41 | (1) |
|
2.2 Starting Up with Python |
|
|
42 | (9) |
|
|
43 | (1) |
|
2.2.2 Numbers: Integers and Floats |
|
|
43 | (3) |
|
|
46 | (3) |
|
|
49 | (2) |
|
2.3 Collections in Python |
|
|
51 | (29) |
|
|
52 | (8) |
|
|
60 | (1) |
|
|
61 | (5) |
|
|
66 | (6) |
|
|
72 | (8) |
|
2.4 The Beginning of Wisdom: Logic & Control Flow |
|
|
80 | (9) |
|
2.4.1 Booleans and Logical Operators |
|
|
80 | (2) |
|
2.4.2 Conditional Statements |
|
|
82 | (3) |
|
|
85 | (2) |
|
|
87 | (2) |
|
|
89 | (5) |
|
|
94 | (5) |
|
3 Snakes, Bears & Other Numerical Beasts: NumPy, SciPy & pandas |
|
|
99 | (42) |
|
3.1 Numerical Python -- NumPy |
|
|
100 | (12) |
|
3.1.1 Matrices and Vectors |
|
|
101 | (1) |
|
3.1.2 N-Dimensional Arrays |
|
|
102 | (2) |
|
3.2.3 N-Dimensional Matrices |
|
|
104 | (3) |
|
3.1.4 Indexing and Slicing |
|
|
107 | (2) |
|
3.1.5 Descriptive Statistics |
|
|
109 | (3) |
|
3.2 Scientific Python -- SciPy |
|
|
112 | (9) |
|
|
114 | (2) |
|
3.2.2 Numerical Integration |
|
|
116 | (1) |
|
3.2.3 Numerical Optimisation |
|
|
117 | (1) |
|
|
118 | (3) |
|
|
121 | (20) |
|
3.3.1 Series and Dataframes |
|
|
122 | (2) |
|
3.3.2 Data Exploration with pandas |
|
|
124 | (1) |
|
|
125 | (1) |
|
3.3.4 Data Manipulation with pandas |
|
|
126 | (4) |
|
3.3.5 Loading Data to pandas |
|
|
130 | (6) |
|
|
136 | (5) |
|
4 The Measure of All Things -- Statistics |
|
|
141 | (38) |
|
4.1 Descriptive Statistics |
|
|
144 | (1) |
|
4.2 Measures of Central Tendency and Dispersion |
|
|
145 | (1) |
|
|
146 | (17) |
|
|
147 | (3) |
|
|
150 | (2) |
|
|
152 | (3) |
|
|
155 | (4) |
|
|
159 | (4) |
|
|
163 | (13) |
|
4.4.1 Setting the Boundaries: Range |
|
|
163 | (3) |
|
4.4.2 Splitting One's Sides: Quantiles, Quartiles, Percentiles and More |
|
|
166 | (3) |
|
|
169 | (2) |
|
4.4.4 Variance and Standard Deviation |
|
|
171 | (5) |
|
4.5 Data Description -- Descriptive Statistics Revisited |
|
|
176 | (3) |
|
5 Definitely Maybe: Probability and Distributions |
|
|
179 | (88) |
|
|
180 | (2) |
|
5.2 Random Variables and Probability Distributions |
|
|
182 | (9) |
|
|
183 | (2) |
|
5.2.2 Discrete and Continuous Distributions |
|
|
185 | (1) |
|
5.2.3 Expected Value and Variance |
|
|
186 | (5) |
|
5.3 Discrete Probability Distributions |
|
|
191 | (32) |
|
5.3.1 Uniform Distribution |
|
|
191 | (6) |
|
5.3.2 Bernoulli Distribution |
|
|
197 | (4) |
|
5.3.3 Binomial Distribution |
|
|
201 | (7) |
|
5.3.4 Hypergeometric Distribution |
|
|
208 | (8) |
|
5.3.5 Poisson Distribution |
|
|
216 | (7) |
|
5.4 Continuous Probability Distributions |
|
|
223 | (24) |
|
5.4.1 Normal or Gaussian Distribution |
|
|
224 | (11) |
|
5.4.2 Standard Normal Distribution Z |
|
|
235 | (3) |
|
5.4.3 Shape and Moments of a Distribution |
|
|
238 | (7) |
|
5.4.4 The Central Limit Theorem |
|
|
245 | (2) |
|
5.5 Hypothesis and Confidence Intervals |
|
|
247 | (20) |
|
5.5.1 Student's t Distribution |
|
|
253 | (7) |
|
5.5.2 Chi-squared Distribution |
|
|
260 | (7) |
|
6 Alluring Arguments and Ugly Facts - Statistical Modelling and Hypothesis Testing |
|
|
267 | (116) |
|
|
268 | (11) |
|
6.1.1 Tales and Tails: One- and Two-Tailed Tests |
|
|
273 | (6) |
|
|
279 | (12) |
|
|
280 | (2) |
|
|
282 | (3) |
|
6.2.3 D'Agostino K-squared Test |
|
|
285 | (3) |
|
6.2.4 Kolmogorov-Smirnov Test |
|
|
288 | (3) |
|
|
291 | (5) |
|
|
291 | (2) |
|
|
293 | (3) |
|
6.4 Linear Correlation and Regression |
|
|
296 | (16) |
|
6.4.1 Pearson Correlation |
|
|
296 | (5) |
|
|
301 | (7) |
|
6.4.3 Spearman Correlation |
|
|
308 | (4) |
|
6.5 Hypothesis Testing with One Sample |
|
|
312 | (12) |
|
6.5.1 One-Sample t-test for the Population Mean |
|
|
312 | (4) |
|
6.5.2 One-Sample z-test for Proportions |
|
|
316 | (4) |
|
6.5.3 Wilcoxon Signed Rank with One-Sample |
|
|
320 | (4) |
|
6.6 Hypothesis Testing with Two Samples |
|
|
324 | (21) |
|
6.6.1 Two-Sample t-test -- Comparing Means, Same Variances |
|
|
325 | (5) |
|
6.6.2 Levene's Test -- Testing Homoscedasticity |
|
|
330 | (2) |
|
6.6.3 Welch's t-test -- Comparing Means, Different Variances |
|
|
332 | (2) |
|
6.6.4 Mann-Whitney Test -- Testing Non-normal Samples |
|
|
334 | (4) |
|
6.6.5 Paired Sample t-test |
|
|
338 | (4) |
|
6.6.6 Wilcoxon Matched Pairs |
|
|
342 | (3) |
|
|
345 | (31) |
|
6.7.1 One-factor or One-way ANOVA |
|
|
347 | (13) |
|
|
360 | (1) |
|
6.7.3 Repeated Measures ANOVA |
|
|
361 | (4) |
|
6.7.4 Kruskal-Wallis -- Non-parametric One-way ANOVA |
|
|
365 | (4) |
|
6.7.5 Two-factor or Two-way ANOVA |
|
|
369 | (7) |
|
6.8 Tests as Linear Models |
|
|
376 | (7) |
|
6.8.1 Pearson and Spearman Correlations |
|
|
377 | (1) |
|
6.8.2 One-sample t- and Wilcoxon Signed Rank Tests |
|
|
378 | (1) |
|
6.8.3 Two-Sample t- and Mann-Whitney Tests |
|
|
379 | (1) |
|
6.8.4 Paired Sample t- and Wilcoxon Matched Pairs Tests |
|
|
380 | (1) |
|
6.8.5 One-way ANOVA and Kruskal-Wallis Test |
|
|
380 | (3) |
|
7 Delightful Details -- Data Visualisation |
|
|
383 | (34) |
|
7.1 Presenting Statistical Quantities |
|
|
384 | (3) |
|
7.1.1 Textual Presentation |
|
|
385 | (1) |
|
7.1.2 Tabular Presentation |
|
|
385 | (1) |
|
7.1.3 Graphical Presentation |
|
|
386 | (1) |
|
7.2 Can You Draw Me a Picture? -- Data Visualisation |
|
|
387 | (7) |
|
7.3 Design and Visual Representation |
|
|
394 | (8) |
|
7.4 Plotting and Visualising: Matplotlib |
|
|
402 | (5) |
|
7.4.1 Keep It Simple: Plotting Functions |
|
|
403 | (1) |
|
7.4.2 Line Styles and Colours |
|
|
404 | (1) |
|
|
405 | (1) |
|
|
406 | (1) |
|
|
407 | (1) |
|
|
407 | (3) |
|
|
410 | (4) |
|
7.8 Data Visualisation -- Best Practices |
|
|
414 | (3) |
|
8 Dazzling Data Designs -- Creating Charts |
|
|
417 | (60) |
|
8.1 What Is the Right Visualisaton for Me? |
|
|
417 | (3) |
|
8.2 Data Visualisation and Python |
|
|
420 | (10) |
|
8.2.1 Data Visualisation with Pandas |
|
|
421 | (2) |
|
|
423 | (2) |
|
|
425 | (3) |
|
|
428 | (2) |
|
|
430 | (8) |
|
|
438 | (2) |
|
|
440 | (7) |
|
|
447 | (5) |
|
|
452 | (7) |
|
|
459 | (5) |
|
|
464 | (4) |
|
|
468 | (9) |
|
A Variance: Population v Sample |
|
|
477 | (2) |
|
B Sum of First n Integers |
|
|
479 | (2) |
|
C Sum of Squares of the First n Integers |
|
|
481 | (2) |
|
D The Binomial Coefficient |
|
|
483 | (2) |
|
D.1 Some Useful Properties of the Binomial Coefficient |
|
|
484 | (1) |
|
E The Hypergeometric Distribution |
|
|
485 | (6) |
|
E.1 The Hypergeometric vs Binomial Distribution |
|
|
485 | (2) |
|
F The Poisson Distribution |
|
|
487 | (1) |
|
F.1 Derivation of the Poisson Distribution |
|
|
487 | (4) |
|
F.2 The Poisson Distribution as a Limit of the Binomial Distribution |
|
|
|
G The Normal Distribution |
|
|
491 | (4) |
|
G.1 Integrating the PDF of the Normal Distribution |
|
|
491 | (2) |
|
G.2 Maximum and Inflection Points of the Normal Distribution |
|
|
493 | (2) |
|
|
495 | (2) |
|
I Kruskal-Wallis Test -- No Ties |
|
|
497 | (4) |
Bibliography |
|
501 | (10) |
Index |
|
511 | |