Muutke küpsiste eelistusi

Statistics and Data Visualisation with Python [Pehme köide]

(Imperial College London, UK)
  • Formaat: Paperback / softback, 516 pages, kõrgus x laius: 235x191 mm, kaal: 1040 g, 33 Tables, black and white; 4 Line drawings, color; 72 Line drawings, black and white; 1 Halftones, black and white; 4 Illustrations, color; 73 Illustrations, black and white
  • Sari: Chapman & Hall/CRC The Python Series
  • Ilmumisaeg: 31-Jan-2023
  • Kirjastus: Chapman & Hall/CRC
  • ISBN-10: 0367744511
  • ISBN-13: 9780367744519
  • Formaat: Paperback / softback, 516 pages, kõrgus x laius: 235x191 mm, kaal: 1040 g, 33 Tables, black and white; 4 Line drawings, color; 72 Line drawings, black and white; 1 Halftones, black and white; 4 Illustrations, color; 73 Illustrations, black and white
  • Sari: Chapman & Hall/CRC The Python Series
  • Ilmumisaeg: 31-Jan-2023
  • Kirjastus: Chapman & Hall/CRC
  • ISBN-10: 0367744511
  • ISBN-13: 9780367744519

This book is intended to serve as a bridge in statistics for graduates and business practitioners interested in using their skills in the area of data science and analytics as well as statistical analysis in general. On the one hand, the book is intended to be a refresher for readers that have taken some courses in statistics, but who have not necessarily used it in their day-to-day work. On the other hand, the material can be suitable for readers interested in the subject as a first encounter with statistical work in Python. Statistics and Data Visualisation with Python aims to build statistical knowledge from the ground up by enabling the reader to understand the ideas behind inferential statistics, and begin to formulate hypotheses that form the foundations for the applications and algorithms in statistical analysis, business analytics, machine learning and applied machine learning. This book begins with the basics of programming in Python and data analysis, to help construct a solid basis in statistical methods and hypothesis testing, which are useful in many modern applications.



Statistics and Data Visualisation with Python aims to build statistical knowledge from the ground up by enabling the reader to understand the ideas behind inferential statistics, and begin to formulate hypotheses that form the foundations for the applications and algorithms in statistical analysis.

1 Data, Stats and Stories - An Introduction
1(32)
1.1 From Small to Big Data
2(8)
1.2 Numbers, Facts and Stats
10(4)
2.3 A Sampled History of Statistics
14(8)
2.4 Statistics Today
22(3)
2.5 Asking Questions and Getting Answers
25(5)
2.6 Presenting Answers Visually
30(3)
2 Python Programming Primer
33(66)
2.2 Talking to Python
35(7)
2.1.1 Scripting and Interacting
38(3)
2.1.2 Jupyter Notebook
41(1)
2.2 Starting Up with Python
42(9)
2.2.1 Types in Python
43(1)
2.2.2 Numbers: Integers and Floats
43(3)
2.2.3 Strings
46(3)
2.2.4 Complex Numbers
49(2)
2.3 Collections in Python
51(29)
2.3.1 Lists
52(8)
2.3.2 List Comprehension
60(1)
2.3.3 Tuples
61(5)
2.3.4 Dictionaries
66(6)
2.3.5 Sets
72(8)
2.4 The Beginning of Wisdom: Logic & Control Flow
80(9)
2.4.1 Booleans and Logical Operators
80(2)
2.4.2 Conditional Statements
82(3)
2.4.3 While Loop
85(2)
2.4.4 For Loop
87(2)
2.5 Functions
89(5)
2.6 Scripts and Modules
94(5)
3 Snakes, Bears & Other Numerical Beasts: NumPy, SciPy & pandas
99(42)
3.1 Numerical Python -- NumPy
100(12)
3.1.1 Matrices and Vectors
101(1)
3.1.2 N-Dimensional Arrays
102(2)
3.2.3 N-Dimensional Matrices
104(3)
3.1.4 Indexing and Slicing
107(2)
3.1.5 Descriptive Statistics
109(3)
3.2 Scientific Python -- SciPy
112(9)
3.2.1 Matrix Algebra
114(2)
3.2.2 Numerical Integration
116(1)
3.2.3 Numerical Optimisation
117(1)
3.2.4 Statistics
118(3)
3.3 Panel Data = pandas
121(20)
3.3.1 Series and Dataframes
122(2)
3.3.2 Data Exploration with pandas
124(1)
3.3.3 Pandas Data Types
125(1)
3.3.4 Data Manipulation with pandas
126(4)
3.3.5 Loading Data to pandas
130(6)
3.3.6 Data Grouping
136(5)
4 The Measure of All Things -- Statistics
141(38)
4.1 Descriptive Statistics
144(1)
4.2 Measures of Central Tendency and Dispersion
145(1)
4.3 Central Tendency
146(17)
4.3.1 Mode
147(3)
4.3.2 Median
150(2)
4.3.3 Arithmetic Mean
152(3)
4.3.4 Geometric Mean
155(4)
4.3.5 Harmonic Mean
159(4)
4.4 Dispersion
163(13)
4.4.1 Setting the Boundaries: Range
163(3)
4.4.2 Splitting One's Sides: Quantiles, Quartiles, Percentiles and More
166(3)
4.4.3 Mean Deviation
169(2)
4.4.4 Variance and Standard Deviation
171(5)
4.5 Data Description -- Descriptive Statistics Revisited
176(3)
5 Definitely Maybe: Probability and Distributions
179(88)
5.1 Probability
180(2)
5.2 Random Variables and Probability Distributions
182(9)
5.2.1 Random Variables
183(2)
5.2.2 Discrete and Continuous Distributions
185(1)
5.2.3 Expected Value and Variance
186(5)
5.3 Discrete Probability Distributions
191(32)
5.3.1 Uniform Distribution
191(6)
5.3.2 Bernoulli Distribution
197(4)
5.3.3 Binomial Distribution
201(7)
5.3.4 Hypergeometric Distribution
208(8)
5.3.5 Poisson Distribution
216(7)
5.4 Continuous Probability Distributions
223(24)
5.4.1 Normal or Gaussian Distribution
224(11)
5.4.2 Standard Normal Distribution Z
235(3)
5.4.3 Shape and Moments of a Distribution
238(7)
5.4.4 The Central Limit Theorem
245(2)
5.5 Hypothesis and Confidence Intervals
247(20)
5.5.1 Student's t Distribution
253(7)
5.5.2 Chi-squared Distribution
260(7)
6 Alluring Arguments and Ugly Facts - Statistical Modelling and Hypothesis Testing
267(116)
6.1 Hypothesis Testing
268(11)
6.1.1 Tales and Tails: One- and Two-Tailed Tests
273(6)
6.2 Normality Testing
279(12)
6.2.1 Q-Q Plot
280(2)
6.2.2 Shapiro-Wilk Test
282(3)
6.2.3 D'Agostino K-squared Test
285(3)
6.2.4 Kolmogorov-Smirnov Test
288(3)
6.3 Chi-square Test
291(5)
6.3.1 Goodness of Fit
291(2)
6.3.2 Independence
293(3)
6.4 Linear Correlation and Regression
296(16)
6.4.1 Pearson Correlation
296(5)
6.4.2 Linear Regression
301(7)
6.4.3 Spearman Correlation
308(4)
6.5 Hypothesis Testing with One Sample
312(12)
6.5.1 One-Sample t-test for the Population Mean
312(4)
6.5.2 One-Sample z-test for Proportions
316(4)
6.5.3 Wilcoxon Signed Rank with One-Sample
320(4)
6.6 Hypothesis Testing with Two Samples
324(21)
6.6.1 Two-Sample t-test -- Comparing Means, Same Variances
325(5)
6.6.2 Levene's Test -- Testing Homoscedasticity
330(2)
6.6.3 Welch's t-test -- Comparing Means, Different Variances
332(2)
6.6.4 Mann-Whitney Test -- Testing Non-normal Samples
334(4)
6.6.5 Paired Sample t-test
338(4)
6.6.6 Wilcoxon Matched Pairs
342(3)
6.7 Analysis of Variance
345(31)
6.7.1 One-factor or One-way ANOVA
347(13)
6.7.2 Tukey's Range Test
360(1)
6.7.3 Repeated Measures ANOVA
361(4)
6.7.4 Kruskal-Wallis -- Non-parametric One-way ANOVA
365(4)
6.7.5 Two-factor or Two-way ANOVA
369(7)
6.8 Tests as Linear Models
376(7)
6.8.1 Pearson and Spearman Correlations
377(1)
6.8.2 One-sample t- and Wilcoxon Signed Rank Tests
378(1)
6.8.3 Two-Sample t- and Mann-Whitney Tests
379(1)
6.8.4 Paired Sample t- and Wilcoxon Matched Pairs Tests
380(1)
6.8.5 One-way ANOVA and Kruskal-Wallis Test
380(3)
7 Delightful Details -- Data Visualisation
383(34)
7.1 Presenting Statistical Quantities
384(3)
7.1.1 Textual Presentation
385(1)
7.1.2 Tabular Presentation
385(1)
7.1.3 Graphical Presentation
386(1)
7.2 Can You Draw Me a Picture? -- Data Visualisation
387(7)
7.3 Design and Visual Representation
394(8)
7.4 Plotting and Visualising: Matplotlib
402(5)
7.4.1 Keep It Simple: Plotting Functions
403(1)
7.4.2 Line Styles and Colours
404(1)
7.4.3 Titles and Labels
405(1)
7.4.4 Grids
406(1)
7.5 Multiple Plots
407(1)
7.6 Subplots
407(3)
7.7 Plotting Surfaces
410(4)
7.8 Data Visualisation -- Best Practices
414(3)
8 Dazzling Data Designs -- Creating Charts
417(60)
8.1 What Is the Right Visualisaton for Me?
417(3)
8.2 Data Visualisation and Python
420(10)
8.2.1 Data Visualisation with Pandas
421(2)
8.2.2 Seaborn
423(2)
8.2.3 Bokeh
425(3)
8.2.4 Plotly
428(2)
8.3 Scatter Plot
430(8)
8.4 Line Chart
438(2)
8.5 Bar Chart
440(7)
8.6 Pie Chart
447(5)
8.7 Histogram
452(7)
8.8 Box Plot
459(5)
8.9 Area Owrf
464(4)
8.20 Heatmap
468(9)
A Variance: Population v Sample
477(2)
B Sum of First n Integers
479(2)
C Sum of Squares of the First n Integers
481(2)
D The Binomial Coefficient
483(2)
D.1 Some Useful Properties of the Binomial Coefficient
484(1)
E The Hypergeometric Distribution
485(6)
E.1 The Hypergeometric vs Binomial Distribution
485(2)
F The Poisson Distribution
487(1)
F.1 Derivation of the Poisson Distribution
487(4)
F.2 The Poisson Distribution as a Limit of the Binomial Distribution
G The Normal Distribution
491(4)
G.1 Integrating the PDF of the Normal Distribution
491(2)
G.2 Maximum and Inflection Points of the Normal Distribution
493(2)
H Skewness and Kurtosis
495(2)
I Kruskal-Wallis Test -- No Ties
497(4)
Bibliography 501(10)
Index 511
Dr. Jesús Rogel-Salazar is an accomplished technologist with over 20 years' experience in the area of data science and machine learning. He obtained his PhD in quantum atom optics at Imperial College in the group of Professor Geoff New and in collaboration with the BoseEinstein Condensation Group in Oxford with Professor Keith Burnett. After completion of his doctorate, he worked in the Centre for Cold Matter at Imperial and moved on to the Department of Mathematics in the Applied Analysis and Computation Group. Further to his academic career, Dr. Rogel-Salazar has held positions as a data scientist with AKQA, IBM Data Science Studios, Barclays, Dow Jones, Prudential and Tympa Health Technologies. He is the author of three books published with CRC Press; the latest two are entitled Data Science and Analytics with Python and Advanced Data Science and Analytics with Python.