The textbook provides students with tools they need to analyze complex data using methods from data science, machine learning and artificial intelligence. The authors include both the presentation of methods along with applications using the programming language R, which is the gold standard for analyzing data. The authors cover all three main components of data science: computer science; mathematics and statistics; and domain knowledge. The book presents methods and implementations in R side-by-side, allowing the immediate practical application of the learning concepts. Furthermore, this teaches computational thinking in a natural way. The book includes exercises, case studies, Q&A and examples.
1. Introduction
2. Introduction to learning from data
3. Part 1: General topics
4. Prediction models
5. Error measures
6. Resampling
7. Data types
8. Part 2: Core methods
9. Maximum Likelihood & Bayesian analysis
10. Clustering
11. Dimension Reduction
12. Classification
13. Hypothesis testing
14. Linear Regression
15. Model Selection
16. Part 3: Advanced topics
17. Regularization
18. Deep neural networks
19. Multiple hypothesis testing
20. Survival analysis
21. Generalization error
22. Theoretical foundations
23. Conclusion.
Frank Emmert-Streib is Professor of Data Science at Tampere University (Finland). He leads the Predictive Society and Data Analytics Lab, which pursues innovative research in deep learning and natural language processing. The Lab develops and applies high-dimensional methods in machine learning, statistics, and artificial intelligence that can be used to extract knowledge from data in the fields of biology, medicine, social media, social sciences, marketing, or business.
Salissou Moutari is Senior Lecturer at Queens University Belfast (UK) and Interim Director of Research of the Mathematical Science Research Centre (MSRC). His research interests include mathematical modelling, optimization, machine learning and data science, and the applications of these methods to problems from traffic, transportation and distribution systems, production planning and industrial processes.
Matthias Dehmer is Professor at UMIT (Austria) and also has a position at Swiss Distance University of Applied Sciences, Brig, Switzerland. His research interests are in complex networks, complexity, data science, machine learning, big data analytics, and information theory. In particular, he is working on machine learning based methods to analyse high-dimensional data.