Muutke küpsiste eelistusi

Fundamentals of Data Science [Kõva köide]

(SP Pune Univ.), (PC College of Eng., Pune, India), (Department of Information Technology, Government College of Engineering, Karad, India)
  • Formaat: Hardback, 282 pages, kõrgus x laius: 234x156 mm, kaal: 539 g, 56 Tables, black and white; 140 Line drawings, black and white; 3 Halftones, black and white; 143 Illustrations, black and white
  • Ilmumisaeg: 27-Sep-2021
  • Kirjastus: CRC Press
  • ISBN-10: 1138336181
  • ISBN-13: 9781138336186
Teised raamatud teemal:
  • Formaat: Hardback, 282 pages, kõrgus x laius: 234x156 mm, kaal: 539 g, 56 Tables, black and white; 140 Line drawings, black and white; 3 Halftones, black and white; 143 Illustrations, black and white
  • Ilmumisaeg: 27-Sep-2021
  • Kirjastus: CRC Press
  • ISBN-10: 1138336181
  • ISBN-13: 9781138336186
Teised raamatud teemal:
Fundamentals of Data Science is designed for students, academicians and practitioners with a complete walkthrough right from the foundational groundwork required to outlining all the concepts, techniques and tools required to understand Data Science.

Data Science is an umbrella term for the non-traditional techniques and technologies that are required to collect, aggregate, process, and gain insights from massive datasets. This book offers all the processes, methodologies, various steps like data acquisition, pre-process, mining, prediction, and visualization tools for extracting insights from vast amounts of data by the use of various scientific methods, algorithms, and processes

Readers will learn the steps necessary to create the application with SQl, NoSQL, Python, R, Matlab, Octave and Tablue.

This book provides a stepwise approach to building solutions to data science applications right from understanding the fundamentals, performing data analytics to writing source code. All the concepts are discussed in simple English to help the community to become Data Scientist without much pre-requisite knowledge.

Features :











Simple strategies for developing statistical models that analyze data and detect patterns, trends, and relationships in data sets.





Complete roadmap to Data Science approach with dedicatedsections which includes Fundamentals, Methodology and Tools.





Focussed approach for learning and practice various Data Science Toolswith Sample code and examples for practice.





Information is presented in an accessible way for students, researchers and academicians and professionals.
Preface xi
Authors xiii
Part I Introduction to Data Science
1 Importance of Data Science
3(14)
1.1 Need for Data Science
3(4)
1.2 What Is Data Science?
7(2)
1.3 Data Science Process
9(1)
1.4 Business Intelligence and Data Science
10(1)
1.5 Prerequisites for a Data Scientist
11(1)
1.6 Components of Data Science
11(1)
1.7 Tools and Skills Needed
12(1)
1.8 Summary
13(2)
References
15(2)
2 Statistics and Probability
17(28)
2.1 Data Types
17(1)
2.2 Variable Types
18(1)
2.3 Statistics
19(3)
2.4 Sampling Techniques and Probability
22(2)
2.5 Information Gain and Entropy
24(7)
2.6 Probability Theory
31(2)
2.7 Probability Types
33(3)
2.8 Probability Distribution Functions
36(2)
2.9 Bayes' Theorem
38(1)
2.10 Inferential Statistics
39(4)
2.11 Summary
43(1)
References
44(1)
3 Databases for Data Science
45(42)
3.1 SQL - Tool for Data Science
45(32)
3.1.1 Basic Statistics with SQL
45(2)
3.1.2 Data Munging with SQL
47(1)
3.1.3 Filtering, Joins, and Aggregation
48(9)
3.1.4 Window Functions and Ordered Data
57(15)
3.1.5 Preparing Data for Analytics Tool
72(5)
3.2 Advanced NoSQL for Data Science
77(2)
3.2.1 Why NoSQL
77(1)
3.2.2 Document Databases for Data Science
77(1)
3.2.3 Wide-Column Databases for Data Science
78(1)
3.2.4 Graph Databases for Data Science
79(1)
3.3 Summary
79(5)
References
84(3)
Part II Data Modeling and Analytics
4 Data Science Methodology
87(16)
4.1 Analytics for Data Science
87(2)
4.2 Examples of Data Analytics
89(1)
4.3 Data Analytics Life Cycle
90(9)
4.3.1 Data Discovery
91(1)
4.3.2 Data Preparation
91(3)
4.3.3 Model Planning
94(2)
4.3.4 Model Building
96(2)
4.3.5 Communicate Results
98(1)
4.3.6 Operationalization
99(1)
4.4 Summary
99(1)
References
100(3)
5 Data Science Methods and Machine Learning
103(26)
5.1 Regression Analysis
103(11)
5.1.1 Linear Regression
103(6)
5.1.2 Logistic Regression
109(2)
5.1.3 Multinomial Logistic Regression
111(2)
5.1.4 Time-Series Models
113(1)
5.2 Machine Learning
114(12)
5.2.1 Decision Trees
114(2)
5.2.2 Naive Bayes
116(1)
5.2.3 Support Vector Machines
117(2)
5.2.4 Nearest Neighbor learning
119(1)
5.2.5 Clustering
120(2)
5.2.6 Confusion Matrix
122(4)
5.3 Summary
126(1)
References
126(3)
6 Data Analytics and Text Mining
129(18)
6.1 Text Mining
129(6)
6.1.1 Major Text Mining Areas
130(1)
6.1.1.1 Information Retrieval
131(1)
6.1.1.2 Data Mining
131(1)
6.1.1.3 Natural Language Processing (NLP)
131(4)
6.2 Text Analytics
135(3)
6.2.1 Text Analysis Subtasks
135(1)
6.2.1.1 Cleaning and Parsing
135(1)
6.2.1.2 Searching and Retrieval
136(1)
6.2.1.3 Text Mining
136(1)
6.2.1.4 Part-of-Speech Tagging
136(1)
6.2.1.5 Stemming
136(1)
6.2.1.6 Lemmatization
137(1)
6.2.2 Basic Text Analysis Steps
137(1)
6.3 Introduction to Natural Language Processing
138(4)
6.3.1 Major Components of NLP
139(1)
6.3.2 Stages of NLP
140(1)
6.3.3 Statistical Processing of Natural Language
141(1)
6.3.3.1 Document Preprocessing
141(1)
6.3.3.2 Parameterization
141(1)
6.3.4 Applications of NLP
141(1)
6.4 Summary
142(1)
References
142(5)
Part III Platforms for Data Science
7 Data Science Tool: Python
147(40)
7.1 Basics of Python for Data Science
147(6)
7.2 Python Libraries: DataFrame Manipulation with pandas and NumPy
153(6)
7.3 Exploration Data Analysis with Python
159(2)
7.4 Time Series Data
161(2)
7.5 Clustering with Python
163(5)
7.6 ARCH and GARCH
168(2)
7.7 Dimensionality Reduction
170(4)
7.8 Python for Machine ML
174(3)
7.9 KNN/Decision Tree/Random Forest/SVM
177(5)
7.10 Python IDEs for Data Science
182(1)
7.11 Summary
183(1)
References
184(3)
8 Data Science Tool: R
187(22)
8.1 Reading and Getting Data into R
187(3)
8.1.1 Reading Data into R
187(2)
8.1.2 Writing Data into Files
189(1)
8.1.3 Scan() Function
190(1)
8.1.4 Built-in Data Sets
190(1)
8.2 Ordered and Unordered Factors
190(2)
8.3 Arrays and Matrices
192(4)
8.3.1 Arrays
192(1)
8.3.1.1 Creating an Array
192(1)
8.3.1.2 Accessing Elements in an Array
193(1)
8.3.1.3 Array Manipulation
193(1)
8.3.2 Matrices
194(1)
8.3.2.1 Creating a Matrix
194(1)
8.3.2.2 Matrix Transpose
194(1)
8.3.2.3 Eigenvalues and Eigenvectors
195(1)
8.3.2.4 Matrix Concatenation
195(1)
8.4 Lists and Data Frames
196(2)
8.4.1 Lists
196(1)
8.4.1.1 Creating a List
196(1)
8.4.1.2 Concatenation of Lists
196(1)
8.4.2 Data Frames
197(1)
8.4.2.1 Creating a Data Frame
197(1)
8.4.2.2 Accessing the Data Frame
197(1)
8.4.2.3 Adding Rows and Columns
198(1)
8.5 Probability Distributions
198(3)
8.5.1 Normal Distribution
199(2)
8.6 Statistical Models in R
201(2)
8.6.1 Model Fitting
202(1)
8.6.2 Marginal Effects
203(1)
8.7 Manipulating Objects
203(3)
8.7.1 Viewing Objects
203(1)
8.7.2 Modifying Objects
204(1)
8.7.3 Appending Elements
204(1)
8.7.4 Deleting Objects
205(1)
8.8 Data Distribution
206(1)
8.8.1 Visualizing Distributions
206(1)
8.8.2 Statistics in Distributions
206(1)
8.9 Summary
207(1)
References
208(1)
9 Data Science Tool: MATLAB
209(24)
9.1 Data Science Workflow with MATLAB
209(2)
9.2 Importing Data
211(5)
9.2.1 How Data is Stored
211(2)
9.2.2 How MATLAB Represents Data
213(1)
9.2.3 MATLAB Data Types
214(1)
9.2.4 Automating the Import Process
215(1)
9.3 Visualizing and Filtering Data
216(4)
9.3.1 Plotting Data Contained in Tables
217(1)
9.3.2 Selecting Data from Tables
218(1)
9.3.3 Accessing and Creating Table Variables
219(1)
9.4 Performing Calculations
220(10)
9.4.1 Basic Mathematical Operations
220(2)
9.4.2 Using Vectors
222(1)
9.4.3 Using Functions
223(1)
9.4.4 Calculating Summary Statistics
224(2)
9.4.5 Correlations between Variables
226(1)
9.4.6 Accessing Subsets of Data
226(2)
9.4.7 Performing Calculations by Category
228(2)
9.5 Summary
230(1)
References
231(2)
10 GNU Octave as a Data Science Tool
233(16)
10.1 Vectors and Matrices
233(5)
10.2 Arithmetic Operations
238(2)
10.3 Set Operations
240(2)
10.4 Plotting Data
242(5)
10.5 Summary
247(1)
References
248(1)
11 Data Visualization Using Tableau
249(20)
11.1 Introduction to Data Visualization
249(1)
11.2 Introduction to Tableau
250(2)
11.3 Dimensions and Measures, Descriptive Statistics
252(4)
11.4 Basic Charts
256(3)
11.5 Dashboard Design & Principles
259(2)
11.6 Special Chart Types
261(3)
11.7 Integrate Tableau with Google Sheets
264(1)
11.8 Summary
265(2)
References
267(2)
Index 269
Sanjeev J. Wagh, Manisha S. Bhende, Anuradha D. Thakare