Muutke küpsiste eelistusi

E-raamat: Data Science with Julia

(McMaster University, Hamilton, ON), (McMaster University)
  • Formaat: 240 pages
  • Ilmumisaeg: 02-Jan-2019
  • Kirjastus: CRC Press
  • Keel: eng
  • ISBN-13: 9781351013666
  • Formaat - PDF+DRM
  • Hind: 70,19 €*
  • * hind on lõplik, st. muud allahindlused enam ei rakendu
  • Lisa ostukorvi
  • Lisa soovinimekirja
  • See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.
  • Formaat: 240 pages
  • Ilmumisaeg: 02-Jan-2019
  • Kirjastus: CRC Press
  • Keel: eng
  • ISBN-13: 9781351013666

DRM piirangud

  • Kopeerimine (copy/paste):

    ei ole lubatud

  • Printimine:

    ei ole lubatud

  • Kasutamine:

    Digitaalõiguste kaitse (DRM)
    Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale  Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

    Vajalik tarkvara
    Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

    PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

    Seda e-raamatut ei saa lugeda Amazon Kindle's. 

"This book is a great way to both start learning data science through the promising Julia language and to become an efficient data scientist."- Professor Charles Bouveyron, INRIA Chair in Data Science, Université Côte dAzur, Nice, France

Julia, an open-source programming language, was created to be as easy to use as languages such as R and Python while also as fast as C and Fortran. An accessible, intuitive, and highly efficient base language with speed that exceeds R and Python, makes Julia a formidable language for data science. Using well known data science methods that will motivate the reader, Data Science with Julia will get readers up to speed on key features of the Julia language and illustrate its facilities for data science and machine learning work.

Features:











Covers the core components of Julia as well as packages relevant to the input, manipulation and representation of data.





Discusses several important topics in data science including supervised and unsupervised learning.





Reviews data visualization using the Gadfly package, which was designed to emulate the very popular ggplot2 package in R. Readers will learn how to make many common plots and how to visualize model results.





Presents how to optimize Julia code for performance.





Will be an ideal source for people who already know R and want to learn how to use Julia (though no previous knowledge of R or any other programming language is required).

The advantages of Julia for data science cannot be understated. Besides speed and ease of use, there are already over 1,900 packages available and Julia can interface (either directly or through packages) with libraries written in R, Python, Matlab, C, C++ or Fortran. The book is for senior undergraduates, beginning graduate students, or practicing data scientists who want to learn how to use Julia for data science.

"This book is a great way to both start learning data science through the promising Julia language and to become an efficient data scientist."

Professor Charles Bouveyron INRIA Chair in Data Science Université Côte dAzur, Nice, France

Arvustused

"The book is ideal for people who want to learn Julia through machine-learning examples and is especially relevant for R users Chapter 7 is devoted to interacting with R from within Julia. The book contains a good balance of equations, code, algorithms written from scratch, and use of built-in machine-learning algorithms. Readers can directly use the code, which is available on GitHub, or dive deeper into how the methods work. A nice feature is the inclusion of probabilistic principal components analysis (PPCA) and mixtures of PPCA for unsupervised learning." ~The Royal Statistical Society

". . . the book is an excellent piece of work that makes a start with Julia very easy and that covers all essential aspects of the language. After making the first steps into the realm of Julia with the help of this book, the reader should be able afterwards to find the own path and to specialize into the more individual aspects of the language that no introductory textbook can cover. The same is true for the data science part. After reading the book, the reader will be able to perform the most common analyses alone and learn other, more specific methods from different sources afterwards." ~Daniel Fischer, International Statistical Review "The book is ideal for people who want to learn Julia through machine-learning examples and is especially relevant for R users Chapter 7 is devoted to interacting with R from within Julia. The book contains a good balance of equations, code, algorithms written from scratch, and use of built-in machine-learning algorithms. Readers can directly use the code, which is available on GitHub, or dive deeper into how the methods work. A nice feature is the inclusion of probabilistic principal components analysis (PPCA) and mixtures of PPCA for unsupervised learning." ~The Royal Statistical Society

". . . the book is an excellent piece of work that makes a start with Julia very easy and that covers all essential aspects of the language. After making the first steps into the realm of Julia with the help of this book, the reader should be able afterwards to find the own path and to specialize into the more individual aspects of the language that no introductory textbook can cover. The same is true for the data science part. After reading the book, the reader will be able to perform the most common analyses alone and learn other, more specific methods from different sources afterwards." ~Daniel Fischer, International Statistical Review

Chapter 1 Introduction
1(12)
1.1 Data Science
1(3)
1.2 Big Data
4(1)
1.3 Julia
5(1)
1.4 Julia And R Packages
6(1)
1.5 Datasets
6(5)
1.5.1 Overview
6(1)
1.5.2 Beer Data
6(1)
1.5.3 Coffee Data
7(1)
1.5.4 Leptograpsus Crabs Data
8(1)
1.5.5 Food Preferences Data
9(1)
1.5.6 X2 Data
9(2)
1.5.7 Iris Data
11(1)
1.6 Outline Of The Contents Of This Monograph
11(2)
Chapter 2 Core Julia
13(30)
2.1 Variable Names
13(1)
2.2 Operators
14(1)
2.3 Types
15(8)
2.3.1 Numeric
15(2)
2.3.2 Floats
17(2)
2.3.3 Strings
19(3)
2.3.4 Tuples
22(1)
2.4 Data Structures
23(5)
2.4.1 Arrays
23(3)
2.4.2 Dictionaries
26(2)
2.5 Control Flow
28(8)
2.5.1 Compound Expressions
28(1)
2.5.2 Conditional Evaluation
29(1)
2.5.3 Loops
30(1)
2.5.3.1 Basics
30(2)
2.5.3.2 Loop Termination
32(1)
2.5.3.3 Exception Handling
33(3)
2.6 Functions
36(7)
Chapter 3 Working With Data
43(24)
3.1 Dataframes
43(4)
3.2 Categorical Data
47(1)
3.3 Input/Output
48(6)
3.4 Useful Dataframe Functions
54(2)
3.5 Split-Apply-Combine Strategy
56(3)
3.6 Queryjl
59(8)
Chapter 4 Visualizing Data
67(26)
4.1 Gadfly. Jl
67(2)
4.2 Visualizing Univariate Data
69(3)
4.3 Distributions
72(11)
4.4 Visualizing Bivariate Data
83(7)
4.5 Error Bars
90(1)
4.6 Facets
91(1)
4.7 Saving Plots
91(2)
Chapter 5 Supervised Learning
93(36)
5.1 Introduction
93(3)
5.2 Cross-Validation
96(3)
5.2.1 Overview
96(1)
5.2.2 Tf-Fold Cross-Validation
97(2)
5.3 Nearest Neighbours Classification
99(3)
5.4 Classification And Regression Trees
102(6)
5.4.1 Overview
102(1)
5.4.2 Classification Trees
103(3)
5.4.3 Regression Trees
106(2)
5.4.4 Comments
108(1)
5.5 Bootstrap
108(3)
5.6 Random Forests
111(2)
5.7 Gradient Boosting
113(13)
5.7.1 Overview
113(3)
5.7.2 Beer Data
116(5)
5.7.3 Food Data
121(5)
5.8 Comments
126(3)
Chapter 6 Unsupervised Learning
129(36)
6.1 Introduction
129(3)
6.2 Principal Components Analysis
132(3)
6.3 Probabilistic Principal Components Analysis
135(2)
6.4 Em Algorithm For Ppca
137(11)
6.4.1 Background: Em Algorithm
137(1)
6.4.2 E-Step
138(1)
6.4.3 M-Step
139(1)
6.4.4 Woodbury Identity
140(1)
6.4.5 Initialization
141(1)
6.4.6 Stopping Rule
141(1)
6.4.7 Implementing The Em Algorithm For Ppca
142(4)
6.4.8 Comments
146(2)
6.5 Zf-Means Clustering
148(3)
6.6 Mixture Of Probabilistic Principal Components Analyzers
151(11)
6.6.1 Model
151(1)
6.6.2 Parameter Estimation
152(9)
6.6.3 Illustrative Example: Coffee Data
161(1)
6.7 Comments
162(3)
Chapter 7 R Interoperability
165(20)
7.1 Accessing R Datasets
165(1)
7.2 Interacting With R
166(5)
7.3 Example: Clustering And Data Reduction For The Coffee Data
171(5)
7.3.1 Coffee Data
171(1)
7.3.2 Pgmm Analysis
172(3)
7.3.3 Vscc Analysis
175(1)
7.4 Example: Food Data
176(17)
7.4.1 Overview
176(1)
7.4.2 Random Forests
176(17)
Appendix A Julia And R Packages Used Herein 185(2)
Appendix B Variables For Food Data 187(6)
Appendix C Useful Mathematical Results 193(4)
C.1 Brief Overview Of Eigenvalues
193(1)
C.2 Selected Linear Algebra Results
193(1)
C.3 Matrix Calculus Results
194(3)
Appendix D Performance Tips 197(6)
D.1 Floating Point Numbers
197(2)
D.1.1 Do Not Test For Equality
197(1)
D.1.2 Use Logarithms For Division
198(1)
D.1.3 Subtracting Two Nearly Equal Numbers
198(1)
D.2 Julia Performance
199(4)
D.2.1 General Tips
199(1)
D.2.2 Array Processing
199(2)
D.2.3 Separate Core Computations
201(2)
Appendix E Linear Algebra Functions 203
E.1 Vector Operations
203(1)
E.2 Matrix Operations
204
Paul D. McNicholas is the Canada Research Chair in Computational Statistics at McMaster University, where he is a Professor in the Department of Mathematics and Statistics.

Peter Tait is a Ph.D. student at the Department of Mathematics and Statistics at McMaster University. Prior to returning to academia, he worked as a data scientist in the software industry, where he gained extensive practical experience.