Klienditugi: 7440010 (E-R 10-18)

Abi | Registreeri | Logi sisse

E-raamat: Julia for Data Analysis

3.75/5 (8 hinnangut Goodreads-ist)

Bogumi Kaminski

Formaat: 472 pages
Ilmumisaeg: 14-Feb-2023
Kirjastus: Manning Publications
Keel: eng
ISBN-13: 9781638351788

Teised raamatud teemal:

Functional programming

Formaat - EPUB+DRM
Hind: 51,64 €*
* hind on lõplik, st. muud allahindlused enam ei rakendu
Lisa ostukorvi
Lisa soovinimekirja
See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.

Formaat: 472 pages
Ilmumisaeg: 14-Feb-2023
Kirjastus: Manning Publications
Keel: eng
ISBN-13: 9781638351788

Teised raamatud teemal:

Functional programming

DRM piirangud

Kopeerimine (copy/paste):

ei ole lubatud
Printimine:

ei ole lubatud
Kasutamine:

Digitaalõiguste kaitse (DRM)
Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

Vajalik tarkvara
Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

Seda e-raamatut ei saa lugeda Amazon Kindle's.

Master core data analysis skills using Julia. Julia for Data Analysis is a fascinating, hands-on projects guide you through time series data, predictive models, popularity ranking, and more.

With this book, you will learn how to:

Read and write data in various formats Work with tabular data, including subsetting, grouping, and transforming Visualise your data using plots Perform statistical analysis Build predictive models Create complex data processing pipelines

Julia was designed for the unique needs of data scientists: it's expressive and easy-to-use whilst also delivering super fast code execution.

Julia for Data Analysis teaches you how to perform core data science tasks with this amazing language. It is written by Bogumi Kamiski, a top contributor to Julia, #1 Julia answerer on StackOverflow, and a lead developer of Julia's core data package DataFrames.jl.

You will learn how to write production-quality code in Julia, and utilize Julia's core features for data gathering, visualisation, and working with data frames. Plus, the engaging hands-on projects get you into the action quickly.

About the technology Julia is a huge step forward for data science and scientific computing. It is a powerful high-performance programming language with many developer-friendly features like garbage collection, dynamic typing, just-in-time compilation, and a flexible approach to concurrent, parallel, and distributed computing. Although Julia's strong numerical programming features make it a favorite of data scientists, it is also an awesome general purpose programming language.

About the reader For data scientists familiar with Python or R. No experience with Julia required.

Arvustused

"A brilliant guide to data analysis with Julia." Kevin Cheung, Carleton University

"One of the best structured and well-written presentations of language fundamentals and analysis concepts that I've encountered. I highly recommend this book." Maureen Metzger, Kumanu

"A solid, hands-on, and really enjoyable introduction to Julia." Sonja Krause-Harder, Elastic

Foreword

xiii

Preface

Acknowledgments

xvii

About this book

xix

About the author

xxiv

About the cover illustration

xxv

1 Introduction

(18)

1.1 What is Julia and why is it useful?

(4)

1.2 Key features of Julia from a data scientist's perspective

(4)

Julia is fast because it is a compiled language

(2)

Julia provides full support for interactive workflows

(1)

Julia programs are highly reusable and easy to compose together

(1)

Julia has a built-in state-of-the-art package manager

(1)

It is easy to integrate existing code with Julia

(1)

1.3 Usage scenarios of tools presented in the book

(1)

1.4 Julia's drawbacks

(2)

1.5 What data analysis skills will you learn?

(1)

1.6 How can Julia be used for data analysis?

(6)

PART 1 Getting started with Julia

(164)

2.1 Representing values

(3)

2.2 Defining variables

(3)

2.3 Using the most important control-flow constructs

(10)

Computations depending on a Boolean condition

(6)

Loops

(1)

Compound expressions

(2)

A first approach to calculating the winsorized mean

(1)

2.4 Defining functions

(8)

Defining functions using the function keyword

(1)

Positional and key word arguments of functions

(2)

Rules for passing arguments to functions

(1)

Short syntax for defining simple functions

(1)

Anonymous functions

(1)

Do blocks

(1)

Function-naming convention in Julia

(1)

A simplified definition of a function computing the winsorized mean

(1)

2.5 Understanding variable scoping rules

(5)

3 Julia's support for scaling projects

(21)

3.1 Understandingjulia's type system

(5)

A single function in Julia may have multiple methods

(1)

Types in Julia are arranged in a hierarchy

(1)

Finding all supertypes of a type

(1)

Finding all subtypes of a type

(1)

Union of types

(1)

Deciding what type restrictions to put in method signature

(1)

3.2 Using multiple dispatch in Julia

(4)

Rules for defining methods of a function

(1)

Method ambiguity problem

(1)

Improved implementation of winsorized mean

(2)

3.3 Working with packages and modules

(6)

What is a module in Julia?

(2)

How can packages be used in Julia?

(2)

Using Stats Base.jl to compute the winsorized mean

(2)

3.4 Using macros

(5)

4 Working with collections in Julia

(53)

4.1 Working with arrays

(18)

Getting the data into a matrix

(4)

Computing basic statistics of the data stored in a matrix

(2)

Indexing into arrays

(3)

Performance considerations of copying vs. making a view

(1)

Calculating correlations between variables

(1)

Fitting a linear regression

(3)

Plotting the Anscombe's quartet data

(2)

4.2 Mapping key-value pairs with dictionaries

(5)

4.3 Structuring your data by using named tuples

(8)

Defining named tuples and accessing their contents

(1)

Analyzing Anscombe's quartet data stored in a named tuple

(1)

Understanding composite types and mutability of values in Julia

(4)

Advanced topics on handling collections

100

(1)

5.1 Vectorizing your code using broadcasting

101

(11)

Understanding syntax and meaning of broadcasting in Julia

101

(2)

Expanding length-1 dimensions in broadcasting

103

(3)

Protecting collections from being broadcasted over

106

(3)

Analyzing Anscombe's quartet data using broadcasting

109

(3)

5.2 Defining methods with parametric types

112

(5)

Most collection types in Julia are parametric

112

(2)

Rules for sub typing of parametric types

114

(2)

Using sub typing rules to define the covariance function

116

(1)

5.3 Integrating with Python

117

(6)

Preparing data for dimensionality reduction using t-SNE

117

(1)

Calling Python from Julia

118

(2)

Visualizing the results of the t-SNE algorithm

120

(3)

6 Working with strings

123

(31)

6.1 Getting and inspecting the data

124

(4)

Downloading files from the web

125

(1)

Using common techniques of string construction

125

(2)

Reading the contents of a file

127

(1)

6.2 Splitting strings

128

(2)

6.3 Using regular expressions to work with strings

130

(2)

Working with regular expressions

130

(1)

Writing a parser of a single line of movies.dat file

131

(1)

6.4 Extracting a subset from a string with indexing

132

(3)

UTF-8 encoding of strings in Julia

132

(1)

Character vs. byte indexing of strings

133

(1)

ASCII strings

134

(1)

The Char type

135

(1)

6.5 Analyzinggenrefrequencyinmovies.dat

135

(5)

Finding common movie genres

135

(2)

Understanding genre popularity evolution over the years

137

(3)

6.6 Introducing symbols

140

(3)

Creating symbols

140

(1)

Using symbols

141

(2)

6.7 Using fixed-width string types to improve performance

143

(3)

Available fixed-width strings

143

(1)

Performance of fixed-width strings

144

(2)

6.8 Compressing vectors of strings with PooledArrays.jl

146

(5)

Creating a file containing flower names

146

(1)

Reading in the data to a vector and compressing it

147

(1)

Understanding the internal design of PooledArray

148

(3)

6.9 Choosing appropriate storage for collections of strings

151

(3)

7 Handling time-series data and missing values

154

(29)

7.1 Understanding the NBP Web API

155

(8)

Getting the data via a web browser

155

(2)

Getting the data by using Julia

157

(2)

Handling cases when an NBP Web API query fails

159

(4)

7.2 Working with missing data in Julia

163

(6)

Definition of the missing value

163

(1)

Working with missing values

164

(5)

7.3 Getting time-series data from the NBP Web API

169

(4)

Working with dates

170

(2)

Fetching data from the NBP Web API for a range of dates

172

(1)

7.4 Analyzing data fetched from the NBP Web API

173

(10)

Computing summary statistics

174

(1)

Finding which days of the week have the most missing values

174

(1)

Plotting the PLN/USD exchange rate

175

(8)

PART 2 Toolbox for data analysis

183

(210)

8 First steps with data frames

185

(24)

8.1 Fetching, unpacking, and inspecting the data

187

(3)

Downloading the file from the web

187

(1)

Working with bzip2 archives

188

(2)

Inspecting the CSV file

190

(1)

8.2 Loading the data to a data frame

190

(6)

Reading a CSV file into a data frame

190

(2)

Inspecting the contents of a data frame

192

(3)

Saving a data frame to a CSV file

195

(1)

8.3 Getting a column out of a data frame

196

(7)

Understanding the data frame's storage model

196

(1)

Treating a data frame column as a property

197

(3)

Getting a column by using data frame indexing

200

(2)

Visualizing data stored in columns of a data frame

202

(1)

8.4 Reading and writing data frames using different formats

203

(6)

Apache Arrow

204

(1)

SQLite

205

(4)

9 Getting data from a data frame

209

(24)

9.1 Advanced data frame indexing

210

(15)

Getting a reduced puzzles data frame

212

(3)

Overview of allowed column selectors

215

(5)

Overview of allowed row-subsetting values

220

(3)

Making views of data frame objects

223

(2)

9.2 Analyzing the relationship between puzzle difficulty and popularity

225

(8)

Calculating mean puzzle popularity by its rating

225

(4)

Fitting LOESS regression

229

(4)

10 Creating data frame objects

233

(32)

10.1 Reviewing the most important ways to create a data frame

234

(14)

Creating a data frame from a matrix

235

(2)

Creating a data frame from vectors

237

(7)

Creating a data frame using a Tables.jl interface

244

(2)

Plotting a correlation matrix of data stored in a data frame

246

(2)

10.2 Creating data frames incrementally

248

(17)

Vertically concatenating data frames

248

(5)

Appending a table to a data frame

253

(3)

Adding a new row to an existing data frame

256

(1)

Storing simulation results in a data frame

257

(8)

11 Converting and grouping data frames

265

(26)

11.1 Converting a data frame to other value types

266

(14)

Conversion to a matrix

268

(1)

Conversion to a named tuple of vectors

269

(7)

Other common conversions

276

(4)

11.2 Grouping data frame objects

280

(11)

Preparing the source data frame

280

(1)

Grouping a data frame

281

(1)

Getting group keys of a grouped data frame

282

(1)

Indexing a grouped data frame with a single value

283

(2)

Comparing performance of indexing methods

285

(1)

Indexing a grouped data frame with multiple values

286

(2)

Iterating a grouped data frame

288

(3)

12 Mutating and transforming data frames

291

(36)

12.1 Getting and loading the GitHub developers data set

292

(11)

Understanding graphs

293

(1)

Fetching GitHub developer data from the web

294

(2)

Implementing a function that extracts data from a ZIP file

296

(2)

Reading the GitHub developer data into a data frame

298

(5)

12.2 Computing additional node features

303

(8)

Creating a SimpleGraph object

303

(2)

Computing features of nodes by using the Graphs.jl package

305

(2)

Counting a node's web and machine learning neighbors

307

(4)

12.3 Using the split-apply-combine approach to predict the developer's type

311

(10)

Computing summary statistics of web and machine learning developer features

311

(4)

Visualizing the relationship between the number of web and machine learning neighbors of a node

315

(4)

Fitting a logistic regression model predicting developer type

319

(2)

12.4 Reviewing data frame mutation operations

321

(6)

Performing low-level API operations

321

(2)

Using the insertcols! function to mutate a data frame

323

(4)

13 Advanced transformations of data frames

327

(66)

13.1 Getting and preprocessing the police stop data set

328

(9)

Loading all required packages

328

(1)

Introducing the &chain macro

329

(2)

Getting the police stop data set

331

(2)

Comparing functions that perform operations on columns

333

(3)

Using short forms of operation specification syntax

336

(1)

13.2 Investigating the violation column

337

(8)

Finding the most frequent violations

337

(3)

Vectorizing functions by using the ByRow wrapper

340

(1)

Flattening data frames

341

(1)

Using convenience syntax to get the number of rows of a data frame

341

(1)

Sorting data frames

342

(1)

Using advanced functionalities of Data Frames Meta.jl

343

(2)

13.3 Preparing data for making predictions

345

(9)

Performing initial transformation of the data

345

(2)

Working with categorical data

347

(2)

Joining data frames

349

(1)

Reshaping data frames

350

(3)

Dropping rows of a data frame that hold missing values

353

(1)

13.4 Building a predictive model of arrest probability

354

(7)

Splitting the data into train and test data sets

354

(2)

Fitting a logistic regression model

356

(1)

Evaluating the quality of a model's predictions

357

(4)

13.5 Reviewing functionalities provided by DataFrames.jl

361

(5)

Creating web services for sharing data analysis results

365

(1)

14.1 Pricing financial options by using a Monte Carlo simulation

366

(6)

Calculating the payoff of an Asian option definition

366

(2)

Computing the value of an Asian option

368

(1)

Understanding GBM

369

(1)

Using a numerical approach to computing the Asian option value

370

(2)

14.2 Implementing the option pricing simulator

372

(7)

Starting Julia with multiple-thread support

372

(1)

Computing the option payofffor a single sample of stock prices

373

(2)

Computing the option value

375

(4)

14.3 Creating a web service serving the Asian option valuation

379

(4)

A general approach to building a web service

379

(2)

Creating a web service using Genie.jl

381

(2)

Running the web service

383

(1)

14.4 Using the Asian option pricing web service

383

(10)

Sending a single request to the web service

384

(2)

Collecting responses to multiple requests from a web service in a data frame

386

(1)

Unnesting a column of a data frame

387

(2)

Plotting the results of Asian option pricing

389

(4)

Appendix A First steps with Julia

393

(12)

Appendix B Solutions to exercises

405

(22)

Appendix C Julia packages for data science

427

(4)

Index

431

Bogumi Kamiski is one of the lead developers of DataFrames.jl the core package for data manipulation in the Julia ecosystem. He has over 20 years of experience delivering data science projects for corporate customers. He has been teaching data science at the undergraduate and graduate levels for two decades.

Lisainfo e-raamatute kohta

Püsilink: https://www.kriso.ee/db/97816383517886e.html

Märksõnad:

E-raamat: Julia for Data Analysis

DRM piirangud

Kopeerimine (copy/paste):

Printimine:

Kasutamine:

Arvustused

Konto & seaded

Otsing

Otsingu andmebaas

Filtreeri tulemusi

Teemad E-raamatute teemad

Vali ostukorv