Klienditugi: 7440010 (E-R 10-18)

Abi | Registreeri | Logi sisse

Data Science for Public Policy 2021 ed. [Kõva köide]

3.00/5 (2 hinnangut Goodreads-ist)

Jeffrey C. Chen, Edward A. Rubin, Gary J. Cornwall

Formaat: Hardback, 363 pages, kõrgus x laius: 279x210 mm, kaal: 1289 g, 111 Illustrations, color; 12 Illustrations, black and white; XIV, 363 p. 123 illus., 111 illus. in color., 1 Hardback
Sari: Springer Series in the Data Sciences
Ilmumisaeg: 01-Sep-2021
Kirjastus: Springer Nature Switzerland AG
ISBN-10: 3030713512
ISBN-13: 9783030713515

Teised raamatud teemal:

Numerical analysis
Mathematical & statistical software - (Hetkel poes: 1 nimetust)
Data analysis: general - (Hetkel poes: 1 nimetust)
Public administration

Kõva köide
Hind: 62,59 €*
* hind on lõplik, st. muud allahindlused enam ei rakendu
Tavahind: 73,64 €
Säästad 15%
Raamatu kohalejõudmiseks kirjastusest kulub orienteeruvalt 2-4 nädalat
Kogus:
- - 1
  - 2
  - 3
  - 4
  - 5
  - 6
  - 7
  - 8
  - 9
  - 10
Lisa ostukorvi
Tasuta tarne
Tellimisaeg 2-4 nädalat
Lisa soovinimekirja

Formaat: Hardback, 363 pages, kõrgus x laius: 279x210 mm, kaal: 1289 g, 111 Illustrations, color; 12 Illustrations, black and white; XIV, 363 p. 123 illus., 111 illus. in color., 1 Hardback
Sari: Springer Series in the Data Sciences
Ilmumisaeg: 01-Sep-2021
Kirjastus: Springer Nature Switzerland AG
ISBN-10: 3030713512
ISBN-13: 9783030713515

Teised raamatud teemal:

Numerical analysis
Mathematical & statistical software - (Hetkel poes: 1 nimetust)
Data analysis: general - (Hetkel poes: 1 nimetust)
Public administration

Püsilink: https://www.kriso.ee/db/9783030713515.html

Märksõnad:

This textbook presents the essential tools and core concepts of data science to public officials, policy analysts, and economists among others in order to further their application in the public sector. An expansion of the quantitative economics frameworks presented in policy and business schools, this book emphasizes the process of asking relevant questions to inform public policy. Its techniques and approaches emphasize data-driven practices, beginning with the basic programming paradigms that occupy the majority of an analyst’s time and advancing to the practical applications of statistical learning and machine learning. The text considers two divergent, competing perspectives to support its applications, incorporating techniques from both causal inference and prediction. Additionally, the book includes open-sourced data as well as live code, written in R and presented in notebook form, which readers can use and modify to practice working with data.

Preface

vii

1 An Introduction

(4)

1.1 Why we wrote this book

(1)

1.2 What we assume

(1)

1.3 How this book is structured

(2)

2 The Case for Programming

(8)

2.1 Doing visual analytics since the 1780s

(2)

2.2 How does programming work?

(1)

2.3 Setting up R and RStudio

(3)

2.3.1 Installing R

(1)

2.3.2 Installing RStudio

(1)

2.3.3 DIY: Running your first code snippet

(1)

2.4 Making the case for open-source software

(2)

3 Elements of Programming

(20)

3.1 Data are everywhere

(1)

3.2 Data types

(3)

3.2.1 numeric

(1)

3.2.2 character

(1)

3.2.3 logical

(2)

3.2.4 factor

(1)

3.2.5 date

(1)

3.2.6 The class function

(1)

3.3 Objects in R

(1)

3.4 R's object classes

(4)

3.4.1 vector

(1)

3.4.2 matrix

(1)

3.4.3 data.frame

(1)

3.4.4 list

(1)

3.4.5 The class function, v2

(1)

3.4.6 More classes

(1)

3.5 Packages

(2)

3.5.1 Base R and the need to extend functionality

(1)

3.5.2 Installing packages

(1)

3.5.3 Loading packages

(1)

3.5.4 Package management and pacman

(1)

3.6 Data input/output

(3)

3.6.1 Directories

(2)

3.6.2 Load functions

(1)

3.6.3 Datasets

(1)

3.7 Finding help

(1)

3.7.1 Help function

(1)

3.7.2 Google and online communities

(1)

3.8 Beyond this chapter

(2)

3.8.1 Best practices

(1)

3.8.2 Further study

(1)

3.9 DIY: Loading solar energy data from the web

(4)

4 Transforming Data

(28)

4.1 Importing and assembling data

(4)

4.1.1 Loading files

(3)

4.2 Manipulating values

(7)

4.2.1 Text manipulation functions

(1)

4.2.2 Regular Expressions (RegEx)

(3)

4.2.3 DIY: Working with PII

(1)

4.2.4 Working with dates

(1)

4.3 The structure of data

(6)

4.3.1 Matrix or data frame?

(1)

4.3.2 Array indexes

(1)

4.3.3 Subsetting

(1)

4.3.4 Sorting and re-ordering

(1)

4.3.5 Aggregating data

(1)

4.3.6 Reshaping data

(2)

4.4 Control structures

(5)

4.4.1 If statement

(1)

4.4.2 For-loops

(2)

4.4.3 While

(1)

4.5 Functions

(1)

4.6 Beyond this chapter

(4)

4.6.1 Best practices

(1)

4.6.2 Further study

(3)

5 Record Linkage

(22)

5.1 Edward Kennedy, Bill de Blasio, and Bayerische Motoren Werke

(1)

5.2 How does record linkage work?

(1)

5.3 Pre-processing the data

(3)

5.4 De-duplication

(1)

5.5 Deterministic record linkage

(3)

5.6 Comparison functions

(4)

5.6.1 Edit distances

(1)

5.6.2 Phonetic algorithms

(2)

5.6.3 New tricks, same heuristics

(1)

5.7 Probabilistic record linkage

(2)

5.8 Data privacy

(1)

5.9 DIY: Matching people in the UK-UN sanction lists

(3)

5.10 Beyond this chapter

(3)

5.10.1 Best practices

(1)

5.10.2 Further study

(2)

6 Exploratory Data Analysis

(30)

6.1 Visually detecting patterns

(2)

6.2 The gist of EDA

(2)

6.3 Visualizing distributions

(7)

6.3.1 Skewed variables

(2)

6.4 Exploring missing values

(9)

6.4.1 Encodings

(1)

6.4.2 Missing value functions

(1)

6.4.3 Exploring missingness

(2)

6.4.4 Treating missingness

(5)

6.5 Analyzing time series

103

(2)

6.6 Finding visual correlations

105

(4)

6.6.1 Visual analysis on high-dimensional datasets

108

(1)

6.7 Beyond this chapter

109

(4)

7 Regression Analysis

113

(26)

7.1 Measuring and predicting the preferences of society

113

(1)

7.2 Simple linear regression

114

(7)

7.2.1 Mean squared error

116

(1)

7.2.2 Ordinary least squares

117

(1)

7.2.3 DIY: A simple hedonic model

118

(3)

7.3 Checking for linearity

121

(2)

7.4 Multiple regression

123

(14)

7.4.1 Non-linearities

124

(1)

7.4.2 Discrete variables

125

(2)

7.4.3 Discontinuities

127

(1)

7.4.4 Measures of model fitness

128

(1)

7.4.5 DIY: Choosing between models

129

(3)

7.4.6 DIY: Housing prices over time

132

(5)

7.5 Beyond this chapter

137

(2)

8 Framing Classification

139

(24)

8.1 Playing with fire

139

(2)

8.1.1 FireCast

139

(1)

8.1.2 What's a classifier?

140

(1)

8.2 The basics of classifiers

141

(5)

8.2.1 The anatomy of a classifier

141

(1)

8.2.2 Finding signal in classification contexts

142

(1)

8.2.3 Measuring accuracy

142

(4)

8.3 Logistic regression

146

(10)

8.3.1 The social science workhorse

146

(1)

8.3.2 Telling the story from coefficients

147

(1)

8.3.3 How are coefficients learned?

148

(1)

8.3.4 In practice

148

(2)

8.3.5 DIY: Expanding health care coverage

150

(6)

8.4 Regularized regression

156

(5)

8.4.1 From regularization to interpretation

158

(1)

8.4.2 DIY: Re-visiting health care coverage

158

(3)

8.5 Beyond this chapter

161

(2)

9 Three Quantitative Perspectives

163

(22)

9.1 Descriptive analysis

164

(1)

9.2 Causal inference

165

(9)

9.2.1 Potential outcomes framework

166

(1)

9.2.2 Regression' discontinuity

167

(5)

9.2.3 Difference-in-differences

172

(2)

9.3 Prediction

174

(8)

9.3.1 Understanding accuracy

175

(5)

9.3.2 Model validation

180

(2)

9.4 Beyond this chapter

182

(3)

10 Prediction

185

(32)

10.1 The role of algorithms

185

(2)

10.2 Data science pipelines

187

(2)

10.3 K-Nearest Neighbors (k-NN)

189

(6)

10.3.1 Under the hood

190

(2)

10.3.2 DIY: Predicting the extent of storm damage

192

(3)

10.4 Tree-based learning

195

(15)

10.4.1 Classification and Regression Trees (CART)

196

(5)

10.4.2 Random forests

201

(2)

10.4.3 In practice

203

(1)

10.4.4 DIY: Wage prediction with CART and random forests

204

(6)

10.5 An introduction to other algorithms

210

(5)

10.5.1 Gradient boosting

211

(1)

10.5.2 Neural networks

212

(3)

10.6 Beyond this chapter

215

(2)

11 Cluster Analysis

217

(20)

11.1 Things closer together are more related

217

(1)

11.2 Foundational concepts

218

(1)

11.3 k-means

219

(7)

11.3.1 Under the hood

219

(2)

11.3.2 In Practice

221

(2)

11.3.3 DIY: Clustering for economic development

223

(3)

11.4 Hierarchical clustering

226

(8)

11.4.1 Under the hood

227

(2)

11.4.2 In Practice

229

(1)

11.4.3 DIY: Clustering time series

230

(4)

11.5 Beyond this chapter

234

(3)

12 Spatial Data

237

(22)

12.1 Anticipating climate impacts

237

(2)

12.2 Classes of spatial data

239

(1)

12.3 Rasters

239

(5)

12.3.1 Raster files

241

(1)

12.3.2 Rasters and math

242

(1)

12.3.3 DIY: Working with raster math

242

(2)

12.4 Vectors

244

(12)

12.4.1 Vector files

244

(1)

12.4.2 Converting points to spatial objects

245

(1)

12.4.3 Coordinate Reference Systems

246

(2)

12.4.4 DIY: Converting coordinates into point vectors

248

(1)

12.4.5 Reading shapefiles

249

(1)

12.4.6 Spatial joins

250

(2)

12.4.7 DIY: Analyzing spatial relationships

252

(4)

12.5 Beyond this chapter

256

(3)

13 Natural Language

259

(24)

13.1 Transforming text into data

260

(6)

13.1.1 Processing textual data

260

(2)

13.1.2 TF-IDF

262

(1)

13.1.3 Document similarities

263

(1)

13.1.4 DIY: Basic text processing

263

(3)

13.2 Sentiment Analysis

266

(5)

13.2.1 Sentiment lexicons

267

(1)

13.2.2 Calculating sentiment scores

267

(2)

13.2.3 DIY: Scoring text for sentiment

269

(2)

13.3 Topic modeling

271

(9)

13.3.1 A conceptual base

271

(1)

13.3.2 How do topics models work?

272

(1)

13.3.3 DIY: Finding topics in presidential speeches

273

(7)

13.4 Beyond this chapter

280

(3)

13.4.1 Best practices

280

(1)

13.4.2 Further study

281

(2)

14 The Ethics of Data Science

283

(16)

14.1 An emerging debate

283

(1)

14.2 Bias

284

(5)

14.2.1 Sampling bias

285

(2)

14.2.2 Measurement bias

287

(2)

14.2.3 Prejudicial bias

289

(1)

14.3 Fairness

289

(2)

14.3.1 Score-based fairness

290

(1)

14.3.2 Accuracy-based fairness

290

(1)

14.3.3 Other considerations

291

(1)

14.4 Transparency and Interpretability

291

(4)

14.4.1 Interpretability

292

(1)

14.4.2 Explainability

293

(2)

14.5 Privacy

295

(2)

14.5.1 An evolving landscape

295

(1)

14.5.2 Privacy strategies

295

(2)

14.6 Beyond this chapter

297

(2)

15 Developing Data Products

299

(18)

15.1 Meeting people where they are

299

(2)

15.2 Designing for impact

301

(3)

15.2.1 Identify a user need

301

(1)

15.2.2 Size up the situation

302

(1)

15.2.3 Build a lean "V1"

303

(1)

15.2.4 Test and evaluate its impact, then iterate

303

(1)

15.3 Communicating data science projects

304

(4)

15.3.1 Presentations

304

(2)

15.3.2 Written reports

306

(2)

15.4 Reporting dashboards

308

(3)

15.5 Prediction products

311

(2)

15.5.1 Prioritization and targeting lists

311

(1)

15.5.2 Scoring engines

311

(2)

15.6 Continuing to hone your craft

313

(2)

15.7 Where to next?

315

(2)

16 Building Data Teams

317

(14)

16.1 Establishing a baseline

317

(3)

16.2 Operating models

320

(6)

16.2.1 Center of excellence

320

(1)

16.2.2 Hack teams

321

(2)

16.2.3 Consultancy

323

(1)

16.2.4 Matrix organizations

324

(2)

16.3 Identifying roles

326

(2)

16.3.1 The manager

326

(1)

16.3.2 Analytics roles

326

(1)

16.3.3 Data product roles

327

(1)

16.3.4 Titles in the civil service system

328

(1)

16.4 The hiring process

328

(2)

16.4.1 Job postings and application review

328

(1)

16.4.2 Interviews

329

(1)

16.5 Final thoughts

330

(1)

Appendix A: Planning a Data Product

331

(4)

Key Questions

331

(4)

Appendix B: Interview Questions

335

(8)

Getting to know the candidate

335

(1)

Business acumen

335

(1)

Project experience

335

(1)

Whiteboard questions

336

(5)

Statistics

336

(1)

Causal inference

337

(1)

Estimation versus prediction

337

(1)

Machine learning

338

(1)

Model evaluation

339

(1)

Communication and visualization

339

(1)

Programming

340

(1)

Take-home questions

341

(2)

References

343

(14)

Index

357

Jeffrey C. Chen: (1) Affiliated Researcher, Bennett Institute for Public Policy, University of Cambridge Edward A. Rubin: (1) Assistant Professor, University of Oregon (Dept. of Economics) Gary J. Cornwall: (1) Research Economist, U.S. Bureau of Economic Analysis

Data Science for Public Policy 2021 ed. [Kõva köide]

Konto & seaded

Otsing

Otsingu andmebaas

Filtreeri tulemusi

Teemad Ingliskeelsed raamatud

Vali ostukorv