Muutke küpsiste eelistusi

E-raamat: R for Conservation and Development Projects: A Primer for Practitioners [Taylor & Francis e-raamat]

(Independent Environmental Consultant - Dunedin, New Zealand)
  • Formaat: 390 pages, 7 Tables, black and white; 123 Line drawings, black and white; 123 Illustrations, black and white
  • Sari: Chapman & Hall/CRC The R Series
  • Ilmumisaeg: 22-Dec-2020
  • Kirjastus: Chapman & Hall/CRC
  • ISBN-13: 9780429262180
  • Taylor & Francis e-raamat
  • Hind: 240,04 €*
  • * hind, mis tagab piiramatu üheaegsete kasutajate arvuga ligipääsu piiramatuks ajaks
  • Tavahind: 342,91 €
  • Säästad 30%
  • Formaat: 390 pages, 7 Tables, black and white; 123 Line drawings, black and white; 123 Illustrations, black and white
  • Sari: Chapman & Hall/CRC The R Series
  • Ilmumisaeg: 22-Dec-2020
  • Kirjastus: Chapman & Hall/CRC
  • ISBN-13: 9780429262180

This book is aimed at conservation and development practitioners and who need to learn and use R in a part-time professional context. It gives people with a non-technical background a set of skills to graph, map, and model in R. It also provides background on data integration in project management and covers fundamental statistical concepts. The book aims to demystify R and give practitioners the confidence to use it.

Key Features:

• Viewing data science as part of a greater knowledge and decision making system
• Foundation sections on inference, evidence, and data integration
• Plain English explanations of R functions
• Relatable examples which are typical of activities undertaken by conservation and development organisations in the developing world
• Worked examples showing how data analysis can be incorporated into project reports

Preface xvii
1 Introduction
1(8)
1.1 What is R?
1(1)
1.2 WhyR?
1(1)
1.3 Why this book?
2(1)
1.4 What are development and conservation?
2(1)
1.5 Science and decision making
3(1)
1.6 Why data science is important
4(2)
1.6.1 Monitoring and evaluation
4(1)
1.6.2 Projects versus programmes
5(1)
1.6.3 Project delivery versus research projects
5(1)
1.7 The goal of this book
6(1)
1.8 How this book is organised
6(1)
1.9 How code is organised in this book
7(2)
I Basics
9(2)
2 Inference and Evidence
11(6)
2.1 Inference
11(1)
2.2 Study design
12(2)
2.3 Evidence
14(1)
2.4 What makes good data?
14(1)
2.5 Recommended resources
15(1)
2.6 Summary
15(2)
3 Data integration in project management
17(12)
3.1 Adaptive management cycles
17(1)
3.2 The Deming cycle
17(9)
3.2.1 Plan
18(1)
3.2.1.1 Development of a project strategy and proposal
18(2)
3.2.1.2 Proposal submission process
20(1)
3.2.1.3 What is a logframe?
21(1)
3.2.1.4 Logframe terminology
21(1)
3.2.1.5 Pre-implementation planning
21(1)
3.2.2 Train
22(1)
3.2.3 Do
23(1)
3.2.4 Check
23(2)
3.2.5 Act
25(1)
3.3 Challenges
26(1)
3.4 Recommended resources
26(1)
3.5 Summary
27(2)
4 Getting started in R
29(14)
4.1 Installing R
29(1)
4.2 Installing RStudio
30(1)
4.3 The R interface
30(4)
4.3.1 The console
31(1)
4.3.2 Version information
31(1)
4.3.3 Writing code in the console
32(1)
4.3.4 Script editors
33(1)
4.3.5 Using the default script editor
33(1)
4.3.6 Using RStudio
34(1)
4.4 R, as a calculator
34(1)
4.5 How R. works
35(4)
4.5.1 Objects
36(1)
4.5.2 Functions
36(1)
4.5.2.1 Getting help on functions
37(1)
4.5.3 Packages
37(2)
4.5.3.1 Getting help on packages
39(1)
4.6 Writing meaningful code
39(1)
4.7 Reproducibility
40(2)
4.8 Recommended resources
42(1)
4.9 Summary
42(1)
5 Introduction to data frames
43(16)
5.1 Making data frames
43(3)
5.2 Importing a data frame
46(1)
5.3 Saving a data frame
47(1)
5.4 Investigating a data frame
47(2)
5.5 Other functions to examine an R object
49(1)
5.6 Subsetting using the `[ ' and `]' operators
50(2)
5.7 Descriptive statistics
52(1)
5.8 Viewing data frames
53(1)
5.9 Making a reproducible example
54(2)
5.9.1 Reproducible example steps
54(2)
5.10 Recommended resources
56(1)
5.11 Summary
57(2)
6 The Waihi project
59(6)
6.1 The scenario
59(1)
6.1.1 Why evidence is important
60(1)
6.2 The data
60(3)
6.2.1 Description of condev data sets
62(1)
6.3 Recommended resources
63(1)
6.4 Summary
63(2)
II First steps
65(132)
7 ggplot2: graphing with the tidyverse
67(16)
7.1 Why graph?
67(1)
7.2 The tidyverse package
68(1)
7.3 The data
68(1)
7.4 Graphing in R
69(12)
7.4.1 Making a ggplot
69(1)
7.4.2 Scatter plots
70(1)
7.4.3 Bar plots
71(2)
7.4.4 Histograms
73(2)
7.4.5 Box plots
75(1)
7.4.6 Polygons
76(2)
7.4.7 Other common geoms
78(3)
7.5 How to save a ggplot
81(1)
7.6 Recommended resources
81(1)
7.7 Summary
82(1)
8 Customising a ggplot
83(1)
8.1 Why customise a ggplot?
83(1)
8.2 The packages
83(1)
8.3 The data
84(1)
8.4 Families of layers
84(1)
8.5 Aesthetics properties
84(5)
8.5.1 Settings aesthetics
85(2)
8.5.2 A quick note about colour
87(1)
8.5.3 Using aesthetics to distinguish groups
87(1)
8.5.4 Using faceting to distinguish groups
88(1)
8.6 Improving crowded graphs
89(2)
8.7 Overlaying
91(1)
8.8 Labels
92(2)
8.9 Using the theme() function
94(5)
8.9.1 The 4 elements
95(1)
8.9.2 Rotating axis text
96(1)
8.9.3 Spacing between axis and graph
97(1)
8.9.4 In-built themes
98(1)
8.10 Controlling axes
99(5)
8.10.1 Tick marks
99(2)
8.10.2 Axis limits
101(1)
8.10.3 Forcing a common origin
102(1)
8.10.4 Flipping axes
102(1)
8.10.5 Forcing a plot to be square
103(1)
8.10.6 Log scales and large numbers
103(1)
8.11 Controlling legends
104(1)
8.12 Recommended resources
105(1)
8.13 Summary
105(2)
9 Data wrangling
107(1)
9.1 What is data wrangling?
107(1)
9.2 The packages
108(1)
9.3 The data
108(1)
9.4 Pipes
108(2)
9.5 Tibbies versus data frame
110(2)
9.6 Subsetting
112(3)
9.6.1 select()
112(1)
9.6.2 filter()
113(2)
9.7 Transforming
115(3)
9.7.1 group_by()
115(1)
9.7.2 summarise()
115(1)
9.7.3 mutate()
116(1)
9.7.4 adorn_totals()
117(1)
9.8 Tidying
118(3)
9.8.1 pivot_wider()
118(2)
9.8.2 pivot_longer()
120(1)
9.9 Ordering
121(1)
9.9.1 arrange()
121(1)
9.9.2 top_n()
121(1)
9.10 Joining
122(1)
9.11 Recommended resources
123(1)
9.12 Summary
123(2)
10 Data cleaning
125(20)
10.1 Cleaning is more than correcting mistakes
125(1)
10.2 The packages
125(1)
10.3 The data
126(1)
10.4 Changing names
127(3)
10.4.1 clean_names()
127(1)
10.4.2 rename()
127(1)
10.4.3 fct_recode()
128(1)
10.4.4 str_replace_all()
128(2)
10.5 Fixing missing values
130(4)
10.5.1 fct_explicit_na()
131(1)
10.5.2 replace_na()
131(1)
10.5.3 replace()
132(1)
10.5.4 drop_na()
132(1)
10.5.5 Cleaning a whole data set
133(1)
10.6 Adding and dropping factor levels
134(2)
10.6.1 fct_drop()
134(1)
10.6.2 fct_expand()
135(1)
10.6.3 Keeping empty levels in ggplot
135(1)
10.7 Fusing duplicate columns
136(1)
10.7.1 coalesce()
137(1)
10.8 Organising factor levels
137(3)
10.8.1 fct_relevel()
138(1)
10.8.2 fct_reorder()
139(1)
10.8.3 fct_rev()
139(1)
10.9 Anonymisation and pseudonymisation
140(3)
10.9.1 fct_anon()
141(2)
10.10 Recommended resources
143(1)
10.11 Summary
143(2)
11 Working with dates and time
145(16)
11.1 The two questions
145(1)
11.2 The packages
145(1)
11.3 The data
146(1)
11.4 Formatting dates
146(3)
11.4.1 Formatting dates with lubridate
147(1)
11.4.2 Formatting dates with base R
148(1)
11.4.3 Numerical dates
148(1)
11.5 Extracting dates
149(2)
11.6 Time intervals
151(1)
11.7 Time zones
151(3)
11.7.1 The importance of time zones
152(1)
11.7.2 Same times in different time zones
153(1)
11.8 Replacing missing date components
154(1)
11.9 Graphing: a worked example
154(5)
11.9.1 Reordering a variable by a date
155(1)
11.9.2 Summarising date-based data
156(2)
11.9.3 Date labels with scale_x_date()
158(1)
11.10 Recommended resources
159(1)
11.11 Summary
159(2)
12 Working with spatial data
161(28)
12.1 The importance of maps
161(1)
12.2 The packages
161(2)
12.3 The data
163(1)
12.4 What is spatial data?
163(1)
12.5 Introduction to the sf package
163(3)
12.5.1 Reading data: st_read()
164(1)
12.5.2 Converting data: st_as_sf()
164(1)
12.5.3 Polygon area: st_area()
164(1)
12.5.4 Plotting maps: geom_sf()
165(1)
12.5.5 Extracting coordinates st_coordinates()
166(1)
12.6 Plotting a world map
166(1)
12.6.1 Filtering with filter()
167(1)
12.7 Coordinate reference systems
167(4)
12.7.1 Finding the CRS of an object with st_crs()
169(1)
12.7.2 Transform the CRS with st_transform()
170(1)
12.7.3 Cropping with coord_sf()
170(1)
12.8 Adding reference information
171(3)
12.8.1 Adding a scale bar and north arrow
171(1)
12.8.2 Positioning names with centroids
172(1)
12.8.3 Adding names with geom_text()
173(1)
12.9 Making a chloropleth
174(1)
12.10 Random sampling
175(1)
12.11 Saving with st_write()
176(1)
12.12 Rasters with the raster package
177(10)
12.12.1 Loading rasters
177(1)
12.12.2 Raster data
177(1)
12.12.3 Plotting rasters
178(2)
12.12.4 Basic raster calculations
180(1)
12.12.5 Sampling
181(1)
12.12.6 Extracting raster data from points
182(1)
12.12.7 Turning data frames into rasters
183(1)
12.12.8 Calculating distances
183(1)
12.12.9 Masking
184(1)
12.12.10 Cropping
185(1)
12.12.11 Saving
185(1)
12.12.12 Changing to a data frame
186(1)
12.13 Recommended resources
187(1)
12.14 Summary
187(2)
13 Common R code mistakes and quirks
189(8)
13.1 Making mistakes
189(1)
13.2 The packages
190(1)
13.3 The data
190(1)
13.4 Capitalisation mistakes
190(1)
13.5 Forgetting brackets
191(1)
13.6 Forgetting quotation marks
191(1)
13.7 Forgetting commas
192(1)
13.8 Forgetting `+' in a ggplot
192(1)
13.9 Forgetting to call a ggplot object
193(1)
13.10 Piping but not making an object
193(1)
13.11 Changing a factor to a number
194(1)
13.12 Strings automatically read as factors
195(1)
13.13 Summary
196(1)
III Modelling
197(158)
14 Basic statistical concepts
199(26)
14.1 Variables and statistics
199(1)
14.2 The packages
199(1)
14.3 The data
199(1)
14.4 Describing things which are variable
200(6)
14.4.1 Central tendency
200(1)
14.4.1.1 Mean
200(1)
14.4.1.2 Median
201(2)
14.4.2 Describing variability
203(1)
14.4.2.1 Range
203(1)
14.4.2.2 Standard deviation
203(1)
14.4.2.3 Percentile range
204(1)
14.4.3 Reporting central tendency and variability
205(1)
14.4.4 Precision
206(1)
14.5 Introducing probability
206(1)
14.6 Probability distributions
207(8)
14.6.1 Binomial distribution
207(1)
14.6.1.1 Bernoulli distribution
208(2)
14.6.2 Poisson distribution
210(3)
14.6.3 Normal distribution
213(2)
14.7 Random sampling
215(4)
14.7.1 Simple random sampling
216(1)
14.7.2 Stratified random sampling
217(2)
14.8 Modelling approaches
219(4)
14.8.1 Null hypothesis testing
219(2)
14.8.2 Information-theoretics
221(1)
14.8.3 Bayesian approaches
222(1)
14.8.4 Machine learning
222(1)
14.9 Under-fitting and over-fitting
223(1)
14.10 Recommended resources
224(1)
14.11 Summary
224(1)
15 Understanding linear models
225(42)
15.1 Regression versus classification
225(1)
15.2 The packages
226(1)
15.3 The data
226(1)
15.4 Graphing a y variable
227(1)
15.5 What is a linear model?
227(2)
15.5.1 How to draw a linear model from an equation
228(1)
15.6 Predicting the response variable
229(2)
15.7 Formulating hypotheses
231(1)
15.8 Goodness-of-fit
232(4)
15.8.1 Residuals
233(1)
15.8.2 Correlation
234(2)
15.9 Making a linear model in R
236(1)
15.10 Introduction to model selection
237(1)
15.10.1 Estimating the number of parameters: K
237(1)
15.10.2 Goodness of fit: L
238(1)
15.11 Doing model selection in R
238(4)
15.11.1 Interpreting an AIC Table
240(1)
15.11.1.1 Evidence ratios
241(1)
15.11.1.2 Keep in mind
241(1)
15.12 Understanding coefficients
242(2)
15.13 Model equations and prediction
244(4)
15.13.1 Dummy variables and a design matrix
245(1)
15.13.2 Plotting a prediction with geom_abline()
246(1)
15.13.3 Automatic prediction
247(1)
15.14 Understanding a model summary
248(2)
15.15 Standard errors and confidence intervals
250(2)
15.15.1 Confidence intervals for model predictions
250(2)
15.16 Model diagnostics
252(4)
15.16.1 Still problems?
256(1)
15.17 Log transformations
256(5)
15.17.1 What are logarithms?
257(3)
15.17.2 Logarithms and zero
260(1)
15.18 Simulation
261(4)
15.18.1 Making a for() loop
262(1)
15.18.2 Example simulation
262(3)
15.19 Reporting modelling results
265(1)
15.20 Summary
266(1)
16 Extensions to linear models
267(42)
16.1 Building upon linear models
267(1)
16.2 The packages
267(1)
16.3 The data
268(1)
16.4 Multiple regression
268(9)
16.4.1 Additive versus interaction models
269(1)
16.4.2 Visualising multiple regression
270(1)
16.4.2.1 Visualising continuous variables
271(3)
16.4.2.2 A visualisation trick
274(2)
16.4.3 Colinearity and multicolinearity
276(1)
16.5 Most statistical tests are linear models
277(4)
16.5.0.1 One sample t-test
278(1)
10.5.0.2 Independent t-test
279(1)
16.5.0.3 One-way ANOVA
279(1)
16.5.0.4 Two-way ANOVA
280(1)
16.5.0.5 ANCOVA
281(1)
16.6 Generalised linear models
281(17)
16.6.0.1 The importance of link functions
282(1)
16.6.1 Gaussian distribution (normal distribution)
283(1)
16.6.2 Binomial distribution for logistic regression
283(2)
16.6.2.1 Confidence intervals with link functions
285(2)
16.6.3 Poisson regresssion for count data
287(2)
16.6.3.1 Diagnostics for GLM
289(2)
16.6.3.2 Under and over-dispersion
291(1)
16.6.4 Chi-squared tests
292(1)
16.6.4.1 Data preparation
293(1)
16.6.4.2 Chi-squared goodness-of-fit test
294(1)
16.6.4.3 Test of independence
295(3)
16.7 Other related modelling approaches
298(9)
16.7.1 Repeated measures
298(2)
16.7.1.1 Pairwise t-test
300(1)
16.7.2 Cumulative link models
301(3)
16.7.3 Beta regression
304(1)
16.7.4 Non-parametric approaches
305(2)
16.7.5 Advantages of non-parametric tests
307(1)
16.8 Recommended resources
307(1)
16.9 Summary
307(2)
17 Introduction to clustering and classification
309(30)
17.1 Clustering and Classification
309(1)
17.2 The packages
309(1)
17.3 The data
310(1)
17.4 Supervised versus unsupervised learning
310(1)
17.4.1 Why learn classification and clustering
311(1)
17.5 Clustering
311(14)
17.5.1 Hierarchical clustering
312(5)
17.5.2 Dimension reduction
317(8)
17.6 Classification
325(12)
17.6.1 Classification trees
326(1)
17.6.2 How classification trees work
327(1)
17.6.3 Model evaluation
328(2)
17.6.3.1 Pruning a tree
330(3)
17.6.4 k-fold cross-validation
333(3)
17.6.5 Accuracy versus interpretability
336(1)
17.7 Recommended resources
337(1)
17.8 Summary
337(2)
18 Reporting and worked examples
339(14)
18.1 Writing the project report
339(1)
18.2 The packages
340(1)
18.3 The data
340(1)
18.4 Reporting training data
341(4)
18.5 Prosecution results
345(3)
18.6 Fish pond study
348(5)
19 Epilogue
353(2)
A Appendix: step-wise statistical calculations
355(8)
A.1 How to approach an equation
355(1)
A.2 Data
355(1)
A.3 The standard deviation
356(1)
A.4 Model coefficients
356(7)
A.4.1 Slope
357(1)
A.4.2 Intercept
358(1)
A.4.3 r2 (Pearson's correlation coefficient)
359(1)
A.4.4 AIC and AICc
360(3)
Index 363
Nathan Whitmore is a conservation scientist and practitioner who worked previously as a fisheries observer, and a ranger for the New Zealand Department of Conservation. Between 2012 and 2018 he worked in Papua New Guinea where he provided scientific and project support for the Wildlife Conservation Societys Papua New Guinea programme. His professional interests include evidence- based decision making, sustainable use of wildlife, and traditional natural resource management.