| About the Author |
|
ix | |
| Preface |
|
xi | |
| Introduction |
|
xiii | |
|
How governments make decisions |
|
|
xiii | |
|
Context as the foundation |
|
|
xv | |
|
Data science as a planning tool |
|
|
xvi | |
|
The importance of spatial thinking |
|
|
xvii | |
|
|
|
xx | |
| 1 Indicators for Transit-oriented Development |
|
1 | (22) |
|
1.1 Why start with indicators? |
|
|
1 | (4) |
|
1.1.1 Mapping and scale bias in areal aggregate data |
|
|
2 | (3) |
|
|
|
5 | (10) |
|
1.2.1 Downloading and wrangling census data |
|
|
6 | (5) |
|
1.2.2 Wrangling transit open data |
|
|
11 | (1) |
|
1.2.3 Relating tracts and subway stops in space |
|
|
12 | (3) |
|
1.3 Developing TOD indicators |
|
|
15 | (3) |
|
|
|
15 | (1) |
|
1.3.2 TOD indicator tables |
|
|
16 | (1) |
|
1.3.3 TOD indicator plots |
|
|
17 | (1) |
|
1.4 Capturing three submarkets of interest |
|
|
18 | (2) |
|
1.5 Conclusion: Are Philadelphians willing to pay for TOD? |
|
|
20 | (1) |
|
1.6 Assignment - Study TOD in your city |
|
|
21 | (2) |
| 2 Expanding the Urban Growth Boundary |
|
23 | (22) |
|
2.1 Introduction - Lancaster development |
|
|
23 | (7) |
|
|
|
26 | (3) |
|
2.1.2 Set up Lancaster data |
|
|
29 | (1) |
|
2.2 Identifying areas inside and outside of the Urban Growth Area |
|
|
30 | (7) |
|
2.2.1 Associate each inside/outside buffer with its respective town |
|
|
32 | (1) |
|
2.2.2 Building density by town and by inside/outside the UGA |
|
|
33 | (1) |
|
2.2.3 Visualize buildings inside and outside the UGA |
|
|
34 | (3) |
|
2.3 Return to Lancaster's bid-rent |
|
|
37 | (4) |
|
2.4 Conclusion - On boundaries |
|
|
41 | (1) |
|
2.5 Assignment - Boundaries in your community |
|
|
41 | (4) |
| 3 Intro to Geospatial Machine Learning, Part 1 |
|
45 | (26) |
|
3.1 Machine learning as a planning |
|
|
45 | (3) |
|
3.1.1 Accuracy and generalizability |
|
|
46 | (1) |
|
3.1.2 The machine learning process |
|
|
46 | (1) |
|
|
|
47 | (1) |
|
3.2 Data wrangling - Home price and crime data |
|
|
48 | (8) |
|
3.2.1 Feature engineering - Measuring exposure to crime |
|
|
51 | (2) |
|
3.2.2 Exploratory analysis - Correlation |
|
|
53 | (3) |
|
3.3 Introduction to ordinary least squares regression |
|
|
56 | (7) |
|
3.3.1 Our first regression model |
|
|
58 | (2) |
|
3.3.2 More feature engineering and colinearity |
|
|
60 | (3) |
|
3.4 Cross-validation and return to goodness of fit |
|
|
63 | (5) |
|
3.4.1 Accuracy - Mean absolute error |
|
|
63 | (2) |
|
3.4.2 Generalizability - Cross-validation |
|
|
65 | (3) |
|
3.5 Conclusion - Our first model |
|
|
68 | (1) |
|
3.6 Assignment - Predict house prices |
|
|
68 | (3) |
| 4 Intro to Geospatial Machine Learning, Part 2 |
|
71 | (16) |
|
4.1 On the spatial process of home prices |
|
|
71 | (3) |
|
4.1.1 Set up and data wrangling |
|
|
72 | (2) |
|
4.2 Do prices and errors cluster? The spatial lag |
|
|
74 | (3) |
|
4.2.1 Do model errors cluster? - Moran's I |
|
|
75 | (2) |
|
4.3 Accounting for neighborhood |
|
|
77 | (8) |
|
4.3.1 Accuracy of the neighborhood model |
|
|
78 | (2) |
|
4.3.2 Spatial autocorrelation in the neighborhood model |
|
|
80 | (2) |
|
4.3.3 Generalizability of the neighborhood model |
|
|
82 | (3) |
|
4.4 Conclusion - Features at multiple scales |
|
|
85 | (2) |
| 5 Geospatial Risk Modeling - Predictive Policing |
|
87 | (42) |
|
5.1 New predictive policing tools |
|
|
87 | (4) |
|
5.1.1 Generalizability in geospatial risk models |
|
|
88 | (1) |
|
5.1.2 From broken windows theory to broken windows policing |
|
|
89 | (1) |
|
|
|
90 | (1) |
|
5.2 Data wrangling: Creating the fishnet |
|
|
91 | (7) |
|
5.2.1 Data wrangling: Joining burglaries to the fishnet |
|
|
94 | (1) |
|
5.2.2 Wrangling risk factors |
|
|
95 | (3) |
|
5.3 Feature engineering - Count of risk factors by grid cell |
|
|
98 | (6) |
|
5.3.1 Feature engineering - Nearest neighbor features |
|
|
100 | (2) |
|
5.3.2 Feature Engineering - Measure distance to one point |
|
|
102 | (1) |
|
5.3.3 Feature Engineering - Create the final_net |
|
|
103 | (1) |
|
5.4 Exploring the spatial process of burglary |
|
|
104 | (5) |
|
|
|
108 | (1) |
|
|
|
109 | (16) |
|
5.5.1 Cross-validated Poisson regression |
|
|
111 | (1) |
|
5.5.2 Accuracy and generalzability |
|
|
112 | (6) |
|
5.5.3 Generalizability by neighborhood context |
|
|
118 | (2) |
|
5.5.4 Does this model allocate better than traditional crime hotspots? |
|
|
120 | (5) |
|
5.6 Conclusion - Bias but useful? |
|
|
125 | (2) |
|
5.7 Assignment - Predict risk |
|
|
127 | (2) |
| 6 People-based ML Models |
|
129 | (24) |
|
|
|
129 | (2) |
|
|
|
131 | (3) |
|
|
|
134 | (3) |
|
6.3.1 Training/testing sets |
|
|
135 | (1) |
|
6.3.2 Estimate a churn model |
|
|
135 | (2) |
|
|
|
137 | (5) |
|
|
|
141 | (1) |
|
|
|
142 | (2) |
|
6.6 Generating costs and benefits |
|
|
144 | (5) |
|
6.6.1 Optimizing the cost/benefit relationship |
|
|
146 | (3) |
|
|
|
149 | (1) |
|
6.8 Assignment - Target a subsidy |
|
|
150 | (3) |
| 7 People-based ML Models: Algorithmic Fairness |
|
153 | (18) |
|
|
|
153 | (3) |
|
7.1.1 The specter of disparate impact |
|
|
154 | (1) |
|
7.1.2 Modeling judicial outcomes |
|
|
155 | (1) |
|
7.1.3 Accuracy and generalizability in recidivism algorithms |
|
|
155 | (1) |
|
7.2 Data and exploratory analysis |
|
|
156 | (3) |
|
7.3 Estimate two recidivism models |
|
|
159 | (5) |
|
7.3.1 Accuracy and generalizability |
|
|
161 | (3) |
|
7.4 What about the threshold? |
|
|
164 | (1) |
|
7.5 Optimizing 'equitable' thresholds |
|
|
165 | (3) |
|
7.6 Assignment - Memo to the mayor |
|
|
168 | (3) |
| 8 Predicting Rideshare Demand |
|
171 | (26) |
|
8.1 Introduction - Rideshare |
|
|
171 | (1) |
|
8.2 Data wrangling - Rideshare |
|
|
172 | (7) |
|
|
|
173 | (1) |
|
|
|
174 | (1) |
|
8.2.3 Subset a study area using neighborhoods |
|
|
175 | (2) |
|
8.2.4 Create the final space/time panel |
|
|
177 | (1) |
|
8.2.5 Split training and test |
|
|
178 | (1) |
|
8.2.6 What about distance features? |
|
|
179 | (1) |
|
8.3 Exploratory Analysis - Rideshare |
|
|
179 | (8) |
|
8.3.1 Trip_Count serial autocorrelation |
|
|
180 | (2) |
|
8.3.2 Trip_Count spatial autocorrelation |
|
|
182 | (2) |
|
8.3.3 Space/time correlation? |
|
|
184 | (1) |
|
|
|
185 | (2) |
|
8.4 Modeling and validation using purrr ::map |
|
|
187 | (7) |
|
8.4.1 A short primer on nested tibbles |
|
|
187 | (1) |
|
8.4.2 Estimate a rideshare forecast |
|
|
188 | (1) |
|
8.4.3 Validate test set by time |
|
|
189 | (3) |
|
8.4.4 Validate test set by space |
|
|
192 | (2) |
|
8.5 Conclusion - Dispatch |
|
|
194 | (1) |
|
8.6 Assignment - Predict bike share trips |
|
|
195 | (2) |
| Conclusion - Algorithmic Governance |
|
197 | (4) |
| Index |
|
201 | |