Preface |
|
xi | |
|
|
|
|
3 | (11) |
|
1.1 Data Processing and the Research Cycle |
|
|
4 | (1) |
|
1.2 What We Do (and Don't Do) in this Book |
|
|
5 | (2) |
|
1.3 Why Focus on Data Processing? |
|
|
7 | (2) |
|
1.4 Data in Files vs. Data in Databases |
|
|
9 | (2) |
|
1.5 Target Audience, Requirements and Software |
|
|
11 | (1) |
|
|
12 | (2) |
|
|
14 | (9) |
|
|
14 | (2) |
|
2.2 Setting Up the Project Environment for Your Work |
|
|
16 | (4) |
|
2.3 The PostgreSQL Database System |
|
|
20 | (2) |
|
|
22 | (1) |
|
3 Data = Content + Structure |
|
|
23 | (16) |
|
|
23 | (1) |
|
3.2 Data Content and Structure |
|
|
24 | (2) |
|
3.3 Tables, Tables, Tables |
|
|
26 | (4) |
|
3.4 The Structure of Tables Matters |
|
|
30 | (5) |
|
|
35 | (4) |
|
|
|
|
39 | (20) |
|
4.1 Text and Binary Files |
|
|
40 | (3) |
|
4.2 File Formats for Tabular Data |
|
|
43 | (11) |
|
4.3 Transparent and Efficient Use of Files |
|
|
54 | (3) |
|
|
57 | (2) |
|
5 Managing Data in Spreadsheets |
|
|
59 | (15) |
|
5.1 Application: Spatial Inequality |
|
|
60 | (3) |
|
5.2 Spreadsheet Tables and (the Lack of) Structure |
|
|
63 | (1) |
|
5.3 Retrieving Data from a Table |
|
|
64 | (2) |
|
5.4 Changing Table Structure and Content |
|
|
66 | (1) |
|
5.5 Aggregating Data from a Table |
|
|
67 | (3) |
|
5.6 Exporting Spreadsheet Data |
|
|
70 | (1) |
|
5.7 Results: Spatial Inequality |
|
|
70 | (1) |
|
|
71 | (3) |
|
6 Basic Data Management in R |
|
|
74 | (13) |
|
6.1 Application: Inequality and Economic Performance in the US |
|
|
75 | (1) |
|
|
76 | (3) |
|
|
79 | (3) |
|
6.4 Aggregating Data from a Table |
|
|
82 | (2) |
|
6.5 Results: Inequality and Economic Performance in the US |
|
|
84 | (1) |
|
|
85 | (2) |
|
|
87 | (16) |
|
7.1 Application: Global Patterns of Inequality across Regime Types |
|
|
88 | (1) |
|
7.2 A New Operator: The Pipe |
|
|
89 | (1) |
|
|
90 | (2) |
|
7.4 Merging the WID and Polity IV Datasets |
|
|
92 | (1) |
|
7.5 Grouping and Aggregation |
|
|
93 | (3) |
|
7.6 Results: Global Patterns of Inequality across Regime Types |
|
|
96 | (1) |
|
7.7 Other Useful Functions in the tidyverse |
|
|
97 | (2) |
|
|
99 | (4) |
|
PART III DATA IN DATABASES |
|
|
|
8 Introduction to Relational Databases |
|
|
103 | (18) |
|
8.1 Database Servers and Clients |
|
|
105 | (3) |
|
|
108 | (1) |
|
8.3 Application: Electoral Disproportionality by Country |
|
|
109 | (1) |
|
8.4 Creating a Table with National Elections |
|
|
110 | (5) |
|
8.5 Computing Electoral Disproportionality |
|
|
115 | (2) |
|
8.6 Results: Electoral Disproportionality by Country |
|
|
117 | (1) |
|
|
118 | (3) |
|
9 Relational Databases and Multiple Tables |
|
|
121 | (14) |
|
9.1 Application: The Rise of Populism in Europe |
|
|
122 | (1) |
|
|
123 | (2) |
|
|
125 | (2) |
|
9.4 Merging Data from the PopuList |
|
|
127 | (2) |
|
9.5 Maintaining Referential Integrity |
|
|
129 | (2) |
|
9.6 Results: The Rise of Populism in Europe |
|
|
131 | (1) |
|
|
132 | (3) |
|
|
135 | (12) |
|
10.1 Speeding Up Data Access with Indexes |
|
|
136 | (4) |
|
10.2 Collaborative Data Management with Multiple Users |
|
|
140 | (3) |
|
|
143 | (4) |
|
PART IV SPECIAL TYPES OF DATA |
|
|
|
|
147 | (19) |
|
11.1 What Is Spatial Data? |
|
|
147 | (3) |
|
11.2 Application: Patterns of Violence in the Bosnian Civil War |
|
|
150 | (1) |
|
11.3 Reading and Visualizing Spatial Data in R |
|
|
151 | (7) |
|
11.4 Spatial Data in a Relational Database |
|
|
158 | (5) |
|
11.5 Results: Patterns of Violence in the Bosnian Civil War |
|
|
163 | (1) |
|
|
164 | (2) |
|
|
166 | (21) |
|
12.1 What Is Textual Data? |
|
|
167 | (2) |
|
12.2 Application: References to (In)equality in UN Speeches |
|
|
169 | (1) |
|
12.3 Working with Strings in (Base) R |
|
|
170 | (5) |
|
12.4 Natural Language Processing with quanteda |
|
|
175 | (4) |
|
12.5 Using PostgreSQL to Manage Documents |
|
|
179 | (4) |
|
12.6 Results: References to (In)equality in UN Speeches |
|
|
183 | (1) |
|
|
184 | (3) |
|
|
187 | (22) |
|
13.1 What Is Network Data? |
|
|
187 | (3) |
|
13.2 Application: Trade and Democracy |
|
|
190 | (1) |
|
13.3 Exploring Network Data in R with igraph |
|
|
191 | (6) |
|
13.4 Network Data in a Relational Database |
|
|
197 | (7) |
|
13.5 Results: Trade and Democracy |
|
|
204 | (1) |
|
|
205 | (4) |
|
|
|
14 Best Practices in Data Management |
|
|
209 | (10) |
|
14.1 Two General Recommendations |
|
|
209 | (3) |
|
14.2 Collaborative Data Management |
|
|
212 | (2) |
|
14.3 Disseminating Research Data and Code |
|
|
214 | (2) |
|
|
216 | (3) |
Bibliography |
|
219 | (4) |
Index |
|
223 | |