Advances in Machine Learning and Data Mining for Astronomy documents numerous successful collaborations among computer scientists, statisticians, and astronomers who illustrate the application of state-of-the-art machine learning and data mining techniques in astronomy. Due to the massive amount and complexity of data in most scientific disciplines, the material discussed in this text transcends traditional boundaries between various areas in the sciences and computer science.
The books introductory part provides context to issues in the astronomical sciences that are also important to health, social, and physical sciences, particularly probabilistic and statistical aspects of classification and cluster analysis. The next part describes a number of astrophysics case studies that leverage a range of machine learning and data mining technologies. In the last part, developers of algorithms and practitioners of machine learning and data mining show how these tools and techniques are used in astronomical applications.
With contributions from leading astronomers and computer scientists, this book is a practical guide to many of the most important developments in machine learning, data mining, and statistics. It explores how these advances can solve current and future problems in astronomy and looks at how they could lead to the creation of entirely new algorithms within the data mining community.
Arvustused
"The volume is a well-organised collection of articles presenting the importance of modern data mining and machine learning techniques in application to analysis of astronomical data. A major strength of the volume is its very impressive collection of real examples that can be both inspirational and educational. The book is particularly successful in showing how collaboration between computer scientists and statisticians on one side and astronomers on the other is needed to search for a scientific discovery in the abundance of data generated by instrumentation and simulations." Krzysztof Podgorski, International Statistical Review, 2014
Foreword |
|
ix | |
Editors |
|
xi | |
Perspective |
|
xiii | |
Contributors |
|
xxv | |
|
Part I Foundational Issues |
|
|
|
Chapter 1 Classification in Astronomy: Past and Present |
|
|
3 | (8) |
|
|
Chapter 2 Searching the Heavens: Astronomy, Computation, Statistics, Data Mining, and Philosophy |
|
|
11 | (16) |
|
|
Chapter 3 Probability and Statistics in Astronomical Machine Learning and Data Mining |
|
|
27 | (14) |
|
|
Part II Astronomical Applications |
|
|
|
Section 1 Source Identification |
|
|
|
Chapter 4 Automated Science Processing for the Fermi Large Area Telescope |
|
|
41 | (14) |
|
|
Chapter 5 Cosmic Microwave Background Data Analysis |
|
|
55 | (34) |
|
|
|
Chapter 6 Data Mining and Machine Learning in Time-Domain Discovery and Classification |
|
|
89 | (24) |
|
|
|
Chapter 7 Cross-Identification of Sources: Theory and Practice |
|
|
113 | (20) |
|
|
Chapter 8 The Sky Pixelization for Cosmic Microwave Background Mapping |
|
|
133 | (28) |
|
|
|
Chapter 9 Future Sky Surveys: New Discovery Frontiers |
|
|
161 | (22) |
|
|
|
Chapter 10 Poisson Noise Removal in Spherical Multichannel Images: Application to Fermi Data |
|
|
183 | (30) |
|
|
|
|
|
|
|
Chapter 11 Galaxy Zoo: Morphological Classification and Citizen Science |
|
|
213 | (24) |
|
|
|
|
|
|
|
|
|
|
Chapter 12 The Utilization of Classifications in High-Energy Astrophysics Experiments |
|
|
237 | (30) |
|
|
Chapter 13 Database-Driven Analyses of Astronomical Spectra |
|
|
267 | (20) |
|
|
Chapter 14 Weak Gravitational Lensing |
|
|
287 | (36) |
|
|
|
|
|
Chapter 15 Photometric Redshifts: 50 Years After |
|
|
323 | (14) |
|
|
Chapter 16 Galaxy Clusters |
|
|
337 | (18) |
|
|
Section 3 Signal Processing (Time-Series) Analysis |
|
|
|
Chapter 17 Planet Detection: The Kepler Mission |
|
|
355 | (28) |
|
|
|
|
|
|
Chapter 18 Classification of Variable Objects in Massive Sky Monitoring Surveys |
|
|
383 | (24) |
|
|
|
|
Chapter 19 Gravitational Wave Astronomy |
|
|
407 | (40) |
|
|
Section 4 The Largest Data Sets |
|
|
|
Chapter 20 Virtual Observatory and Distributed Data Mining |
|
|
447 | (16) |
|
|
Chapter 21 Multitree Algorithms for Large-Scale Astrostatistics |
|
|
463 | (24) |
|
|
|
|
|
|
Part III Machine Learning Methods |
|
|
|
Chapter 22 Time-Frequency Learning Machines for Nonstationarity Detection Using Surrogates |
|
|
487 | (18) |
|
|
|
|
|
|
|
Chapter 23 Classification |
|
|
505 | (18) |
|
|
Chapter 24 On the Shoulders of Gauss, Bessel, and Poisson: Links, Chunks, Spheres, and Conditional Models |
|
|
523 | (20) |
|
|
Chapter 25 Data Clustering |
|
|
543 | (20) |
|
|
Chapter 26 Ensemble Methods: A Review |
|
|
563 | (32) |
|
|
|
Chapter 27 Parallel and Distributed Data Mining for Astronomy Applications |
|
|
595 | (22) |
|
|
|
Chapter 28 Pattern Recognition in Time Series |
|
|
617 | (30) |
|
|
|
|
|
Chapter 29 Randomized Algorithms for Matrices and Data |
|
|
647 | (26) |
|
Index |
|
673 | |
Michael J. Way, PhD, is a research scientist at the NASA Goddard Institute for Space Studies in New York and the NASA Ames Research Center in California. He is also an adjunct professor in the Department of Physics and Astronomy at Hunter College. His research focuses on understanding the multiscale structure of our universe, modeling the atmospheres of exoplanets, and applying kernel methods to new areas in astronomy.
Jeffrey D. Scargle, PhD, is an astrophysicist in the Space Science and Astrobiology Division of the NASA Ames Research Center. His main interests encompass the variability of astronomical objects, including the Sun, sources in the Galaxy, and active galactic nuclei; cosmology; plasma astrophysics; planetary detection; and data analysis and statistical methods.
Kamal M. Ali, PhD, is a research scientist in machine learning and data mining. He has a consulting practice and is cofounder of the start-up Metric Avenue. He has carried out research at IBM Almaden, Stanford University, Vividence, Yahoo, and TiVo, where he worked on the Tivo Collaborative Filtering Engine. His current research focuses on combining machine learning in conditional random fields with linguistically rich features to make machines better at reading web pages.
Ashok N. Srivastava, PhD, is the principal scientist for Data Mining and Systems Health Management and leader of the Intelligent Data Understanding group at NASA Ames Research Center. His research includes the development of data mining algorithms for anomaly detection in massive data streams, kernel methods in machine learning, and text mining algorithms.