Muutke küpsiste eelistusi

Big Data in Omics and Imaging: Association Analysis [Kõva köide]

(University of Texas School of Public Health, USA)
  • Formaat: Hardback, 668 pages, kõrgus x laius: 254x178 mm, kaal: 1610 g, 26 Tables, black and white; 60 Illustrations, color; 3 Illustrations, black and white
  • Sari: Chapman & Hall/CRC Computational Biology Series
  • Ilmumisaeg: 13-Dec-2017
  • Kirjastus: Chapman & Hall/CRC
  • ISBN-10: 1498725783
  • ISBN-13: 9781498725781
Teised raamatud teemal:
  • Formaat: Hardback, 668 pages, kõrgus x laius: 254x178 mm, kaal: 1610 g, 26 Tables, black and white; 60 Illustrations, color; 3 Illustrations, black and white
  • Sari: Chapman & Hall/CRC Computational Biology Series
  • Ilmumisaeg: 13-Dec-2017
  • Kirjastus: Chapman & Hall/CRC
  • ISBN-10: 1498725783
  • ISBN-13: 9781498725781
Teised raamatud teemal:
Big Data in Omics and Imaging: Association Analysis addresses the recent development of association analysis and machine learning for both population and family genomic data in sequencing era. It is unique in that it presents both hypothesis testing and a data mining approach to holistically dissecting the genetic structure of complex traits and to designing efficient strategies for precision medicine. The general frameworks for association analysis and machine learning, developed in the text, can be applied to genomic, epigenomic and imaging data.

FEATURES

Bridges the gap between the traditional statistical methods and computational tools for small genetic and epigenetic data analysis and the modern advanced statistical methods for big data

Provides tools for high dimensional data reduction

Discusses searching algorithms for model and variable selection including randomization algorithms, Proximal methods and matrix subset selection

Provides real-world examples and case studies

Will have an accompanying website with R code

The book is designed for graduate students and researchers in genomics, bioinformatics, and data science. It represents the paradigm shift of genetic studies of complex diseases from shallow to deep genomic analysis, from low-dimensional to high dimensional, multivariate to functional data analysis with next-generation sequencing (NGS) data, and from homogeneous populations to heterogeneous population and pedigree data analysis. Topics covered are: advanced matrix theory, convex optimization algorithms, generalized low rank models, functional data analysis techniques, deep learning principle and machine learning methods for modern association, interaction, pathway and network analysis of rare and common variants, biomarker identification, disease risk and drug response prediction.

Arvustused

"This is a fantastic book intensively focusing on the mathematical underpinnings of modern genome-wide association studies (GWAS). It serves well for senior graduate students in applied mathematics, computer science, and statistics who are interested in building a solid mathematical understanding of GWAS. Backgrounds of advanced mathematics and genetics are expected. It can also be used as a handbook for professionals to quickly check mathematical contexts of GWAS approaches and tools. This book is especially helpful for the latest generation of statistical geneticists who are pursuing academic career paths." ~Journal of the American Statistical Association, Jing Su (Wake Forest School of Medicine) "This is a fantastic book intensively focusing on the mathematical underpinnings of modern genome-wide association studies (GWAS). It serves well for senior graduate students in applied mathematics, computer science, and statistics who are interested in building a solid mathematical understanding of GWAS. Backgrounds of advanced mathematics and genetics are expected. It can also be used as a handbook for professionals to quickly check mathematical contexts of GWAS approaches and tools. This book is especially helpful for the latest generation of statistical geneticists who are pursuing academic career paths." ~Journal of the American Statistical Association, Jing Su (Wake Forest School of Medicine)

Preface xxv
Author xxxi
Chapter 1 Mathematical Foundation 1(94)
1.1 Sparsity-Inducing Norms, Dual Norms, And Fenchel Conjugate
1(15)
1.1.1 "Entrywise" Norms
4(1)
1.1.1.1 L2,1 Norm
5(1)
1.1.1.2 Lp,q Norm
5(1)
1.1.2 Frobenius Norm
5(1)
1.1.2.1 l1/l2 Norm
5(1)
1.1.3 Overlapping Groups
6(2)
1.1.4 Dual Norm
8(2)
1.1.4.1 The Norm Dual to the Group Norm
9(1)
1.1.5 Fenchel Conjugate
10(3)
1.1.6 Fenchel Duality
13(3)
1.2 Subdifferential
16(10)
1.2.1 Definition of Subgradient
17(1)
1.2.2 Subgradients of Differentiable Functions
18(1)
1.2.3 Calculus of Subgradients
18(8)
1.2.3.1 Nonnegative Scaling
19(1)
1.2.3.2 Addition
19(1)
1.2.3.3 Affine Transformation of Variables
19(1)
1.2.3.4 Pointwise Maximum
19(2)
1.2.3.5 Pointwise Supremum
21(1)
1.2.3.6 Expectation
21(1)
1.2.3.7 Chain Rule
22(1)
1.2.3.8 Subdifferential of the Norm
22(1)
1.2.3.9 Optimality Conditions: Unconstrained
23(1)
1.2.3.10 Application to Sparse Regularized Convex Optimization Problems
24(2)
1.3 Proximal Methods
26(29)
1.3.1 Introduction
26(1)
1.3.2 Basics of Proximate Methods
27(1)
1.3.2.1 Definition of Proximal Operator
27(1)
1.3.3 Properties of the Proximal Operator
28(8)
1.3.3.1 Separable Sum
28(5)
1.3.3.2 Moreau-Yosida Regularization
33(3)
1.3.3.3 Gradient Algorithms for the Calculation of the Proximal Operator
36(1)
1.3.4 Proximal Algorithms
36(6)
1.3.4.1 Proximal Point Algorithm
37(1)
1.3.4.2 Proximal Gradient Method
37(1)
1.3.4.3 Accelerated Proximal Gradient Method
38(1)
1.3.4.4 Alternating Direction Method of Multipliers
39(2)
1.3.4.5 Linearized ADMM
41(1)
1.3.5 Computing the Proximal Operator
42(13)
1.3.5.1 Generic Function
42(8)
1.3.5.2 Norms
50(5)
1.4 Matrix Calculus
55(8)
1.4.1 Derivative of a Function with Respect to a Vector
55(1)
1.4.2 Derivative of a Function with Respect to a Matrix
56(1)
1.4.3 Derivative of a Matrix with Respect to a Scalar
57(1)
1.4.4 Derivative of a Matrix with Respect to a Matrix or a Vector
58(1)
1.4.5 Derivative of a Vector Function of a Vector
59(1)
1.4.6 Chain Rules
59(1)
1.4.6.1 Vector Function of Vectors
59(1)
1.4.6.2 Scalar Function of Matrices
60(1)
1.4.7 Widely Used Formulae
60(3)
1.4.7.1 Determinants
60(1)
1.4.7.2 Polynomial Functions
61(1)
1.4.7.3 Trace
61(2)
1.5 Functional Principal Component Analysis (FPCA)
63(14)
1.5.1 Principal Component Analysis (PCA)
64(4)
1.5.1.1 Least Square Formulation of PCA
64(1)
1.5.1.2 Variance-Maximization Formulation of PCA
65(3)
1.5.2 Basic Mathematical Tools for Functional Principal Component Analysis
68(3)
1.5.2.1 Calculus of Variation
68(1)
1.5.2.2 Stochastic Calculus
69(2)
1.5.3 Unsmoothed Functional Principal Component Analysis
71(2)
1.5.4 Smoothed Principal Component Analysis
73(2)
1.5.5 Computations for the Principal Component Function and the Principal Component Score
75(2)
1.6 Canonical Correlation Analysis
77(13)
1.6.1 Mathematical Formulation of Canonical Correlation Analysis
77(1)
1.6.2 Correlation Maximization Techniques for Canonical Correlation Analysis
78(4)
1.6.3 Single Value Decomposition for Canonical Correlation Analysis
82(1)
1.6.4 Test Statistics
83(4)
1.6.5 Functional Canonical Correlation Analysis
87(3)
Appendix 1A
90(2)
Exercises
92(3)
Chapter 2 Linkage Disequilibrium 95(36)
2.1 Concepts Of Linkage Disequilibrium
95(1)
2.2 Measures Of Two-Locus Linkage Disequilibrium
96(7)
2.2.1 Linkage Disequilibrium Coefficient D
96(1)
2.2.2 Normalized Measure of Linkage Disequilibrium D'
97(1)
2.2.3 Correlation Coefficient r
97(4)
2.2.4 Composite Measure of Linkage Disequilibrium
101(1)
2.2.5 Relationship between the Measure of LD and Physical Distance
102(1)
2.3 Haplotype Reconstruction
103(2)
2.3.1 Clark's Algorithm
104(1)
2.3.2 EM algorithm
104(1)
2.3.3 Bayesian and Coalescence-Based Methods
104(1)
2.4 Multilocus Measures Of Linkage Disequilibrium
105(14)
2.4.1 Mutual Information Measure of LD
105(2)
2.4.2 Multi-Information and Multilocus Measure of LD
107(2)
2.4.3 Joint Mutual Information and a Measure of LD between a Marker and a Haplotype Block or between Two Haplotype Blocks
109(3)
2.4.4 Interaction Information
112(2)
2.4.5 Conditional Interaction Information
114(1)
2.4.6 Normalized Multi-Information
115(1)
2.4.7 Distribution of Estimated Mutual Information, Multi-Information and Interaction Information
115(4)
2.5 Canonical Correlation Analysis Measure For LD Between Two Genomic Regions
119(4)
2.5.1 Association Measure between Two Genomic Regions Based on CCA
119(3)
2.5.2 Relationship between Canonical Correlation and Joint Information
122(1)
Software Package
123(1)
Bibliographical Notes
123(1)
Appendix 2A
124(1)
Appendix 2B
125(1)
Appendix 2C
126(2)
Exercises
128(3)
Chapter 3 Association Studies for Qualitative Traits 131(80)
3.1 Population-Based Association Analysis For Common Variants
131(23)
3.1.1 Introduction
131(2)
3.1.2 The Hardy-Weinberg Equilibrium
133(3)
3.1.3 Genetic Models
136(3)
3.1.4 Odds Ratio
139(4)
3.1.5 Single Marker Association Analysis
143(7)
3.1.5.1 Contingency Tables
143(3)
3.1.5.2 Fisher's Exact Test
146(1)
3.1.5.3 The Traditional x2 Test Statistic
147(3)
3.1.6 Multimarker Association Analysis
150(4)
3.1.6.1 Generalized T2 Test Statistic
151(1)
3.1.6.2 The Relationship between the Generalized T2 Test and Fisher's Discriminant Analysis
152(2)
3.2 Population-Based Multivariate Association Analysis For Next-Generation Sequencing
154(24)
3.2.1 Multivariate Group Tests
155(3)
3.2.1.1 Collapsing Method
155(1)
3.2.1.2 Combined Multivariate and Collapsing Method
156(1)
3.2.1.3 Weighted Sum Method
157(1)
3.2.2 Score Tests and Logistic Regression
158(3)
3.2.2.1 Score Function
158(2)
3.2.2.2 Score Tests
160(1)
3.2.3 Application of Score Tests for Association of Rare Variants
161(6)
3.2.3.1 Weighted Function Method
161(3)
3.2.3.2 Sum Test and Adaptive Association Test
164(1)
3.2.3.3 The Sum Test
165(2)
3.2.4 Variance-Component Score Statistics and Logistic Mixed Effects Models
167(11)
3.2.4.1 Logistic Mixed Effects Models for Association Analysis
167(10)
3.2.4.2 Sequencing Kernel Association Test
177(1)
3.3 Population-Based Functional Association Analysis For Next-Generation Sequencing
178(18)
3.3.1 Introduction
179(1)
3.3.2 Functional Principal Component Analysis for Association Test
180(6)
3.3.2.1 Model and Principal Component Functions
180(2)
3.3.2.2 Computations for the Principal Component Function and the Principal Component Score
182(2)
3.3.2.3 Test Statistic
184(2)
3.3.3 Smoothed Functional Principal Component Analysis for Association Test
186(25)
3.3.3.1 A General Framework for the Smoothed Functional Principal Component Analysis
187(1)
3.3.3.2 Computations for the Smoothed Principal Component Function
188(2)
3.3.3.3 Test Statistic
190(1)
3.3.3.4 Power Comparisons
190(3)
3.3.3.5 Application to Real Data Examples
193(3)
Software Package
196(1)
Appendix 3A: Fisher Information Matrix For gamma
196(2)
Appendix 3B: Variance Function v(µ)
198(1)
Appendix 3C: Derivation Of Score Function For Utau
199(1)
Appendix 3D: Fisher Information Matrix Of PQL
200(2)
Appendix 3E: Scoring Algorithm
202(1)
Appendix 3F: Equivalence Between Iteratively Solving Linear Mixed Model And Iteratively Solving The Normal Equation
203(1)
Appendix 3G: Equation Reduction
204(3)
Exercises
207(4)
Chapter 4 Association Studies for Quantitative Traits 211(70)
4.1 Fixed Effect Model For A Single Trait
211(12)
4.1.1 Introduction
211(1)
4.1.2 Genetic Effects
211(5)
4.1.2.1 Variation Partition
211(2)
4.1.2.2 Genetic Additive and Dominance Effects
213(2)
4.1.2.3 Genetic Variance
215(1)
4.1.3 Linear Regression for a Quantitative Trait
216(4)
4.1.4 Multiple Linear Regression for a Quantitative Trait
220(3)
4.2 Gene-Based Quantitative Trait Analysis
223(10)
4.2.1 Functional Linear Model for a Quantitative Trait
223(8)
4.2.1.1 Model
223(1)
4.2.1.2 Parameter Estimation
224(5)
4.2.1.3 Test Statistics
229(2)
4.2.2 Canonical Correlation Analysis for Gene-Based Quantitative Trait Analysis
231(2)
4.2.2.1 Multivariate Canonical Correlation Analysis
231(2)
4.2.2.2 Functional Canonical Correlation Analysis
233(1)
4.3 Kernel Approach To Gene-Based Quantitative Trait Analysis
233(27)
4.3.1 Kernel and RKHS
233(11)
4.3.1.1 Kernel and Nonlinear Feature Mapping
233(4)
4.3.1.2 The Reproducing Kernel Hilbert Space
237(7)
4.3.2 Covariance Operator and Dependence Measure
244(16)
4.3.2.1 Hilbert-Schmidt Operator and Norm
244(2)
4.3.2.2 Tensor Product Space and Rank-One Operator
246(4)
4.3.2.3 Cross-Covariance Operator
250(4)
4.3.2.4 Dependence Measure and Covariance Operator
254(1)
4.3.2.5 Dependence Measure and Hilbert-Schmidt Norm of Covariance Operator
255(2)
4.3.2.6 Kernel-Based Association Tests
257(3)
4.4 Simulations And Real Data Analysis
260(4)
4.4.1 Power Evaluation
260(1)
4.4.2 Application to Real Data Examples
261(3)
Software Package
264(3)
Appendix 4A: Convergence Of The Least Square Estimator Of The Regression Coefficients
267(5)
Appendix 4B: Convergence Of Regression Coefficients In The Functional Linear Model
272(3)
Appendix 4C: Noncentrality Parameter Of The CCA Test
275(1)
Appendix 4D: Solution To The Constrained Nonlinear Covariance Optimization Problem And Dependence Measure
275(3)
Exercises
278(3)
Chapter 5 Multiple Phenotype Association Studies 281(62)
5.1 Pleiotropic Additive And Dominance Effects
281(2)
5.2 Multivariate Marginal Regression
283(21)
5.2.1 Models
283(1)
5.2.2 Estimation of Genetic Effects
284(10)
5.2.2.1 Least Square Estimation
284(5)
5.2.2.2 Maximum Likelihood Estimator
289(5)
5.2.3 Test Statistics
294(10)
5.2.3.1 Classical Null Hypothesis
294(1)
5.2.3.2 The Multivariate General Linear Hypothesis
295(1)
5.2.3.3 Estimation of the Parameter Matrix under Constraints
296(1)
5.2.3.4 Multivariate Analysis of Variance (MANOVA)
297(1)
5.2.3.5 Other Multivariate Test Statistics
298(6)
5.3 Linear Models For Multiple Phenotypes And Multiple Markers
304(7)
5.3.1 Multivariate Multiple Linear Regression Models
304(2)
5.3.2 Multivariate Functional Linear Models for Gene-Based Genetic Analysis of Multiple Phenotypes
306(5)
5.3.2.1 Parameter Estimation
307(1)
5.3.2.2 Null Hypothesis and Test Statistics
308(1)
5.3.2.3 Other Multivariate Test Statistics
309(1)
5.3.2.4 Willa' Lambda
310(1)
5.3.2.5 F Approximation to the Distribution of Three Test Statistics
310(1)
5.4 Canonical Correlation Analysis For Gene-Based Genetic Pleiotropic Analysis
311(8)
5.4.1 Multivariate Canonical Correlation Analysis (CCA)
311(1)
5.4.2 Kernel CCA
312(2)
5.4.3 Functional CCA
314(3)
5.4.4 Quadratically Regularized Functional CCA
317(2)
5.5 Dependence Measure And Association Tests Of Multiple Traits
319(2)
5.6 Principal Component For Phenotype Dimension Reduction
321(5)
5.6.1 Principal Component Analysis
321(1)
5.6.2 Kernel Principal Component Analysis
322(3)
5.6.3 Quadratically Regularized PCA or Kernel PCA
325(1)
5.7 Other Statistics For Pleiotropic Genetic Analysis
326(4)
5.7.1 Sum of Squared Score Test
326(2)
5.7.2 Unified Score-Based Association Test (USAT)
328(1)
5.7.3 Combining Marginal Tests
329(1)
5.7.4 FPCA-Based Kernel Measure Test of Independence
329(1)
5.8 Connection Between Statistics
330(5)
5.9 Simulations And Real Data Analysis
335(2)
5.9.1 Type 1 Error Rate and Power Evaluation
335(1)
5.9.2 Application to Real Data Example
336(1)
Software Package
337(1)
Appendix 5A Optimization Formulation Of Kernel CCA
337(2)
Appendix 5B Derivation Of The Regression Coefficient Matrix In The Functional Linear Mode, Sum Of Squares Due To Regression, And RFCCA Matrix
339(1)
Exercises
340(3)
Chapter 6 Family-Based Association Analysis 343(104)
6.1 Genetic Similarity And Kinship Coefficients
344(14)
6.1.1 Kinship Coefficients
344(3)
6.1.2 Identity Coefficients
347(1)
6.1.3 Relation between Identity Coefficients and Kinship Coefficients
348(2)
6.1.4 Estimation of Genetic Relations from the Data
350(8)
6.1.4.1 A General Framework for Identity by Descent
350(2)
6.1.4.2 Kinship Matrix or Genetic Relationship Matrix in the Homogeneous Population
352(1)
6.1.4.3 Kinship Matrix or Genetic Relationship Matrix in the General Population
353(4)
6.1.4.4 Coefficient of Fraternity
357(1)
6.2 Genetic Covariance Between Relatives
358(4)
6.2.1 Assumptions and Genetic Models
358(1)
6.2.2 Analysis for Genetic Covariance between Relatives
359(3)
6.3 Mixed Linear Model For A Single Trait
362(28)
6.3.1 Genetic Random Effect
362(4)
6.3.1.1 Single Random Variable
362(3)
6.3.1.2 Multiple Genetic Random Effects
365(1)
6.3.2 Mixed Linear Model for Quantitative Trait Association Analysis
366(4)
6.3.2.1 Mixed Linear Model
366(1)
6.3.2.2 Estimating Fixed and Random Effects
367(3)
6.3.3 Estimating Variance Components
370(13)
6.3.3.1 ML Estimation of Variance Components
370(3)
6.3.3.2 Restricted Maximum Likelihood Estimation
373(1)
6.3.3.3 Numerical Solutions to the ML/REML Equations
374(3)
6.3.3.4 Fisher Information Matrix for the ML Estimators
377(1)
6.3.3.5 Expectation/Maximization (EM) Algorithm for ML Estimation
378(4)
6.3.3.6 Expectation/Maximization (EM) Algorithm for REML Estimation
382(1)
6.3.3.7 Average Information Algorithms
383(1)
6.3.4 Hypothesis Test in Mixed Linear Models
383(4)
6.3.5 Mixed Linear Models for Quantitative Trait Analysis with Sequencing Data
387(3)
6.3.5.1 Sequence Kernel Association Test (SKAT)
387(3)
6.4 Mixed Functional Linear Models For Sequence-Based Quantitative Trait Analysis
390(5)
6.4.1 Mixed Functional Linear Models (Type 1)
390(3)
6.4.2 Mixed Functional Linear Models (Type 2: Functional Variance Component Models)
393(2)
6.5 Multivariate Mixed Linear Model For Multiple Traits
395(5)
6.5.1 Multivariate Mixed Linear Model
395(3)
6.5.2 Maximum Likelihood Estimate of Variance Components
398(1)
6.5.3 REML Estimate of Variance Components
399(1)
6.6 Heritability
400(10)
6.6.1 Heritability Estimation for a Single Trait
400(4)
6.6.1.1 Definition of Narrow-Sense Heritability
400(1)
6.6.1.2 Mixed Linear Model for Heritability Estimation
401(3)
6.6.2 Heritability Estimation for Multiple Traits
404(6)
6.6.2.1 Definition of Heritability Matrix for Multiple Traits
404(1)
6.6.2.2 Connection between Heritability Matrix and Multivariate Mixed Linear Models
405(1)
6.6.2.3 Another Interpretation of Heritability
406(2)
6.6.2.4 Maximizing Heritability
408(2)
6.7 Family-Based Association Analysis For Qualitative Trait
410(10)
6.7.1 The Generalized T2 Test with Families and Additional Population Structures
410(4)
6.7.2 Collapsing Method
414(2)
6.7.3 CMC with Families
416(2)
6.7.4 The Functional Principal Component Analysis and Smooth Functional Principal Component Analysis with Families
418(2)
Software Package
420(1)
Appendix 6A: Genetic Relationship Matrix
420(3)
Appendix 6B: Derivation Of Equation 6.30
423(3)
Appendix 6C: Derivation Of Equation 6.33
426(2)
Appendix 6D: ML Estimation Of Variance Components
428(1)
Appendix 6E: Covariance Matrix Of The ML Estimators
429(2)
Appendix 6F: Selection Of The Matrix K In The REML
431(2)
Appendix 6G: Alternative Form Of Log-Likelihood Function For The REML
433(3)
Appendix 6H: ML Estimate Of Variance Components In The Multivariate Mixed Linear Models
436(2)
Appendix 6I: Covariance Matrix For Family-Based T2 Statistic
438(2)
Appendix 6J: Family-Based Functional Principal Component Analysis
440(3)
Exercise
443(4)
Chapter 7 Interaction Analysis 447(84)
7.1 Measures Of Gene-Gene And Gene-Environment Interactions For A Qualitative Trait
448(14)
7.1.1 Binary Measure of Gene-Gene and Gene-Environment Interactions
448(5)
7.1.1.1 The Binary Measure of Gene-Gene Interaction for the Cohort Study Design
448(4)
7.1.1.2 The Binary Measure of Gene-Gene Interaction for the Case-Control Study Design
452(1)
7.1.2 Disequilibrium Measure of Gene-Gene and Gene-Environment Interactions
453(2)
7.1.3 Information Measure of Gene-Gene and Gene-Environment Interactions
455(3)
7.1.4 Measure of Interaction between a Gene and a Continuous Environment
458(4)
7.1.4.1 Multiplicative Measure of Interaction between a Gene and a Continuous Environment
458(1)
7.1.4.2 Disequilibrium Measure of Interaction between a Gene and a Continuous Environment
459(1)
7.1.4.3 Mutual Information Measure of Interaction between a Gene and a Continuous Environment
460(2)
7.2 Statistics For Testing Gene-Gene And Gene-Environment Interactions For A Qualitative Trait With Common Variants
462(24)
7.2.1 Relative Risk and Odds-Ratio-Based Statistics for Testing Interaction between a Gene and a Discrete Environment
462(2)
7.2.2 Disequilibrium-Based Statistics for Testing Gene-Gene Interaction
464(5)
7.2.2.1 Standard Disequilibrium Measure-Based Statistics
464(2)
7.2.2.2 Composite Measure of Linkage Disequilibrium for Testing Interaction between Unlinked Loci
466(3)
7.2.3 Information-Based Statistics for Testing Gene-Gene Interaction
469(3)
7.2.4 Haplotype Odds Ratio and Tests for Gene-Gene Interaction
472(8)
7.2.4.1 Genotype-Based Odds Ratio Multiplicative Interaction Measure
473(1)
7.2.4.2 Allele-Based Odds Ratio Multiplicative Interaction Measure
474(2)
7.2.4.3 Haplotype-Based Odds Ratio Multiplicative Interaction Measure
476(3)
7.2.4.4 Haplotype-Based Odds Ratio Multiplicative Interaction Measure-Based Test Statistics
479(1)
7.2.5 Multiplicative Measure-Based Statistics for Testing Interaction between a Gene and a Continuous Environment
480(1)
7.2.6 Information Measure-Based Statistics for Testing Interaction between a Gene and a Continuous Environment
481(1)
7.2.7 Real Example
481(5)
7.3 Statistics For Testing Gene-Gene And Gene-Environment Interaction For A Qualitative Trait With Next-Generation Sequencing Data
486(6)
7.3.1 Multiple Logistic Regression Model for Gene-Gene Interaction Analysis
487(1)
7.3.2 Functional Logistic Regression Model for Gene-Gene Interaction Analysis
488(4)
7.3.3 Statistics for Testing Interaction between Two Genomic Regions
492(1)
7.4 Statistics For Testing Gene-Gene And Gene-Environment Interaction For Quantitative Traits
492(24)
7.4.1 Genetic Models for Epistasis Effects of Quantitative Traits
493(5)
7.4.2 Regression Model for Interaction Analysis with Quantitative Traits
498(1)
7.4.3 Functional Regression Model for Interaction Analysis with a Quantitative Trait
499(8)
7.4.3.1 Model
499(1)
7.4.3.2 Parameter Estimation
500(3)
7.4.3.3 Test Statistics
503(1)
7.4.3.4 Simulations and Applications to Real Example
504(3)
7.4.4 Functional Regression Model for Interaction Analysis with Multiple Quantitative Traits
507(9)
7.4.4.1 Model
507(2)
7.4.4.2 Parameter Estimation
509(2)
7.4.4.3 Test Statistics
511(1)
7.4.4.4 Simulations and Real Example Applications
512(4)
7.5 Multivariate And Functional Canonical Correlation As A Unified Framework For Testing For Gene-Gene And Gene-Environment Interaction For Both Qualitative And Quantitative Traits
516(6)
7.5.1 Data Structure of CCA for Interaction Analysis
517(2)
7.5.1.1 Single Quantitative Trait
517(1)
7.5.1.2 Multiple Quantitative Trait
518(1)
7.5.1.3 A Qualitative Trait
518(1)
7.5.2 CCA and Functional CCA
519(2)
7.5.3 Kernel CCA
521(1)
Software Package
522(1)
Appendix 7A: Variance Of Logarithm Of ODDS Ratio
522(2)
Appendix 7B: Haplotype Odds-Ratio Interaction Measure
524(1)
Appendix 7C: Parameter Estimation For Multivariate Functional Regression Model
525(2)
Exercise
527(4)
Chapter 8 Machine Learning, Low-Rank Models, and Their Application to Disease Risk Prediction and Precision Medicine 531(84)
8.1 Logistic Regression
532(20)
8.1.1 Two-Class Logistic Regression
532(2)
8.1.2 Multiclass Logistic Regression
534(2)
8.1.3 Parameter Estimation
536(6)
8.1.4 Test Statistics
542(1)
8.1.5 Network-Penalized Two-Class Logistic Regression
543(5)
8.1.5.1 Model
543(4)
8.1.5.2 Proximal Method for Parameter Estimation
547(1)
8.1.6 Network-Penalized Multiclass Logistic Regression
548(4)
8.1.6.1 Model
548(2)
8.1.6.2 Proximal Method for Parameter Estimation in Multiclass Logistic Regression
550(2)
8.2 Fisher's Linear Discriminant Analysis
552(10)
8.2.1 Fisher's Linear Discriminant Analysis for Two Classes
552(4)
8.2.2 Multiclass Fisher's Linear Discriminant Analysis
556(2)
8.2.3 Connections between Linear Discriminant Analysis, Optimal Scoring, and Canonical Correlation Analysis (CCA)
558(4)
8.2.3.1 Matrix Formulation of Linear Discriminant Analysis
558(3)
8.2.3.2 Optimal Scoring and Its Connection with Linear Discriminant Analysis
561(1)
8.2.3.3 Connection between LDA and CCA
561(1)
8.3 Support Vector Machine
562(18)
8.3.1 Introduction
563(1)
8.3.2 Linear Support Vector Machines
563(12)
8.3.2.1 Separable Case
563(3)
8.3.2.2 Nonseparable Case
566(2)
8.3.2.3 The Karush-Kuhn-Tucker (KKT) Conditions
568(2)
8.3.2.4 Sequential Minimal Optimization (SMO) Algorithm
570(5)
8.3.3 Nonlinear SVM
575(1)
8.3.4 Penalized SVMs
575(5)
8.4 Low-Rank Approximation
580(5)
8.4.1 Quadratically Regularized PCA
580(3)
8.4.1.1 Formulation
580(2)
8.4.1.2 Interpretation
582(1)
8.4.2 Generalized Regularization
583(2)
8.4.2.1 Formulation
583(1)
8.4.2.2 Sparse PCA
583(2)
8.5 Generalized Canonical Correlation Analysis (CCA)
585(16)
8.5.1 Quadratically Regularized Canonical Correlation Analysis
585(1)
8.5.2 Sparse Canonical Correlation Analysis
586(10)
8.5.2.1 Least Square Formulation of CCA
586(9)
8.5.2.2 CCA for Multiclass Classification
595(1)
8.5.3 Sparse Canonical Correlation Analysis via a Penalized Matrix Decomposition
596(5)
8.5.3.1 Sparse Singular Value Decomposition via Penalized Matrix Decomposition
596(3)
8.5.3.2 Sparse CCA via Direct Regularization Formulation
599(2)
8.6 Inverse Regression (IR) And Sufficient Dimension Reduction
601(10)
8.6.1 Sufficient Dimension Reduction (SDR) and Sliced Inverse Regression (SIR)
601(4)
8.6.2 Sparse SDR
605(6)
8.6.2.1 Coordinate Hypothesis
605(1)
8.6.2.2 Reformulation of SIR for SDR as an Optimization Problem
606(1)
8.6.2.3 Solve Sparse SDR by Alternative Direction Method of Multipliers
607(3)
8.6.2.4 Application to Real Data Examples
610(1)
Software Package
611(4)
Appendix 8A: Proximal Method For Parameter Estimation In Network-Penalized Two-Class Logistic Regression 615(6)
Appendix 8B: Equivalence Of Optimal Scoring And LDA 621(1)
Appendix 8C: A Distance From A Point To The Hyperplane 622(2)
Appendix 8D: Solving A Quadratically Regularized PCA Problem 624(2)
Appendix 8E: The Eckart-Young Theorem 626(4)
Appendix 8F: Poincare Separation Theorem 630(2)
Appendix 8G: Regression For CCA 632(2)
Appendix 8H: Partition Of Global SDR For A Whole Genome Into A Number Of Small Regions 634(3)
Appendix 8I: Optimal Scoring And Alternative Direction Methods Of Multipliers (ADMM) Algorithms 637(4)
Exercises 641(4)
References 645(10)
Index 655
Momiao Xiong, is a professor in the Department of Biostatistics, University of Texas School of Public Health, and a regular member in the Genetics & Epigenetics (G&E) Graduate Program at The University of Texas MD Anderson Cancer Center, UTHealth Graduate School of Biomedical Science.