Preface |
|
xi | |
1 Introduction |
|
1 | (10) |
|
|
2 | (1) |
|
1.2 The Organization of this Book |
|
|
2 | (1) |
|
1.3 Sequence Fundamentals |
|
|
3 | (8) |
|
|
5 | (1) |
|
|
6 | (1) |
|
|
6 | (1) |
|
|
7 | (2) |
|
|
9 | (2) |
2 Protein/DNA/RNA Pairwise Sequence Alignment |
|
11 | (14) |
|
2.1 Sequence Alignment Fundamentals |
|
|
12 | (1) |
|
|
12 | (2) |
|
|
14 | (5) |
|
2.3.1 Needleman-Wunsch's Algorithm |
|
|
15 | (1) |
|
|
16 | (1) |
|
2.3.3 Smith-Waterman's Algorithm |
|
|
17 | (2) |
|
|
19 | (1) |
|
|
19 | (2) |
|
|
20 | (1) |
|
2.5 Searching Sequence Databases |
|
|
21 | (4) |
|
|
21 | (1) |
|
|
21 | (4) |
3 Quantifying Sequence Alignments |
|
25 | (34) |
|
3.1 Evolution and Measuring Evolution |
|
|
25 | (3) |
|
3.1.1 Jukes and Cantor's Model |
|
|
26 | (2) |
|
3.1.2 Measuring Relatedness |
|
|
28 | (1) |
|
3.2 Substitution Matrices and Scoring Matrices |
|
|
28 | (4) |
|
|
28 | (1) |
|
3.2.2 Substitution/Mutation Scores |
|
|
29 | (3) |
|
|
32 | (4) |
|
|
35 | (1) |
|
|
35 | (1) |
|
3.4 Scoring Multiple Sequence Alignments |
|
|
36 | (2) |
|
|
36 | (2) |
|
|
38 | (1) |
|
3.6 Conservation Score Schemes |
|
|
39 | (1) |
|
3.6.1 Wu and Kabat's Method |
|
|
39 | (1) |
|
|
39 | (1) |
|
3.6.3 Lockless and Ranganathan's Method |
|
|
40 | (1) |
|
3.7 Diversity Scoring Schemes |
|
|
40 | (2) |
|
|
41 | (1) |
|
|
41 | (1) |
|
3.8 Stereochemical Property Methods |
|
|
42 | (2) |
|
|
43 | (1) |
|
3.9 Hierarchical Expected Matching Probability Scoring Metric (HEP) |
|
|
44 | (15) |
|
3.9.1 Building an AACCH Scoring Tree |
|
|
44 | (2) |
|
|
46 | (1) |
|
3.9.3 Proof of Scoring Metric Correctness |
|
|
47 | (1) |
|
|
48 | (1) |
|
3.9.5 Scoring Metric and Sequence Weighting Factor |
|
|
49 | (1) |
|
3.9.6 Evaluation Data Sets |
|
|
50 | (2) |
|
|
52 | (7) |
4 Sequence Clustering |
|
59 | (10) |
|
4.1 Unweighted Pair Group Method with Arithmetic Mean - UPGMA |
|
|
60 | (1) |
|
4.2 Neighborhood-Joining Method - NJ |
|
|
61 | (4) |
|
4.3 Overlapping Sequence Clustering |
|
|
65 | (4) |
5 Multiple Sequences Alignment Algorithms |
|
69 | (34) |
|
|
70 | (1) |
|
|
70 | (1) |
|
5.2 Progressive Alignment |
|
|
71 | (5) |
|
|
73 | (1) |
|
5.2.2 PIMA: Pattern-Induced Multisequence Alignment |
|
|
73 | (1) |
|
5.2.3 PRIME: Profile-Based Randomized Iteration Method |
|
|
74 | (1) |
|
|
75 | (1) |
|
5.3 Consistency and Probabilistic MSA |
|
|
76 | (6) |
|
5.3.1 POA: Partial Order Graph Alignment |
|
|
76 | (1) |
|
|
77 | (1) |
|
5.3.3 ProbCons: Probabilistic Consistency-Based Multiple Sequence Alignment |
|
|
78 | (1) |
|
5.3.4 T-Coffee: Tree-Based Consistency Objective Function for Alignment Evaluation |
|
|
79 | (1) |
|
5.3.5 MSA Based on Fast Fourier Transform |
|
|
80 | (1) |
|
|
81 | (1) |
|
|
81 | (1) |
|
|
82 | (3) |
|
5.4.1 SAGA: Sequence Alignment by Genetic Algorithm |
|
|
83 | (1) |
|
5.4.2 GA and Self-Organizing Neural Networks |
|
|
84 | (1) |
|
|
85 | (1) |
|
5.5 New Development in Multiple Sequence Alignment Algorithms |
|
|
85 | (12) |
|
5.5.1 KB-MSA: Knowledge-Based Multiple Sequence Alignment |
|
|
85 | (9) |
|
5.5.2 PADT: Progressive Multiple Sequence Alignment Based on Dynamic Weighted Tree |
|
|
94 | (3) |
|
5.6 Test Data and Alignment Methods |
|
|
97 | (1) |
|
|
98 | (5) |
|
5.7.1 Measuring Alignment Quality |
|
|
98 | (1) |
|
|
98 | (5) |
6 Phylogeny in Multiple Sequence Alignments |
|
103 | (10) |
|
|
103 | (2) |
|
6.2 Phylogeny Construction |
|
|
105 | (7) |
|
|
106 | (1) |
|
6.2.2 Character-Based Methods |
|
|
107 | (2) |
|
6.2.3 Maximum Likelihood Methods |
|
|
109 | (1) |
|
|
110 | (1) |
|
6.2.5 Subtree Pruning and Re-grafting |
|
|
111 | (1) |
|
6.3 Inferring Phylogeny from Multiple Sequence Alignments |
|
|
112 | (1) |
7 Multiple Sequence Alignment on High-Performance Computing Models |
|
113 | (20) |
|
|
113 | (1) |
|
|
113 | (1) |
|
|
114 | (1) |
|
|
114 | (1) |
|
|
114 | (1) |
|
7.1.5 Reconfigurable Mesh |
|
|
114 | (1) |
|
7.2 Exiting Parallel Multiple Sequence Alignment |
|
|
114 | (2) |
|
7.3 Reconfigurable-Mesh Computing Models - (R-Mesh) |
|
|
116 | (2) |
|
7.4 Pairwise Dynamic Programming Algorithms |
|
|
118 | (8) |
|
7.4.1 R-Mesh Max Switches |
|
|
118 | (1) |
|
7.4.2 R-Mesh Adder/Subtractor |
|
|
118 | (2) |
|
7.4.3 Constant-Time Dynamic Programming on R-Mesh |
|
|
120 | (3) |
|
|
123 | (1) |
|
7.4.5 R-Mesh On/Off Switches |
|
|
124 | (1) |
|
7.4.6 Dynamic Programming Backtracking on R-Mesh |
|
|
125 | (1) |
|
7.5 Progressive Multiple Sequence Alignment ON R-Mesh |
|
|
126 | (7) |
|
7.5.1 Hierarchical Clustering on R-Mesh |
|
|
127 | (1) |
|
7.5.2 Constant Run-Time Sum-of-Pair Scoring Method |
|
|
128 | (1) |
|
7.5.3 Parallel Progressive MSA Algorithm and Its Complexity Analysis |
|
|
129 | (4) |
8 Sequence Analysis Services |
|
133 | (12) |
|
8.1 EMBL-EBI: European Bioinformatics Institute |
|
|
133 | (2) |
|
8.2 NCBI: National Center for Biotechnology Information |
|
|
135 | (1) |
|
8.3 GenomeNet and Data Bank of Japan |
|
|
136 | (1) |
|
8.4 Other Sequence Analysis and Alignment Web Servers |
|
|
137 | (1) |
|
8.5 SeqAna: Multiple Sequence Alignment with Quality Ranking |
|
|
138 | (2) |
|
8.6 Pairwise Sequence Alignment and Other Analysis Tools |
|
|
140 | (2) |
|
|
142 | (3) |
9 Multiple Sequence for Next-Generation Sequences |
|
145 | (16) |
|
|
145 | (2) |
|
9.2 Overview of Next Generation Sequence Alignment Algorithms |
|
|
147 | (7) |
|
9.2.1 Alignment Algorithms Based on Seeding and Hash Tables |
|
|
147 | (4) |
|
9.2.2 Alignment Algorithms Based on Suffix Tries |
|
|
151 | (3) |
|
9.3 Next-Generation Sequencing Tools |
|
|
154 | (7) |
10 Multiple Sequence Alignment for Variations Detection |
|
161 | (18) |
|
|
161 | (2) |
|
|
163 | (2) |
|
10.3 Variation Detection Methods Based on MSA |
|
|
165 | (7) |
|
10.4 Evaluation Methodology |
|
|
172 | (4) |
|
10.4.1 Performance Metrics |
|
|
172 | (2) |
|
10.4.2 Simulated Sequence Data |
|
|
174 | (1) |
|
10.4.3 Real Sequence Data |
|
|
175 | (1) |
|
10.5 Conclusion and Future Work |
|
|
176 | (3) |
11 Multiple Sequence Alignment for Structure Detection |
|
179 | (20) |
|
|
179 | (1) |
|
11.2 RNA Secondary Structure Prediction Based on MSA |
|
|
180 | (9) |
|
11.2.1 Common Information in Multiple Aligned RNA Sequences |
|
|
182 | (1) |
|
11.2.2 Review of RNA SS Prediction Methods |
|
|
183 | (4) |
|
11.2.3 Measures of Quality of RNA SS Prediction |
|
|
187 | (2) |
|
11.3 Protein Secondary Structure Prediction Based on MSA |
|
|
189 | (7) |
|
11.3.1 Review of Protein Secondary Structure Prediction Methods |
|
|
190 | (5) |
|
11.3.2 Measures of Quality of Protein SS Prediction |
|
|
195 | (1) |
|
11.4 Conclusion and Future Work |
|
|
196 | (3) |
References |
|
199 | (20) |
Index |
|
219 | |