Muutke küpsiste eelistusi

Multiple Biological Sequence Alignment: Scoring Functions, Algorithms and Evaluation [Kõva köide]

Series edited by (University of Western Australia), (Clayton State University, GA, USA), (Department of Computer Science, Georgia State University), (Oak Ridge National Lab, USA)
  • Formaat: Hardback, 256 pages, kõrgus x laius x paksus: 236x160x23 mm, kaal: 476 g
  • Sari: Wiley Series in Bioinformatics
  • Ilmumisaeg: 26-Aug-2016
  • Kirjastus: John Wiley & Sons Inc
  • ISBN-10: 1118229045
  • ISBN-13: 9781118229040
Teised raamatud teemal:
  • Formaat: Hardback, 256 pages, kõrgus x laius x paksus: 236x160x23 mm, kaal: 476 g
  • Sari: Wiley Series in Bioinformatics
  • Ilmumisaeg: 26-Aug-2016
  • Kirjastus: John Wiley & Sons Inc
  • ISBN-10: 1118229045
  • ISBN-13: 9781118229040
Teised raamatud teemal:
Covers the fundamentals and techniques of multiple biological sequence alignment and analysis, and shows readers how to choose the appropriate sequence analysis tools for their tasks

This book describes the traditional and modern approaches in biological sequence alignment and homology search.  This book contains 11 chapters, with Chapter 1 providing basic information on biological sequences. Next, Chapter 2 contains fundamentals in pair-wise sequence alignment, while Chapters 3 and 4 examine popular existing quantitative models and practical clustering techniques that have been used in multiple sequence alignment. Chapter 5 describes, characterizes and relates many multiple sequence alignment models. Chapter 6 describes how traditionally phylogenetic trees have been constructed, and available sequence knowledge bases can be used to improve the accuracy of reconstructing phylogeny trees. Chapter 7 covers the latest methods developed to improve the run-time efficiency of multiple sequence alignment.  Next, Chapter 8 covers several popular existing multiple sequence alignment server and services, and Chapter 9 examines several multiple sequence alignment techniques that have been developed to handle short sequences (reads) produced by the Next Generation Sequencing technique (NSG). Chapter 10 describes a Bioinformatics application using multiple sequence alignment of short reads or whole genomes as input. Lastly, Chapter 11 provides a review of RNA and protein secondary structure prediction using the evolution information inferred from multiple sequence alignments.

Covers the full spectrum of the field, from alignment algorithms to scoring methods, practical techniques, and alignment tools and their evaluations

Describes theories and developments of scoring functions and scoring matrices

Examines phylogeny estimation and large-scale homology search

Multiple Biological Sequence Alignment: Scoring Functions, Algorithms and Applications is a reference for researchers, engineers, graduate and post-graduate students in bioinformatics, and system biology and molecular biologists.

Ken Nguyen, PhD, is an associate professor at Clayton State University, GA, USA. He received his PhD, MSc and BSc degrees in computer science all from Georgia State University. His research interests are in databases, parallel and distribute computing and bioinformatics. He was a Molecular Basis of Disease fellow at Georgia State and is the recipient of the highest graduate honor at Georgia State, the William M. Suttles Graduate Fellowship.

Xuan Guo, PhD, is a postdoctoral associate at Oak Ridge National Lab, USA. He received his PhD degree in computer science from Georgia State University in 2015. His research interests are in bioinformatics, machine leaning, and cloud computing. He is an editorial assistant of International Journal of Bioinformatics Research and Applications.

Yi Pan, PhD, is a Regents' Professor of Computer Science and an Interim Associate Dean and Chair of Biology at Georgia State University. He received his BE and ME in computer engineering from Tsinghua University in China and his PhD in computer science from the University of Pittsburgh. Dr. Pan's research interests include parallel and distributed computing, optical networks, wireless networks and bioinformatics. He has published more than 180 journal papers with about 60 papers published in various IEEE/ACM journals. He is co-editor along with Albert Y. Zomaya of the Wiley Series in Bioinformatics.

Arvustused

"Covers the full spectrum of the field, from alignment algorithms to scoring methods, practical techniques, and alignment tools and their evaluations." (Zentralblatt MATH, 2016)

Preface xi
1 Introduction 1(10)
1.1 Motivation
2(1)
1.2 The Organization of this Book
2(1)
1.3 Sequence Fundamentals
3(8)
1.3.1 Protein
5(1)
1.3.2 DNA/RNA
6(1)
1.3.3 Sequence Formats
6(1)
1.3.4 Motifs
7(2)
1.3.5 Sequence Databases
9(2)
2 Protein/DNA/RNA Pairwise Sequence Alignment 11(14)
2.1 Sequence Alignment Fundamentals
12(1)
2.2 Dot-Plot Matrix
12(2)
2.3 Dynamic Programming
14(5)
2.3.1 Needleman-Wunsch's Algorithm
15(1)
2.3.2 Example
16(1)
2.3.3 Smith-Waterman's Algorithm
17(2)
2.3.4 Affine Gap Penalty
19(1)
2.4 Word Method
19(2)
2.4.1 Example
20(1)
2.5 Searching Sequence Databases
21(4)
2.5.1 FASTA
21(1)
2.5.2 BLAST
21(4)
3 Quantifying Sequence Alignments 25(34)
3.1 Evolution and Measuring Evolution
25(3)
3.1.1 Jukes and Cantor's Model
26(2)
3.1.2 Measuring Relatedness
28(1)
3.2 Substitution Matrices and Scoring Matrices
28(4)
3.2.1 Identity Scores
28(1)
3.2.2 Substitution/Mutation Scores
29(3)
3.3 GAPS
32(4)
3.3.1 Sequence Distances
35(1)
3.3.2 Example
35(1)
3.4 Scoring Multiple Sequence Alignments
36(2)
3.4.1 Sum-of-Pair Score
36(2)
3.5 Circular Sum Score
38(1)
3.6 Conservation Score Schemes
39(1)
3.6.1 Wu and Kabat's Method
39(1)
3.6.2 Jores's Method
39(1)
3.6.3 Lockless and Ranganathan's Method
40(1)
3.7 Diversity Scoring Schemes
40(2)
3.7.1 Background
41(1)
3.7.2 Methods
41(1)
3.8 Stereochemical Property Methods
42(2)
3.8.1 Valdar's Method
43(1)
3.9 Hierarchical Expected Matching Probability Scoring Metric (HEP)
44(15)
3.9.1 Building an AACCH Scoring Tree
44(2)
3.9.2 The Scoring Metric
46(1)
3.9.3 Proof of Scoring Metric Correctness
47(1)
3.9.4 Examples
48(1)
3.9.5 Scoring Metric and Sequence Weighting Factor
49(1)
3.9.6 Evaluation Data Sets
50(2)
3.9.7 Evaluation Results
52(7)
4 Sequence Clustering 59(10)
4.1 Unweighted Pair Group Method with Arithmetic Mean - UPGMA
60(1)
4.2 Neighborhood-Joining Method - NJ
61(4)
4.3 Overlapping Sequence Clustering
65(4)
5 Multiple Sequences Alignment Algorithms 69(34)
5.1 Dynamic Programming
70(1)
5.1.1 DCA
70(1)
5.2 Progressive Alignment
71(5)
5.2.1 Clustal Family
73(1)
5.2.2 PIMA: Pattern-Induced Multisequence Alignment
73(1)
5.2.3 PRIME: Profile-Based Randomized Iteration Method
74(1)
5.2.4 DlAlign
75(1)
5.3 Consistency and Probabilistic MSA
76(6)
5.3.1 POA: Partial Order Graph Alignment
76(1)
5.3.2 PSAlign
77(1)
5.3.3 ProbCons: Probabilistic Consistency-Based Multiple Sequence Alignment
78(1)
5.3.4 T-Coffee: Tree-Based Consistency Objective Function for Alignment Evaluation
79(1)
5.3.5 MSA Based on Fast Fourier Transform
80(1)
5.3.6 AVID
81(1)
5.3.7 Eulerian Path MSA
81(1)
5.4 Genetic Algorithms
82(3)
5.4.1 SAGA: Sequence Alignment by Genetic Algorithm
83(1)
5.4.2 GA and Self-Organizing Neural Networks
84(1)
5.4.3 FAlign
85(1)
5.5 New Development in Multiple Sequence Alignment Algorithms
85(12)
5.5.1 KB-MSA: Knowledge-Based Multiple Sequence Alignment
85(9)
5.5.2 PADT: Progressive Multiple Sequence Alignment Based on Dynamic Weighted Tree
94(3)
5.6 Test Data and Alignment Methods
97(1)
5.7 Results
98(5)
5.7.1 Measuring Alignment Quality
98(1)
5.7.2 RT-OSM Results
98(5)
6 Phylogeny in Multiple Sequence Alignments 103(10)
6.1 The Tree of Life
103(2)
6.2 Phylogeny Construction
105(7)
6.2.1 Distance Methods
106(1)
6.2.2 Character-Based Methods
107(2)
6.2.3 Maximum Likelihood Methods
109(1)
6.2.4 Bootstrapping
110(1)
6.2.5 Subtree Pruning and Re-grafting
111(1)
6.3 Inferring Phylogeny from Multiple Sequence Alignments
112(1)
7 Multiple Sequence Alignment on High-Performance Computing Models 113(20)
7.1 Parallel Systems
113(1)
7.1.1 Multiprocessor
113(1)
7.1.2 Vector
114(1)
7.1.3 GPU
114(1)
7.1.4 FPGA
114(1)
7.1.5 Reconfigurable Mesh
114(1)
7.2 Exiting Parallel Multiple Sequence Alignment
114(2)
7.3 Reconfigurable-Mesh Computing Models - (R-Mesh)
116(2)
7.4 Pairwise Dynamic Programming Algorithms
118(8)
7.4.1 R-Mesh Max Switches
118(1)
7.4.2 R-Mesh Adder/Subtractor
118(2)
7.4.3 Constant-Time Dynamic Programming on R-Mesh
120(3)
7.4.4 Affine Gap Cost
123(1)
7.4.5 R-Mesh On/Off Switches
124(1)
7.4.6 Dynamic Programming Backtracking on R-Mesh
125(1)
7.5 Progressive Multiple Sequence Alignment ON R-Mesh
126(7)
7.5.1 Hierarchical Clustering on R-Mesh
127(1)
7.5.2 Constant Run-Time Sum-of-Pair Scoring Method
128(1)
7.5.3 Parallel Progressive MSA Algorithm and Its Complexity Analysis
129(4)
8 Sequence Analysis Services 133(12)
8.1 EMBL-EBI: European Bioinformatics Institute
133(2)
8.2 NCBI: National Center for Biotechnology Information
135(1)
8.3 GenomeNet and Data Bank of Japan
136(1)
8.4 Other Sequence Analysis and Alignment Web Servers
137(1)
8.5 SeqAna: Multiple Sequence Alignment with Quality Ranking
138(2)
8.6 Pairwise Sequence Alignment and Other Analysis Tools
140(2)
8.7 Tool Evaluation
142(3)
9 Multiple Sequence for Next-Generation Sequences 145(16)
9.1 Introduction
145(2)
9.2 Overview of Next Generation Sequence Alignment Algorithms
147(7)
9.2.1 Alignment Algorithms Based on Seeding and Hash Tables
147(4)
9.2.2 Alignment Algorithms Based on Suffix Tries
151(3)
9.3 Next-Generation Sequencing Tools
154(7)
10 Multiple Sequence Alignment for Variations Detection 161(18)
10.1 Introduction
161(2)
10.2 Genetic Variants
163(2)
10.3 Variation Detection Methods Based on MSA
165(7)
10.4 Evaluation Methodology
172(4)
10.4.1 Performance Metrics
172(2)
10.4.2 Simulated Sequence Data
174(1)
10.4.3 Real Sequence Data
175(1)
10.5 Conclusion and Future Work
176(3)
11 Multiple Sequence Alignment for Structure Detection 179(20)
11.1 Introduction
179(1)
11.2 RNA Secondary Structure Prediction Based on MSA
180(9)
11.2.1 Common Information in Multiple Aligned RNA Sequences
182(1)
11.2.2 Review of RNA SS Prediction Methods
183(4)
11.2.3 Measures of Quality of RNA SS Prediction
187(2)
11.3 Protein Secondary Structure Prediction Based on MSA
189(7)
11.3.1 Review of Protein Secondary Structure Prediction Methods
190(5)
11.3.2 Measures of Quality of Protein SS Prediction
195(1)
11.4 Conclusion and Future Work
196(3)
References 199(20)
Index 219
Ken Nguyen, PhD, is an associate professor at Clayton State University, GA, USA. He received his PhD, MSc and BSc degrees in computer science all from Georgia State University. His research interests are in databases, parallel and distribute computing and bioinformatics. He was a Molecular Basis of Disease fellow at Georgia State and is the recipient of the highest graduate honor at Georgia State, the William M. Suttles Graduate Fellowship.

Xuan Guo, PhD, is a postdoctoral associate at Oak Ridge National Lab, USA. He received his PhD degree in computer science from Georgia State University in 2015. His research interests are in bioinformatics, machine leaning, and cloud computing. He is an editorial assistant of International Journal of Bioinformatics Research and Applications.

Yi Pan, PhD, is a Regents' Professor of Computer Science and an Interim Associate Dean and Chair of Biology at Georgia State University. He received his BE and ME in computer engineering from Tsinghua University in China and his PhD in computer science from the University of Pittsburgh. Dr. Pan's research interests include parallel and distributed computing, optical networks, wireless networks and bioinformatics. He has published more than 180 journal papers with about 60 papers published in various IEEE/ACM journals. He is co-editor along with Albert Y. Zomaya of the Wiley Series in Bioinformatics.