Muutke küpsiste eelistusi

E-raamat: Genome Annotation [Taylor & Francis e-raamat]

(University of Calgary, Alberta, Canada), (University of Calgary, Alberta, Canada), (University of Calgary, Alberta, Canada)
Teised raamatud teemal:
  • Taylor & Francis e-raamat
  • Hind: 166,18 €*
  • * hind, mis tagab piiramatu üheaegsete kasutajate arvuga ligipääsu piiramatuks ajaks
  • Tavahind: 237,40 €
  • Säästad 30%
Teised raamatud teemal:
The success of individualized medicine, advanced crops, and new and sustainable energy sources requires thoroughly annotated genomic information and the integration of this information into a coherent model. A thorough overview of this field, Genome Annotation explores automated genome analysis and annotation from its origins to the challenges of next-generation sequencing data analysis.

The book initially takes you through the last 16 years since the sequencing of the first complete microbial genome. It explains how current analysis strategies were developed, including sequencing strategies, statistical models, and early annotation systems. The authors then present visualization techniques for displaying integrated results as well as state-of-the-art annotation tools, including MAGPIE, Ensembl, Bluejay, and Galaxy. They also discuss the pipelines for the analysis and annotation of complex, next-generation DNA sequencing data. Each chapter includes references and pointers to relevant tools.

As very few existing genome annotation pipelines are capable of dealing with the staggering amount of DNA sequence information, new strategies must be developed to accommodate the needs of todays genome researchers. Covering this topic in detail, Genome Annotation provides you with the foundation and tools to tackle this challenging and evolving area. Suitable for both students new to the field and professionals who deal with genomic information in their work, the book offers two genome annotation systems on an accompanying downloadable resources.
Preface xv
Authors xix
Contributor xxi
Chapter 1 DNA Sequencing Strategies
1(8)
1.1 The Evolution Of Dna Sequencing Technologies
1(1)
1.2 Dna Sequence Assembly Strategies
2(3)
1.3 Next-Generation Sequencing
5(1)
1.4 Sequencing Bias And Error Rates
6(3)
References
7(2)
Chapter 2 Coding Sequence Prediction
9(22)
2.1 Introduction
9(1)
2.2 Mapping Messenger Rna (mRNA)
9(4)
2.3 Statistical Models
13(6)
2.3.1 5' Untranslated Region
14(1)
2.3.2 35 Signal
14(1)
2.3.3 B Recognition Element
14(1)
2.3.4 TATA Box
15(1)
2.3.5 Ribosomal Binding Site
15(1)
2.3.6 Start Codon
16(1)
2.3.7 Protein Coding Sequence
16(1)
2.3.8 Donor Splice Site
17(1)
2.3.9 Intron Sequence
17(1)
2.3.10 Acceptor Splice Site
18(1)
2.3.11 Stop Codon
18(1)
2.3.12 3' Untranslated Region
18(1)
2.3.13 Terminator
18(1)
2.4 Cross-Species Methods
19(4)
2.4.1 Nucleotide Homology
19(1)
2.4.2 Protein Homology
20(2)
2.4.3 Domain Homology
22(1)
2.5 Combining Gene Predictions
23(1)
2.6 Splice Variants
24(7)
References
27(4)
Chapter 3 Between the Genes
31(24)
3.1 Introduction
31(1)
3.2 Transcription Factors
31(5)
3.2.1 Transcription Factor Binding Site (TFBS) Motifs
32(2)
3.2.2 TFBS Location
34(1)
3.2.3 TFBS Neighborhood
34(1)
3.2.4 TFBS Conservation
35(1)
3.3 RNA
36(8)
3.3.1 Ribosomal RNA
37(1)
3.3.2 Transfer RNA
37(2)
3.3.3 Small Nucleolar RNA
39(2)
3.3.4 Micro RNA
41(2)
3.3.5 Other Types of RNAs
43(1)
3.4 Pseudogenes
44(5)
3.4.1 Transposable Elements
46(1)
3.4.2 DNA Transposons
46(1)
3.4.3 Retrotransposons
47(2)
3.5 Other Repeats
49(6)
References
50(5)
Chapter 4 Genome-Associated Data
55(14)
4.1 Introduction
55(1)
4.2 Operons
55(1)
4.3 Metagenomics
56(4)
4.3.1 Population Statistics
56(2)
4.3.2 Data Size
58(1)
4.3.3 Phylogenetic Sorting
58(1)
4.3.4 Assembly Quality
59(1)
4.4 Individual Genomes
60(9)
4.4.1 Epigenetics
60(1)
4.4.1.1 DNA Methylation
60(1)
4.4.1.2 Histone Modification
60(1)
4.4.1.3 Nucleosome Positioning
61(1)
4.4.2 Single Nucleotide Polymorphisms
62(1)
4.4.2.1 Nomenclature
62(1)
4.4.2.2 Effects
62(1)
4.4.3 Insertions and Deletions
63(1)
4.4.3.1 Nomenclature
63(1)
4.4.3.2 Effects
64(1)
4.4.4 Copy Number Variation
64(1)
References
65(4)
Chapter 5 Characterization of Gene Function through Bioinformatics: The Early Days
69(14)
5.1 Overview
69(2)
5.2 Stand-Alone Tools And Tools For The Early Internet
71(2)
5.3 Packages
73(4)
5.3.1 IBI/Pustell
73(1)
5.3.2 PC/GENE
73(1)
5.3.3 GCG
74(1)
5.3.4 From EGCG to EMBOSS
74(1)
5.3.5 The Staden Package
75(2)
5.3.6 GeneSkipper
77(1)
5.3.7 Sequencher
77(1)
5.4 From Fasta Files To Annotated Genomes
77(2)
5.4.1 ACeDB
77(1)
5.4.2 One Genome Project, the Beginning of Three Genome Annotation Systems
78(1)
5.5 Conclusion
79(4)
References
79(4)
Chapter 6 Visualization Techniques and Tools for Genomic Data
83(22)
6.1 Introduction
83(1)
6.2 Visualization Of Sequencing Data
84(4)
6.3 Visualization Of Multiple Sequence Alignments
88(3)
6.3.1 Pairwise Alignment Viewers
88(2)
6.3.2 Multiple Alignment Viewers
90(1)
6.4 Visualization Of Hierarchical Structures
91(5)
6.4.1 Tree Visualization Styles
92(2)
6.4.2 Tree Visualization Tools
94(2)
6.5 Visualization Of Gene Expression Data
96(9)
6.5.1 Expression Data Visualization Techniques
97(2)
6.5.2 Visualization for Biological Interpretation
99(2)
References
101(4)
Chapter 7 Functional Annotation
105(20)
7.1 Introduction
105(1)
7.2 Biophysical And Biochemical Feature Prediction
105(3)
7.2.1 Physical Chemistry Features
105(1)
7.2.2 Sequence Motif Prediction
106(1)
7.2.2.1 Protein Modification
106(1)
7.2.2.2 Protein Localization
107(1)
7.3 Protein Domains
108(4)
7.4 Similarity Searches
112(4)
7.4.1 Paralogs
112(1)
7.4.2 Orthologs
113(2)
7.4.3 Xenologs
115(1)
7.4.4 Analogs
116(1)
7.5 Pairwise Alignment Methods
116(3)
7.5.1 Canonical Methods
116(1)
7.5.2 Heuristic Methods
117(1)
7.5.3 Scoring Matrices
118(1)
7.6 Conclusion
119(6)
References
119(6)
Chapter 8 Automated Annotation Systems
125(20)
8.1 Introduction
125(1)
8.2 Magpie
126(9)
8.2.1 Analysis Management
126(2)
8.2.2 Structural Annotation
128(2)
8.2.3 Functional Annotation
130(3)
8.2.4 User Interface
133(2)
8.3 Generic Model Organism Database
135(2)
8.3.1 Analysis Management
135(1)
8.3.2 Structural Annotation
135(1)
8.3.3 Functional Annotation
136(1)
8.3.4 User Interface
136(1)
8.4 AGeS
137(2)
8.4.1 Analysis Management
137(1)
8.4.2 Structural Annotation
138(1)
8.4.3 Functional Annotation
138(1)
8.4.4 User Interface
138(1)
8.5 Ensembl
139(3)
8.5.1 Analysis Management
140(1)
8.5.2 Structural Annotation
140(1)
8.5.3 Functional Annotation
141(1)
8.6 Summary
142(3)
References
143(2)
Chapter 9 Dynamic Annotation Systems: End-User-Driven Annotation and Visualization
145(28)
9.1 Introduction
145(1)
9.2 Web-Based Genome Annotation Browsers
146(9)
9.2.1 University of California, Santa Cruz (UCSC) Genome Browser
147(2)
9.2.2 Ensembl Genome Browser
149(2)
9.2.3 NCBI Map Viewer
151(3)
9.2.4 Generic Genome Browser
154(1)
9.3 Stand-Alone Genome Annotation Browsers
155(7)
9.3.1 Bluejay: An XML-Based Genome Visualization and Data Integration System
155(3)
9.3.2 NCBI Genome Workbench
158(2)
9.3.3 Integrated Genome Browser
160(1)
9.3.4 Apollo
161(1)
9.4 Comparative Visualization Of Genomes
162(11)
9.4.1 Dot Plots
162(1)
9.4.2 Linear Representation
163(2)
9.4.3 Circular Representation
165(4)
References
169(4)
Chapter 10 Web-Based Workflows
173(22)
10.1 Introduction
173(1)
10.2 Principles Of Web-Based Workflows
173(2)
10.2.1 Motivation
173(1)
10.2.2 Early Workflow Environments
174(1)
10.3 GALAXY
175(4)
10.3.1 Interactive Analysis
175(2)
10.3.2 Workflows
177(1)
10.3.3 Component Repository
178(1)
10.4 Taverna
179(4)
10.4.1 The Design Interface
180(1)
10.4.2 The Results Interface
181(2)
10.4.3 Workflow Repository
183(1)
10.5 Seahawk
183(9)
10.5.1 Demonstration-to-Workflow
183(3)
10.5.2 The Search Widget
186(3)
10.5.3 Data Filters and Labels
189(1)
10.5.4 Taverna Enactment of Seahawk-Generated Workflows
190(2)
10.6 Conclusion
192(3)
References
192(3)
Chapter 11 Analysis Pipelines for Next-Generation Sequencing Data
195(20)
11.1 Introduction
195(1)
11.2 Genome Sequence Reconstruction
196(4)
11.2.1 Alignment to the Reference Genome
197(1)
11.2.2 De Novo Assembly
198(2)
11.3 Analysis Pipelines: Case Studies
200(5)
11.3.1 16S rRNA Analysis
200(2)
11.3.2 Targeted EST Assembly
202(2)
11.3.3 Gene Prediction
204(1)
11.4 Next-Generation Genome Browsing
205(10)
11.4.1 Integration of Different Types of Genomic Data
205(4)
11.4.2 Decentralization
209(1)
References
210(5)
Index 215
Jung Soh is a research associate at the University of Calgary. He earned a Ph.D. in computer science from the University at Buffalo, The State University of New York, where he worked at the Center of Excellence for Document Analysis and Recognition (CEDAR). He also worked as a principal research scientist at the Electronics and Telecommunications Research Institute (ETRI) in Daejeon, Korea. His research interests are in bioinformatics, machine learning, and biomedical data visualization.

Paul M.K. Gordon is the bioinformatics support specialist for the Alberta Childrens Hospital Research Institute at the University of Calgary. He has worked at the National Research Council of Canadas Institute for Information Technology (NRC-IIT) and Institute for Marine Biosciences (NRC-IMB). His current work focuses on developing bioinformatics techniques for personalized medicine.

Christoph W. Sensen is a professor of bioinformatics at the University of Calgary. He has previously worked as a research officer at the National Research Council of Canadas Institute for Marine Biosciences (NRC-IMB) and as a visiting scientist at the European Molecular Biology Laboratory (EMBL) in Heidelberg. His research interests are in genome research and bioinformatics.