Muutke küpsiste eelistusi

E-raamat: RNA-seq Data Analysis: A Practical Approach [Taylor & Francis e-raamat]

(CSC - IT Center for Science, Espoo, Finland), (RS-koulutus, Helsinki, Finland), (University of Eastern Finland, Kuopio), (SciLifeLab, Stockholm University, Sweden), (University of Helsinki, Finland)
Teised raamatud teemal:
  • Taylor & Francis e-raamat
  • Hind: 106,17 €*
  • * hind, mis tagab piiramatu üheaegsete kasutajate arvuga ligipääsu piiramatuks ajaks
  • Tavahind: 151,67 €
  • Säästad 30%
Teised raamatud teemal:
"RNA-seq offers unprecedented information about transcriptome, but harnessing this information with bioinformatics tools is typically a bottleneck. This self-contained guide enables researchers to examine differential expression at gene, exon, and transcript level and to discover novel genes, transcripts, and whole transcriptomes. Each chapter starts with theoretical background, followed by descriptions of relevant analysis tools. The book also provides examples using command line tools and the R statistical environment. For non-programming scientists, the same examples are covered using open source software with a graphical user interface"--



The State of the Art in Transcriptome Analysis
RNA sequencing (RNA-seq) data offers unprecedented information about the transcriptome, but harnessing this information with bioinformatics tools is typically a bottleneck.RNA-seq Data Analysis: A Practical Approach enables researchers to examine differential expression at gene, exon, and transcript levels and to discover novel genes, transcripts, and whole transcriptomes.

Balanced Coverage of Theory and Practice
Each chapter starts with theoretical background, followed by descriptions of relevant analysis tools and practical examples. Accessible to both bioinformaticians and nonprogramming wet lab scientists, the examples illustrate the use of command-line tools, R, and other open source tools, such as the graphical Chipster software.

The Tools and Methods to Get Started in Your Lab
Taking readers through the whole data analysis workflow, this self-contained guide provides a detailed overview of the main RNA-seq data analysis methods and explains how to use them in practice. It is suitable for researchers from a wide variety of backgrounds, including biology, medicine, genetics, and computer science. The book can also be used in a graduate or advanced undergraduate course.

Preface xvii
Acknowledgments xxi
Authors xxiii
Chapter 1 Introduction to RNA-seq
1(26)
1.1 Introduction
1(2)
1.2 Isolation Of RNAs
3(1)
1.3 Quality Control Of RNA
4(2)
1.4 Library Preparation
6(3)
1.5 Major RNA-Seq Platforms
9(5)
1.5.1 Illumina
9(1)
1.5.2 Solid
10(1)
1.5.3 Roche 454
11(1)
1.5.4 Ion Torrent
11(1)
1.5.5 Pacific Biosciences
12(1)
1.5.6 Nanopore Technologies
13(1)
1.6 RNA-Seq Applications
14(7)
1.6.1 Protein Coding Gene Structure
14(2)
1.6.2 Novel Protein-Coding Genes
16(1)
1.6.3 Quantifying and Comparing Gene Expression
16(1)
1.6.4 Expression Quantitative Train Loci (eQTL)
17(1)
1.6.5 Single-Cell RNA-seq
18(1)
1.6.6 Fusion Genes
18(1)
1.6.7 Gene Variations
19(1)
1.6.8 Long Noncoding RNAs
19(1)
1.6.9 Small Noncoding RNAs (miRNA-seq)
20(1)
1.6.10 Amplification Product Sequencing (Ampli-seq)
20(1)
1.7 Choosing An RNA-Seq Platform
21(6)
1.7.1 Eight General Principles for Choosing an RNA-seq Platform and Mode of Sequencing
21(1)
1.7.1.1 Accuracy: How Accurate Must the Sequencing Be?
21(1)
1.7.1.2 Reads: How Many Do I Need?
22(1)
1.7.1.3 Length: How Long Must the Reads Be?
23(1)
1.7.1.4 SR or PE: Single Read or Paired End?
23(1)
1.7.1.5 RNA or DNA: Am I Sequencing RNA or DNA?
23(1)
1.7.1.6 Material: How Much Sample Material Do I Have?
24(1)
1.7.1.7 Costs: How Much Can I Spend?
24(1)
1.7.1.8 Time: When Does the Work Need to Be Completed?
24(1)
1.7.2 Summary
25(1)
References
25(2)
Chapter 2 Introduction to RNA-seq Data Analysis
27(14)
2.1 Introduction
27(3)
2.2 Differential Expression Analysis Workflow
30(4)
2.2.1 Step 1: Quality Control of Reads
31(1)
2.2.2 Step 2: Preprocessing of Reads
31(1)
2.2.3 Step 3: Aligning Reads to a Reference Genome
31(1)
2.2.4 Step 4: Genome-Guided Transcriptome Assembly
32(1)
2.2.5 Step 5: Calculating Expression Levels
32(1)
2.2.6 Step 6: Comparing Gene Expression between Conditions
33(1)
2.2.7 Step 7: Visualization of Data in Genomic Context
33(1)
2.3 Downstream Analysis
34(1)
2.3.1 Gene Annotation
34(1)
2.3.2 Gene Set Enrichment Analysis
34(1)
2.4 Automated Workflows And Pipelines
35(1)
2.5 Hardware Requirements
35(1)
2.6 Following The Examples In The Book
36(4)
2.6.1 Using Command Line Tools and R
36(1)
2.6.2 Using the Chipster Software
37(2)
2.6.3 Example Data Sets
39(1)
2.7 Summary
40(1)
References
40(1)
Chapter 3 Quality Control and Preprocessing
41(22)
3.1 Introduction
41(1)
3.2 Software For Quality Control And Preprocessing
42(2)
3.2.1 FastQC
42(1)
3.2.2 PRINSEQ
43(1)
3.2.3 Trimmomatic
44(1)
3.3 Read Quality Issues
44(16)
3.3.1 Base Quality
44(1)
3.3.1.1 Filtering
45(4)
3.3.1.2 Trimming
49(3)
3.3.2 Ambiguous Bases
52(2)
3.3.3 Adapters
54(1)
3.3.4 Read Length
55(1)
3.3.5 Sequence-Specific Bias and Mismatches Caused by Random Hexamer Priming
56(1)
3.3.6 GC Content
57(1)
3.3.7 Duplicates
57(2)
3.3.8 Sequence Contamination
59(1)
3.3.9 Low-Complexity Sequences and PolyA Tails
59(1)
3.4 Summary
60(3)
References
61(2)
Chapter 4 Aligning Reads to Reference
63(22)
4.1 Introduction
63(1)
4.2 Alignment Programs
64(13)
4.2.1 Bowtie
64(4)
4.2.2 TopHat
68(5)
4.2.3 Star
73(4)
4.3 Alignment Statistics And Utilities For Manipulating Alignment Files
77(4)
4.4 Visualizing Reads In Genomic Context
81(1)
4.5 Summary
82(3)
References
83(2)
Chapter 5 Transcriptome Assembly
85(24)
5.1 Introduction
85(2)
5.2 Methods
87(5)
5.2.1 Transcriptome Assembly Is Different From Genome Assembly
87(1)
5.2.2 Complexity of Transcript Reconstruction
88(1)
5.2.3 Assembly Process
89(1)
5.2.4 de Bruijn Graph
90(1)
5.2.5 Use of Abundance Information
91(1)
5.3 Data Preprocessing
92(3)
5.3.1 Read Error Correction
93(1)
5.3.2 Seecer
93(2)
5.4 Mapping-Based Assembly
95(3)
5.4.1 Cufflinks
95(2)
5.4.2 Scripture
97(1)
5.5 De Novo Assembly
98(6)
5.5.1 Velvet + Oases
98(2)
5.5.2 Trinity
100(4)
5.6 Summary
104(5)
References
106(3)
Chapter 6 Quantitation and Annotation-Based Quality Control
109(22)
6.1 Introduction
109(1)
6.2 Annotation-Based Quality Metrics
110(6)
6.2.1 Tools For Annotation-Based Quality Control
111(5)
6.3 Quantitation Of Gene Expression
116(12)
6.3.1 Counting Reads per Genes
117(1)
6.3.1.1 HTSeq
117(3)
6.3.2 Counting Reads per Transcripts
120(2)
6.3.2.1 Cufflinks
122(1)
6.3.2.2 Express
122(4)
6.3.3 Counting Reads per Exons
126(2)
6.4 Summary
128(3)
References
129(2)
Chapter 7 RNA-seq Analysis Framework in R and Bioconductor
131(16)
7.1 Introduction
131(3)
7.1.1 Installing R and Add-on Packages
132(1)
7.1.2 Using R
133(1)
7.2 Overview Of The Bioconductor Packages
134(1)
7.2.1 Software Packages
134(1)
7.2.2 Annotation Packages
134(1)
7.2.3 Experiment Packages
135(1)
7.3 Descriptive Features Of The Bioconductor Packages
135(3)
7.3.1 Oop Features In R
135(3)
7.4 Representing Genes And Transcripts In R
138(3)
7.5 Representing Genomes In R
141(2)
7.6 Representing SNPs In R
143(1)
7.7 Forging New Annotation Packages
143(3)
7.8 Summary
146(1)
References
146(1)
Chapter 8 Differential Expression Analysis
147(34)
8.1 Introduction
147(1)
8.2 Technical Vs. Biological Replicates
148(1)
8.3 Statistical Distributions In RNA-Seq Data
149(3)
8.3.1 Biological Replication, Count Distributions, and Choice of Software
150(2)
8.4 Normalization
152(2)
8.5 Software Usage Examples
154(22)
8.5.1 Using Cuffdiff
154(4)
8.5.2 Using Bioconductor Packages: DESeq, edgeR, limma
158(1)
8.5.3 Linear Models, the Design Matrix, and the Contrast Matrix
158(1)
8.5.3.1 Design Matrix
159(1)
8.5.3.2 Contrast Matrix
160(1)
8.5.4 Preparations Ahead of Differential Expression Analysis
161(1)
8.5.4.1 Starting from BAM Files
162(1)
8.5.4.2 Starting from Individual Count Files
162(1)
8.5.4.3 Starting from an Existing Count Table
163(1)
8.5.4.4 Independent Filtering
163(1)
8.5.5 Code Example for DESeq(2)
163(1)
8.5.6 Visualization
164(4)
8.5.7 For Reference: Code Examples for Other Bioconductor Packages
168(1)
8.5.8 Limma
169(1)
8.5.9 SAMSeq (samr package)
170(1)
8.5.10 edgeR
171(1)
8.5.11 DESeq2 Code Example for a Multifactorial Experiment
171(3)
8.5.12 For Reference: edgeR Code Example
174(1)
8.5.13 Limma Code Example
175(1)
8.6 Summary
176(5)
References
177(4)
Chapter 9 Analysis of Differential Exon Usage
181(18)
9.1 Introduction
181(2)
9.2 Preparing The Input Files For Dexseq
183(1)
9.3 Reading Data In To R
184(1)
9.4 Accessing The ExonCountSet Object
185(2)
9.5 Normalization And Estimation Of The Variance
187(3)
9.6 Test For Differential Exon Usage
190(3)
9.7 Visualization
193(5)
9.8 Summary
198(1)
References
198(1)
Chapter 10 Annotating the Results
199(18)
10.1 Introduction
199(1)
10.2 Retrieving Additional Annotations
200(8)
10.2.1 Using an Organism-Specific Annotation Package to Retrieve Annotations for Genes
201(4)
10.2.2 Using BioMart to Retrieve Annotations for Genes
205(3)
10.3 Using Annotations For Ontological Analysis Of Gene Sets
208(2)
10.4 Gene Set Analysis In More Detail
210(6)
10.4.1 Competitive Method Using GOstats Package
211(2)
10.4.2 Self-Contained Method Using Globaltest Package
213(2)
10.4.3 Length Bias Corrected Method
215(1)
10.5 Summary
216(1)
References
216(1)
Chapter 11 Visualization
217(20)
11.1 Introduction
217(2)
11.1.1 Image File Types
218(1)
11.1.2 Image Resolution
218(1)
11.1.3 Color Models
219(1)
11.2 Graphics In R
219(13)
11.2.1 Heatmap
220(4)
11.2.2 Volcano Plot
224(2)
11.2.3 MA Plot
226(2)
11.2.4 Idiogram
228(2)
11.2.5 Visualizing Gene and Transcript Structures
230(2)
11.3 Finalizing The Plots
232(2)
11.4 Summary
234(3)
References
235(2)
Chapter 12 Small Noncoding RNAs
237(22)
12.1 Introduction
237(2)
12.2 MICRORNAs (miRNAs)
239(4)
12.3 Microrna Off-Set RNAS (moRNAs)
243(1)
12.4 Piwi-Associated RNAS (piRNAs)
243(1)
12.5 Endogenous Silencing RNAs (endo-siRNAs)
244(1)
12.6 Exogenous Silencing RNAs (exo-siRNAs)
244(1)
12.7 Transfer RNAs (tRNAs)
245(1)
12.8 Small Nucleolar RNAs (snoRNAs)
245(1)
12.9 Small Nuclear RNAs (snRNAs)
245(1)
12.10 Enhancer-Derived RNAs (eRNA)
246(1)
12.11 Other Small Noncoding RNAs
246(2)
12.12 Sequencing Methods For Discovery Of Small Noncoding RNAs
248(7)
12.12.1 microRNA-seq
248(3)
12.12.2 Clip-seq
251(3)
12.12.3 Degradome-seq
254(1)
12.12.4 Global Run-On Sequencing (GRO-seq)
254(1)
12.13 Summary
255(4)
References
255(4)
Chapter 13 Computational Analysis of Small Noncoding RNA Sequencing Data
259(28)
13.1 Introduction
259(1)
13.2 Discovery Of Small RNAs---miRDeep2
260(8)
13.2.1 GFF files
260(3)
13.2.2 FASTA Files of Known miRNAs
263(1)
13.2.3 Setting up the Run Environment
263(3)
13.2.4 Running miRDeep2
266(1)
13.2.4.1 miRDeep2 Output
266(2)
13.3 miRANALYZER
268(3)
13.3.1 Running miRanalyzer
271(1)
13.4 miRNA Target Analysis
271(5)
13.4.1 Computational Prediction Methods
272(2)
13.4.2 Artificial Intelligence Methods
274(1)
13.4.3 Experimental Support-Based Methods
275(1)
13.5 miRNA-Seq And mRNA-Seq Data Integration
276(1)
13.6 Small RNA Databases And Resources
277(7)
13.6.1 RNA-seq Reads of miRNAs in miRBase
277(2)
13.6.2 Expression Atlas of miRNAs
279(2)
13.6.3 Database for CLIP-seq and Degradome-seq Data
281(1)
13.6.4 Databases for miRNAs and Disease
281(1)
13.6.5 General Databases for the Research Community and Resources
282(1)
13.6.6 miRNAblog
282(2)
13.7 Summary
284(3)
References
284(3)
Index 287
Eija Korpelainen, Jarno Tuimala, Panu Somervuo, Mikael Huss, Garry Wong