Klienditugi: 7440010 (E-R 10-18)

Abi | Registreeri | Logi sisse

E-raamat: Genomics in the Azure Cloud

Colby T. Ford

Formaat: 330 pages
Ilmumisaeg: 14-Nov-2022
Kirjastus: O'Reilly Media
Keel: eng
ISBN-13: 9781098139018

Teised raamatud teemal:

Formaat - PDF+DRM
Hind: 56,15 €*
* hind on lõplik, st. muud allahindlused enam ei rakendu
Lisa ostukorvi
Lisa soovinimekirja
See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.

Formaat: 330 pages
Ilmumisaeg: 14-Nov-2022
Kirjastus: O'Reilly Media
Keel: eng
ISBN-13: 9781098139018

Teised raamatud teemal:

DRM piirangud

Kopeerimine (copy/paste):

ei ole lubatud
Printimine:

ei ole lubatud
Kasutamine:

Digitaalõiguste kaitse (DRM)
Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

Vajalik tarkvara
Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

Seda e-raamatut ei saa lugeda Amazon Kindle's.

This practical guide bridges the gap between general cloud computing architecture in Microsoft Azure and scientific computing for bioinformatics and genomics. You'll get a solid understanding of the architecture patterns and services that are offered in Azure and how they might be used in your bioinformatics practice. You'll get code examples that you can reuse for your specific needs. And you'll get plenty of concrete examples to illustrate how a given service is used in a bioinformatics context.

You'll also get valuable advice on how to:

Use enterprise platform services to easily scale your bioinformatics workloads
Organize, query, and analyze genomic data at scale
Build a genomics data lake and accompanying data warehouse
Use Azure Machine Learning to scale your model training, track model performance, and deploy winning models
Orchestrate and automate processing pipelines using Azure Data Factory and Databricks
Cloudify your organization's existing bioinformatics pipelines by moving your workflows to Azure high-performance compute services
And more

Preface

vii

1 Essentials of Cloud Architecture

(36)

Cloud Horsepower

(4)

Considerations for the Cloud

(2)

Three Benefits of the Cloud

(1)

Types of Cloud Services

(4)

Infrastructure Services

(1)

Platform Services

(2)

Software Services

(1)

Azure Environment Organization

(3)

Getting an Azure Account

(2)

Welcome to the Azure Portal

(10)

Setting Up a Resource Group

(4)

Creating Resources

(4)

Free Services

(2)

Basics of the Bioinformatics Workflow

(13)

Primary Analysis

(2)

Secondary Analysis

(3)

Tertiary Analysis

(1)

Other Analyses

(1)

Other File Formats

(7)

2 Organizing Genomics Data with Data Lakes

(32)

Organizing Your Genomics Data

(5)

Going for Bronze, Silver, and Gold

(2)

Letting Your Bioinformatics Workflow Dictate Your Data Lake Organization

(2)

Planning for -omics and Non-omics Data Together

(1)

Creating a Data Lake with Azure Storage

(6)

Blob Storage Versus Data Lake Storage

(1)

Balancing Costs Versus Performance in Data Storage

(9)

The Goldilocks Method of Storage Tiers

(1)

Genomics Data Lifecycle

(8)

Managing Access Inside the Lake

(5)

Role-Based Access Control

(2)

Access-Control Lists

(2)

Azure Open Datasets for Genomics

(6)

3 Querying Variant Data in SQL

(44)

Building a Genomics Data Warehouse

(6)

Example: Lab Results

(1)

Data Warehouse Architecture for Genomics

(5)

Azure Synapse Analytics

(19)

Creating an Azure Synapse Analytics Workspace

(4)

Registering Services in Subscriptions

(3)

Getting to Work in the Synapse Workspace

(3)

Using Open Row Sets

(2)

Creating External Tables

(4)

Did Someone Say "Pool Party"?

(3)

Connecting to More Data Sources

(4)

Azure SQL DB

100

(8)

Creating a Database in Azure SQL DB

100

(8)

Relaxing at Your Genomics Data Lakehouse

108

(5)

Efficient File Formats

109

(4)

4 Orchestrating Data Movement and Transformation

113

(36)

Creating Your Data Factory

114

(5)

Getting Started with Data Movement

119

(30)

Getting Data into Your Data Lake Using the Copy Data Tool

119

(2)

Linking to NCBI's FTP Server

121

(9)

Transforming Data Using Data Flows

130

(17)

Building and Triggering Pipelines for Automation

147

(2)

5 Azure Databricks (and Apache Spark)

149

(46)

Introduction to Apache Spark and Databricks

149

(4)

Setting Up an Azure Databricks Workspace

153

(12)

Connecting Databricks to Your Data Lake

162

(3)

Processing Variant Data with the Glow Package

165

(5)

Exploring DataFrames

168

(2)

Automating Variant Data Processing

170

(17)

Orchestrating a Databricks Notebook from Data Factory

173

(12)

A Brief Interlude About Distributed File Formats

185

(2)

Using Other Tools in Databricks

187

(8)

Single-Node Bioinformatics Tools

188

(1)

Koalas

189

(1)

Hail

190

(5)

6 Azure Machine Learning

195

(32)

How to Scale Machine Learning Tasks

195

(2)

Creating an Azure Machine Learning Workspace

197

(3)

Training a Drug Sensitivity Model

200

(13)

Creating a Compute Instance in Azure Machine Learning Studio

201

(2)

Datastores and Datasets

203

(6)

Experimenting with Cluster-Based Training

209

(4)

Automating Model Training with AutoML

213

(5)

Explainable Machine Learning

216

(2)

Using Azure Machine Learning Not for Machine Learning

218

(9)

Performing Alignment in a Notebook

218

(1)

Custom Docker Images for Bioinformatics

219

(8)

7 High-Performance Computing and Other Compute Services

227

(38)

Bring Your Own Pipeline (BYOP)

228

(2)

Why Azure for HPC?

228

(2)

Azure Batch

230

(8)

Scaling Workloads with Cromwell

231

(7)

Azure CycleCloud

238

(20)

Setting Up CycleCloud Clusters

239

(19)

Microsoft Genomics

258

(7)

Alignment and Variant Calling with the msgen Package

258

(7)

8 Deployment, Security, Compliance, and Potpourri

265

(38)

Automating the Deployment of Cloud Resources

265

(10)

Dev, Staging, and Prod

266

(1)

Lifting Your Deployment with ARMs and Biceps

266

(9)

Security Planning

275

(5)

Azure Active Directory

275

(3)

Role-Based Access Controls and Access-Control Lists

278

(2)

Compliance

280

(6)

HIPAA, HITECH, and HITRUST

281

(3)

Azure Blueprints

284

(2)

Cost Considerations

286

(7)

Azure Pricing Calculator

286

(2)

Retail Pricing Versus Enterprise Agreements

288

(1)

Budgeting Examples

289

(4)

Quota Problems

293

(4)

Please, Sir, Can I Have Some More (vCPUs)?

295

(2)

Getting General Support

297

(6)

Conclusion

303

(4)

Index

307

Dr. Colby T. Ford is a professional AI cloud architect, data scientist, and computational biologist who uses machine learning and distributed computing to solve problems in the fields of infectious diseases and human genomics. For the last 8+ years, he has been consulting for companies across industries, leading the conversation for digital transformation using artificial intelligence and cloud computing. He currently serves as the Principal of Life Sciences at BlueGranite, a top-tier Microsoft partner, and focuses on building cloud-based bioinformatics solutions in the Azure cloud. In academia, his research includes the use of large-scale machine learning architecture in the study of infectious disease genomics and rare human diseases. In addition to his consulting and academic career, Dr. Ford is a co-founder of a digital health startup that focuses on the use of wearable devices to help study neurological disorders. Given Dr. Ford's interdisciplinary education background and parallel experience in industry and academia, he has a unique viewpoint and approach to effectively solve genomics research problems with cutting-edge technologies previously only used in industry blended with methods previously only seen in academia.

Lisainfo e-raamatute kohta

Püsilink: https://www.kriso.ee/db/97810981390182e.html

Märksõnad:

E-raamat: Genomics in the Azure Cloud

DRM piirangud

Kopeerimine (copy/paste):

Printimine:

Kasutamine:

Konto & seaded

Otsing

Otsingu andmebaas

Filtreeri tulemusi

Teemad E-raamatute teemad

Vali ostukorv