Muutke küpsiste eelistusi

E-raamat: Accelerating Discovery: Mining Unstructured Information for Hypothesis Generation

(IBM Research, San Jose, California, USA)
  • Formaat - PDF+DRM
  • Hind: 58,49 €*
  • * hind on lõplik, st. muud allahindlused enam ei rakendu
  • Lisa ostukorvi
  • Lisa soovinimekirja
  • See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.

DRM piirangud

  • Kopeerimine (copy/paste):

    ei ole lubatud

  • Printimine:

    ei ole lubatud

  • Kasutamine:

    Digitaalõiguste kaitse (DRM)
    Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale  Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

    Vajalik tarkvara
    Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

    PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

    Seda e-raamatut ei saa lugeda Amazon Kindle's. 

Unstructured Mining Approaches to Solve Complex Scientific Problems

As the volume of scientific data and literature increases exponentially, scientists need more powerful tools and methods to process and synthesize information and to formulate new hypotheses that are most likely to be both true and important.Accelerating Discovery: Mining Unstructured Information for Hypothesis Generation describes a novel approach to scientific research that uses unstructured data analysis as a generative tool for new hypotheses.

The author develops a systematic process for leveraging heterogeneous structured and unstructured data sources, data mining, and computational architectures to make the discovery process faster and more effective. This process accelerates human creativity by allowing scientists and inventors to more readily analyze and comprehend the space of possibilities, compare alternatives, and discover entirely new approaches.

Encompassing systematic and practical perspectives, the book provides the necessary motivation and strategies as well as a heterogeneous set of comprehensive, illustrative examples. It reveals the importance of heterogeneous data analytics in aiding scientific discoveries and furthers data science as a discipline.

Preface xvii
Acknowledgments xxi
Chapter 1 Introduction
1(8)
Chapter 2 Why Accelerate Discovery?
9(24)
Scott Spangler
Ying Chen
The Problem Of Synthesis
11(1)
The Problem Of Formulation
11(2)
What Would Darwin Do?
13(1)
The Potential For Accelerated Discovery: Using Computers To Map The Knowledge Space
14(1)
Why Accelerate Discovery: The Business Perspective
15(1)
Computational Tools That Enable Accelerated Discovery
16(4)
Search
16(1)
Business Intelligence And Data Warehousing
17(1)
Massive Parallelization
17(1)
Unstructured Information Mining
17(1)
Natural Language Processing
17(1)
Machine Learning
18(1)
Collaborative Filtering/Matrix Factorization
18(1)
Modeling And Simulation
18(1)
Service-Oriented Architectures
19(1)
Ontological Representation Schemes
19(1)
Deepqa
19(1)
Reasoning Under Uncertainty
20(1)
Accelerated Discovery From A System Perspective
20(4)
Content Curator
21(1)
Domain-Pedia
21(2)
Annotators
23(1)
Normalizers
23(1)
Biginsights Framework
23(1)
Query Services
23(1)
Analytics Services
23(1)
User Interface
23(1)
Catalogue
24(1)
Accelerated Discovery From A Data Perspective
24(4)
Initial Domain Content And Knowledge Collection
24(2)
Content Comprehension And Semantic Knowledge Extraction
26(1)
Complex And High-Level Knowledge Composition And Representation
26(1)
New Hypothesis And Discovery Creation
27(1)
Accelerated Discovery In The Organization
28(1)
Challenge (And Opportunity) Of Accelerated Discovery
29(1)
References
30(3)
Chapter 3 Form And Function
33(8)
The Process Of Accelerated Discovery
34(6)
Conclusion
40(1)
Reference
40(1)
Chapter 4 Exploring Content To Find Entities
41(20)
Searching For Relevant Content
42(1)
How Much Data Is Enough? What Is Too Much?
42(1)
How Computers Read Documents
43(1)
Extracting Features
43(3)
Editing The Feature Space
46(1)
Feature Spaces: Documents As Vectors
47(1)
Clustering
48(2)
Domain Concept Refinement
50(1)
Category Level
50(1)
Document Level
51(1)
Modeling Approaches
51(3)
Classification Approaches
52(1)
Centroid
52(1)
Decision Tree
52(1)
Naive Bayes
52(1)
Numeric Features
52(1)
Binary Features
53(1)
Rule Based
53(1)
Statistical
53(1)
Dictionaries And Normalization
54(1)
Cohesion And Distinctness
54(2)
Cohesion
55(1)
Distinctness
56(1)
Single And Multimembership Taxonomies
56(1)
Subclassing Areas Of Interest
57(1)
Generating New Queries To Find Additional Relevant Content
57(1)
Validation
58(1)
Summary
58(1)
References
58(3)
Chapter 5 Organization
61(10)
Domain-Specific Ontologies And Dictionaries
61(1)
Similarity Trees
62(3)
Using Similarity Trees To Interact With Domain Experts
65(1)
Scatter-Plot Visualizations
65(2)
Using Scatter Plots To Find Overlaps Between Nearby Entities Of Different Types
67(2)
Discovery Through Visualization Of Type Space
69(1)
References
69(2)
Chapter 6 Relationships
71(10)
What Do Relationships Look Like?
71(1)
How Can We Detect Relationships?
72(1)
Regular Expression Patterns For Extracting Relationships
72(1)
Natural Language Parsing
73(1)
Complex Relationships
74(1)
Example: P53 Phosphorylation Events
74(1)
Putting It All Together
75(1)
Example: Drug/Target/Disease Relationship Networks
75(4)
Conclusion
79(2)
Chapter 7 Inference
81(10)
Co-Occurrence Tables
81(2)
Co-Occurrence Networks
83(1)
Relationship Summarization Graphs
83(1)
Homogeneous Relationship Networks
83(3)
Heterogeneous Relationship Networks
86(1)
Network-Based Reasoning Approaches
86(1)
Graph Diffusion
87(1)
Matrix Factorization
87(1)
Conclusion
88(1)
References
89(2)
Chapter 8 Taxonomies
91(12)
Taxonomy Generation Methods
91(1)
Snippets
92(1)
Text Clustering
92(2)
Time-Based Taxonomies
94(1)
Partitions Based On The Calendar
94(1)
Partitions Based On Sample Size
95(1)
Partitions On Known Events
95(1)
Keyword Taxonomies
95(2)
Regular Expression Patterns
96(1)
Numerical Value Taxonomies
97(1)
Turning Numbers Into X-Tiles
98(1)
Employing Taxonomies
98(3)
Understanding Categories
98(1)
Feature Bar Charts
98(1)
Sorting Of Examples
99(1)
Category/Category Co-Occurrence
99(1)
Dictionary/Category Co-Occurrence
100(1)
References
101(2)
Chapter 9 Orthogonal Comparison
103(14)
Affinity
104(1)
Cotable Dimensions
105(1)
Cotable Layout And Sorting
106(1)
Feature-Based Cotables
107(2)
Cotable Applications
109(1)
Example: Microbes And Their Properties
109(2)
Orthogonal Filtering
111(3)
Conclusion
114(1)
Reference
115(2)
Chapter 10 Visualizing The Data Plane
117(12)
Entity Similarity Networks
117(2)
Using Color To Spot Potential New Hypotheses
119(4)
Visualization Of Centroids
123(2)
Example: Three Microbes
125(2)
Conclusion
127(1)
Reference
127(2)
Chapter 11 Networks
129(10)
Protein Networks
130(1)
Multiple Sclerosis And Il7R
130(4)
Example: New Drugs For Obesity
134(2)
Conclusion
136(1)
Reference
136(3)
Chapter 12 Examples And Problems
139(2)
Problem Catalogue
139(1)
Example Catalogue
140(1)
Chapter 13 Problem: Discovery Of Novel Properties Of Known Entities
141(10)
Antibiotics And Anti-Inflammatories
141(5)
Sos Pathway For Escherichia Coli
146(3)
Conclusions
149(1)
References
150(1)
Chapter 14 Problem: Finding New Treatments For Orphan Diseases From Existing Drugs
151(8)
Ic50:Ic50
152(6)
References
158(1)
Chapter 15 Example: Target Selection Based On Protein Network Analysis
159(6)
Type 2 Diabetes Protein Analysis
159(6)
Chapter 16 Example: Gene Expression Analysis For Alternative Indications
165(10)
Scott Spangler
Ignacio Terrizzano
Jeffrey Kreulen
Ncbi Geo Data
165(8)
Conclusion
173(1)
References
174(1)
Chapter 17 Example: Side Effects
175(8)
Chapter 18 Example: Protein Viscosity Analysis Using Medline Abstracts
183(12)
Discovery Of Ontologies
184(3)
Using Orthogonal Filtering To Discover Important Relationships
187(7)
Reference
194(1)
Chapter 19 Example: Finding Microbes To Clean Up Oil Spills
195(30)
Scott Spangler
Zarath Summers
Adam Usadi
Entities
196(3)
Using Cotables To Find The Right Combination Of Features
199(3)
Discovering New Species
202(3)
Organism Ranking Strategy
205(1)
Characterizing Organisms
206(10)
Respiration
209(6)
Environment
215(1)
Substrate
215(1)
Conclusion
216(9)
Chapter 20 Example: Drug Repurposing
225(6)
Compound 1: A Pde5 Inhibitor
226(2)
Pparα/γ Agonist
228(3)
Chapter 21 Example: Adverse Events
231(10)
Fenofibrate
231(1)
Process
232(5)
Conclusion
237(2)
References
239(2)
Chapter 22 Example: P53 Kinases
241(12)
An Accelerated Discovery Approach Based On Entity Similarity
243(3)
Retrospective Study
246(2)
Experimental Validation
248(2)
Conclusion
250(1)
Reference
251(2)
Chapter 23 Conclusion And Future Work
253(9)
Architecture
254(1)
Future Work
255(1)
Assigning Confidence And Probabilities To Entities, Relationships, And Inferences
255(4)
Dealing With Contradictory Evidence
259(1)
Understanding Intentionality
259(2)
Assigning Value To Hypotheses
261(1)
Tools And Techniques For Automating The Discovery Process
261(1)
Crowd Sourcing Domain Ontology Curation
262(1)
Final Words 262(1)
Reference 262(1)
Index 263
Scott Spangler is a principal data scientist, distinguished engineer, and master inventor in the Watson Innovations Group at the IBM Almaden Research Center. He has been involved with knowledge base and data mining research for the past 25 years. His recent work has applied Watson technology to help accelerate cancer research. He holds 45 patents and is the author of over 30 publications. He received a BS in mathematics from MIT and an MS in computer science from the University of Texas.