Muutke küpsiste eelistusi

E-raamat: Learning in Embedded Systems

  • Formaat: 193 pages
  • Ilmumisaeg: 20-Jun-2019
  • Kirjastus: MIT Press
  • ISBN-13: 9780262288507
Teised raamatud teemal:
  • Formaat - PDF+DRM
  • Hind: 37,44 €*
  • * hind on lõplik, st. muud allahindlused enam ei rakendu
  • Lisa ostukorvi
  • Lisa soovinimekirja
  • See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.
  • Formaat: 193 pages
  • Ilmumisaeg: 20-Jun-2019
  • Kirjastus: MIT Press
  • ISBN-13: 9780262288507
Teised raamatud teemal:

DRM piirangud

  • Kopeerimine (copy/paste):

    ei ole lubatud

  • Printimine:

    ei ole lubatud

  • Kasutamine:

    Digitaalõiguste kaitse (DRM)
    Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale  Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

    Vajalik tarkvara
    Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

    PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

    Seda e-raamatut ei saa lugeda Amazon Kindle's. 

Reporting new experimental results, explores how to design learning into embedded systems used in mobile rockets, factory process controllers, long-term software databases, and other systems that must adapt their behavior to a complex and changing external environment. Annotation copyright Book News, Inc. Portland, Or.

It is the first detailed exploration of the problem of learning action strategies in the context of designing embedded systems that adapt their behavior to a complex, changing environment; such systems include mobile robots, factory process controllers, and long-term software databases.

Learning to perform complex action strategies is an important problem in the fields of artificial intelligence, robotics, and machine learning. Filled with interesting new experimental results, Learning in Embedded Systems explores algorithms that learn efficiently from trial-and error experience with an external world. It is the first detailed exploration of the problem of learning action strategies in the context of designing embedded systems that adapt their behavior to a complex, changing environment; such systems include mobile robots, factory process controllers, and long-term software databases.

Kaelbling investigates a rapidly expanding branch of machine learning known as reinforcement learning, including the important problems of controlled exploration of the environment, learning in highly complex environments, and learning from delayed reward. She reviews past work in this area and presents a number of significant new results. These include the intervalestimation algorithm for exploration, the use of biases to make learning more efficient in complex environments, a generate-and-test algorithm that combines symbolic and statistical processing into a flexible learning method, and some of the first reinforcement-learning experiments with a real robot.
Acknowledgments xiii
Introduction
1(14)
Direct Programming
2(1)
What Is Learning?
3(1)
What to Learn?
4(1)
What to Learn From?
5(3)
Representation
8(1)
Situated Action
9(1)
Theory and Practice
10(1)
Contents
10(5)
Foundations
15(20)
Acting in a Complex World
15(9)
Modeling an Agent's Interaction with the World
16(1)
Inconsistent Worlds
17(5)
Learning Behaviors
22(2)
Performance Criteria
24(10)
Correctness
24(5)
Convergence
29(3)
Time and Space Complexity
32(2)
Related Foundational Work
34(1)
Previous Approaches
35(16)
Bandit Problems
35(2)
Learning Automata
37(4)
Early Work
37(1)
Probability-Vector Approaches
38(3)
Reinforcement-Comparison Methods
41(1)
Associative Methods
42(6)
Copying
43(1)
Linear Associators
43(2)
Error Backpropagation
45(3)
Genetic Algorithms
48(1)
Extensions to the Model
49(1)
Non-Boolean Reinforcement
49(1)
Nonstationary Environments
50(1)
Conclusions
50(1)
Interval Estimation Method
51(16)
Description of the Algorithm
51(3)
Analysis
54(3)
Regular Error
54(1)
Error Due to Sticking
55(2)
Total Regret
57(1)
Empirical Results
57(1)
Experimental Comparisons
58(6)
Algorithms and Environments
58(1)
Parameter Tuning
59(1)
Results
60(4)
Extensions
64(2)
Multiple Inputs and Actions
64(1)
Real-valued Reinforcement
64(1)
Nonstationary Environments
65(1)
Conclusion
66(1)
Divide and Conquer
67(10)
Boolean-Function Learners
67(1)
Cascade Algorithm
67(3)
Correctness and Convergence
70(3)
Correctness
70(2)
Convergence
72(1)
Empirical Results
73(3)
Complexity
73(1)
Performance
74(2)
Conclusion
76(1)
Learning Boolean Function in k-DNF
77(12)
Background
77(1)
Learning k-DNF from Input-Output Pairs
78(1)
Combining the LARC and VALIANT Algorithms
78(1)
Interval Estimation Algorithm for k-DNF
79(3)
Empirical Comparison
82(6)
Algorithms and Environments
82(1)
Parameter Tuning
83(1)
Results
84(2)
Discussion
86(2)
Conclusion
88(1)
A Generate-and-Test Algorithm
89(24)
Introduction
89(1)
High-Level Description
90(2)
Statistics
92(1)
Evaluating Inputs
93(1)
Managing Hypotheses
94(5)
Adding Hypotheses
94(4)
Promoting Hypotheses
98(1)
Pruning Hypotheses
98(1)
Parameters of the Algorithm
99(1)
Computational Complexity
99(2)
Choosing Parameter Values
101(3)
Number of Levels
102(1)
Number of Working and Candidate Hypotheses
102(1)
Promotion Age
103(1)
Rate of Generating Hypotheses
104(1)
Maximum New Hypothesis Tries
104(1)
Empirical Results
104(6)
Sample Run
104(1)
Effects of Parameter Settings on Performance
105(1)
Comparison with Other Algorithms
105(5)
Conclusions
110(3)
Learning Action Maps with State
113(10)
Set-Reset
113(1)
Using SR in GTRL
114(3)
Hypotheses
115(1)
Statistics
116(1)
Search Heuristics
117(1)
Complexity
117(1)
Experiments with GTRL-S
117(5)
Lights and Buttons
119(2)
Many Lights and Buttons
121(1)
Conclusion
122(1)
Delayed Reinforcement
123(16)
Q-Learning
123(2)
Q-Learning and Interval Estimation
125(1)
Adaptive Heuristic Critic Method
126(3)
Other Approaches
129(2)
Complexity Issues
131(1)
Empirical Comparison
131(8)
Environments
131(3)
Algorithms
134(1)
Parameter Tuning
134(1)
Results
135(1)
Discussion
136(3)
Experiments in Complex Domains
139(10)
Simple, Large, Random Environment
139(2)
Algorithms
139(1)
Task
140(1)
Parameter Settings
140(1)
Results
140(1)
Mobile Robot Domain
141(4)
Algorithms
142(1)
Task
142(1)
Results
143(2)
Robot Domain with Delayed Reinforcement
145(4)
Algorithms
145(1)
Task
146(1)
Results
146(3)
Conclusion
149(10)
Results
149(1)
Conclusions
150(1)
Future Work
151(6)
Exploration
152(1)
Bias
152(1)
World Models and State
153(4)
Delayed Reinforcement
157(1)
Final Words
157(2)
Appendix A Statistics in GTRL 159(6)
A.1 Binomial Statistics
159(2)
A.2 Normal Statistics
161(1)
A.3 Nonparametric Statistics
162(3)
Appendix B Simplifying Boolean Expressions in GTRL 165(2)
References 167(8)
Index 175