|
|
vii | |
Preface |
|
xiii | |
Contributors |
|
xv | |
|
1 Specification Mining: A Concise Introduction |
|
|
1 | (28) |
|
|
|
|
|
|
1 | (2) |
|
|
3 | (4) |
|
1.3 Mining Finite State Machines |
|
|
7 | (4) |
|
1.4 Mining Value-Based Invariants |
|
|
11 | (1) |
|
1.5 Mining Patterns and Rules |
|
|
12 | (6) |
|
1.6 Mining Sequence Diagrams |
|
|
18 | (1) |
|
|
19 | (10) |
|
|
20 | (9) |
|
2 Mining Finite-State Automata with Annotations |
|
|
29 | (30) |
|
|
|
|
|
|
30 | (1) |
|
2.2 Modeling Software Systems with Annotated FSA |
|
|
31 | (1) |
|
2.3 Introducing a Running Example |
|
|
32 | (4) |
|
2.4 Inferring FSA Annotated with Constraints |
|
|
36 | (7) |
|
2.4.1 Merging Similar Traces |
|
|
36 | (1) |
|
2.4.2 Generating Constraints |
|
|
37 | (1) |
|
|
37 | (2) |
|
2.4.4 Merging Equivalent States in EFSA |
|
|
39 | (4) |
|
2.5 Inferring FSA Annotated with Data-Flow Information |
|
|
43 | (9) |
|
|
43 | (1) |
|
2.5.1.1 Identifying Data Clusters |
|
|
43 | (1) |
|
2.5.1.2 Global Ordering Rewriting Strategy |
|
|
44 | (1) |
|
2.5.1.3 Instantiation Rewriting Strategy |
|
|
45 | (2) |
|
2.5.1.4 Access Rewriting Strategy |
|
|
47 | (1) |
|
2.5.1.5 Choosing a Rewriting Strategy |
|
|
48 | (2) |
|
|
50 | (2) |
|
2.6 Comparing the Generated Models |
|
|
52 | (1) |
|
|
53 | (1) |
|
2.7.1 Mining Models That Represent Ordering of Events |
|
|
53 | (1) |
|
2.7.2 Mining Models That Represent Program Variables |
|
|
54 | (1) |
|
|
54 | (5) |
|
|
55 | (4) |
|
3 Adapting Grammar Inference Techniques to Mine State Machines |
|
|
59 | (26) |
|
|
|
|
59 | (2) |
|
3.2 The Conventional Reverse-Engineering Approach and Its Problems |
|
|
61 | (5) |
|
3.2.1 State Machines and Their Languages |
|
|
61 | (1) |
|
|
62 | (2) |
|
3.2.3 Problems with This Approach |
|
|
64 | (2) |
|
3.3 Applying Advances from the Field of Grammar Inference |
|
|
66 | (9) |
|
|
67 | (1) |
|
3.3.2 Improved State-Merging Heuristics |
|
|
67 | (1) |
|
3.3.2.1 Evidence-Driven State Merging |
|
|
68 | (2) |
|
3.3.2.2 Blue-Fringe Search Strategy |
|
|
70 | (1) |
|
|
71 | (2) |
|
3.3.4 Applying These Advances in Practice |
|
|
73 | (1) |
|
3.3.4.1 The Software System as Oracle |
|
|
73 | (2) |
|
3.4 Integrating Constraints |
|
|
75 | (4) |
|
3.4.1 Integrating Data Constraints |
|
|
76 | (1) |
|
3.4.2 Integrating Temporal Constraints |
|
|
76 | (3) |
|
3.5 Conclusions and Future Work |
|
|
79 | (6) |
|
|
80 | (5) |
|
4 Mining API Usage Protocols from Large Method Traces |
|
|
85 | (28) |
|
|
|
|
86 | (2) |
|
4.2 Mining API Usage Protocols |
|
|
88 | (9) |
|
4.2.1 Collecting Method Traces |
|
|
89 | (2) |
|
4.2.2 Finding Object Collaborations |
|
|
91 | (1) |
|
4.2.3 Collaboration Transformers |
|
|
92 | (1) |
|
4.2.3.1 Generalizing Types |
|
|
92 | (1) |
|
|
93 | (1) |
|
4.2.3.3 Package-Based Filtering |
|
|
93 | (1) |
|
4.2.3.4 Dataflow Filtering |
|
|
94 | (1) |
|
4.2.4 Extracting Collaboration Patterns |
|
|
95 | (1) |
|
4.2.4.1 Summarizing Frequent Collaborations |
|
|
95 | (1) |
|
4.2.4.2 Removing Infrequent Collaborations |
|
|
96 | (1) |
|
4.2.5 Generating Finite State Machines |
|
|
96 | (1) |
|
|
97 | (1) |
|
4.3.1 Instrumentation and Program Run |
|
|
97 | (1) |
|
4.3.2 Specification Inference |
|
|
98 | (1) |
|
|
98 | (6) |
|
|
99 | (1) |
|
|
100 | (1) |
|
4.4.3 Number and Size of Inferred Protocols |
|
|
101 | (1) |
|
4.4.4 Influence of Coverage |
|
|
101 | (1) |
|
4.4.5 Quality of Inferred Protocols |
|
|
102 | (1) |
|
4.4.6 Performance and Scalability |
|
|
102 | (2) |
|
4.5 Discussion and Applications |
|
|
104 | (1) |
|
4.5.1 Dynamic versus Static Analysis |
|
|
104 | (1) |
|
|
104 | (1) |
|
|
105 | (1) |
|
|
105 | (3) |
|
4.6.1 Mining of Finite State Machines |
|
|
105 | (2) |
|
4.6.2 Mining of Temporal Rules and Other Specifications |
|
|
107 | (1) |
|
4.6.3 Invariant Inference |
|
|
108 | (1) |
|
|
108 | (1) |
|
|
108 | (5) |
|
|
108 | (5) |
|
5 Static API Specification Mining: Exploiting Source Code Model Checking |
|
|
113 | (46) |
|
|
|
|
114 | (2) |
|
5.2 Static Trace Generation for Mining Specifications |
|
|
116 | (17) |
|
|
117 | (2) |
|
|
119 | (1) |
|
5.2.2.1 Partial and Total Order |
|
|
120 | (1) |
|
5.2.2.2 Frequent Closed Partial Orders (FCPO) |
|
|
121 | (1) |
|
5.2.2.3 Formalizing FCPO Mining from Program Traces |
|
|
122 | (1) |
|
|
123 | (3) |
|
5.2.2.5 Scenario Extraction |
|
|
126 | (1) |
|
5.2.2.6 Specification Mining |
|
|
127 | (1) |
|
|
128 | (1) |
|
|
129 | (1) |
|
5.2.4 Comparison with a Dynamic Miner |
|
|
130 | (3) |
|
5.3 Adapting Static Trace Generation for Mining API Error-Handling Specifications |
|
|
133 | (17) |
|
|
135 | (5) |
|
5.3.2 Adaptation - The Details |
|
|
140 | (1) |
|
5.3.2.1 Error/Normal Trace Generation |
|
|
140 | (4) |
|
5.3.2.2 Specification Mining |
|
|
144 | (1) |
|
|
145 | (1) |
|
|
145 | (5) |
|
|
150 | (9) |
|
|
151 | (1) |
|
5.4.2 Scope, Limitations, and Further Research |
|
|
151 | (1) |
|
|
152 | (7) |
|
6 Static Specification Mining Using Automata-Based Abstractions |
|
|
159 | (42) |
|
|
|
|
|
|
160 | (2) |
|
|
162 | (4) |
|
6.2.1 Abstract-Trace Collection |
|
|
165 | (1) |
|
|
165 | (1) |
|
6.2.3 Refining Mining Results via Inspection |
|
|
166 | (1) |
|
|
166 | (1) |
|
6.4 Abstract Trace Collection |
|
|
167 | (12) |
|
6.4.1 Concrete Instrumented Semantics |
|
|
167 | (2) |
|
|
169 | (1) |
|
|
169 | (4) |
|
6.4.2.2 Abstract Semantics |
|
|
173 | (1) |
|
6.4.2.3 History Abstractions |
|
|
174 | (5) |
|
6.5 Summarization Using Statistical Approaches |
|
|
179 | (2) |
|
|
180 | (1) |
|
|
180 | (1) |
|
6.6 Refining Mining Results Using Inspection |
|
|
181 | (5) |
|
|
182 | (1) |
|
6.6.1.1 Static Client Inspection |
|
|
182 | (1) |
|
6.6.1.2 Static Component Inspection |
|
|
182 | (1) |
|
|
182 | (1) |
|
6.6.3 Selection of Paths for Inspection |
|
|
183 | (1) |
|
6.6.4 Refinement Based on Abstraction Merge Points |
|
|
183 | (3) |
|
|
186 | (7) |
|
|
186 | (1) |
|
|
187 | (3) |
|
|
190 | (1) |
|
|
191 | (1) |
|
|
192 | (1) |
|
|
193 | (3) |
|
|
196 | (5) |
|
|
196 | (5) |
|
7 DynaMine: Finding Usage Patterns and Their Violations by Mining Software Repositories |
|
|
201 | (40) |
|
|
|
|
202 | (2) |
|
|
203 | (1) |
|
7.1.2 Chapter Organization |
|
|
204 | (1) |
|
|
204 | (4) |
|
|
205 | (2) |
|
|
207 | (1) |
|
7.3 Mining Usage Patterns |
|
|
208 | (6) |
|
7.3.1 Basic Mining Algorithm |
|
|
210 | (1) |
|
|
210 | (1) |
|
7.3.2.1 Considering a Subset of Method Calls Only |
|
|
210 | (1) |
|
7.3.2.2 Considering Small Patterns Only |
|
|
211 | (1) |
|
|
212 | (1) |
|
7.3.3.1 Standard Ranking Approaches |
|
|
212 | (1) |
|
7.3.3.2 Corrective Ranking |
|
|
213 | (1) |
|
7.3.4 Locating Added Method Calls |
|
|
213 | (1) |
|
7.4 Checking Patterns at Runtime |
|
|
214 | (3) |
|
7.4.1 Pattern Selection and Instrumentation |
|
|
214 | (1) |
|
7.4.2 Post-Processing Dynamic Traces |
|
|
215 | (1) |
|
7.4.2.1 Dynamic Interpretation of Patterns |
|
|
215 | (1) |
|
7.4.2.2 Dynamic versus Static Counts |
|
|
216 | (1) |
|
7.4.2.3 Pattern Classification |
|
|
216 | (1) |
|
|
217 | (9) |
|
|
218 | (1) |
|
|
218 | (1) |
|
|
219 | (1) |
|
7.5.2 Discussion of the Results |
|
|
219 | (1) |
|
7.5.2.1 Matching Method Pairs |
|
|
219 | (3) |
|
|
222 | (2) |
|
7.5.2.3 More Complex Patterns |
|
|
224 | (2) |
|
|
226 | (3) |
|
7.6.1 Static versus Dynamic Analysis |
|
|
226 | (2) |
|
7.6.2 Amount of User Involvement |
|
|
228 | (1) |
|
7.6.3 Granularity of Mined Information |
|
|
228 | (1) |
|
|
229 | (1) |
|
|
230 | (3) |
|
7.8.1 Revision History Mining |
|
|
232 | (1) |
|
|
232 | (1) |
|
|
233 | (8) |
|
|
234 | (7) |
|
8 Automatic Inference and Effective Application of Temporal Specifications |
|
|
241 | (68) |
|
|
|
|
242 | (3) |
|
8.1.1 Prior Work on Inferring Specifications |
|
|
243 | (1) |
|
|
244 | (1) |
|
8.2 Specification Inference |
|
|
245 | (23) |
|
8.2.1 A Running Example: Producer-Consumer |
|
|
246 | (1) |
|
|
246 | (3) |
|
|
249 | (1) |
|
|
249 | (1) |
|
8.2.4.1 Property Templates |
|
|
249 | (1) |
|
8.2.4.2 Pattern Matching Algorithm |
|
|
250 | (5) |
|
8.2.4.3 Handling Context Information |
|
|
255 | (2) |
|
8.2.5 Approximate Inference |
|
|
257 | (1) |
|
|
257 | (1) |
|
8.2.5.2 Detecting the Dominant Behavior |
|
|
258 | (1) |
|
|
259 | (1) |
|
8.2.6.1 Static Call Graph Based Heuristic |
|
|
259 | (1) |
|
8.2.6.2 Naming Similarity Heuristic |
|
|
260 | (1) |
|
|
261 | (1) |
|
|
261 | (2) |
|
8.2.7.2 Chaining Is in NP-Complete |
|
|
263 | (2) |
|
8.2.7.3 The Chaining Algorithm |
|
|
265 | (2) |
|
|
267 | (1) |
|
|
267 | (1) |
|
|
268 | (1) |
|
8.3 Inference Experiments |
|
|
268 | (12) |
|
|
271 | (1) |
|
8.3.1.1 Inference Results |
|
|
272 | (1) |
|
8.3.2 JBoss Application Server |
|
|
273 | (1) |
|
8.3.2.1 Inference Results |
|
|
273 | (1) |
|
8.3.2.2 Comparison with JTA Specification |
|
|
274 | (3) |
|
|
277 | (1) |
|
8.3.3.1 Inference Results |
|
|
277 | (3) |
|
8.4 Using Inferred Properties |
|
|
280 | (14) |
|
8.4.1 Program Verification |
|
|
280 | (1) |
|
|
281 | (2) |
|
|
283 | (2) |
|
8.4.2 Program Differencing |
|
|
285 | (1) |
|
8.4.2.1 Tour Bus Simulator |
|
|
286 | (2) |
|
|
288 | (6) |
|
|
294 | (5) |
|
|
294 | (1) |
|
|
294 | (1) |
|
8.5.2.1 Template-Based Inference |
|
|
295 | (1) |
|
8.5.2.2 Arbitrary Model Inference |
|
|
296 | (1) |
|
8.5.3 Use of Inferred Specifications |
|
|
297 | (1) |
|
|
297 | (1) |
|
|
298 | (1) |
|
|
299 | (10) |
|
|
301 | (1) |
|
|
301 | (8) |
|
9 Path-Aware Static Program Analyses for Specification Mining |
|
|
309 | (48) |
|
|
|
|
|
310 | (4) |
|
9.1.1 Path-Aware Analyses |
|
|
310 | (1) |
|
9.1.2 Dynamic versus Static Path Generation |
|
|
311 | (1) |
|
9.1.3 Specification Inference |
|
|
312 | (1) |
|
|
313 | (1) |
|
9.2 Specification Inference |
|
|
314 | (25) |
|
9.2.1 Precedence Protocols |
|
|
315 | (2) |
|
9.2.2 Preceded-By and Followed-By Relations |
|
|
317 | (1) |
|
9.2.3 Dataflow Predicates |
|
|
317 | (4) |
|
|
321 | (5) |
|
9.2.5 Incorporating Mining |
|
|
326 | (1) |
|
9.2.5.1 Mining Strategies |
|
|
327 | (3) |
|
9.2.5.2 The Structural Similarity Problem |
|
|
330 | (1) |
|
|
331 | (3) |
|
9.2.7 Experimental Results |
|
|
334 | (2) |
|
9.2.8 Quantitative Assessment |
|
|
336 | (1) |
|
9.2.9 Qualitative Assessment |
|
|
337 | (2) |
|
|
339 | (1) |
|
|
339 | (10) |
|
9.3.1 Deriving Specifications |
|
|
342 | (2) |
|
|
344 | (1) |
|
|
345 | (1) |
|
9.3.4 Quantitative Assessment |
|
|
346 | (1) |
|
9.3.5 Qualitative Assessment |
|
|
347 | (2) |
|
|
349 | (1) |
|
|
350 | (7) |
|
|
351 | (6) |
|
10 Mining API Usage Specifications via Searching Source Code from the Web |
|
|
357 | (20) |
|
|
|
|
|
357 | (3) |
|
|
360 | (1) |
|
10.3 Example SE Task: Detecting Exception-Handling Defects |
|
|
361 | (1) |
|
|
362 | (2) |
|
|
364 | (5) |
|
10.5.1 Resolve Object Types |
|
|
365 | (1) |
|
10.5.2 Generate Candidates |
|
|
365 | (4) |
|
|
369 | (2) |
|
|
371 | (1) |
|
|
372 | (5) |
|
|
372 | (5) |
|
11 Merlin: Specification Inference for Explicit Information Flow Problems |
|
|
377 | (34) |
|
|
|
|
|
|
379 | (3) |
|
|
382 | (6) |
|
11.2.1 Assumptions and Probabilistic Inference |
|
|
383 | (1) |
|
11.2.2 Potential Sources, Sinks, and Sanitizers |
|
|
384 | (1) |
|
|
384 | (2) |
|
11.2.4 Auxiliary Constraints |
|
|
386 | (2) |
|
|
388 | (2) |
|
11.4 Constructing the Factor Graph |
|
|
390 | (5) |
|
11.4.1 Computing s() and W() |
|
|
392 | (3) |
|
11.5 Relationship between Triples and Paths |
|
|
395 | (3) |
|
11.6 Experimental Evaluation |
|
|
398 | (7) |
|
11.6.1 Experimental Setup |
|
|
399 | (1) |
|
|
400 | (4) |
|
|
404 | (1) |
|
|
405 | (1) |
|
11.7.1 Securing Web Applications |
|
|
405 | (1) |
|
11.7.2 Mining Specifications |
|
|
406 | (1) |
|
|
406 | (5) |
|
|
407 | (1) |
|
|
407 | (4) |
|
12 Lightweight Mining of Object Usage |
|
|
411 | (22) |
|
|
|
|
411 | (2) |
|
|
413 | (3) |
|
12.3 Creating Function Models |
|
|
416 | (1) |
|
12.4 Obtaining Sequential Constraints |
|
|
417 | (2) |
|
12.5 Finding Specifications |
|
|
419 | (3) |
|
12.6 Discovering Violations |
|
|
422 | (1) |
|
|
423 | (3) |
|
12.8 Scaling to Multiple Projects |
|
|
426 | (3) |
|
12.9 Conclusion and Future Work |
|
|
429 | (4) |
|
|
430 | (3) |
Index |
|
433 | |