Dr. Gregory Gay has received an NSF Research award for his project on Understanding The Role of Software Test Adequacy Criteria in Search-Based Test Generation. This work aims to help improve automated software testing in industry.
Abstract
Software testing ensures that software is robust and reliable. As testers cannot know what faults exist apriori, dozens of metrics---ranging from the measurement of structural coverage to the detection of synthetic faults---have been proposed to judge test case adequacy. In theory, if such metrics are fulfilled, tests should be adequate at detecting faults. To alleviate the high cost of testing, optimization algorithms can be used to automatically generate test suites. These adequacy metrics are well-suited for guiding automated test creation. However, there is no adequacy metric known to universally correspond to "effective fault detection." Testers are left with a bewildering number of testing options, and there is little guidance on when to use one criterion over another. These metrics are a solid starting point for test case generation. Many faults cannot be detected until the code has been executed. However, merely executing code does not ensure adequate testing. How code is executed is important. It is clear that testers do not yet understand which adequacy metrics actually correspond to a high probability of fault detection, or under what situations these metrics can be applied.
Therefore, it is clear that improving automated test generation requires gaining a better understanding of the circumstances where particular metrics are effective, isolating the features of such metrics that correlate to fault detection in such circumstances, and establishing and evaluating guidelines for the use and combination of metrics - perhaps tied to particular system types or domains - that will result in real-world fault detection. Large-scale empirical investigations will be performed into the nature of the relationship between adequacy criteria and the probability of fault detection in order to understand the efficacy and applicability of the criteria that are used to guide test creation. This work will have broader impacts on industrial practices, software engineering education, and - through dissemination to and collaborations with industrial partners and regulatory agencies - public safety and security.