Skip to Content

Group 2 - Test data generation




  • Mauro Baluda
  • Lajos Jenő Fülöp
  • Vilas Jagannath
  • Gordana Rakić

Read and comment in the topic forum. (authorization required)

There is strong empirical evidence [1,2] that deficient testing of both functional and non-functional properties is one of the major sources of software and system errors. In 2002, NIST estimated the cost of software failure to the US economy at 6*10^10; which was 0.6% of GDP at the time [3].

The same report found that more than one third of these costs of software failure could be eliminated by an improved testing infrastructure.

Automation of testing is a crucial concern [4]. Through automation, large-scale thorough testing can become practical and scalable. However, the automated generation of test cases presents challenges. The general problem involves finding a (partial) solution to the path sensitisation problem. That is, the problem of finding an input to drive the software down a chosen path. Of course, the underlying problem of path sensitisation is known to be undecidable, so research has focused on techniques that seek to identify near optimal test sets in reasonable time.

Two classes of techniques that have received much recent attention include the application dynamic symbolic execution [5-9] or search-based optimisation [10,11] (or both [12,13]) to the problem of code-based software test data generation.


The goal of this working group is to investigate (i) the limitations and weaknesses of existing test data generation methods, (ii) how research in this field could favour the adoption of automated test data generation techniques by practitioners, and (iii) the main open challenges in automated test data generation.


The study will be conducted by contacting experts of this field-possibly from both academia and industry-during the ESEC-FSE conference and interviewing them to collect information aimed at addressing our goal.

To this purpose, the working group should identify a set of questions to be asked, for example:

1) What do you think the advantages and disadvantages are of a test data generation approach based on either dynamic symbolic execution and/or search-based testing?

2) What kind of automatic test data generation tools are being used among your industrial partners/in your company? For example, whether they use very simple models, off-the-shelf tools, or whether someone is actually applying techniques developed by the research community.

3) What do you think are the main barriers for the adoption of automatic test data generation techniques and/or tools? For example, manual effort that is still required of humans in acting as an oracle, too much emphasis on code as opposed to a formal specification or model.

4) In which direction do you think the research community should put effort? For example, to improve coverage, minimise test suite size, hybridise with other techniques to improve bug-finding etc.


[1] D. Leffingwell and D. Widrig. Managing Software Requirements: A Use Case Approach. Addison Wesley, 2003.

[2] R. L. Glass. Facts and Fallacies of Software Engineering. Addison Wesley, 2002.

[3] National Institute of Standards and Technology. The economic impacts of inadequate infrastructure for software testing, May 2002. Planning Report 02-3.

[4] A. Bertolino. Software testing research: Achievements, challenges, dreams. In Future of Software Engineering 2007 (FOSE 2007), pages 85–103. IEEE Computer Society, 2007.

[5] P. Godefroid, N. Klarlund, and K. Sen, "DART: directed automated
random testing," ACM SIGPLAN Notices, vol. 40, no. 6, pp. 213–223,
Jun. 2005.

[6] K. Sen, D. Marinov, and G. Agha, "CUTE: a concolic unit testing engine for C," in ESEC/SIGSOFT FSE, 2005, pp. 263–272.

[7] N. Tillmann and J. de Halleux, "Pex-white box test generation for.NET," in TAP, 2008, pp. 134–153.

[8] R. Majumdar and K. Sen, "Hybrid concolic testing" in ICSE, 2007, pp. 416–426.

[9] K. Sen and G. Agha, "CUTE and jCUTE: Concolic unit testing and
explicit path model-checking tools," in CAV, 2006, pp. 419–423.

[10] Phil McMinn. Search-Based Software Test Data Generation: A Survey. Software Testing, Verification and Reliability, 14(2), pp. 105-156, 2004.

[11] P. Tonella, "Evolutionary testing of classes," in ISSTA, 2004, pp. 119–128

[12] K. Inkumsah and T. Xie, "Evacon: A framework for integrating evolutionary and concolic testing for object-oriented programs," in ASE, November 2007, pp. 425–428.

[13] K. Lakhotia, N. Tillman, M. Harman, and J. de Halleux, "FloPSy - Search-Based Floating Point Constraint Solving for Symbolic Execution," in ICTSS, 2010, pp. 142–157.