Skip to Content

Group 6 - Mining software repositories


  • Tao Xie (North Carolina State University, USA)



  • Goran Mausa
  • Mehdi Mirzaaghaei
  • Meng Na
  • Gabriella Kakuja-Tóth

Read and comment in the topic forum. (authorization required)


The Mining Software Repositories (MSR) field analyzes the rich data available in software repositories to uncover interesting and actionable information about software systems and projects. Software repositories such as source control systems, archived communications between project personnel, and defect tracking systems are used to help manage the progress of software projects. Software practitioners and researchers are recognizing the benefits of mining this information to support the maintenance of software systems, improve software design/reuse, and empirically validate novel ideas and techniques. Research is now proceeding to uncover the ways in which mining these repositories can help to understand software development and software evolution, to support predictions about software development, and to exploit this knowledge concretely in planning future development.

The goal of this working group is to investigate

  1. The limitations and weaknesses of software repositories and MSR techniques.
  2. How research in this field could favor the adoption of MSR techniques by practitioners.
  3. Main open challenges and future directions in MSR.
  4. Formation of the MSR research community and its interactions with other research communities (e.g., data mining, statistics, visualization, software testing and analysis, software maintenance, requirements engineering,...).

The study will be conducted by contacting experts of the field – ideally, both academics and practitioners – during the ESEC/FSE conference and interviewing them.

For this purpose, the working group should identify a set of questions to be asked academics and practitioners; the following questions provide starting points:

  • What software repositories are currently used among your industrial partners? Do any of them use MSR techniques?
  • What are the main shortcomings of current software repositories to properly support MSR? How can they be addressed?
  • How can MSR techniques be conveniently deployed in the industry? What are the barriers to entry preventing practitioners to apply these techniques?
  • What should be the medium to long-term goals of the MSR community?
  • What other communities do you belong to besides the MSR community? What interactions between the MSR community and other communities do you see?

An extensive set of references can be found at:

Feel free to discuss the topic on the discussion forum: