This page includes the prototype, derivatives and the four datasets used in the following paper.

Chengnian Sun, David Lo, Siau-Cheng Khoo, Jing Jiang: Towards More Accurate Retrieval of Duplicate Bug Reports

Program

Program (Version 2)


Derivatives of RF

OpenOffice Dataset (19M)

Eclipse Dataset (33M)

Mozilla Dataset (54M)

Large Eclipse Dataset (167M)