A polyadenine tail is found at the 3’ end of nearly every fully processed eukaryotic mRNA and has been suggested to influence virtually all aspects of mRNA metabolism. The ability to predict polyadenylation site will allow us to define gene boundaries, predict number of genes present in a particular gene locus and perhaps better understand mRNA metabolism.
To this end, we built an arabidopsis polyadenylation prediction model. The prediction model consists of four sequential steps: feature generation, feature selection, feature integration and cascade classifier. Previously, there was no other arabidopsis polyadenylation prediction model available. Only recently, a program, PASS, was developed and published in February 2007.
However, that program was not available for comparison at the point of writing. Nevertheless, we have tested my model on several public datasets and achieved above 96% sensitivity and specificity at the best combinations.
Download Paper |