FSWeight.Zip 1 March 2007 ============ The source codes for FSWeight.Zip along with the VC project files are provided here. The program runs in windows 2000/XP commandline. There is a sample batch file and some sample input files. INPUT The software will take the following command: relate scheme_file annotation_file interaction_file scheme_file - Function scheme file (e.g. funcat-2.0_scheme.txt) annotation_file - Function annotation file (e.g. funcat-2.0_data_14032005.txt) interaction_file - Protein interaction file (e.g. grid_1804205.txt) Refer to the files for the respective formats. All files are simple delimited flat text files. OUTPUT The software will output 2 files: predictions.txt - contains predictions made in this format (tab-delimited): ProteinName FunctionID Weight results.txt - contains the data for the precision vs recall graph (tab-delimited): Score Recall Precision OPERATION By default the software will perform leave-one out cross validation using FS-Weighted Averaging. Some simple comments are included in main.cpp on how to: 1) Perform X-fold cross validation 2) Use Neighbour counting, Chi-Square and Functional Flow. Functional Flow is implemented based on the description in the manuscript (Nabieva,E. et al. (2005) Whole-proteome prediction of protein function via graph-theoretic analysis of interaction maps. Bioinformatics, 21 (Suppl. 1), i302¨Ci310). We received comments that the authors made some mistake in the description, but was not given any details. Hence it would be best to request the software from the authors. 3) Output the distance matrix for the clustering in PRODISTIN. (Brun,C. et al. (2003) Functional classification of proteins for the prediction of cellular function from a protein¨Cprotein interaction network. Genome Biol., 5, R6.). This matrix will be output to the file distmatrix.txt. This can be used as an input for the BIONJ clustering program which is also included here. BIONJ will output a tree as a text file. Name this file as tree.txt, then use the function PrintProdistin() to make predictions from the tree. MRF We did not implement the MRF method (Deng,M. et al. (2003) Prediction of protein function using protein¨Cprotein interaction data. J. Comp. Biol., 10, 947¨C960). Instead we run our software on their data and compare with their results. I have included the formatted functional scheme, annotation and interaction files, as well as precision vs recall results of the MRF method here in the folder "MRD data". Original data files used to be available at: http://www-hto.usc.edu/ˇ«msms/ProteinFunction, but do not seem to be available anymore. Credits: The FSWeight protein function prediction program was implemented by Hon Nian CHUA. If you use this software, please cite the following paper: Hon Nian Chua, Wing-Kin Sung, Limsoon Wong. "Exploiting Indirect Neighbours and Topological Weight to Predict Protein Function from Protein-Protein Interactions". Bioinformatics, 22:1623--1630, 2006.