Benchmarking Research Performance in Department of Computer Science,
School of Computing, National University of Singapore

Philip M. Long         Tan Kian Lee       Joxan Jaffar
April 12, 1999




In April 1999, the Department of Computer Science at the National University of Singapore conducted a study to benchmark its research performance. The study shows, using publication counts along, that NUS ranks between 21 and 28 in comparison to a list of the top 70 departments of CS in the US. In this article, we present  the methodology adopted and report our findings.
 

1. Background

As part of its self-assessment effort, the Department of Computer Science at the National University of Singapore conducted a study to benchmark its research performance. The study used publication statistics to estimate where it would have been placed in an authoritative ranking of CS departments.

We chose to use statistics of conference publications instead of journal publications because in computer science, conferences are the primary means for communicating research results, and they are refereed and some are very selective. We used papers published from 1995-1997; we stopped at 1997 so that the proceedings from the most recent year would be likely to be available in the library. Prior to this exercise, our department had divided conferences into three categories based on their prestige level: rank 1(the most prestigious conferences), rank 2, and rank 3. Since we felt publications in rank 1 and rank 2 conferences are far more relevant to the standing of a department, and to save on data collection costs, we omitted consideration of conferences of rank 3. In fact, a few of the proceedings were not available in the library: in our study we used the 109 conferences of rank 2 and above whose proceedings were available. We divided the rank 2 conferences into two groups, picking out a small collection of the better rank 2 conferences, which we will refer to as rank 2A conferences and the remainder as rank 2B conferences. This was done by consulting faculty in different areas and asking their opinions: they could support their case for a conference using the usual arguments like small acceptance ratio, publication of prominent results, participation by famous researchers in the conference or on the program committee, and so forth.

As our "authoritative ranking" of CS departments, we used the ranking published by the National Research Council [1]. To save on data collection costs, we used only the top 70 universities in that ranking (see Table 1). We note that our estimate is obtained only from publication statistics, whereas the original ranking done by the NRC took into account other factors.
 
 

1 Stanford University  26 Purdue University  51 University of Illinois at Chicago 
2 Massachusetts Inst of Technology  27 Rutgers State Univ-New Brunswick 52 Washington University 
3 University of California-Berkeley 28 Duke University  53 Michigan State University 
4 Carnegie Mellon University  29 U of North Carolina-Chapel Hill  54 CUNY - Grad Sch & Univ Center 
5 Cornell University  30 University of Rochester  55 Pennsylvania State University
6 Princeton University  31 State U of New York-Stony Brook  56 Dartmouth College 
7 University of Texas at Austin  32 Georgia Institute of Technology  57 State Univ of New York-Buffalo 
8 U of Illinois at Urbana-Champaign  33 University of Arizona  58 University of California-Davis 
9 University of Washington  34 University of California-Irvine  59 Boston University 
10 University of Wisconsin-Madison  35 University of Virginia  60 North Carolina State University 
11 Harvard University  36 Indiana University 61 Arizona State University 
12 California Institute Technology  37 Johns Hopkins University  62 University of Iowa 
13 Brown University  38 Northwestern University  63 Texas A&M University 
14 Yale University  39 Ohio State University  64 University of Oregon 
15 Univ of California-Los Angeles  40 University of Utah  65 University of Kentucky 
16 University of Maryland College Park  41 University of Colorado  66 Virginia Polytech Inst & State U 
17 New York University  42 Oregon Graduate Inst Sci & Tech  67 George Washington University 
18 U of Massachusetts at Amherst  43 University of Pittsburgh 68 Case Western Reserve Univ 
19 Rice University  44 Syracuse University  69 University of South Florida 
20 University of Southern California  45 University of Pennsylvania a 70 Oregon State University B. 
21 University of Michigan  46 University of Florida
22 Univ of California-San Diego  47 University of Minnesota 
23 Columbia University  48 Univ of California-Santa Barbara 
24 University of Pennsylvania b  49 Rensselaer Polytechnic Inst 
25 University of Chicago  50 Univ of California-Santa Cruz
Table 1. The (1995) US National Research Council Top 70 CS Departments.

 

2. Basic Method and Result

Once we had divided the conferences into three categories (rank 1 only, ranks 1+2A, and ranks 1+2A+2B), we counted the number of papers published in the selected conferences by NUS and the 70 US computer science departments, and checked how well counting the number of papers published in conferences of some rank or higher agreed with the ranking published by the NRC. To measure the degree of disagreement, we counted the number of pairs of universities that had the property that University A was ranked above University B, but University B had a higher paper count (considering conferences at some rank or higher). We took the prestige threshold with the fewest disagreements (this turned out to be rank 1 conferences only), and looked at its ranking. Using this method, NUS's estimated ranking among US universities was 26th. Counting rank 1 conference papers agreed with 80% of the relative rankings of the NRC study. Using the other prestige thresholds yielded similar results, with slightly higher rankings for NUS, and slightly more disagreements with the NRC ranking.
 

3. Other Methods and Results

Despite the best intentions of the members of the computer science department, it is natural to suspect that some bias might creep into our departmental rankings. To address this potential problem, we tried a variety of different methods, which balanced our prior knowledge about the prestige of conferences with information obtained by looking at where members of well-respected universities published.
 

3.1 Choosing conferences using the NRC ranking only

In this method, we used the NRC ranking to choose the subset of conferences to use for paper counting. For this, we used "simulated annealing" [3], a standard optimization technique, to choose a subset of conferences to approximately minimize the number of disagreements with the NRC ranking. There was no prior bias toward any particular subset, and all of the conferences for which we collected data were considered for membership in the final subset. Using this method, NUS's estimated ranking among US universities was 21st. Counting publications in the chosen subset of conferences agreed with 86% of the relative rankings of the NRC study.
 

3.2 Weighting the conferences

Instead of choosing a subset of conferences and simply counting publications in that subset, a more refined measure of the overall research output of a department could be obtained by assigning a weight to each conference that reflects how prestigious it is, and then calculating the total weight of the publications of the department. How to use the NRC ranking to estimate appropriate weights? One reasonable goal is to try to find a weighting such that the relative "total publication weight" of departments agrees with their relative NRC rankings to the greatest extent possible. An iterative algorithm, called the perceptron algorithm, can be proved to find a weighting which agrees with the ranking exactly if there is one (see [2]). We applied the perceptron algorithm using all of the conferences. It found a weighting that agreed with all of the relative rankings of the NRC study. Using this weighting, NUS's rank was 28th. We were concerned that the number of parameters being adjusted by this method might be too large relative to the amount of data used to set them. So we ran the experiment again, using only the rank 1 conferences. Using this weighting, NUS's rank was 22nd. The agreement with the NRC ranking was 91%.
 

4. Validating the Methods

To assess the quality of the different methods that we considered, we used a variant of a standard technique, called "cross validation". Note that each method can be viewed as using the NRC ranking to estimate a weighting on conferences, and then using the weighting to rank the departments again (the ones which choose a subset of conferences can be viewed as assigning weights of 0 to conferences that are not chosen, and 1 to those that are). We performed the following experiment to estimate the quality of the different weighting methods: First, we applied whatever method to choose a weighting of the conferences using only the departments in the NRC ranking with odd-number ranks. Then, we took the resulting weights, and counted the number of disagreements that they had with pairs of de- partments with even-numbered ranks. We found that the algorithm which was based on the departmental rankings agreed with 80% of the pairs of even-numbered-ranked universities. The algorithm which used simulated annealing to choose a subset of the conferences, when applied to the odd- numbered-ranked universities, found a subset of conferences whose paper count also agreed with 80% of the pairs of even-numbered-ranked univer- sities. The perceptron algorithm applied to all the conferences and the odd-numbered-ranked universities, yielded a weighting which agreed with 67% of the pairs of even-numbered-ranked universities. However, when the perceptron algorithm was restricted to rank 1 conferences, its performance improved to 90%.
 
 

5. Discussion

Looking at the details of the weightings computed by these methods and where the disagreements occurred uncovered some possible biases that one should keep in mind when interpreting these results. First, the methods which chose weightings purely based on the NRC ranking appeared to be strongly biased in favor of conferences held outside the US. For example, the simulated annealing method left out the two best conferences in theoretical computer science (STOC and FOCS), but included theory conferences held outside the US (ICALP and ASIAN) which are widely regarded to be much less prestigious. One possible explanation for this is that members of universities near the top of the NRC ranking are more likely to have sufficient research funding to be able to more easily afford to travel overseas to attend conferences (note that the NRC ranking only includes universities in the US). Since members of NUS are more likely to publish in conferences held in Asia than members of US universities, this bias favors NUS. Second, counting the total research output resulted in a bias toward large departments. This was evident when looking at where disagreements with the NRC ranking occured. Since our department is large, this bias also favors NUS.
 

6. Conclusion

Using a variety of different methods, we have estimated that the Computer Science Department of NUS would be ranked somewhere in the 20s had it been included in the US NRC ranking of CS departments. Table 2 summarizes our findings under the different methods.
 
 
Method
NUS's rank
Threshold-dept-ranks
26
Best-subset
21
Perceptron-all-conference
28
Perceptron-rank-1
22
Table 2. Estimates of the Department of Computer Science would fall in the NRC ranking of US departments using different methods.

References

[1]   National Research Council. Research-doctorate programs in the united states: Continuity and change. Computer Science rankings and other
        information available on the web at http://www.cra.org/statistics/nrcstudy2/home.html.
[2]    J. A. Hertz, A. Krogh, and R. Palmer. Introduction to the theory of neural computation. Addison-Wesley, 1991.
[3]    D. Karaboga and D.T. Pham. Intelligent Optimization Techniques : Genetic Algorithms, Tabu Search, Simulated Annealing and Neural Net- works.
        Springer Verlag, 1999.