bench

Benchmarking Research Performance in Department of Computer Science,
School of Computing, National University of Singapore

Philip M. Long Tan Kian Lee Joxan Jaffar
April 12, 1999

In April 1999, the Department of Computer Science at the National University of Singapore conducted a study to benchmark its research performance. The study shows, using publication counts along, that NUS ranks between 21 and 28 in comparison to a list of the top 70 departments of CS in the US. In this article, we present the methodology adopted and report our findings.

1. Background

As part of its self-assessment effort, the Department of Computer Science at the National University of Singapore conducted a study to benchmark its research performance. The study used publication statistics to estimate where it would have been placed in an authoritative ranking of CS departments.

We chose to use statistics of conference publications instead of journal publications because in computer science, conferences are the primary means for communicating research results, and they are refereed and some are very selective. We used papers published from 1995-1997; we stopped at 1997 so that the proceedings from the most recent year would be likely to be available in the library. Prior to this exercise, our department had divided conferences into three categories based on their prestige level: rank 1(the most prestigious conferences), rank 2, and rank 3. Since we felt publications in rank 1 and rank 2 conferences are far more relevant to the standing of a department, and to save on data collection costs, we omitted consideration of conferences of rank 3. In fact, a few of the proceedings were not available in the library: in our study we used the 109 conferences of rank 2 and above whose proceedings were available. We divided the rank 2 conferences into two groups, picking out a small collection of the better rank 2 conferences, which we will refer to as rank 2A conferences and the remainder as rank 2B conferences. This was done by consulting faculty in different areas and asking their opinions: they could support their case for a conference using the usual arguments like small acceptance ratio, publication of prominent results, participation by famous researchers in the conference or on the program committee, and so forth.

As our "authoritative ranking" of CS departments, we used the ranking published by the National Research Council [1]. To save on data collection costs, we used only the top 70 universities in that ranking (see Table 1). We note that our estimate is obtained only from publication statistics, whereas the original ranking done by the NRC took into account other factors.

**Table 1. The (1995) US National Research Council Top 70 CS Departments.**
1 Stanford University	26 Purdue University	51 University of Illinois at Chicago
2 Massachusetts Inst of Technology	27 Rutgers State Univ-New Brunswick	52 Washington University
3 University of California-Berkeley	28 Duke University	53 Michigan State University
4 Carnegie Mellon University	29 U of North Carolina-Chapel Hill	54 CUNY - Grad Sch & Univ Center
5 Cornell University	30 University of Rochester	55 Pennsylvania State University
6 Princeton University	31 State U of New York-Stony Brook	56 Dartmouth College
7 University of Texas at Austin	32 Georgia Institute of Technology	57 State Univ of New York-Buffalo
8 U of Illinois at Urbana-Champaign	33 University of Arizona	58 University of California-Davis
9 University of Washington	34 University of California-Irvine	59 Boston University
10 University of Wisconsin-Madison	35 University of Virginia	60 North Carolina State University
11 Harvard University	36 Indiana University	61 Arizona State University
12 California Institute Technology	37 Johns Hopkins University	62 University of Iowa
13 Brown University	38 Northwestern University	63 Texas A&M University
14 Yale University	39 Ohio State University	64 University of Oregon
15 Univ of California-Los Angeles	40 University of Utah	65 University of Kentucky
16 University of Maryland College Park	41 University of Colorado	66 Virginia Polytech Inst & State U
17 New York University	42 Oregon Graduate Inst Sci & Tech	67 George Washington University
18 U of Massachusetts at Amherst	43 University of Pittsburgh	68 Case Western Reserve Univ
19 Rice University	44 Syracuse University	69 University of South Florida
20 University of Southern California	45 University of Pennsylvania a	70 Oregon State University B.
21 University of Michigan	46 University of Florida
22 Univ of California-San Diego	47 University of Minnesota
23 Columbia University	48 Univ of California-Santa Barbara
24 University of Pennsylvania b	49 Rensselaer Polytechnic Inst
25 University of Chicago	50 Univ of California-Santa Cruz

2. Basic Method and Result

Once we had divided the conferences into three categories (rank 1 only, ranks 1+2A, and ranks 1+2A+2B), we counted the number of papers published in the selected conferences by NUS and the 70 US computer science departments, and checked how well counting the number of papers published in conferences of some rank or higher agreed with the ranking published by the NRC. To measure the degree of disagreement, we counted the number of pairs of universities that had the property that University A was ranked above University B, but University B had a higher paper count (considering conferences at some rank or higher). We took the prestige threshold with the fewest disagreements (this turned out to be rank 1 conferences only), and looked at its ranking. Using this method, NUS's estimated ranking among US universities was 26th. Counting rank 1 conference papers agreed with 80% of the relative rankings of the NRC study. Using the other prestige thresholds yielded similar results, with slightly higher rankings for NUS, and slightly more disagreements with the NRC ranking.

3. Other Methods and Results

Despite the best intentions of the members of the computer science department, it is natural to suspect that some bias might creep into our departmental rankings. To address this potential problem, we tried a variety of different methods, which balanced our prior knowledge about the prestige of conferences with information obtained by looking at where members of well-respected universities published.

3.1 Choosing conferences using the NRC ranking only

In this method, we used the NRC ranking to choose the subset of conferences to use for paper counting. For this, we used "simulated annealing" [3], a standard optimization technique, to choose a subset of conferences to approximately minimize the number of disagreements with the NRC ranking. There was no prior bias toward any particular subset, and all of the conferences for which we collected data were considered for membership in the final subset. Using this method, NUS's estimated ranking among US universities was 21st. Counting publications in the chosen subset of conferences agreed with 86% of the relative rankings of the NRC study.

3.2 Weighting the conferences

Instead of choosing a subset of conferences and simply counting publications in that subset, a more refined measure of the overall research output of a department could be obtained by assigning a weight to each conference that reflects how prestigious it is, and then calculating the total weight of the publications of the department. How to use the NRC ranking to estimate appropriate weights? One reasonable goal is to try to find a weighting such that the relative "total publication weight" of departments agrees with their relative NRC rankings to the greatest extent possible. An iterative algorithm, called the perceptron algorithm, can be proved to find a weighting which agrees with the ranking exactly if there is one (see [2]). We applied the perceptron algorithm using all of the conferences. It found a weighting that agreed with all of the relative rankings of the NRC study. Using this weighting, NUS's rank was 28th. We were concerned that the number of parameters being adjusted by this method might be too large relative to the amount of data used to set them. So we ran the experiment again, using only the rank 1 conferences. Using this weighting, NUS's rank was 22nd. The agreement with the NRC ranking was 91%.

4. Validating the Methods

To assess the quality of the different methods that we considered, we used a variant of a standard technique, called "cross validation". Note that each method can be viewed as using the NRC ranking to estimate a weighting on conferences, and then using the weighting to rank the departments again (the ones which choose a subset of conferences can be viewed as assigning weights of 0 to conferences that are not chosen, and 1 to those that are). We performed the following experiment to estimate the quality of the different weighting methods: First, we applied whatever method to choose a weighting of the conferences using only the departments in the NRC ranking with odd-number ranks. Then, we took the resulting weights, and counted the number of disagreements that they had with pairs of de- partments with even-numbered ranks. We found that the algorithm which was based on the departmental rankings agreed with 80% of the pairs of even-numbered-ranked universities. The algorithm which used simulated annealing to choose a subset of the conferences, when applied to the odd- numbered-ranked universities, found a subset of conferences whose paper count also agreed with 80% of the pairs of even-numbered-ranked univer- sities. The perceptron algorithm applied to all the conferences and the odd-numbered-ranked universities, yielded a weighting which agreed with 67% of the pairs of even-numbered-ranked universities. However, when the perceptron algorithm was restricted to rank 1 conferences, its performance improved to 90%.

5. Discussion

Looking at the details of the weightings computed by these methods and where the disagreements occurred uncovered some possible biases that one should keep in mind when interpreting these results. First, the methods which chose weightings purely based on the NRC ranking appeared to be strongly biased in favor of conferences held outside the US. For example, the simulated annealing method left out the two best conferences in theoretical computer science (STOC and FOCS), but included theory conferences held outside the US (ICALP and ASIAN) which are widely regarded to be much less prestigious. One possible explanation for this is that members of universities near the top of the NRC ranking are more likely to have sufficient research funding to be able to more easily afford to travel overseas to attend conferences (note that the NRC ranking only includes universities in the US). Since members of NUS are more likely to publish in conferences held in Asia than members of US universities, this bias favors NUS. Second, counting the total research output resulted in a bias toward large departments. This was evident when looking at where disagreements with the NRC ranking occured. Since our department is large, this bias also favors NUS.

6. Conclusion

Using a variety of different methods, we have estimated that the Computer Science Department of NUS would be ranked somewhere in the 20s had it been included in the US NRC ranking of CS departments. Table 2 summarizes our findings under the different methods.

**Table 2. Estimates of the Department of Computer Science would fall in the NRC ranking of US departments using different methods.**
Method	NUS's rank
Threshold-dept-ranks	26
Best-subset	21
Perceptron-all-conference	28
Perceptron-rank-1	22

References

[1]   National Research Council. Research-doctorate programs in the united states: Continuity and change. Computer Science rankings and other
        information available on the web at http://www.cra.org/statistics/nrcstudy2/home.html.
[2]    J. A. Hertz, A. Krogh, and R. Palmer. Introduction to the theory of neural computation. Addison-Wesley, 1991.
[3]    D. Karaboga and D.T. Pham. Intelligent Optimization Techniques : Genetic Algorithms, Tabu Search, Simulated Annealing and Neural Net- works.
        Springer Verlag, 1999.