CS 16 Student Guide
Semester II, Spring 1997

Lubomir Bourdev - Mark Handy - Michael Horn - Roberto Tamassia


Course Overview

Welcome to CS 16!

Welcome to Computer Science 16: Introduction to Algorithms and Data Structures. This course covers fundamental techniques for solving problems with computers, techniques which are relevant to most areas of theoretical and applied computer science. You will learn how to design, analyze, and implement algorithms and data structures for sorting, searching, graph problems, and geometric problems. You will also be introduced to basic concepts of parallel computation, a key component of future computer systems.

CS 16 uses state-of-the-art hardware and software technology for computer-aided instruction. You will have a computer account to work on color Sun SPARCstations in the Unix/Solaris environment. Slides will be broadcast on the computers during lectures and will also be available for individual study. Algorithm animations will help you visualize the algorithms and their computations. Your understanding of the course material will be strengthened and tested by means of homeworks, exams, and programming assignments. Expect to do a lot of work, but also to learn a lot.


Outline of Topics

CS 16 covers fundamental algorithms and data structures. Specific topics include:

Elementary Data Structures: 
containers, locators, sequences, trees.

Analysis of Algorithms: 
asymptotic notation, recurrence relations, worst-case time complexity.

Sorting: 
insertion sort, selection sort, heap sort, merge sort, quicksort, radix sort.

Searching: 
binary search, binary search trees, AVL trees, radix search tries, hashing, extendible hashing, string searching (Rabin-Karp), file compression (Huffman encoding).

Geometric Algorithms: 
convex hull, intersection of segments, closest pair of points, sweep-line algorithms.

Graph Algorithms: 
depth-first search, breadth-first search, connected components, biconnected components, shortest path (Dijkstra, all-pairs), minimum spanning tree (Prim, Kruskal), topological sorting, transitive closure, maximum flow (Ford-Fulkerson), bipartite matching, stable marriages.

Parallel Algorithms: 
PRAM models, prefix operations, list ranking, sorting, CREW searching, CRCW maximum finding, network models, systolic arrays.



Preliminary Requirements

The prerequisite for this course is either CS 15 or both CS 4 and CS 5. If you did not take CS 15 or CS 5 in Semester I '96-'97, you must contact Roberto to explore the possibility of a special arrangement. Programming experience in Java and some familiarity with elementary data structures, such as arrays, linked lists, stacks, and queues, are important for your success in the course. We assume you know Java syntax, basic principles of object-oriented programming, and something about object-oriented design. In addition, knowledge of basic binary arithmetic and of elementary properties of polynomials, logarithms, and exponentials is assumed. If you have any questions about the knowledge required for the course, see Roberto or a TA.

Computer Accounts

To enroll in CS 16 you must request a CS computer account that will allow you to use the SPARCstations in the Sun Lab. If you were not present on the first day of class, see a TA or access the CS 16 home page for information on setting up an account. Before you are given an account, you will have to sign the CS 16 student agreement. You may read the student agreement on the course home page.


Teaching Staff

Professor Tamassia, known to all as Roberto, has been at Brown since 1988. He received the master's degree in electrical engineering from the University of Rome in 1984 and the Ph.D. in electrical and computer engineering from the University of Illinois at Urbana-Champaign in 1988. Roberto's research interests are in algorithms and data structures, especially for graph and geometric problems, so he is very knowledgeable and enthusiastic about the material in this course. Roberto encourages active participation in class and would like to receive your feedback on the lectures and organization of the course.

Roberto and all the TAs will hold regular office hours. The schedule of office hours are available on line. They may also be contacted by e-mail; their addresses are below. If you need help using Unix e-mail, see a TA or the on-line documentation. To e-mail us from a SPARC, just type the address below in the To: field. To e-mail us from outside the department, use the address below followed by @cs.brown.edu

Instructor:
Roberto Tamassia (rt)

Graduate TA:
Srikanta Tirthapura (snt)

Head TAs:
Lubomir Bourdev (ldb), Mike Horn (msh)

Undergraduate TAs:
Elaine Chen (cs016003), Mark Handy (mdh), Sam Hazlehurst (seh), Benoit Hudson (bh), Dave Jackson (dej), Andy Miller (cs016008), David Powell (dep), Marcin Romaszewicz (mr)



Meetings

The class meets on Monday, Wednesday, and Friday, 11-12 (D hour), in room 143 of the CIT -- the Sun Lab. The schedule of the lectures is available on-line. Attending lectures is very important. Administrative announcements, information about programming assignments, and answers to questions about course material will all be available in class. You are strongly encouraged to actively participate in class discussions and to ask questions during lectures, since a portion of your grade depends on class participation.

Help Sessions

We will schedule additional meetings to discuss homework and programming assignments. Attendance is optional, but to your advantage. The time and place will be announced in class and posted on line.



Reading Material

The required reading material consists of a set of class notes, portions of which you may buy at the Metcalf Copy Center at intervals throughout the semester.

The purpose of the lecture notes is to make it easier for you to follow the lectures. These notes do not take the place of the lecture; they simply relieve you from hectic copying of complex figures or pseudo-code from the computer screen. You can instead focus on what Roberto is saying, and write your notes on an exact duplicate of what is displayed in class. (The notes are also available on line for your perusal.)

Optional reading material can be found in the following textbook, which is an excellent reference on algorithms (and also the text for CS 157, Design and Analysis of Algorithms).



On-Line Resources

Home Page

On-line versions of most handouts, plus homework solutions, programming help, environment documentation, and a link to the newsgroup (see below) are on the CS 16 home page, http://www.cs.brown.edu/courses/cs016/. Some material may be available only from a machine on the CS department network, so if you can't get a link to work from home or from a cluster, try again on a Sun.

Newsgroup

In addition to lectures, office hours, and help sessions, you will be able to ask questions to TAs at any time through the CS 16 newsgroup, brown.cs.cs016. The newsgroup can be accessed through Netscape, which starts up with your account. Simply click on the link to the newsgroup and you will be able to read the most recent information on the course. To ask questions on the newsgroup all you need to do is post to the newsgroup from Netscape. Of course, you may use other news-readers, not just Netscape, if you prefer.

Posts to the newsgroup from the TAs will include administrative announcements, answers to previously posted questions, and other information of general interest. Students may answer each others' posts, provided that the collaboration policy is not violated. Remember that the newsgroup can be read by everyone, so you shouldn't post any of your code or anything too specific. If you have doubts about whether or not posting something is legal, don't do it. Ask a TA in person, instead. Remember, too, that the purpose of the newsgroup is to allow people quick access to questions and answers. Please do not clutter the newsgroup with jokes, flames, or general discussions about the course. If you have problems or want to make suggestions, send e-mail to (or see in person) a TA, a head TA, or Roberto.

Newsgroup FAQ

During the semester the newsgroup might have a tendency to become filled with a large amount of questions, answers, and extraneous material that is not vital to your participation in the course. The newsgroup FAQ (frequently asked questions) is a document available off of the home page which contains all important articles, frequently asked questions, and TA announcements from the newsgroup. If you get behind on the newsgroup make sure to at least read the newsgroup FAQ regularly. It's got what you need to make it through your day.

System MOTD

The System Message of the Day will come up when you start your account. It contains important department-wide information, like the hours of the Sun Lab.


Graded Work

For more information about how grades are calculated, see Chapter gif.

Homework

Weekly homeworks consist of short questions and problems on the material presented in class and in the notes. Homeworks will be handed out in class and will also be available in the CS 16 file drawer, located in CIT 242, just beyond the TA room. The homeworks should be handed in to the wooden CS 16 handin bin, in the same room. Each homework is graded on a 0-10 scale.

Programming Assignments

There will be five programming assignments on the implementation of fundamental algorithms. Programs must be written in Java and must follow the specifications in the assignments. Programming assignments will be handed out in class, and will also be available in the CS 16 file bin and on line. Programming assignments can be done in the same room as the lectures (143 CIT), using Sun SPARCstations. Programs should be handed in electronically by means of the handin button on the Wave control panel. Each program is graded on a 0-10 scale.

Reference material for Java and our programming environment can be found later in this document, in a separate handout on Wave, and on line, from the course home page.

Exams

There will be two in-class exams. Review sessions will be given during the lectures preceding the exams. The exams are closed-book, but a single sheet of notes, written as densely as you wish, is allowed. Absence from an exam can be excused only for medical or emergency reasons.


Academic Code and Collaboration Policy

Except as noted below for homeworks, all assignments should be your own original work. We will strictly enforce the rules of Brown's Academic Code on this issue.

For homeworks, you may work with other CS 16 students to learn the material necessary to complete the assignment. You may not divide the assignment into parts and merely copy the parts prepared by others. When you turn in the assignment, you should be prepared to discuss your answers with a TA or with Roberto.

Programming assignments should be entirely your own work, from design, through coding, to testing and debugging. It is permissible, however, to discuss with another student high-level algorithmic and programming concepts that are unrelated to the specific assignments. That is, you can ask, ``How do you pass an integer by reference?'' and ``What happens when there are multiple keys in a priority queue?'' but not, ``Should I have a parent reference in Program 2?'' or ``Does the final program require multiple keys?''

Also, although you should not examine another student's code, it is permissible to give help on purely syntactical issues such as the format of System.out.println in Java.

Finally, you may discuss with other students any problems with the CS 16 programming environment or support code.

For the two exams, no collaboration is allowed.

Use your best judgment and consult Roberto or a TA in case of doubt as to what is allowed. If a student is found to have violated this policy, proper action will be taken. This means anything from a 0 on the assignment to disciplinary action by the University.


Getting (and Giving) Help

If you do not understand a topic covered in class or have difficulties in doing homeworks or programs, we are here to help you. The regular TA hours (posted on-line) are your first resort. If there are no TAs on hours when you need help, you can post to the CS 16 newsgroup. (You may not, however, ask questions to TAs who are not on hours. Remember that the TAs have to get their work done, too.)

Anyone on the CS 16 staff can assist you -- TAs, head TAs, the graduate TA, or Roberto. Don't be afraid to walk into Roberto's office during his hours. In fact, it is a good idea to see him at his office hours at least once in the semester. Even if you do not have specific questions, stop by to introduce yourself. It is important that you demonstrate your interest in the course, by asking questions during lectures, by participating in class discussions, and by seeing Roberto during his office hours.

Finally, your feedback is valuable to us. Please do not hesitate to provide comments and constructive criticism on the course organization.


Grading

Homework and Exam Grades

The homeworks collectively make up 25% of your grade for the semester. Your lowest homework grade will be dropped. The two exams together make up another 25%. The homeworks and exams consist of short questions and problems on the material presented in class and in the class notes. Always justify your answers. When describing algorithms and data structures, use preferably English or pseudo-code (possibly accompanied by diagrams) instead of Java code.



Program Grades

The five programs make up 50% of your final grade for the course. Your grade on each programming assignment will be based on four categories: correctness, efficiency, comprehensibility, and extensibility/reusability. These four categories are described below. See Chapter gif for a description of recommended programming practices. Following these recommendations will probably help you get good grades on your programs.

Each of the four categories will be graded on a 0-10 scale, and then the final grade will be computed by giving to the four categories the following (approximate) weights: correctness, 40%; efficiency, 30%; comprehensibility, 15%; and extensibility/reusability, 15%. The grade for the entire program may be adjusted slightly for particularly good or bad features, especially regarding design choices. Some of the assignments may deviate from this apportioning of weights; check the assignment handouts for more information.

Each program you hand in will be returned with a grading sheet attached. This sheet will identify the TA and head TA who have graded your program. If you do not agree with your grade you may go to the TA or head TA during office hours to discuss it (remember that you should do this within two weeks of receiving your grade).

Correctness

Not surprisingly, correctness describes how well your program works. A program that does not compile gets a 0. You should be absolutely clear on this: a program that does not compile gets a 0. That's a zero. If it compiles and nothing works, you will get a 0 on correctness, but you may get points for something else.

During the semester you will be implementing many algorithms that have special cases. A '10' program handles all of these cases (in addition to the non-special cases, of course). The more obscure the special case, the less that will be taken off your grade if you do not handle it. No more than 5 of the 10 points for correctness will be taken off for failure to handle special cases.

The remaining 5 correctness points are awarded for getting the general cases to work correctly. It is to your advantage to complete some parts of the program. Perhaps the algorithm has an insertion and a deletion part, and then a lookup. Devote time first to insertion, and then to lookup, and then to deletion. Instead of handing in a program with three incomplete sections, have something to show for your time spent in the lab. In many cases, the division of the algorithm will be mirrored in the separation of the interface into methods, so work on getting one method working at a time.

Try to make your program as ``bulletproof'' as possible. If your program repeatedly crashes, it won't be judged to work as well as one that prints out an error message ``I could not figure out how to handle this case, but I know that it exists and I check for it.''

Efficiency

Of the 10 points possible for efficiency, 8 are awarded for asymptotic efficiency and 2 for constant-factor efficiency. These concepts are among the first taken up by CS 16, so don't worry if you don't understand what the terms mean. Your first assignment will not have many opportunities for increasing efficiency, so you will not be penalized much for errors. In later assignments, efficiency will be very important. See Chapter gif for more on efficiency.

Comprehensibility

More than anything, commenting is a matter of personal taste. Chapter gif gives our rules for naming conventions and API comments. We intend for these requirements to be as small a burden as possible. Meeting them guarantees you 4 of the 10 points for comprehensibility.

We also want you to write and comment your code in such a way that it is easy to understand. Chapter gif explains our one-reading rule for comprehensibility. If you meet that requirement, you will receive the remaining 6 points for comprehensibility. Less understandable code will receive fewer points.

Extensibility and Reusability

The designs we recommend in CS 16 involve nodes that hold references to other nodes. This style of design is very flexible and makes it easy to design data structures with subtly different characteristics. However, the wrapper around the nodes makes the structure inaccessible to clients of the data structure, so extending an existing class is difficult. For this reason, you should be careful to put any functionality that clients might reasonably want to change in the wrapper; don't hard-wire it into the nodes.

See Chapter gif for more on extendibility and reusability.

Late Policy

As with most classes in the CS department, there is an official late policy for CS 16. Fully aware of the pressures of being a student, we allow programs to be handed in late.

Each program has a regular deadline which is specified on the program handout. All programs handed in after the regular deadline are considered to be late. However, each student will be allotted a total of 120 ``late credits'' towards submitting programs late. Each late credit can be used to extend by one hour the deadline for a program. You may allocate your late credits towards your programs in any way you choose, and you may rearrange the allocation at any time during the semester.

At the end of the semester, when the final grades are computed, a penalty will be applied to late programs according to the following rules, where C denotes the number of late credits applied to a program.

Additional late credits will be given for medical or emergency reasons. See Roberto or a Head TA to request late credits.

Homeworks have only one deadline. They cannot be turned in late since we make the answers available after the deadline. However, remember that the lowest score on a homework is not considered towards the final grade.


Final Grade

The final grade will be determined as follows: the programs count for 50% of the grade (10% each), the homeworks for 25%, and the exams for 25%. Your lowest homework score will be dropped. Grades on homeworks, programs and exams cannot be changed after two weeks from the date they are returned to you. Roberto will determine your final letter grade by considering your grand total of points on a percentage scale, and by taking into account active participation in class and special interest shown in the course.



Problems

How to Get an A

Here are some hints on how to do well in CS 16.


Computing Environment

This chapter describes the operating environment you will use for the programming assignments.

Sun SPARCstations and the Sun Lab

The computers you will be using are the Sun SPARCstations in the Sun Lab. These machines are connected to a central file server where your files will be stored; you will never need to buy or carry a disk. We have over sixty SPARCs in the Sun Lab, and although occasionally there is a wait list, the lab has more than suited our needs in the past. All programming assignments must compile, execute, and be handed in on these machines. We cannot accept programs written for other computer platforms, although you may work on another computer for editing, debugging, etc. (Information on logging in remotely is available from the CS department's systems page, http://www.cs.brown.edu/system/)

The hours of the Sun Lab (CIT 143) vary and can be found posted on the door. Occasionally classes like CS 16 use the lab for lectures, closing it to other students. These times will also be posted or can be found in the System MOTD, which comes up when you start your account.

Whenever the lab is open a student consultant will be sitting at the workstation nearest the door. This student is there to administer the lab and help with any problems that may come up. However, do not ask the consultant questions specific to CS 16, even if the consultant happens to be a CS 16 TA. If there is a problem with your account that the consultant cannot fix, send e-mail to a CS 16 TA from your account.

Please use common sense and follow the general rules of the lab below:


Ergonomics

Please take time to read the ergonomics information document off of the home page. This is for your own good. We are all working long hours in an environment that can be very stressful at times. Practicing good ergonomic habits can help you prevent painful and disabling injuries which might be a problem in your life sooner than you think.



CS 16 Environment

When you sit down at a SPARC, turn the screen on by moving the mouse or pressing the return key. Make sure the mouse is over the small gray window, and then enter your account number (cs016xxx). When requested, enter your password. After the computer thinks about this for a while, your account will appear. Its components are described in more detail below.

If you need to leave the lab only briefly, you can lock your machine without logging out. Choose ``xlock'' from the menu under the left mouse button. (That is, move the mouse so that it is not over any window, then press the left button, then drag down to the word ``xlock'' and release the button.) Cool graphics will take over your screen, and you will have to hit return and enter your password to continue working. Please do not xlock when there is a wait list or if you are using one of the experimental ergonomic workstations.

When you leave the lab for good, log out by choosing ``Log out'' from the menu under the right mouse button.

Your Environment

The following programs are run automatically by your account when it starts up. Feel free to ask the TAs if you need help in using them.

Window Manager:
Our window manager (fvwm) provides you with a virtual desktop that is larger than the physical workstation screen. The square in the upper right hand corner of your screen represents the virtual desktop. The section you are currently in will be highlighted. You can switch among screens by hitting a function key (F1, etc.) or by clicking in one of the squares of the miniature display. The colored rectangles on the virtual desktop represent the windows on that virtual screen. Dragging these with the mouse (middle button) moves them, even onto a different virtual screen.

The window manager controls other aspects of how your environment work, too. These can be changed by editing your dot files (invisible files starting with a period), particularly .fvwmrc. However, you are responsible for any changes you make. If you mess things up, the TAs might not be able to help you unmess them.

Wave:
The latest version of Wave will start up when your account does. However, your account does not depend on it, so you can quit Wave without logging out. Wave is also independent of Netscape and XEmacs (see below), although it uses connections with them. If you need to restart Wave, type wave & in a shell, or just choose Wave from the menu under the left mouse button. After it starts, you will need to re-establish a connection with XEmacs by holding the Meta key (the one with the diamond on it) and typing the x key. Then type wave-server-run. XEmacs will then seek a connection with Wave. (The connection between Netscape and Wave is made without your intervention.) Further documentation of Wave is available in a separate document that is linked to the CS 16 home page and will be available on paper.

Netscape:
If you need help using Netscape, please see a TA immediately, as it is an integral part of Wave, our development environment. In addition to using it as a class browser, of course, you may use it to look at the course home page, the course newsgroup, and web pages and newsgroups in general. You may also use it to get and send e-mail, if you'd like. (See the home page for other e-mail options.)

Xterms:
The Xterms (also called shells) are text-only windows that allow you to use Unix commands directly. Unix is the variety of operating system that the Suns run. Some of the most useful Unix commands are documented below.

XEmacs:
XEmacs is a text editor that you can use to write your programs. More information on how to use XEmacs is given in the CS 16 on-line documentation.

System MOTD:
The System Message of the Day gives important department-wide information, like the hours of the Sun Lab.

Changing your password

At the account sign-up session, you will be given a password that is a random collection of characters. Immediately after you log on for the first time, you should change your password to something that you can remember and that cannot be easily guessed. Do not use dictionary words or easily guessed names, especially yours. The system will only accept the first eight characters of your password; the rest will be ignored. Your password should contain a number or a punctuation character. Use the yppasswd (yellow pages password) command to change your password. Simply type yppasswd in a shell and follow the instructions it gives you. Your password will not be echoed on the screen, so type it carefully.

In addition, always wait to be prompted for your password before you begin typing it. If you start typing too soon, the password might be echoed on the screen.

Solaris

The SPARCstations run a version of the UNIX operating system called SunOS 5.3. We call the entire hardware, software and operating system environment Solaris. There are several basic UNIX concepts and commands you will need to know.

You will store your programs in named files in directories. Directories may contain other directories, so they are arranged in a tree-like structure (you'll know enough about trees in a few weeks!). When you log in you are automatically put into your home directory. This directory is owned by you. You must keep all of your files in this directory, or in sub-directories which you make inside this directory. Your home directory is a small part of a very large tree of directories, arranged in a distributed file system, meaning that you cannot tell the physical location of a file or directory by its logical location in the tree. The name of your home directory is /u/cs016xxx. Note that you use forward slashes to separate the names of directories.

Your program files will have names with two parts separated by a dot. The first part is the file name and the second part is the file type. Your source code will be stored in .java files. They will have names like NodeBinaryTree.java or MyHeap.java. When a source file is compiled, a .class file is created: NodeBinaryTree.class or MyHeap.class. It is the .class file that is executed by the Java interpreter. Finally, XEmacs makes a backup copy of the old version of your source-code file when you save a new version. This backup file is named with a tilde at the end: NodeBinaryTree.java~ or MyHeap.java~.

As you read through this section try typing the commands in your window. Some of the output you would see on your screen will be described below.

Some Useful UNIX Commands

CS 16 Environment Commands

 

We have customized the CS 16 environment to make your computing as convenient as possible. The following are some of the things you can do with your new account.

The cs16_credit and cs16_debit commands

In accordance with our system of allotting each student credits which can be applied to late programs, we have two commands to allow you to change your credit allocations. Type cs16_credit asgn_num credits to allocate a certain number of credits to the given assignment number. For example, if you type cs16_credit 1 10 in your shell, you will allocate 10 of your remaining credits to program 1. Typing this command again allocates another 10 credits towards program 1. You may also deduct credits from a program with the cs16_debit command, which takes the same command-line arguments. At any time, you may view your current credit allocations by typing either cs16_credit or cs16_debit.

The Grades Database

In CS 16 we keep all grades and other course records on-line in a database. You can check your grades through the left-mouse-button menu. This will pop up a window with the status of all of your assignments. A sample grade report is shown in Table gif. Notice that this display will tell you who graded your work and whether or not we consider your work to be late. Note that even if you apply credits to a late assignment, it will still be marked as late. At the end of the semester, you will be asked to submit your final credit allocations, and then final grades will take these into account.

 

 ___
|___| cs016 Grade Report                 cs016099 : Sample Student


Asgn          Grade Avg  Handed In        Status               Graded by
------------- ----- ---- ---------------- -------------------- -----------------
hw1           10.0  7.8  in on time       graded               Mark Handy
hw2           9.0   7.6  in on time       graded               Benoit Hudson
hw3           9.0   7.8  in on time       graded               Marcin Romaszewicz
hw4           2.0   8.2  in on time       graded               Elaine Chen
hw5           8.0   7.2  in on time       graded               David Powell
hw6           9.0   8.6  in on time       graded               Sam Hazlehurst
hw7           9.5   9.0  in on time       graded               Mark Handy
hw8          10.0   8.0  in on time       graded               Andy Miller
hw9           7.5   8.1  in on time       graded               David Jackson
hw10          6.5   7.2  in on time       graded               Mark Handy
asgn1         2.5   7.9  in on time       graded               Marcin Romaszewics
asgn2        10.0   7.1  in late          graded               Elaine Chen
asgn3        10.0   8.3  in on time       graded               David Powell
asgn4         9.0   5.8  in on time       graded               Sam Hazlehurst
asgn5         8.0   6.4  in late          graded               Andy Miller
exam1        28.0  27.5  in on time       graded               Roberto Tamassia
exam2        25.0  14.9  in on time       graded               Roberto Tamassia

final_grade   A                                                Roberto Tamassia

 
Table: Sample grade report

Menus

Three sets of menus can be accessed through the mouse buttons. (Put the mouse over the background of your account and hold down the appropriate button.) The left mouse button includes Grades, mentioned above, and other useful programs. Start Wave from this menu.

The middle mouse button has standard window operations. Many of these can be performed directly on windows, using the mouse in the title bar or the menu in the upper left corner. Always attempt to Delete a window before you Kill it. Only if Delete doesn't work should you choose Kill.

The right mouse button has other miscellaneous commands pertinent to CS 16 (including logging out).

Demos

We will make available a demo version of each programming assignment to give you an idea of what we expect from you. Note that the demos are only examples of good programs. You do not need to imitate them in all respects. In addition, they are not necessarily an example of a perfect program. Use them as a starting point for designing your program, and feel free to improve on the design.

The demos can be started from one of the course pages, which also contains links to other algorithm animations and instructions for running them.


Programming

A large part of your work in CS 16 will be devoted to implementing algorithms and data structures in Java. This work serves two purposes. First, it makes you more intimately familiar with the course material, and, second, it helps you improve your basic programming skills: designing, coding, testing, and debugging. The present chapter treats the second purpose by recommending (not requiring) certain programming practices. A sample Java program that illustrates our recommended programming practices is given in Appendix gif.

Priorities

Your programming priorities should be the following, in decreasing order of importance:

Correctness,
both general and special cases. This is crucial for the algorithms and data structures we will study in CS 16. Even what seems like a rare special case in the context of designing an algorithm could be called for thousands of times in the context of some applications.

Asymptotic efficiency.
This is one of the first ideas that CS 16 takes up, so don't worry if you don't understand the term. Intuitively, asymptotic efficiency refers to how expensive (in memory or time) a solution gets as the problem it solves gets larger and larger. For most of your programs, the asymptotic efficiencies will be specified, and strong hints about the necessary algorithms and data structures will be given, so your goal for this area will be very clear.

Extensibility.
We have tried to make sure that the interfaces to which you will have to conform are generally useful. Programming for extensibility, therefore, will mainly involve satisfying the interfaces correctly, and secondarily should involve considering what parts of your algorithm someone else might reasonably want to change, and making those parts easily changeable.

Comprehensibility.
Making your code understandable is an important part of making it maintainable and extensible. This requires careful thought about what parts of your program will be confusing to someone who is capable of understanding most of it easily, then careful writing to document those parts. Consider also documenting thoroughly those parts that people might want to change, for easy extensibility.

Constant-factor efficiency.
This is the less structural counterpart to asymptotic efficiency. In implementations of algorithms, constant-factor efficiency can mean halving or tripling the execution time of the algorithm. Needless to say, this is a crucial consideration for real-world performance. However, constant-factor efficiency depends on many things besides the structure of the algorithm and the size of the problem -- processor speed and architecture, hard-drive speed, memory speed, implementation language, specialized coding practices, etc. -- and therefore it is of secondary importance in CS 16.

This order of priorities does not imply anything about the temporal order in which you should work. If you kept only correctness in mind as you started to design your Dictionary, you would use a linear data structure, because that makes implementation easy. Then, when you turned to asymptotic efficiency, you would have to throw away the linear data structure, along with all of your work thus far, and choose a different data structure. Clearly, you want instead to focus on all the priorities at once as you do your design. The order of priorities does mean, however, that you should expect to spend most of your time on ensuring correctness, which means a lot of time considering special cases before you begin coding, and a lot of time testing and debugging after you have some code.


Design Principles


Testing and Debugging Techniques

Careful consideration of how to test your programs is essential to assuring correctness. You need to think about the problem and your solution enough to discover the special cases and make sure you take care of them. Then, when you find you have not done so, you need to have a methodical way of finding and fixing the bug. In the absence of a reliable debugger for Java, you will need to resort to more primitive techniques for finding bugs.



Further Reading

We recommend Design Patterns: Elements of Reusable Object-Oriented Software, by Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides (Addison-Wesley, 1995). It documents many patterns that you may find useful in writing your programs, in understanding the support code, in programming in general, and in passing CS 32, for which it is the textbook. Some of the patterns were used in CS 15.

We also recommend to you Writing Solid Code, by Steve Maguire (Microsoft Press, 1995), as a source of techniques for keeping bugs from happening, and finding them easily once they do happen. The book is based on procedural C, not on object-oriented C++ or Java, so some of its recommendations do not apply directly to your work in this course. In particular, its use of assert to warn of incorrect assumptions is not supported by native Java. (It is emulated by the class gjt.Assert. See Chapter gif for more information.)



Questions to Ask Yourself

When you start to design a data structure for CS 16, you should ask yourself the following questions. Some comments (and answers) are given for the Sequence, your first assignment.

  1. Why do I care about this data structure? What can it be used for? How do its anticipated uses affect the specification of methods, locators, exceptions, and complexities? How do its anticipated uses affect the details of my design?
  2.   Can I use another data structure and modify it a little bit? If the answer is yes, you save a lot of time. For the Sequence, since we haven't told you about any other data structures, and since the whole point is to learn how to write data structures, the answer is no.
  3.   How do the required time complexities affect the design of the data structure? You already know about a couple of linear data structures: arrays and linked lists. The time complexities that we require for the Sequence eliminate arrays from consideration. Make sure you understand why, and think about what modifications to a linked list are required to meet the spec.
  4.   Can any functionality be implemented in terms of other functionality, instead of being written from scratch? For the Sequence particularly, there are a bazillion methods, many of which can be derived easily from other methods. You will have to write real code for far fewer methods if you can figure out which methods can be expressed with calls to other methods. Remember, however, that some methods can be correct if written using other methods but would be more efficient if written from scratch.
  5.   How will I implement the locator features? Since a locator can be any object that implements the Locator interface, you have a lot of freedom in answering this question. What you have to do is make sure the locator follows the element as long as the element is in the data structure, and is invalidated once the element is no longer in the data structure. This requires some ingenuity. In our VectorSequence (see Appendix gif), for instance, a locator is a specially defined object that holds both an index into the underlying Vector and a reference to the user's data object. In some nodes-and-pointers data structures, the Locator interface can be implemented by the nodes of the data structures instead of separate objects. Which design should you use for your sequence? Are there other options that we haven't thought of?
  6. When are exceptional conditions encountered, and how do I detect them, and how do I deal with them? The exceptions specified by the interfaces we require will give you major hints about the answer to this question. (And do the specifications miss any possible exceptional conditions? We hope not.) For the Sequence, we have written a VectorSequence, with different time complexities from yours but similar exceptional conditions (see Appendix gif). You should study it to learn how exceptions work and how you should use them.
  7.   How do my methods need to work in the general case? We suggest drawing a picture of your data structure, then running through the methods you are going to be implementing from scratch (see question gif) a step at a time, updating the drawing as you go. This hand-simulation will suggest a class breakdown and a repertoire of methods that your objects will need.
  8. Given a design that will work in general, what special cases do I need to detect and deal with? Special cases are related to exceptional conditions, the terrible difference being that you have to deal with special cases yourself, rather than throwing an exception and forcing the client code to deal with them. In your general-case drawing, look for places where your algorithm won't work. These will probably be at ends, edges, boundaries, transition zones, etc.
  9. Finally (as in, at the end), what do all the important methods look like, in pseudo code that can be quickly translated to real Java code? To be fair, the writing of pseudo code can happen in stages, after various parts of your design are clear, rather than when the entire design is clear. But you should be wary of sitting down and writing pseudo code (much less Java code) before you understand thoroughly the main ideas and details emphasized above.
  10. Can I write some actual code now?

On the first pass through your design, you can pretty much answer the above questions in order. As you refine it, you will need to revisit some of the questions, as you think of new uses (and new abuses) of your data structure, new special cases, and new optimizations. However, questions 2, 3, 5, and 7 will be difficult to revisit once you have settled on an answer, since changing the answer will involve redoing most of your work. For that reason, you should spend a lot of time, early on, thinking about them.

  1. Is it a coincidence that the questions that affect my code most deeply are numbered with primes? As you ponder your answer, consider the numbering of this question.


Support Code

CS 16 will give you a lot of objects to help you test and debug your data structures and algorithms. This chapter documents those objects.


Compiling and Running in a Shell

You may compile and execute Java programs from a command line, rather than through Wave, if you wish. This may save you a little time, or allow you to do two compilations at once, or allow you to run one thing while Wave is compiling another. If you decide to do this, we recommend that you create new classes and do the first and last compilations all in Wave, so that the Wave documentation will stay up to date with your work.

Use the Java compiler, javac, to compile. It takes a file name or file names. Here are some examples of legal invocations of the compiler:

javac MyClass.java (compile a single file)
javac *.java (compile all the .java files in current directory)
javac /u/cs016000/wave/src/*.java (compile all the .java files in the specified directory

Use the Java interpreter, java, to run classes. For a class to run it must have a method with the signature public static void main (String args[ ]). The script run_project, documented below, just runs a class that, at this writing, is named cs16Viz. If you wanted to execute it yourself, instead of running the script, this is how you would do it:

java Visualizing.cs16Viz

Notice two important differences from the compiler invocation. First, the interpreter takes a name qualified by packages, not by directories. Second, the name does not end with .java. These are side effects of the fact that the interpreter executes a class, whereas the compiler compiles a file.

Writing a main(.) may seem daunting, but it is actually very simple. Furthermore, main(.) methods are very useful for testing the functionality of classes at early stages of development. The VectorSequence that the TAs are providing as an example uses a simple main(.) to make sure that the basic methods work correctly. You may want to study it to find out how to write your own testing programs. If you type the following, the VectorSequence's main will be executed:

java VectorSequence.VectorSequence

The String args[ ] that main takes as a parameter is an array of the arguments that were passed on the command line when the interpreter was invoked. You will most likely want to ignore it. Writing a main(.) generally involves instantiating one or more things, hooking them together if appropriate, and calling one or more methods on them. We recommend you use small main programs to test your data structures, starting early and doing it often.


Visualizing Data Structures

You can use container visualizers to see the contents of the data structures you will be writing for the first three programming assignments. A visualizer takes a container that implements one of the container interfaces. When you request an operation on the container (using the GUI described below) the visualizer calls methods on that interface in order to display its contents on the screen. Since there is no good way to visualize locators, locators are represented by their elements. This means that different locators with the same element will appear the same on the screen.

We take a picture of your container before an operation on it begins, and we take another picture when the operation returns. In between, within your code you can request snapshots during execution of the operation. If a complicated method has a bug, you can break it into as many snapshots as you want, so you freeze-frame through the method.

How Visualization Works

It is important that you understand something about how our visualizers work, so that you will know which strange behaviors come from our code and which come from yours.

Do not forget that we visualize by calling methods on your data structure. For instance, for the Sequence we call first() to get the starting locator, and then we call next(.) repeatedly to get the rest of the locators, and then we catch the BoundaryViolationException that next(.) throws to find out when we have reached the end. There are therefore two requirements for visualization, and it is crucial that you meet them.

We have designed the visualization algorithms to use as few methods as possible of your containers, so you don't have to write and debug much code before you can use the visualizers. The methods used for visualization will be listed for each assignment. Be sure to find that list in the assignment handout and learn it by heart.

When we describe the GUI for the visualizers, we will describe a time-line that lets you choose among the snapshots that we and you have taken of your data structure. For now, just understand that such a mechanism exists. The way we store the multiple snapshots is by making one copy of your data structure for every snapshot request (whether the request comes from us or from you). We make a new instance of your container by calling newContainer() on it. Then we use the minimal methods described above to copy the contents of the old one to the new one. The new one becomes the data structure that is visualized for that event on the time-line.

All the snapshots are created when your method is called. Only after the method returns do you get an opportunity to see the snapshots.

This is what happens when a snapshot of your data structure is requested:

  1. first, a new, empty instance of your container is created with newContainer();
  2. the old instance is traversed using minimal methods of the container interface (like first() and next(.)); during the traversal, the contents of the old container are also inserted into the new container;
  3. the old copy is released and the new copy becomes the container that will be visualized when that snapshot is viewed.
when you ask to see a particular snapshot, the container copy is traversed again (using the same minimal methods), and the locators' elements are drawn on the screen.

We will now tell you how you create one of these visualizers, and how you use it.

Instantiating a Visualizer

In a shell, type run_project>


Transfer interrupted!