0 Introduction

You are advised to review the material covered in chapters 1 through 12.

This is an ungraded lab assignment, but you are encouraged to complete at least Exercise 1. Exercise 2 is a challenger.

This lab assignment concerns program efficiency. While the study of efficiency is a topic of a follow-up module (CS1102), usually under the subject of asymptotic analysis, complexity analysis or analysis of algorithms where concepts such as big-O notation are introduced, here we shall contend ourselves with empirical results. For those who are interested in the theoretical aspects of asymptotic analysis of algorithms, there are reading materials abound on the Internet. For group CS1101X, I will cover this non-examinable topic at an introductory level if I have spare time, to whet your appetite a little.

Do not use the following in your programs for this lab assignment. (If you are in doubt whether you can use certain features not mentioned below and are not yet covered in class, please consult your discussion leader or lecturer first):

• Recursion
• Inheritance and polymorphism
• Advanced data structures (such as, but not limited to, trees, heaps, graphs) that are not covered in CS1101.

You may assume that all input data are correct.

Test your programs thoroughly with your own input data before you submit to CourseMarker. Do not use CourseMarker as a debugging tool. For this lab, since it will not be graded, the number of submissions is set to 30.

1 Exercise 1: Basic Sort

1.1 Learning objectives

• File input/output
• Random number
• Writing a basic sort
• Timing a program

In preparation for Exercise 1 in this assignment, you need to create a text file numbers that contains many integers. The text file should contain a positive value n on its first line. This is to be followed by n lines of random non-negative integers in the range [0, 20000].

Write a program GenerateIntegerList.java to create the above text file numbers, Your program should preferably take an inline argument for n, as shown below:

```\$ java GenerateIntegerList 5
```

instead of requesting for n during the run:

```\$ java GenerateIntegerList
```

The above examples should create the text file numbers that contains 6 lines of integers, as shown below (the first line contains 5, while the subsequent lines contain the 5 random integers generated):

```5
7843
13918
3607
16886
17699
```

Refer to Chapter 12 File Input and Output on how to create a text file, or you may use the output redirection operator '>' if you are using UNIX.

The 3 basic sorting techniques are Bubble Sort, Selection Sort, and Insertion Sort. Bubble Sort and Selection Sort have been covered in class; the algorithm of Insertion Sort on an n-element array A is as follows: on pass i (1 ≤ i < n), insert A[i] into the correct position in the sorted region to its left: A[0] ... A[i - 1]. At the end of pass n - 1, the array would be sorted.

Write a program BasicSort.java to implement one of the 3 basic sorts. The array should be an integer array, and your program should read the values from the text file numbers created earlier. You may use the Scanner class for reading from a text file. Two demonstration programs DemoLab8v1.java and DemoLab8v2.java show how you may read data from a text file numbers that contains some integers are given.

We usually assess the running time of a sort algorithm by counting the number of comparisons or the number of swaps between elements. In the worst case all the 3 basic sorts have to make roughly n × (n-1) / 2 comparisons, or O(n2) comparisons in big-O notation. Faster sorting techniques, such as heap sort, merge sort, quick sort, just to name a few, have fewer comparisons, and they are covered in CS1102 and other advanced modules.

To find out how fast your sort algorithm works, time your program. (Refer to section 6.10 Estimating the Execution Time on page 336 in your textbook.) You should run your program on different arrays of various sizes, for example, 2000, 4000, 6000 elements, etc. Plot a graph of n (array size) against the running time.

1.3 Sample run

Sample run (output shown in bold purple):

```\$ javac BasicSort.java
\$ java BasicSort
Array size  : 30000
Elapsed time: 3.309 sec.
```

1.4 Submission

Submit your program BasicSort.java through CourseMarker.

1.5 Important notes

• As the basic sorts are slow, you will find that they take a long time for large arrays.
• There are plenty of reading materials on asymptotic analysis and big-O notation. You may read up Big O notation from Wikipedia.

2 Exercise 2: Pixels (Challenger)

2.1 Learning objectives

• Writing efficient program

The most common way of representing pixels (picture elements) is to use the RGB (red, green and blue) values, which range from 0 to 255.

Given a list of pixels represented by their RGB values, we may define similarity as follows: A list is of similarity measure D if for all pixels in this list, A × (r - rmin) + B × (g - gmin) + C × (b - bmin) ≤ D where A, B, C and D are integer constants, r, g, and b are the RGB values for the pixel, and rmin, gmin, and bmin are the minimum R, G, and B values among all the pixels in this list. Note that more than one pixel may have the same RGB value.

Given the RGB values of n pixels, and the integer constants A, B, C, and D, we want to find the size of the largest sublist of pixels with similarity D. Write a program Pixels.java to solve this problem.

The first line of the input contains 5 integers: n (where 1 ≤ n ≤ 200), A, B, C, and D (where 1 ≤ A, B, C ≤ 10000, and 1 ≤ D ≤ 10000000).

The following n lines each contains three integers r, g and b (where 0 ≤ r, g, b ≤ 255) representing the RGB values of a pixel.

Your program should output the size of the largest sublist of these pixels of similarity measure D.

For example, given a list of 5 pixels:

```3 7 5
4 8 6
5 9 7
4 9 7
8 9 6
```

and given also that A = B = C = 1, and D = 5, if we pick a sublist that contains the first, second and fourth pixels {(3,7,5), (4,8,6), (4,9,7)}, then rmin = 3, gmin = 7, and bmin = 5. For the first pixel (3,7,5), the measure is 1×(3-3) + 1×(7-7) + 1×(5-5) = 0. For the second pixel (4,8,6), the measure is 1×(4-3) + 1×(8-7) + 1×(6-5) = 3. For the fourth pixel (4,9,7), the measure is 1×(4-3) + 1×(9-7) + 1×(7-5) = 5.

Since all the measures are less than or equal to D (5), this sublist of 3 pixels satisfies the similarity requirement. However, this is not the largest sublist possible. The largest sublist that satisfies the similarity requirement consists of the second, third, fourth and fifth pixels. (Verify this yourself.) Since the size of this sublist is 4, and we cannot find a larger sublist that satisfies the similarity requirement, the answer is 4.

This problem is a challenger because an inefficient algorithm would take too long. To have an indication of how fast your algorithm runs, you should time your program. A fast algorithm should be able to produce the result within 1 second for the most complicated data set on which your program will be tested.

Four test data sets used by CourseMarker for this task are included here for your testing. Study these data sets and their answers so that you know exactly what the question demands. Your program will be tested with 10 data sets.

Input: set1 | set2 | set3 | set9
Output (with timing on sunfire): set1 | set2 | set3 | set9
Output (with timing on Pentium IV): set1 | set2 | set3 | set9

(The timing information in the output above is only for your information, so that you have an idea how fast your algorithm works. Your program should only generate one line of output, containing a single integer.)

If your are able to clear these three test data sets, you may send your program directly to me (tantc @ comp.nus.edu.sg) before 1 November, 2007, 1pm for further testing. I will test your program in sunfire on more test data sets only after 1 November (therefore, you should ensure that your program works, instead of asking me to check whether it works). The first 3 students (from any CS1101 lecture group) who are able to pass all my test data sets within the stipulated running time of 1 second will each get a very small prize as a token of encouragement. (Selection is based on the first 3 successful submissions, and not on the speed of the programs.)

Please note that if you search hard enough in the Internet, you might be able to find the solution. However, I trust that you would attempt this on your own without external help.

This problem can be solved using what you have learned in CS1101 so far. Programs that use advanced data structures (such as, but not limited to, trees, heaps and graphs) will not be accepted.

2.3 Sample run

Sample run (user's input shown in blue; output shown in bold purple):

```\$ javac Pixels.java
\$ java Pixels
5 1 1 1 5
3 7 5
4 8 6
5 9 7
4 9 7
8 9 6
4
```

2.4 Submission

Submit your program Pixels.java through CourseMarker. If it passes the four test data sets, you mail email the program to Aaron (tantc @ comp.nus.edu.sg).

2.5 Important notes

• An inefficient algorithm will probably not be able to compute the answer within the time limit.
• The 1-second time limit should be viewed as a guide. I am not going to distinguish a program that takes 0.9 second from another that takes 0.5 second. An efficient algorithm will solve all test data sets within 1 second on most machines, or maybe slightly longer on the slowest machine. On the other hand, an inefficient algorithm will take exceedingly long, probably many seconds or even minutes, to run.