NUS HomeIntro
Home | Up


Last updated on: 10 November 2008 07:11:01 PM

1. What is Algorithm?
2. 1st basic skill: Ability to quickly identify problem types

3. 2nd basic skill: Ability to analyze your algorithm
4. 3rd basic skill: Making the most of your programming language
5. 4th basic skill: The art of testing your code
6. Practice makes perfect
7. Summary: Producing Winning Solution


1. What is Algorithm?

Tell me what should you do if you are inside a room with one locked door and you want to get out from that room. The room's key is on top of your table.

I believe you will answer:
1. Get the key from the table,
2. Go to the locked door,
3. Open it, and
4. Get out...

And not this:
1. Go to the locked door,
2. Pray that it will magically open itself,
3. Go out if it happens.

That is algorithm, a step-by-step sequence of instructions to solve a problem. In the realm of Computer Science, we create algorithms for the computer to follow and eventually solve our problems.

Every good and correct algorithm must have these characteristics:
1. Must terminate [otherwise: Time Limit/Memory Limit/Output Limit Exceeded]
2. When it terminate, it must produce a correct output [otherwise: Wrong Answer]
3. It should be as efficient as possible [otherwise: Time Limit Exceeded]

There is a good analogy of algorithm in [CLR chapter 1.4]: "A good algorithm is like a sharp knife - it does exactly what it is supposed to do with a minimum amount of applied effort. Using the wrong algorithm to solve a problem is trying to cut a steak with a screwdriver: you may eventually get a digestible result, but you will expend considerable more effort than necessary, and the result is unlikely to be aesthetically pleasing."

In Computer Science, there are already a lot of well known algorithms available for us. Knowing them will make our life with Computers much easier rather than reinventing the wheel all over again.

Let's take a look at the following computational problem:

You are given a list of integers (array 'arr'):  10, 8, 7, 9, 12, 5, 7.
You want to sort 'arr' to get the sorted list:   5, 7, 7, 8, 9, 10, 12.

If you know sorting algorithms, you can just call:

sort(arr,arr+SIZE); // use C++ STL sort, where SIZE = 7 for this case

Rather than spending time to figure out how to do this.

References:
-. CLRS Chapter 1

2. What is Data Structure?

Taken from USACO Training Gateway section 1.2.2 with some modification, addition & simplification

In every algorithm, there is a need to store data. Ranging from storing a single value in a single variable, to more complex data structures. In programming contests, there are several aspect of data structures to consider when selecting the proper way to represent the data for a problem. This chapter will give you some guidelines and list some basic data structures to start with.

Will it work?

If the data structures won't work, it's not helpful at all. Ask yourself what questions the algorithm will need to be able to ask the data structure, and make sure the data structure can handle it. If not, then either more data must be added to the structure, or you need to find a different representation.

Can I code it?

If you don't know or can't remember how to code a given data structure, pick a different one. Make sure that you have a good idea how each of the operations will affect the structure of the data.

Another consideration here is memory. Will the data structure fit in the available memory? If not, compact it or pick a new one. Otherwise, it is already clear from the beginning that it won't work.

Can I code it in time? or has my programming language support it yet?

As this is a timed contest, you have three to five programs to write in, say, five hours. If it'll take you an hour and a half to code just the data structure for the first problem, then you're almost certainly looking at the wrong structure.

Another very very important consideration is, whether your programming language has actually provide you the required data structure. For C++ programmers, STL has a wide options of a very good built-in data structure, what you need to do is to master them, rather than trying to built your own data structure.

In contest time, this will be one of the winning strategy. Consider this scenario. All teams are given a set of problems. Some of them required the usage of 'stack'. The best programmer from team A directly implement the best known stack implementation, he need 10 minutes to do so. Surprisingly for them, other teams type in only two lines: "#include <stack>" and "std::stack<int> s;", can you see who have 10 minutes advantage now?

Can I debug it?

It is easy to forget this particular aspect of data structure selection. Remember that a program is useless unless it works. Don't forget that debugging time is a large portion of the contest time, so include its consideration in calculating coding time.

What makes a data structure easy to debug? That is basically determined by the following two properties.

1. State is easy to examine. The smaller, more compact the representation, in general, the easier it is to examine. Also, statically allocated arrays are much easier to examine than linked lists or even dynamically allocated arrays.

2. State can be displayed easily. For the more complex data structures, the easiest way to examine them is to write a small routine to output the data. Unfortunately, given time constraints, you'll probably want to limit yourself to text output. This means that structures like trees and graphs are going to be difficult to examine.

Is it fast?

The usage of more sophisticated data structure will reduce the amount of overall algorithm complexity. It will be better if you know some advanced data structures.

Conclusion

In general, remember the KISS principle: 'Keep It Simple, Stupid / Keep It Short & Simple.' Sometimes more complexity is very helpful, but make sure you're getting your money's worth. Remember that taking the time to make sure you've got the correct data structure at the start is a lot less expensive than having to replace a data structure later.

Things to Avoid (unless absolutely necessary): Dynamic Memory

In general, you should avoid dynamic memory in programming contests, because:

1. It is too easy to make mistakes using dynamic memory. Overwriting past allocated memory, not freeing memory, and not allocating memory are only some of the mistakes that are introduced when dynamic memory is used. In addition, the failure modes for these errors are such that it's hard to tell where the error occurred, as it's likely to be at a (potentially much later) memory operation.

2. It is too hard to examine the data structure's contents. The interactive development environments available don't handle dynamic memory well, especially for C. Consider parallel arrays as an alternative to dynamic memory. One way to do a linked list, where instead of keeping a next point, you keep a second array, which has the index of the next element. Sometimes you may have to dynamically allocate these, but as it should only be done once, it's much easier to get right than allocating and freeing the memory for each insert and delete.

All of this notwithstanding, sometimes dynamic memory is the way to go, especially for large data structures where the size of the structure is not known until you have the input.

Things to Avoid: Coolness Factor

Try not to fall into the 'coolness' trap. You may have just seen the neatest data structure, but remember:

1. Cool ideas that don't work aren't.
2.
Cool ideas that'll take forever to code aren't, either

It's much more important that your data structure and program work than how impressive your data structure is.

Basic Data Structures

There are five basic data structures: arrays, linked lists, stacks, queues, and deque (pronounced deck, a double ended queue). It will not be discussed in this section, go for their respective section.

More Advanced Data Structures

This include binary search tree, hash table, heaps, B-Trees, Binomial heaps, Fibonacci heaps, etc.

We will start examining basic data structures step by step. Advanced data structures will be explained later.

References:
-. USACO training gateway
-. Any algorithm books will include these standard data structures

3. Data Structure + Algorithm = Program

Say something.

2. 1st basic skill: Ability to quickly identify problem types

In all programming contests, there are only three types of problems:
1. I have not see this one before
2. I have seen this type before, but have not or cannot solve it
3. I have solve this type before

In programming contests, you will be dealing with a set of problems, not only one problem. The ability to quickly identify problems into the above mentioned contest-classifications (have not see, have seen, have solved) will be one of key factor to do well in programming contests.

There are a long list of problem types and problem solving techniques according to algorists. Here is the mixed list, beware that this list is not exhaustive...

  1. Math // General math problems
    Math (Prime Number)
    Math (Big Integer)
    Math (Base Number)
    Math (Permutation)
    Math (Number Theory)
    Math (Factorial)
    Math (Fibonacci)
    Math (Sequences)
    Math (Modulus)

  2. DP // General Dynamic Programming problems
    DP (LCS) // Longest Common Subsequence
    DP (LIS) // Longest Increasing Subsequence
    DP (Edit Distance)
    DP (0-1 Knapsack)
    DP (Coin Change)
    DP (MCM) // Matrix Chain Multiplication
    DP (Max Interval Sum)

  3. Greedy Algorithms

  4. Graph Traversal // standard graph problem that involves traversal (DFS/BFS)
    Graph (Flood Fill)
    Graph (Floyd Warshall) // All-pairs shortest path
    Graph (MST)
    Graph (Maximum Bipartite Matching)
    Graph (Network Flow)
    Graph (Articulation Point)
    Set (Union-Find)

  5. Sorting

  6. Searching: Backtracking, Complete Search, also well known as Brute Force

  7. Simulation Problems

  8. Divide & Conquer

  9. String Processing
    String Matching

  10. Computational Geometry

  11. Ad Hoc Problems

Sometimes, the algorithm may be 'nested' inside a loop of another algorithm. Such as binary search inside a DP algorithm, making the problem type identification not so trivial.

If you want to be able to compete well in programming contests, you must be able to know all that I listed above, with some precautions to ad-hoc problems.

'Ad Hoc' problems are those whose algorithms do not fall into standard categories with well-studied solutions. Each Ad Hoc problem is different... No specific or general techniques exist to solve them. This makes the problems the 'fun' ones (and sometimes frustrating), since each one presents a new challenge.

The solutions might require a novel data structure or an unusual set of loops or conditionals. Sometimes they require special combinations that are rare or at least rarely encountered. It usually require careful problem description reading and usually yield to an attack that revolves around carefully sequencing the instructions given in the problem. Ad Hoc problems can still require reasonable optimizations and at least a degree of analysis that enables one to avoid loops nested five deep, for example.

Please note that there are more Ad Hoc problems appear in this world than any other kind of problem. Always be ready for an Ad Hoc problem if you can not classify a problem as one of the other standard types. Let your problem solving creativity blossoms.

References:
-. USACO Training Gateway Section 1

3. 2nd basic skill: Ability to analyze your algorithm

People said that there are many ways to Rome...
Although not all ways will go to Rome
But among those which are going to rome, which one is the shortest?

You have identified your problem. You think you know how to solve it. The question that you must ask now is simple: Given the maximum input bound (usually given in problem description), can my algorithm, with the complexity that I can compute, pass the time limit given in the programming contest.

Usually, there are more than one way to solve a problem. However, some of them may be incorrect and some of them is not fast enough... However, the rule of thumb is: Brainstorm many possible algorithms - then pick the stupidest that works!

Lets have a gist of it.

For example, the maximum size of input n is 1.000.000, or 10^6, and your algorithm is of order O(n^2). Your common sense told you that 1.000.000^2 is an extremely big number... 10^12. So, you'll try to devise a better algorithm (but correct) to solve the problem, say of order O(n log n). Now 10^6 log2 (10^6) is just 2*10^7... Since computer nowadays are quite fast and can process up to order 10^6 in seconds, your common sense told you that this one likely able to pass the time limit.

Now how about this. You can only devise an algorithm of order O(n^4). Seems pretty bad right? But when you look at the maximum input, it's written n <= 10... then you are fortunate. Just directly implement your O(n^4) algorithm since 10^4 is just 10.000, and very likely solvable using small computation time.

So, by analyzing your proposed algorithm and doing some mathematics with the given input bound and stated time limit, you can do a better judging whether you should gamble and try coding your algorithm (which will take your time, especially in time-constrained programming contests), or attempt to improve your algorithm first or switch to other problems in the problem set.

I will not write the whole concept of algorithm analysis here, but please check other reference books and make sure you understand how to do:

1. Proof of algorithm correctness (especially for Greedy algorithms)
2. Time/Space complexity analysis for non recursive algorithms.
3. For recursive algorithms, the knowledge of computing recurrence relations and analyze them: iterative method, substitution method, recursion tree method and finally, Master Theorem
4. Analysis of randomized algorithm which involves probabilistic knowledge, e.g: Random variable, Expectation, etc. (CLRS chapter 5)
5. Amortized analysis (CLRS chapter 17)
6. Output-sensitive analysis, to analyze algorithm which depends on output size, example: O(n log k) LIS algorithm, which depends on k, which is output size not input size.

I just want to say that this skill is very important. Many novice programmers usually avoid this phase and always tempted to straightforwardly code the first algorithm (usually the naive version) that they can think of, after that they ended up realizing that their algorithm is not fast enough (or totally wrong). My advice is: only do the coding until you are sure your algorithm is both correct and fast.

The table below can also be found in many algorithms book. I put one here as a quick reference for you. The second version of the table will be more interesting for competitive programmers, as it roughly told you what is the maximum size of input solvable within 60 seconds (usual judge time limit, you may want to adjust this value accordingly).

Table 1: Time comparison of different order of growth
We assume our computer can compute 1000 elements in 1 seconds (1000 ms)

Order of Growth

n

Time (ms)

Comment

O(1)

1000

1

Excellent, almost impossible for most cases

O(log n)

1000

9.96

Very good, example: Binary Search

O(n)

1000

1000

Normal, Linear time algorithm

O(n log n)

1000

9960

Average, this is usually found in sorting algorithm such as Quick sort

O(n^2)

1000

1000000

Slow

O(n^3)

1000

10^9

Slow, btw, All Pairs Shortest Paths algorithm: Floyd Warshall, is O(N^3)

O(2^n)

1000

2^1000

Poor, exponential growth... try to avoid this. Use Dynamic Programming if you can.

O(n!)

1000

uncountable

Typical NP-Complete problems... unless the problem size is very small, there is no hope...

Table 2: Limit of maximum input size under 60 seconds time limit (with assumptions)
Another way to view this order of growth can be seen in the following table: You are given 1 minute (typical time limit for programming contest), what is the maximum size of n that your algorithm can solve using the same assumption as table above?

Order of Growth

Time (ms)

Max Possible n

Comment

O(1)

60.000 ms

Virtually infinite

... yeah...

O(log n)

60.000 ms

6^18.000

A very very large number

O(n)

60.000 ms

60.000

Still a very big number

O(n log n)

60.000 ms

~ 5.000

Moderate, average real life size

O(n^2)

60.000 ms

244

small

O(n^3)

60.000 ms

39

very small

O(2^n)

60.000 ms

16

avoid if you can

O(n!)

60.000 ms

8

extremely too small...

From this table, we have seen the importance of having good algorithm with lower order of growth. It may be useful to memorize the following ordering for quickly determine which algorithm perform better asymptotically: constant < log n < n < n log n < n^2 < n^3 < 2^n < n!

Some rules of thumb:

  1. 2^10 ~~ 10^3
    2^20 ~~ 10^6
    Max 32-bit integer: 2^31-1 ~~ 2*10^9 (roughly can handle up to 9 digits numbers)
    Biggest built in data structure "long long" is 2^63-1: 9*10^18 (up to 18 digits)

  2. If you have k nested loops running about n iterations each, the program has O(n*k) complexity

  3. If your program is recursive with b recursive calls per level and has l levels, the program O(b*l) complexity

  4. Bear in mind that there are n! permutations and 2^n subsets or combinations of n elements when dealing with those kinds of algorithms

  5. The best times for sorting n elements are W(n log n)

  6. DP algorithms which involves filling in a matrix usually in O(n^3)

  7. In contest, most of the time O(n log n) algorithms will be sufficient :)

References:
-. CLRS Chapter 1-5, basic algorithm analysis
-. CLRS Chapter 17, amortized analysis
-. Any books related to probabilistic theory to analyze algorithm which involves probability.

4. 3rd basic skill: Making the most of your programming language

Wise saying: A man, before going to a war, will sharpen his sword.

Usually, there are many programming languages allowed in the programming contests. Usually C/C++, Java and Pascal are among the list. The question is, do you know your preferred programming language in and out?

Brief programming language comparison table:

C/C++

Java

Pascal

Assignment

a = 7;

a = 7;

a := 7;

Comparison

if (a == b)

if (a == b)

if a=b then

For loop

for (int i=0; i<=7; i++) {
  // very flexible
}

for (int i=0; i<=7; i++) {
  // very flexible
}

for i=0 to 7 do begin
  { not flexible }
end;

While loop

while (i < 7) {
  // do something
}

while (i < 7) {
  // do something
}

while (ct1=7) do begin
  // do something
end;

Block
statement

{
    statements;
}

{
    statements;
}

begin
statements;
end;

Program Speed

Fastest

Slowest

Medium

EXE or
class size

Big

Medium

Smallest

Popularity

Still popular

Increasing popularity

Decreasing popularity

Shortcuts

Many, a++,b--;a+=b; etc

Many, a++,b--;a+=b; etc

None, only Inc(a); Dec(b);

Boolean Algebra

a && b, a || b, !a

a && b, a || b, !a

a and b, a or b, not a

Library

#include <stdio.h>

import java.io.*;

uses crt;

OOP

No, only in C++

Yes

No

Input/Output

scanf,printf,cin,cout

System.out.println

readln,writeln

Array

int arr[7];

int[] arr = new int[7];

var arr:array[1..7] of integer;

My preference

The best language for programming contest in my opinion. Have built in STL, which I found very useful.

Not good for programming contest because it is 'slow'...

No longer used, too old... don't have many usable libraries...

I won't write much here. The key idea is for you to do something in programming contest, coding using your preferred language, should not be your bottleneck at all, that is, once you figure out the algorithm, you are supposed to be able to translate it into code and you can do it fast. :)

References:
-. Your programming language manual. C, C++, Java, Pascal, Visual Basic, C#, you name it...

5. 4th basic skill: The art of testing your code

WA, TLE, MLE, RTE -_-', we all have these...

You've done it. Identifying the problem, designing the best possible algorithm, you have calculate using time/space complexity, that it will be within time and memory limit given., and you have code it so well. But, is it 100% correct?

Depends on the programming contest's type, you may or may not get credit by solving the problem partially. In ACM ICPC, you will only get credit if your team's code solve all the test cases, that's it, you'll get either Accepted or Not Accepted (Wrong Answer, Time Limit Exceeded, etc). In IOI, there exist partial credit system, in which you will get score: number of correct/total number of test cases for each code that you submit.

In either case, you will need to be able to design a good, educated, tricky test cases. Sample input-output given in problem description is by default too trivial and therefore not a good way to measure your code's correctness for all input instances.

Rather than wasting your submission (and gaining time or points penalty) by getting wrong answer, you may want to design some tricky test cases first, test it in your own machine, and ensure your code is able to solve it correctly (otherwise, there is no point submitting your solution right?).

Some team coaches sometime ask their students to compete with each other by designing test cases. If student A's test cases can break other student's code, then A will get bonus point. You may want to try this in your school team training too :).

Here is some guidelines in designing good test cases:
1. Must include sample input, the most trivial one, you even have the answer given...
2. Must include boundary cases, what is the maximum n,x,y, or other input variables, try varying their values to test for out of bound errors...
3. For multiple input test case, try using two identical test case consecutively. Both must output the same result. This is to check whether you forgot to initialize some variables, which will be easily identified if the first instance produce correct output but the second one doesn't.
4. Increase the size of input. Sometimes your program works for small input size, but behave wrongly when input size increases. Check for overflow, out of bounds, etc.
5. Tricky test cases, analyze the problem description and identify parts that are tricky, test them to your code.
6. Don't assume input will always nicely formatted if the problem description didn't say so. Try inserting white spaces (space, tabs) in your input, check whether your code is able to read in the values correctly
7. Finally, do random test cases, try random input and check your code's correctness

References:
-. Software engineering books, chapters regarding rigorous software testing...

6. Practice makes perfect

I encountered interesting fact, that usually lecturers or researchers can't really code well (this sentence is not meant to offence anyone). They know a lot of algorithms, they know a lot of programming languages and all its methodologies. But when it comes to practice, the competitive programmers, say the school's programming team, usually outperform them. Reason? Because competitive programmers train themselves often.

Here is some list of websites that can help you improve your problem solving skill.

Spanish University of Valladolid Online Judge

University of Valladolid (from Spain) Online Judge contains past years ACM contest problems (either regional or finals) plus problems from another source, including their own contest problems. You can solve these problems and submit your solution to this Online Judge. The correctness of your program will be reported as soon as possible. Try solving these problems and see your name on the top-500 authors rank list someday :-)

USACO Training Gateway

USA Computing Olympiad has a very useful website for you to learn about programming contest. Go straight to their website, register your account, and train yourself.

References:
-. UVa Online Judge URL: http://acm.uva.es/p
-. USACO Training Gateway website: http://ace.delos.com/usacogate
-. Other online judges around the world, the number of online judges increases over time...

7. Summary: Producing Winning Solution

A recipe for winning is to prepare well :)

A good way to get a competitive edge is to write down a game plan for what you're going to do in a contest round. This will help you script out your actions, in terms of what to do both when things go right and when things go wrong. This way you can spend your thinking time in the round figuring out programming problems and not trying to figure out what the heck you should do next... it's sort of like Pre-Computing your reactions to most situations.

Game plan for a contest:

  1. Read through all the problems first, don't directly attempt one problem since you may missed easier problem.

  2. Order the problems: shortest job first, in terms of your effort (shortest to longest: done it before, easy, unfamiliar, hard)

  3. Sketch the algorithms, complexity, the numbers, data structures, tricky details...

  4. Brainstorm other possible algorithms (if any) - then pick the stupidest that works!

  5. Do the Math! (space & time complexity & plug-in actual expected & worst case numbers)

  6. Code it of course, as fast as possible, and it must be correct

  7. Try to break the algorithm - use special (degenerate?) test cases

Coding a problem:

  1. Only coding after you finalize your algorithm

  2. Create test data for tricky cases

  3. Code the input routine and test it (write extra output routines to show data)

  4. Code the output routine and test it

  5. Write data structures needed

  6. Stepwise refinement: write comments outlining the program logic

  7. Fill in code and debug one section at a time

  8. Get it working & verify correctness (use trivial test cases)

  9. Try to break the code - use special cases for code correctness

  10. Optimize progressively - only as much as needed, and keep all versions (use hard test cases to figure out actual runtime)

Time management strategy and "damage control" scenarios

Have a plan for what to do when various (foreseeable!) things go wrong; imagine problems you might have and figure out how you want to react. The central question is: "When do you spend more time debugging a program, and when do you cut your losses and move on?". Consider these issues:

  1. How long have you spent debugging it already?

  2. What type of bug do you seem to have?

  3. Is your algorithm wrong?

  4. Do you data structures need to be changed?

  5. Do you have any clue about what's going wrong?

  6. A short amount (20 mins) of debugging is better than switching to anything else; but you might be able to solve another from scratch in 45 mins.

  7. When do you go back to a problem you've abandoned previously?

  8. When do you spend more time optimizing a program, and when do you switch?

  9. Consider from here out - forget prior effort, focus on the future: how can you get the most points in the next hour with what you have?

Checklist before you submit your solution:

  1. Turn all assertions off

  2. Turn off all debugging output

  3. Turn on all optimizations

  4. Make sure input and output are to correct filenames (if use file I/O)

  5. Make sure the input and output formats are correct (check for white spaces)

  6. Recompile and test once more

  7. Copy files to correct locations with correct names (or do online submission)

Tips & tricks for contests:

  1. Brute force when you can, Brute force algorithm tends to be the easiest to implement

  2. KISS: Simple is smart! (Keep It Simple, Stupid !!! / Keep It Short & Simple)

  3. Hint: focus on limits (specified in problem statement)

  4. Waste memory when it makes your life easier (trade memory space for speed)

  5. Don't delete your extra debugging output, comment it out

  6. Optimize progressively, and only as much as needed

  7. Keep all working versions!

  8. Code to debug:

    1. white space is good,

    2. use meaningful variable names,

    3. don't reuse variables, (we are not doing software engineering here)

    4. stepwise refinement,

    5. Comment before code.

  9. Avoid pointers if you can.

  10. Avoid dynamic memory like the plague: statically allocate everything. (yeah yeah)

  11. Try not to use floating point; if you have to, put tolerances in everywhere (never test equality)

  12. Comments on comments:

    1. Not long prose, just brief notes

    2. Explain high-level functionality: ++i; /* increase the value of i by */ is worse than useless

    3. Explain code trickery

    4. Delimit & document functional sections

    5. As if to someone intelligent who knows the problem, but not the code

    6. Anything you had to think about

    7. Anything you looked at even once saying, "now what does that do again?"

    8. Always comment order of array indices

  13. Keep a log of your performance in each contest: successes, mistakes, and what you could have done better; use this to rewrite and improve your game plan!

References:
-. The above section is taken from USACO Training Gateway section 1. (Taken without permission actually, I hope this doesn't matter. Their suggestion on producing winning solution is very nice that it will be a good closing summary for this chapter.)

What's next?

In the next chapters, we will be looking step by step of what are written here.
Stay tuned :D


This document, 1_introduction.html, has been accessed 12859 times since 26-Jul-04 18:17:09 SGT. This is the 4th time it has been accessed today.

A total of 7718 different hosts have accessed this document in the last 1594 days; your host, 38.103.63.56, has accessed it 2 times.

If you're interested, complete statistics for this document are also available, including breakdowns by top-level domain, host name, and date.