CS1101C Practical Exam

Session 4 (1000 - 1145 hours)

Anagram Dictionary

The name of your C program file must be called adict.c, files with any other name will not be marked.

The deadline for this lab is Wednesday 18 April 2007, 11:45:59 hours. Strictly no submissions will be accepted after the deadline.

Background

A proper anagram is the rearrangement of the letters in a word in increasing alphabetical order.

Here are some words and its corresponding proper anagrams:

  1. CANOE = ACENO.
  2. OCEAN = ACENO.
  3. FUNNY = FNNUY.
  4. ALMOST = ALMOST.

Note that “CANOE” and “OCEAN” have the same proper anagram of “ACENO”.

Anagram Dictionary

An anagram dictionary contains the alphabetical listings of proper anagrams. In addition, we would like to keep track of the number of times each proper anagram occurs in the text file.

Text File

A text file called “doc1.txt” has been copied into your directory by the pesetup command. This text file contains an unknown number of lines and an unknown number of words on each line. The sample text file contains the following lines:

===== Begin Sample Text File =====
tips oCeAn SPIT tips CANoe

Gastronomy from the Alps to the Mediterranean In France you will find
a wide variety of excellent regional food
===== End Sample Text File =====

The words consists of a mix of upper case and lower case letters. There are no punctuation characters or digits in the file. The file consists of only letters, spaces, and newlines.

You may assume that each word has at the most 30 letters, and that there are at the most 1000 proper anagrams.

Task

Your task is listed below:
  1. Read the words from the text file and convert all letters to lower case.
  2. Convert each word to its proper anagram (you may use any sorting algorithm), and insert it into the anagram dictionary.
  3. Finally, display the list of proper anagrams (and its corresponding frequency) in alphabetical order. You may use any sorting algorithm.

Sample Output

The following is the sample output using the above sample text file. Remember that we will test your program with other input files. Do not print the lines “Begin Sample Output” and “End Sample Output”.
===== Begin Sample Output =====
Anagram Dictionary:

Word #1 and frequency: a, 1.
Word #2 and frequency: aadeeeimnnrrt, 1.
Word #3 and frequency: acefnr, 1.
Word #4 and frequency: aceno, 2.
Word #5 and frequency: aegilnor, 1.
Word #6 and frequency: aeirtvy, 1.
Word #7 and frequency: agmnoorsty, 1.
Word #8 and frequency: alps, 1.
Word #9 and frequency: ceeellntx, 1.
Word #10 and frequency: deiw, 1.
Word #11 and frequency: dfin, 1.
Word #12 and frequency: dfoo, 1.
Word #13 and frequency: eht, 2.
Word #14 and frequency: fmor, 1.
Word #15 and frequency: fo, 1.
Word #16 and frequency: illw, 1.
Word #17 and frequency: in, 1.
Word #18 and frequency: ipst, 3.
Word #19 and frequency: ot, 1.
Word #20 and frequency: ouy, 1.
===== End Sample Output =====

Note that the words “TIPS”, “SPIT”, and “PITS” all have the same proper anagram of “IPST”. Thus, the proper anagram “IPST” occurs three times.

Print the word number as shown in the sample output.

You are reminded to follow the sample output exactly; else marks will be deducted.

We will test your programs with other (more complicated) text input files.

Hints

An array of strings is simply a two-dimensional array of characters. If there are at most 100 strings with each string having at the most 20 characters, the two-dimensional array of characters should be declared as:

char x[100][21];

So there are 100 rows with each row containing a null-terminated string, and there are 21 columns; the one extra column is needed to store the null-terminator for any string with twenty non-null characters.

To access the first string in the array, use x[0]. To access the second string, use x[1], and so on.

The first character of the first string is at x[0][0], the second character of the first string is at x[0][1], and so on.

Notes

Do not use any structures or any form of dynamic memory allocation (using malloc or calloc) in your program, else no credit will be given.

Remember to submit your program frequently using the submit adict.c command, and check your submission using the check command.

All the best!

UNIX commands

Some useful UNIX commands (in case you forgot what you did in Lab 0):

  1. dir”: lists all the files in the directory.
  2. cp a.txt b.txt”: copies a.txt to b.txt.
  3. mv a.txt b.txt”: moves / renames a.txt to b.txt.
  4. cat a.txt”: shows the contents of a.txt.