Some dictionaries have compiled a list of the top 100 most frequently occurring words in the English language. Being the curious sort, you would like to see how often these 100 words appear in a normal document.
You are given the text file called common.txt which has been placed in your directory by the pesetup program. It contains a list of the top 100 most frequently occuring words, sorted in alphabetical order. The words are listed with each one on a separate line. The first ten words in the text file are:
a about all an and are as at be been
... and so on.
Your program must read a document text file called document.txt and record the frequency of each of the 100 words found in common.txt. You can assume that each line in document.txt will never exceed 100 characters, and that none of the words in common.txt and document.txt will exceed 30 characters. Assume both common.txt and document.txt always exist. The file document.txt may contain multiple lines. Suppose document.txt contains:
Duplicate bridge is a game for people who are seeking intelligent diversions in a social setting. It is both fast-paced and mentally challenging. While each hand takes only 5 or 10 minutes to play, each hand presents a new mystery to be solved. If you have never played bridge before, we can help you get started.
Then the frequency of the first ten words in common.txt is:
a 3 about 0 all 0 an 0 and 1 are 1 as 0 at 0 be 1 been 0
This is because the word a occurs three times, the word and occurs one time, the word are occurs one time, and the word be occurs one time, while the rest of the words (about, all, an, as, at, been) do not occur in document.txt.
Your program must read in common.txt, then read in document.txt (ignoring the case of the words), count the frequency of the 100 words, and write the output into the text file freqcount.txt. Assume that the list of delimiters is given by ",.?! :;()-" (do not forget the space). Overwrite the output file if it already exists. The first ten lines of the output file freqcount.txt (based on the example document.txt given) is shown as follows:
a 3 about 0 all 0 an 0 and 1 are 1 as 0 at 0 be 1 been 0
Sample common.txt and document.txt files have been placed in your directory by the pesetup program.
We will test your program with other common.txt files and other document.txt files.
All the best!
Some useful UNIX commands (in case you forgot what you did in Lab 0):