CSC 143 - Homework #6

20 points

(taken from the UW)

Warm-up (5 points): building a helper class LetterInventory

In this part, you will practice using arrays and classes.  You are to implement a class called LetterInventory that can be used to keep track of an inventory of letters of the alphabet.  The constructor for the class takes a String and computes how many of each letter are in the String.  This is the information the object keeps track of (how many a’s, how many b’s, etc).  It ignores the case of the letters and ignores anything that is not an alphabetic character (e.g., it ignores punctuation characters, digits and anything else that is not a letter).

 

Your class should have the following public methods.

 

Method

Description

LetterInventory(String data)

Construct an inventory (a count) of the alphabetic letters in the given string, ignoring the case of letters and ignoring any non-alphabetic characters.

int get(char letter)

Return a count of how many of this letter are in the inventory.  Letter might be lowercase or uppercase (your method shouldn’t care).  If a nonalphabetic character is passed, your method should throw an IllegalArgumentException.

void set(char letter, int value)

Set the count for the given letter to the given value.  Letter might be lowercase or uppercase.  If a nonalphabetic character is passed or if value is negative, your method should throw an IllegalArgumentException

int size()

Return the sum of all of the counts in this inventory.  This operation should be “fast” in that it should store the size rather than having to compute it each time this method is called.

boolean isEmpty()

Return true if this inventory is empty (all counts are 0).  This operation should be fast in that it should not need to examine each of the 26 counts when it is called.

String toString()

Return a String representation of the inventory with the letters all in lowercase and in sorted order and surrounded by square brackets.  The number of occurrences of each letter should match its count in the inventory.  For example, an inventory of 4 a’s, 1 b, 1 l and 1 m would be represented as “[aaaablm]”.

LetterInventory add(LetterInventory other)

Construct and returns a new LetterInventory object that represents the sum of this letter inventory and the other given LetterInventory.  The counts for each letter should be added together.  The two LetterInventory objects being added together (this and other) should not be changed by this method

LetterInventory subtract(LetterInventory other)

Construct and returns a new LetterInventory object that represents the result of subtracting the other inventory from this inventory (i.e., subtracting the counts in the other inventory from this object’s counts).  If any resulting count would be negative, your method should return null.  The two LetterInventory objects being subtracted (this and other) should not be changed by this method

 

You should implement this class with an array of 26 counters (one for each letter) along with any other data fields you find that you need.

 

You will need to know certain things about the properties of letters and type char.  You might want to look at the Character class for useful methods (e.g., there is a toLowerCase method, but be sure to read the documentation carefully to be sure any methods you use do exactly what you need).  You can compare different values of type char using less-than and greater-than tests.  All of the lowercase letters appear grouped together in type char (‘a’ is followed by ‘b’ followed by ‘c’, and so on) and all of the uppercase letters appear grouped together in type char (‘A’ followed by ‘B’ followed by ‘C’ and so on).  Because of this, you can compute a letter’s displacement (or distance) from the letter “a” with an expression like the following (this expression assumes the letter is a lowercase letter):

letter - 'a'

 

Going in the other direction, if you know a character’s integer equivalent, you can cast the result to char to get the character.  For example, suppose that you want to get the letter that is 8 away from ‘a’.  You could ask for:

(char) ('a' + 8)

 

This expression evaluates to the letter ‘i’.

 

Use the following unit test file (turn it in with your homework): LetterInventoryTest.java

 

Main part: (15 points)

This assignment will give you practice with recursive backtracking.  You are to create a class called AnagramSolver that uses a dictionary to find all combinations of words that have the same letters as a given phrase.  You might want to first look at the sample log of execution at the end of this handout.

Your class must include the following public methods.

Method

Description

AnagramSolver(List<String> list)

Construct an anagram solver that will use the given word list as its dictionary.  The list must not be modified.

void print(String s, int max)

Print to System.out all combinations of words from its dictionary that are anagrams of the String s and that include at most max words (or unlimited number of words if max is 0).  Throw an IllegalArgumentException if max is less than 0.

 

Your print method must produce the anagrams in the same format as in the sample log.  The easiest way to do this is to build up your answer a word at a time in an ArrayList or LinkedList.  Then you can simply “println” the list and it will have the appropriate format.

You are required to solve this problem by using recursive backtracking.  In particular, you must write a recursive method that builds up an answer one word at a time.  On each recursive call, you must search the dictionary from beginning to end and to explore each word that is a match for the current set of letters.  The possible solutions must be explored in dictionary order.  For example, in deciding what word might come first, you must examine the words in the same order in which they appear in the dictionary.

The solution to the 8 queens problem provides a good example to follow for your own code.  The primary difference in the recursion is that in 8 queens, we stopped as soon as we found an answer, whereas in the anagrams program you should to produce all possible answers.

An important aspect of the 8 queens solution was the separation of the recursive code (Queens.java) from the supporting code that kept track of low-level details of the problem (Board.java).  You must follow a similar strategy here.  The low-level details for the anagram problem involve keeping track of various letters and figuring out when one group of letters can be formed from another group of letters.  It turns out that the LetterInventory class provides exactly the low-level support we need.

For any given word or phrase, what matters to us is how many of each letter there are.  This is exactly what the LetterInventory keeps track of.  In addition, the “subtract” method of the LetterInventory is the key to solving this problem.  For example, if you have a LetterInventory for the phrase “george bush” and ask whether or not you can subtract the LetterInventory for “bee”, the answer is yes.  Every letter in the “bee” inventory is also in the “george bush” inventory.  That means that you need to explore this possibility.  Of course, the word “bee” alone is not enough to account for all of the letters of “george bush”, which is why you’d want to work with the new inventory formed by subtracting the letters from “bee” as you continue the exploration.

Part of your grade will be based on the efficiency of your solution.  Recursive backtracking is, in general, highly inefficient because it is a brute force technique that checks every possibility, but there are still things you can do to make sure that your solution is as efficient as it can be.  Be careful not to compute something twice if you don’t need to.  And don’t continue to explore branches that you know will never be printed.  And you are required to implement the following two optimizations:

In terms of correctness, your class must provide all of the functionality described above.  In terms of style, we will be grading on your use of comments, good variable names, consistent indentation and good coding style to implement these operations.  Remember that you will lose points if you declare variables as data fields that can instead be declared as local variables.  You should also be making correct use of generics and you should declare fields and variables using interfaces when possible (e.g., using List<String> for your variable even if the object is of type ArrayList<String>).  You should also avoid extraneous cases (e.g., don’t make something into a special case if it doesn’t have to be) and should be careful to make your code as efficient as possible.

The constructor for your class is passed a reference to a dictionary stored as a List of String objects.  You can use this dictionary for your own object as long as you don’t change it (that is expressly forbidden in the specification).  In other words, you don’t need to make your own independent copy of the dictionary as long as you don’t modify the one that is passed to you in the constructor.

Some people try to use their Map in place of their dictionary.  There is some space efficiency to this in that the Map will store all of the dictionary words as keys of the Map.  But this isn’t really an appropriate use of the Map.  The dictionary might have a different order than what the Map uses and you are required to explore the possibilities in dictionary order.

Don’t make this problem harder than it needs to be.  You are doing a fairly exhaustive search of the possibilities.  You have to avoid dead ends and you have to implement the optimizations listed above, but otherwise you are exploring every possibility.  For example, in the sample log you will see that one solution for “Barbara Bush” is [abash, bar, rub].  Because this is found as a solution, you know that every other permutation of these words will also be included ([abash, rub, bar], [bar, abash, rub], [bar, rub, abash], and so on).  But you don’t have to write any special code to make that work.  That is a natural result of the exhaustive nature of the search.  It will locate each of these possibilities and print them out when they are found.  Similarly, you don’t need any special cases for words that have already been used.  If someone asks you for the anagrams of “bar bar bar”, you should include [bar, bar, bar].

You should name your file AnagramSolver.java. To run your program use this class: AnagramMain.java (turn it in with your other files). Four dictionary files are also provided. They are called dict1.txt (short dictionary of 56 words that has lots of matches for the Bush family—appropriate for testing), dict2.txt (medium sized dictionary of approximately 4 thousand words), dict3.txt (large dictionary of approximately 20 thousand words) and dict4.txt (large dictionary in a different order). 

You will see that AnagramMain has two public fields a PrintStream out and a Scanner console. They are meant for testing the program with the JUnit test file (see below). In your file AnagramSolver, use AnagramMain.out.print(ln) in place of any System.out.print(ln) statement. To test your program use the file AnagramTest.java (turn it in with your other files), and the text files testWithdict1.txt, testWithdict2.txt, testWithdict3.txt, and testWithdict4.txt.

If you are interested in working with even larger dictionaries, check out the files available for download from the following site:

http://www.puzzlers.org/dokuwiki/doku.php?id=solving:wordlists:start

Your program must produce exactly the same output in exactly the same order as in the sample logs of execution.

Code Style

Use iterators and/or Java 5’s for each statement (for (variable : collection) …) where appropriate instead of indexed for-loops and the get(i)  method.  Recall that, as a matter of style, we want to use more general interface types like List instead of specific implementations like ArrayList for variables and parameters unless we have some reason to restrict ourselves to a particular implementation.  For a general collection type like List, iterators, which are also used by the for each statement, are guaranteed to give efficient O(1) access to successive elements, even for implementations like linked lists that do not implement direct, indexed access to elements in constant time.

Although it turns out that the word lists in the main program for this assignment and, probably, the ones in your code will be implemented by an ArrayList instead of a LinkedList or other collection, it is better practice to use iterators and/or for each loops, particularly when a parameter or variable is an interface type like List that might, or might not, support efficient indexed access. 

Sample execution (short file)

Welcome to the csc143 anagram solver.

 

What is the name of the dictionary file? dict1.txt

 

phrase to scramble (return to quit)? Barbara Bush

Max words to include (0 for no max)? 0

[abash, bar, rub]

[abash, rub, bar]

[bar, abash, rub]

[bar, rub, abash]

[rub, abash, bar]

[rub, bar, abash]

 

phrase to scramble (return to quit)? George BUSH

Max words to include (0 for no max)? 0

[bee, go, shrug]

[bee, shrug, go]

[bog, he, surge]

[bog, surge, he]

[bogus, erg, he]

[bogus, he, erg]

[bug, erg, hose]

[bug, goes, her]

[bug, her, goes]

[bug, hose, erg]

[bugs, ego, her]

[bugs, go, here]

[bugs, her, ego]

[bugs, here, go]

[bus, erg, go, he]

[bus, erg, he, go]

[bus, go, erg, he]

[bus, go, he, erg]

[bus, gorge, he]

[bus, he, erg, go]

[bus, he, go, erg]

[bus, he, gorge]

[egg, hose, rub]

[egg, rub, hose]

[ego, bugs, her]

[ego, grub, she]

[ego, her, bugs]

[ego, she, grub]

[erg, bogus, he]

[erg, bug, hose]

[erg, bus, go, he]

[erg, bus, he, go]

[erg, go, bus, he]

[erg, go, he, bus]

[erg, go, he, sub]

[erg, go, sub, he]

[erg, goes, hub]

[erg, he, bogus]

[erg, he, bus, go]

[erg, he, go, bus]

[erg, he, go, sub]

[erg, he, sub, go]

[erg, hose, bug]

[erg, hub, goes]

[erg, sub, go, he]

[erg, sub, he, go]

[go, bee, shrug]

[go, bugs, here]

[go, bus, erg, he]

[go, bus, he, erg]

[go, erg, bus, he]

[go, erg, he, bus]

[go, erg, he, sub]

[go, erg, sub, he]

[go, he, bus, erg]

[go, he, erg, bus]

[go, he, erg, sub]

[go, he, sub, erg]

[go, here, bugs]

[go, shrug, bee]

[go, sub, erg, he]

[go, sub, he, erg]

[goes, bug, her]

[goes, erg, hub]

[goes, grub, he]

[goes, he, grub]

[goes, her, bug]

[goes, hub, erg]

[gorge, bus, he]

[gorge, he, bus]

[gorge, he, sub]

[gorge, sub, he]

[grub, ego, she]

[grub, goes, he]

[grub, he, goes]

[grub, she, ego]

[he, bog, surge]

[he, bogus, erg]

[he, bus, erg, go]

[he, bus, go, erg]

[he, bus, gorge]

[he, erg, bogus]

[he, erg, bus, go]

[he, erg, go, bus]

[he, erg, go, sub]

[he, erg, sub, go]

[he, go, bus, erg]

[he, go, erg, bus]

[he, go, erg, sub]

[he, go, sub, erg]

[he, goes, grub]

[he, gorge, bus]

[he, gorge, sub]

[he, grub, goes]

[he, sub, erg, go]

[he, sub, go, erg]

[he, sub, gorge]

[he, surge, bog]

[her, bug, goes]

[her, bugs, ego]

[her, ego, bugs]

[her, goes, bug]

[here, bugs, go]

[here, go, bugs]

[hose, bug, erg]

[hose, egg, rub]

[hose, erg, bug]

[hose, rub, egg]

[hub, erg, goes]

[hub, goes, erg]

[rub, egg, hose]

[rub, hose, egg]

[she, ego, grub]

[she, grub, ego]

[shrug, bee, go]

[shrug, go, bee]

[sub, erg, go, he]

[sub, erg, he, go]

[sub, go, erg, he]

[sub, go, he, erg]

[sub, gorge, he]

[sub, he, erg, go]

[sub, he, go, erg]

[sub, he, gorge]

[surge, bog, he]

[surge, he, bog]

 

phrase to scramble (return to quit)? George Bush

Max words to include (0 for no max)? 2

 

phrase to scramble (return to quit)? GeOrGe

Max words to include (0 for no max)? 2

[ego, erg]

[erg, ego]

 

phrase to scramble (return to quit)? GEORGES

Max words to include (0 for no max)? 2

[erg, goes]

[goes, erg]

 

phrase to scramble (return to quit)?

Sample execution (long file)

Welcome to the csc143 anagram solver.

 

What is the name of the dictionary file? dict3.txt

 

phrase to scramble (return to quit)? Howard Dean

Max words to include (0 for no max)? 2

[ahead, drown]

[don, warhead]

[drown, ahead]

[hadron, wade]

[head, onward]

[nod, warhead]

[onward, head]

[wade, hadron]

[warhead, don]

[warhead, nod]

 

phrase to scramble (return to quit)? Wesley Clark

Max words to include (0 for no max)? 2

[creaky, swell]

[seller, wacky]

[swell, creaky]

[wacky, seller]

 

phrase to scramble (return to quit)?

Sample execution (dict4 file)

Welcome to the csc143 anagram solver.

 

What is the name of the dictionary file? dict4.txt

 

phrase to scramble (return to quit)? hugo chavez

Max words to include (0 for no max)? 0

[huh, zag, cove]

[huh, cove, zag]

[zag, huh, cove]

[zag, cove, huh]

[cove, huh, zag]

[cove, zag, huh]

 

phrase to scramble (return to quit)? howard dean

Max words to include (0 for no max)? 2

[don, warhead]

[nod, warhead]

[head, onward]

[wade, hadron]

[ahead, drown]

[drown, ahead]

[hadron, wade]

[onward, head]

[warhead, don]

[warhead, nod]

 

phrase to scramble (return to quit)?