Question

For this week's lab, you will use two of the classes in the Java Collection Framework:...

For this week's lab, you will use two of the classes in the Java Collection Framework: HashSet and TreeSet. You will use these classes to implement a spell checker.

Set Methods

For this lab, you will need to use some of the methods that are defined in the Set interface. Recall that if set is a Set, then the following methods are defined:

set.size() -- Returns the number of items in the set.

set.add(item) -- Adds the item to the set, if it is not already there.

set.contains(item) -- Check whether the set contains the item.

set.isEmpty() -- Check whether the set is empty.

You will also need to be able to traverse a set, using either an iterator or a for-each loop.

Reading a Dictionary

The file /classes/s09/cs225/words.txt http://math.hws.edu/eck/cs225/s09/lab9/words.txt contains a list of English words, with one word on each line. You will look up words in this list to check whether they are correctly spelled. To make the list easy to use, you can store the words in a set. Since there is no need to have the words stored in order, you can use a HashSet for maximum efficiency.

Use a Scanner to read the file. Recall that you can create a scanner, filein, for reading from a file with a statement such as:

          filein = new Scanner(new File("/classes/s09/cs225/words.txt"));

and that a file can be processed, token by token, in a loop such as:

          while (filein.hasNext()) {

              String tk = filein.next();

              process(tk); // do something with the token

          }

(For the wordlist file, a token is simply a word.)

Start your main program by reading the words from /classes/s09/cs225/words.txt and storing them in aHashSet<String>. For the purposes of this program, convert all words to lower case before putting them in the set. To make sure that you've read all the words, check the size of the set. (It should be 73845.) You could also use the contains method to check for the presence of some common word in the set.

Checking the Words in a File

Once you have the list of words in a set, it's easy to read the words from a file and check whether each word is in the set. Start by letting the user select a file. You can either let the user type the name of the file or you can use the following method:

         /**

          * Lets the user select an input file using a standard file

          * selection dialog box. If the user cancels the dialog without

          * selecting a file, the return value is null.

          */

         static File getInputFileNameFromUser() {

            JFileChooser fileDialog = new JFileChooser();

            fileDialog.setDialogTitle("Select File for Input");

            int option = fileDialog.showOpenDialog(null);

            if (option != JFileChooser.APPROVE_OPTION)

               return null;

            else

               return fileDialog.getSelectedFile();

         }

Use a Scanner to read the words from the selected file. In order to skip over any non-letter characters in the file, you can use the following command just after creating the scanner (where in is the variable name for the scanner):

          in.useDelimiter("[^a-zA-Z]+");

(In this statement, "[^a-zA-Z]+" is a regular expression that matches any sequence of one or more non-letter characters. This essentially makes the scanner treat any non-letter the way it would ordinarily treat a space.)

You can then go through the file, read each word (converting it to lower case) and check whether the set contains the word. At this point, just print out any word that you find that is not in the dictionary.

Providing a List of Possible Correct Spellings

A spell checker shouldn't just tell you what words are misspelled -- it should also give you a list of possible correct spellings for that word. Write a method

          static TreeSet corrections(String badWord, HashSet dictionary)

that creates and returns a TreeSet<String> containing variations on badWord that are contained in the dictionary. In your main program, when you find a word that is not in the set of legal words, pass that word to this method (along with the set). Take the return value and output any words that it contains; these are the suggested correct spellings of the misspelled word. Here for example is part of the output from my program when it was run with the HTML source of this page as input:

   html: (no suggestions)

   cpsc: (no suggestions)

   hashset: hash set

   treeset: tree set

   cvs: cs, vs

   isempty: is empty

   href: ref

   txt: tat, tet, text, tit, tot, tut

   filein: file in

   pre: are, ere, ire, ore, pare, pee, per, pie, poe, pore, prep, pres, prey, pro, pry, pure, pyre, re

   hasnext: has next

   wordlist: word list

   getinputfilenamefromuser: (no suggestions)

   jfilechooser: (no suggestions)

   filedialog: file dialog

   setdialogtitle: (no suggestions)

   int: ant, dint, hint, in, ina, inc, ind, ink, inn, ins, inti, into, it, lint, mint, nit, pint, tint

Note that I have written my program so that it will not output the same misspelled word more than once. (I do this by keeping a set of misspelled words that have been output.) If my corrections() method returns an empty set, I output the message "(no suggestions)". Since the corrections are stored in a tree set, they are automatically printed out in alphabetical order with no repeats.

The possible corrections that I consider are as follows:

Delete any one of the letters from the misspelled word.

Change any letter in the misspelled word to any other letter.

Insert any letter at any point in the misspelled word.

Swap any two neighboring characters in the misspelled word.

Insert a space at any point in the misspelled word (and check that both of the words that are produced are in the dictionary)

For constructing the possible corrections, you will have to make extensive use of substrings. If w is a string, then w.substring(0,i) is the string consisting of the first i characters in w (not including the character in position i, which would be character number i+1). And w.substring(i) consists of the characters of w from position i through the end of the string. For example, if ch is a character, then you can change the i-th character of w to ch with the statement:

          String s = w.substring(0,i) + ch + w.substring(i+1);

Also, you will find it convenient to use a for loop in which the loop control variable is a char:

          for (char ch = 'a'; ch <= 'z'; ch++) { ...

*******Please do not forget to post the output and do not copy and paste someone else's code or from any other website on the internet. Thanks*****

0 0
Add a comment Improve this question Transcribed image text
Answer #1

Hello,

I've completed the below methods by using HashSet and TreeSet:

1)Reading and storing the words from the dictionary file (words.txt). You can change the file path of words.txt to your convenience.

As mentioned here, you will not get the output count as '73845'. Since there are some duplicates for example words like 'Sherry' 'Afghan' , after removing the duplicates , you will get only '72875'

2) Asking user to select the input file by showing the dialog. Read and store the fiel contents. For my testing I've kept only 5 words.

3)printCorrections-method for spell check. For 'isempty', the output will not be as expected as 'is empty' but rather 'is em pt' because the dictionary contains words like 'em' and 'pt' as valid

Please do some small changes in this method , so that the expected value will be got for the word 'int'

Program:

package com.dummy.suriya;

import javax.swing.*;
import java.io.File;
import java.util.HashSet;
import java.util.Iterator;
import java.util.Scanner;
import java.util.TreeSet;

public class SpellChecker
{
    private static HashSet dictionaryHashSet;
    private static HashSet inputFileHashSet;
    private static TreeSet outputCorrectionTreeSet;

    public static void main(String args[])
    {
        try{
            //Read the dictionary file words.txt which contains 73845 words
            //Each word in each line
            //And store in the HashSet in a lowercase
            readDictionaryFile();

            //Check whether all the words from the dictionary file are read and stored
            // Print the count of the words
            //It should be 73845 but since it contains some duplicates only 72875 words are stored
            //Eg. Words like Sherry, Afghan are found as duplicates.
            if(!dictionaryHashSet.isEmpty())
            {
                System.out.println("Count of total number of words read from the dictionary-->"+dictionaryHashSet.size());
            }

            //Ask the user to enter the file name which contains the words that needs to be spell checked
            File inputFileForSpellCheckTest = getInputFileNameFromUser();
            Scanner testInputFileIn = new Scanner(inputFileForSpellCheckTest);
            inputFileHashSet = new HashSet();
            while (testInputFileIn.hasNext()) {
                testInputFileIn.useDelimiter("[^a-zA-Z]+");
                String tk = testInputFileIn.next();
                inputFileHashSet.add(tk.toLowerCase());
            }
            if(!inputFileHashSet.isEmpty())
            {
                System.out.println("Count of total number of words read from the input test file-->"+inputFileHashSet.size());
            }

            //Check for spelling
            //Only print the words that are not found in the dictionary
            //If no suggestions can be provided, print as (no suggestions)
            //Suggestions include split words , rhyming words with only one letter change
            printCorrections();

        }
        catch(Exception e)
        {
            e.printStackTrace();
        }

    }

    public static void readDictionaryFile()
    {
        try {
            Scanner filein = new Scanner(System.in);
            filein = new Scanner(new File("C:/Users/Guest/IdeaProjects/Java8Practice/src/com/dummy/suriya/SpellCheckerTextFiles/words.txt"));
            dictionaryHashSet = new HashSet();
            while (filein.hasNext()) {
                String tk = filein.next();
                dictionaryHashSet.add(tk.toLowerCase());
            }
        }
        catch(Exception e)
            {
                e.printStackTrace();
            }
    }

    public static File getInputFileNameFromUser(){
        JFileChooser fileDialog = new JFileChooser();
        fileDialog.setDialogTitle("Select File for Input");
        int option = fileDialog.showOpenDialog(null);
        if (option != JFileChooser.APPROVE_OPTION)
            return null;
        else
            return fileDialog.getSelectedFile();
    }

    public static void printCorrections(){
        outputCorrectionTreeSet = new TreeSet();
        if(!inputFileHashSet.isEmpty()){
            Iterator itr = inputFileHashSet.iterator();
            while(itr.hasNext()){
                String inputWord = itr.next().toString();
                if(!dictionaryHashSet.contains(inputWord)){
                    String outputString = checkForSplitWords(inputWord);
                    if((outputString == null || outputString.trim().length()==0))
                    {
                        outputCorrectionTreeSet.add(inputWord+": (no suggestions)");
                    }
                    else{
                        outputCorrectionTreeSet.add(inputWord+": "+outputString);
                    }
                }
            }
        }

        if(!outputCorrectionTreeSet.isEmpty()){
            Iterator itr = outputCorrectionTreeSet.iterator();
            while(itr.hasNext()) {
                System.out.println(itr.next().toString());
            }
        }
    }

    public static String checkForSplitWords(String inputWord){
        String splitWords = "";
        int newPosition = 0;
        for(int i=0; i<inputWord.length();i++)
        {
            if(dictionaryHashSet.contains(inputWord.substring(newPosition,i)))
            {
                splitWords +=  inputWord.substring(newPosition,i) +" ";
                newPosition = i;
            }

        }
        return splitWords;
    }

}

Outputs:

Add a comment
Know the answer?
Add Answer to:
For this week's lab, you will use two of the classes in the Java Collection Framework:...
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for? Ask your own homework help question. Our experts will answer your question WITHIN MINUTES for Free.
Similar Homework Help Questions
  • For this week's lab, you will use two of the classes in the Java Collection Framework:...

    For this week's lab, you will use two of the classes in the Java Collection Framework: HashSet and TreeSet. You will use these classes to implement a spell checker. Set Methods For this lab, you will need to use some of the methods that are defined in the Set interface. Recall that if set is a Set, then the following methods are defined: set.size() -- Returns the number of items in the set. set.add(item) -- Adds the item to the...

  • In C++ and not Java: Implement a spelling checker by using a hash table. Assume that...

    In C++ and not Java: Implement a spelling checker by using a hash table. Assume that the dictionary comes from two sources: an existing large dictionary and a second file containing a personal dictionary. Output all misspelled words and the line numbers on which they occur. Also, for each misspelled word, list any words in the dictionary that are obtainable by applying any of the following rules: 1. Add one character. 2. Remove one character. 3. Exchange adjacent characters The...

  • In this lab you will write a spell check program. The program has two input files:...

    In this lab you will write a spell check program. The program has two input files: one is the dictionary (a list of valid words) and the other is the document to be spellchecked. The program will read in the words for the dictionary, then will read the document and check whether each word is found in the dictionary. If not, the user will be prompted to leave the word as is or type in a replacement word and add...

  • Write a program IN PYTHON that checks the spelling of all words in a file. It...

    Write a program IN PYTHON that checks the spelling of all words in a file. It should read each word of a file and check whether it is contained in a word list. A word list available below, called words.txt. The program should print out all words that it cannot find in the word list. Requirements Your program should implement the follow functions: main() The main function should prompt the user for a path to the dictionary file and a...

  • Overview: The goal of this assignment is to implement a simple spell checker using a hash...

    Overview: The goal of this assignment is to implement a simple spell checker using a hash table. You will be given the basic guidelines for your implementation, but other than that you are free to determine and implement the exact classes and methods that you might need. Your spell-checker will be reading from two input files. The first file is a dictionary containing one word per line. The program should read the dictionary and insert the words into a hash...

  • Dictionary.java DictionaryInterface.java Spell.java SpellCheck.java In this lab you will write a spell check program. The program...

    Dictionary.java DictionaryInterface.java Spell.java SpellCheck.java In this lab you will write a spell check program. The program has two input files: one is the dictionary (a list of valid words) and the other is the input file to be spell checked. The program will read in the words for the dictionary, then will read the input file and check whether each word is found in the dictionary. If not, the user will be prompted to leave the word as is, add...

  • For this lab you will write a Java program that plays a simple Guess The Word...

    For this lab you will write a Java program that plays a simple Guess The Word game. The program will prompt the user to enter the name of a file containing a list of words. These words mustbe stored in an ArrayList, and the program will not know how many words are in the file before it starts putting them in the list. When all of the words have been read from the file, the program randomly chooses one word...

  • C++ program: can you help create a autocorrect code using the cpp code provided and the...

    C++ program: can you help create a autocorrect code using the cpp code provided and the words below using pairs, vectors and unordered map: Objectives To practice using C++ std::pair, std::vector, and std::unordered_map To tie together what we've learned into the context of a real-world application used by millions of people every day Instructions For Full Credit You're given a short list of words in known_words_short.txt that contains a handful of very different words. Assume this short list of words...

  • Question 2: Finding the best Scrabble word with Recursion using java Scrabble is a game in...

    Question 2: Finding the best Scrabble word with Recursion using java Scrabble is a game in which players construct words from random letters, building on words already played. Each letter has an associated point value and the aim is to collect more points than your opponent. Please see https: //en.wikipedia.org/wiki/Scrabble for an overview if you are unfamiliar with the game. You will write a program that allows a user to enter 7 letters (representing the letter tiles they hold), plus...

  • JAVA Primitive Editor The primary goal of the assignment is to develop a Java based primitive...

    JAVA Primitive Editor The primary goal of the assignment is to develop a Java based primitive editor. We all know what an editor of a text file is. Notepad, Wordpad, TextWrangler, Pages, and Word are all text editors, where you can type text, correct the text in various places by moving the cursor to the right place and making changes. The biggest advantage with these editors is that you can see the text and visually see the edits you are...

ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT