Question

This project is meant to give you experience writing linked lists and graphs. As such, you...

This project is meant to give you experience writing linked lists and graphs. As such, you are not permitted to use arrays or any data structure library. You may, however, make use of code presented in class and posted to Blackboard.

Objective

Your goal for this project is to take a block of text, analyze it, and produce random sentences in the style of the original text. For example, your program, given Wizard of Oz, might produce:

how quite lion bread them no mighty us said then for you or would and boots her wizard arts and injury with they the he nose said them

How?

1. Build a Markov Model

I generated the above text using a two step process. My program first analyzed the story and recorded word sequences. For example, it observed that the word so was followed by the following words: you, you, do, get, you, goodbye. It produced a record for so as follows:

so” ==> [ “you” 3 ], [”do” 1] , [”get” 1], [”goodbye” 1]

Each unique word in the story, except the last one, has a record like this. This is called a Markov model.

In general, a Markov model is a statistical model that tells the probability of one state leading to another. In this case we model the probability of word sequences in a book. For example, we can say that so is followed by you 1/2 of the time, do 1/6 of the time, get 1/6 of the time, and goodbye 1/6 of the time. In other words, our model tells us that if we see the word so, there's a 50% chance that you will come next.

2. Do a "random walk"

Now I will do the following:

a. choose a random "starter word".

b. find its entry.

c. randomly pick a next word, based on the probability. For example, if the word is so, then there's a 50/50 chance that I'll pick you as my next word.

d. print out that word.

e. go to step b and repeat.

Steps.

You should complete the project in the following steps.

PART 1: REA D THE WORDS INTO A LIST

Step 1. Get some e-books in .txt form

Go to Project Gutenberg and download some free books, such as Alice in Wonderland and The Wizard of Oz. Make sure that they are in ASCII Text format, meaning that you can open them in Notepad. Save them in your project's directory.

Step 2. Make a KeyWordList class

You can start by downloading the class example LinkedList or make your own. Don't make it generic: it will only hold Strings. Call your class KeyWordList and implement the following methods:

public void add(String s)

adds s to the list

public String get(int index)

returns the string at index index

public int length()

returns the number of elements in the list

public int find(String s)

returns the index of the first instance of s

public void print()

prints out the entire list to the console

Step 3. Read the words into the list

In your main class, make a KeyWordList keywordlist. Read the words in one at a time. Preprocess them to convert them to lower case and remove numbers and punctuation. If the resulting word isn't empty, add it to keywordlist.

Call keywordlist.print() and check that you see all the words in the book.

Step 4. Only add unique words to the list

Make a new function inside of KeyWordList: void addUnique(String s)

This function should use the find function and add words to the list if they are not previously present. Call this function from your main class instead of add. When you print out your list you should see no repeated words.

PART 2: BUILD THE MARKOV MODEL

In this part you will store a list of next-words for each word in your KeyWordList.

Step 5. Make another linked list for holding next-words

Now we'll make a new class NextWordList. Each element in NextWordList will hold 1) a string word and 2) an int count.

NextWordList will have two methods:

void foundNextWord(String nextword)

search the list for nextword.

- if it doesn't find it, it will add it and set its count to 1.

- if it does find it, it will increase its count.

void print() will print out each list element and its count.

Test this:

In your main class, make a NextWordList and call foundNextWord with you, you, do, get, you, goodbye.Print it out and see if you get

you 3

do 1

get 1

goodbye 1

Step 6. Give every keyword a nextwordlist

Go back to your KeyWordList and modify it so that every KeyWordElement has a NextWordList objectnextwordlist. Don't forget to construct it in add.

Make a new KeyWordList method

public void foundWordSequence(String keyword, String nextword)

that does the following:

- calls addUnique to find the keyword element if it exists or make a new one if it doesn't

- adds nextword to that keyword's nextwordlist with foundNextWord

Step 7. Modify main()

Go back to your main class where you read the words in from the file.

Read in an initial word before you start your while loop, and call it keyword.

Inside your loop, each time you read a word word, don't call addUnique. Instead callkeywordlist.foundWordSequence(keyword, word); Then make that word your new key:

keyword=word;

Test it

Modify KeyWordList's print function so that it prints out not only the word but all of its nextwords as well. Run the program and see if it works. For example, my printout for the Wizard of included

break:

myself 1

them 1

mended:

asked 1

you 1

in 2

mr:

joker 3

joker:

one 1

youre 1

said 1

hundred:

places 1

PART 3. DO A RANDOM WALK

Step 8. Get a random next word

Add another method to NextWordList

public String getRandomWord()

that randomly picks a next word from the list. You can use the following approach:

1. Make a totalcount=0. Step through each NextWordElement. Add its count to totalcount.

2. Pick a random integer choice between 0 and totalcount.

3. Make a runningcount=0. Step through each NextWordElement again and add its count to runningcount. However, stop when runningcount>=choice. Return that next-word.

Step 9. Update KeyWordList

Add a method to KeyWordList:

String getRandomNextWord(String keyword)

The method should find the matching KeyWordElement, call getRandomWord on its nextwords, and return it.

Test your program.

Call keywordlist.getRandomNextWord("the") a few times in your main class and see if the words you get make sense.

Step 10. Pick a random KeyWord

Add another method to KeyWordList:

string getRandomKeyword() that will pick a random starting word.

This method should:

- step through the list and count the number of KeyWordElements

- pick a random number from 0 to keywordelementcount

- step through again and and count until you reach that random number

- return that keyword

Inside of your main class call this method to get a starting word startword and print it out.

Step 11. Do a random walk

Now we can generate text!

Inside of your main class do the following:

String nextword = keywordlist.getRandomNextWord(startword); and print it out.

startword = nextword;

Put this in a loop. Run it 100 times. You should get gibberish!

PART 4. MAKE A SECOND-ORDER MARKOV MODEL

You will get better gibberish if you use pairs of words as your keyword. Consider the following approach:

- Pick a starting word pair they are

- Get a next word for that pair and print it out --> sharks

- Combine it with the starting pair. are sharks

- Get a next word for that pair and print it out --> and

- Repeat this:

sharks and

--> they

and they

--> will

they will

--> eat

will eat

--> us

eat us

--> up

If you followed the approach in Parts 2 and 3, you will not have to modify either of your classes at all! All the modifications can be done in main().

Step 12. Add word pairs to the KeyWordList

To start with, read two words from the book before your while loop instead of one word. Call them word1and word2.

In your while loop:

- put them together: keyword = word1 + " " + word2

- get a next word nextword

- call keywordlist.foundWordSequence(keyword,nextword)

- shift them: word1=word2; word2=nextword;

Step 13. Modify your random walk

You won't have to change the lines:

startword = keywordlist.getRandomKeyWord()

nextword = keywordlist.getRandomNextWord(startword);

at all. However startword will now hold a pair of words, so you can no longer say

startword = nextword;

Instead do the following:

1. Remove everything in startword up to and including the first space.

startword.substring(startword.indexOf(" ")+1,startword.length())

2. Add nextword on to the end.

startword = startword + " " + nextword;

PART 5. MAKE AN N-ORDER MARKOV MODEL

Instead of single words or pairs of words, what if we made it variable? Have a constant int order. Modify your work in main so that if order is 3 you use three word sequences, if order is 5 you use five word sequences, and so on. Make sure, too, that your program still works if order is 1 or 2.

In general, as you increase order, the text will become better grammatically but less original. I find that 2 or 3 is the sweet spot.

Requirements

  • Your program must be written in Java

  • You may not use arrays (apart from Part 5) or any preexisting data structure libraries

  • Your program must compile without errors

  • You must submit your .java files to Blackboard

0 0
Add a comment Improve this question Transcribed image text
Answer #1

Working code implemented in Java and appropriate comments provided for better understanding:

Here I am attaching code for these files:

  • Markov.java
  • NextWordList.java
  • KeyWordList.java

Source code for Markov.java:

import java.io.FileReader;
import java.io.FileWriter;
import java.io.IOException;
import java.util.Scanner;

public class Markov {

private KeyWordList keywordlist = new KeyWordList();
private int order;

private Markov() throws IOException {
Scanner kb = new Scanner(System.in);
// ask user for N-order
System.out.print("Enter a Markov ORDER (N): ");
// assign input to global order variable
order = kb.nextInt();

kb.close();

// raw text file name string
String bookFileName = "wizard_of_oz.txt";
// create clean text file
createCleanTextFile(bookFileName);
//build keyKeyWordList object
buildKeyWordList();
// print contents of key word list
keywordlist.printAll();
// preform random walk
randomWalk();

}

private void createCleanTextFile(String rawTextFile) throws IOException {
//open file with a FileReader
FileReader fileReader = new FileReader
("" + rawTextFile);
// create new Scanner object with fileReader as parameter
Scanner fileScanner = new Scanner(fileReader);

// file writer to create clean copy of text
FileWriter writer = new FileWriter("Clean_text.txt");

// create clean text
String wordToken;
String[] tempWordList;
while(fileScanner.hasNext()) {
wordToken = fileScanner.next();

//get rid of "--" appearing in the middle of two words
// put in a list of two words
tempWordList = wordToken.split("--");
//strip punctuation from end of word
tempWordList[0] = tempWordList[0].replaceAll("\\s*\\p{Punct}+\\s*$", "");
//get rid of all numerical characters
tempWordList[0] = tempWordList[0].replaceAll("\\d", "");
//set all characters to lowercase for both words in temp word list
tempWordList[0] = tempWordList[0].toLowerCase();
//tokenize each string
char[] firstWord = tempWordList[0].toCharArray();
// replace punctuation with '_' from front of word until a letter is reached
for(int i = 0; i < firstWord.length; i++) {
if(firstWord[i] >= 97 && firstWord[i] <= 122) {
break;
}
firstWord[i] = '_';
}
//assemble word back together from char arrays
tempWordList[0] = new String(firstWord);
// get rid of all '_' chars in the string
tempWordList[0] = tempWordList[0].replaceAll("_", "");
// if resulting string is not empty, add it to the master list
if(!tempWordList[0].equalsIgnoreCase("")) {
writer.write(tempWordList[0]);
writer.write(" ");
}

// process second word if it exists
if(tempWordList.length == 2) {
tempWordList[1] = tempWordList[1].replaceAll("\\s*\\p{Punct}+\\s*$", "");
tempWordList[1] = tempWordList[1].replaceAll("\\d", "");
tempWordList[1] = tempWordList[1].toLowerCase();
char[] secondWord = tempWordList[1].toCharArray();
for(int i = 0; i < secondWord.length; i++) {
if(secondWord[i] >= 97 && secondWord[i] <= 122) {
break;
}
secondWord[i] = '_';
}
tempWordList[1] = new String(secondWord);
tempWordList[1] = tempWordList[1].replaceAll("_", "");
if(!tempWordList[1].equalsIgnoreCase("")) {
writer.write(tempWordList[1]);
writer.write(" ");
}
}
}
writer.close();
fileReader.close();
fileScanner.close();
}

private void buildKeyWordList() throws IOException {
// now process clean text
FileReader fileReader = new FileReader
("Clean_text.txt");
// create new Scanner object with fileReader as parameter
Scanner fileScanner = new Scanner(fileReader);

// array holding all words in keyword entry
String[] keyWordArray = new String[order];
// create new KeyWordList - read in first key words
for(int i=0; i<order; i++) {
keyWordArray[i] = fileScanner.next();
}

String nextword;
// main loop through clean text file
while(fileScanner.hasNext()) {
StringBuilder keyword = new StringBuilder();
// if 1 word in keyword element, just add the word
if(keyWordArray.length == 1) {
keyword.append(keyWordArray[0]);
}
// if more than one word is used in key word..
else if(keyWordArray.length > 1) {
for(int i = 0; i < keyWordArray.length; i++) {
if(i == 0) {
keyword.append(keyWordArray[i]);
}
else {
keyword.append(" ").append(keyWordArray[i]);
}
}
}
nextword = fileScanner.next();
keywordlist.foundWordSequence(keyword.toString(), nextword);

// shift array down and make last element = nextword
for(int i = 0; i < keyWordArray.length; i++) {
if(i < keyWordArray.length - 1) {
keyWordArray[i] = keyWordArray[i+1];
}
else if (i == keyWordArray.length - 1){
keyWordArray[i] = nextword;
}

}
}
fileReader.close();
fileScanner.close();
}

private void randomWalk() {
// get random starting word from key word list
String startWord = keywordlist.getRandomKeyword();

// preform random walk
for(int i=0; i < 100; i++) {
String nextword = keywordlist.getRandomNextWord(startWord);
System.out.print(nextword + " ");
if(order == 1) {
startWord = nextword;
}
else {
startWord = startWord.substring(startWord.indexOf(" ")+1);
startWord = startWord + " " + nextword;
}
}
}


public static void main(String[] args) throws IOException {
new Markov();
}

}

Source code for NextWordList.java:

import java.util.Objects;
import java.util.Random;

class NextWordList {

private static class NextWordElement {
String word;
int count;
NextWordElement next;

// Element class constructor
private NextWordElement(String w) {
word = w;
count = 0;
next = null;
}
private NextWordElement getNext() {
if(this.next==null) {
return null;
}
return this.next;
}
private void setNext(NextWordElement newElement) {
this.next = newElement;
}
private String getWord() {
return this.word;
}
private int getCount() {
return this.count;
}
}

// fields
private NextWordElement head;

void foundNextWord(String nextWord) {
int index = find(nextWord);
if(index == -1) {
this.add(nextWord);
}
else {
Objects.requireNonNull(this.get(index)).count++;
}
}

private void add(String newData) {
NextWordElement curr = this.head;
if(curr == null) {
this.head = new NextWordElement(newData);
this.head.count = 1;
}
else {
while(curr.getNext() != null) {
curr = curr.getNext();
}
curr.setNext(new NextWordElement(newData));
curr.getNext().count = 1;
}
}

private int find(String s) {
int index = 0;
NextWordElement curr = this.head;
while(curr != null) {
if(s.equalsIgnoreCase(curr.getWord())) {
return index;
}
index++;
curr = curr.getNext();
}
return -1;
}

private NextWordElement get(int index) {
NextWordElement curr = this.head;
for(int i=0; i < index; i++) {
curr = curr.getNext();
if(curr==null) {
System.out.println("invalid index");
return null;
}
}
return curr;
}

String getRandomWord() {
// create rand object
Random randNum = new Random();
int totalCount = 0;
// point to first element in list
NextWordElement curr = this.head;
// get total count for all words in list
while(curr != null) {
totalCount += curr.getCount();
curr = curr.getNext();
}

int choice = randNum.nextInt(totalCount);
int runningCount = 0;
curr = this.head;
while(runningCount < choice && curr.next != null) {
runningCount += curr.getCount();
curr = curr.getNext();
}
assert curr != null;
return curr.getWord();
}

void printAll() {
NextWordElement curr = this.head;
while (curr != null) {
System.out.println("\t" + curr.getWord() + " " + curr.count);
curr = curr.getNext();
}
}
}

Source code for KeyWordList.java:

import java.util.Random;

public class KeyWordList {

// nested Element Class - node class that will hold the Strings
private static class KeyWordElement {
String value;
KeyWordElement next;
NextWordList nextwordlist;

// Element class constructor
private KeyWordElement(String thingToStore) {
value = thingToStore;
next = null;
nextwordlist = null;
}
private KeyWordElement getNext() {
return this.next;
}
private void setNext(KeyWordElement newElement) {
this.next = newElement;
}
private String getData() {
return this.value;
}
}

// KeyWordList fields
private KeyWordElement head;
private int size;

// KeyWordList constructor
KeyWordList() {
this.head = null;
this.size = 0;
}

private void add(String newData) {
KeyWordElement curr = this.head;
if(curr == null) {
this.head = new KeyWordElement(newData);
this.head.nextwordlist = new NextWordList();
}
else {
while(curr.getNext() != null) {
curr = curr.getNext();
}
curr.setNext(new KeyWordElement(newData));
curr.getNext().nextwordlist = new NextWordList();
}
this.size++;
}

private void addUnique(String s) {
int index = find(s);
if(index == -1) {
add(s);
}
}

public KeyWordElement get(int index) {
KeyWordElement curr = this.head;
for(int i=0; i < index; i++) {
curr = curr.getNext();
if(curr==null) {
System.out.println("invalid index");
return null;
}
}
return curr;
}

public int length() {
return this.size;
}

public int find(String s) {
int index = 0;
KeyWordElement curr = this.head;
while(curr != null) {
if(s.equalsIgnoreCase(curr.getData())) {
return index;
}
index++;
curr = curr.getNext();
}
return -1;
}

void foundWordSequence(String keyword, String nextWord) {
this.addUnique(keyword);
int index = this.find(keyword);
this.get(index).nextwordlist.foundNextWord(nextWord);
}

String getRandomNextWord(String keyword) {
int foundWordIndex = this.find(keyword);
if(foundWordIndex != -1) {
KeyWordElement kwe = this.get(foundWordIndex);
return kwe.nextwordlist.getRandomWord();
}
System.out.println("Keyword not present in document");
return null;
}

String getRandomKeyword() {
Random rand = new Random();
int wordCount = 0;
KeyWordElement curr = this.head;
while(curr != null) {
wordCount++;
curr = curr.getNext();
}
int randInt = rand.nextInt(wordCount);
return this.get(randInt).getData();
}

public void printAll() {
KeyWordElement curr = this.head;
while (curr != null) {
System.out.println(curr.getData() +":");
curr.nextwordlist.printAll();
System.out.println();
curr = curr.getNext();
}
}

}

Sample Output Screenshots:

Hope it helps, if you like the answer give it a thumbs up. Thank you.

Add a comment
Know the answer?
Add Answer to:
This project is meant to give you experience writing linked lists and graphs. As such, you...
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for? Ask your own homework help question. Our experts will answer your question WITHIN MINUTES for Free.
Similar Homework Help Questions
  • You will be writing some methods that will give you some practice working with Lists. Create...

    You will be writing some methods that will give you some practice working with Lists. Create a new project and create a class named List Practice in the project. Then paste in the following code: A program that prompts the user for the file names of two different lists, reads Strings from two files into Lists, and prints the contents of those lists to the console. * @author YOUR NAME HERE • @version DATE HERE import java.util.ArrayList; import java.util.List; import...

  • Write a C++ program for the instructions below. Please read the instructions carefully and make sure they are followed correctly.   and please put comment with code! Problem:2 1. Class Student Create...

    Write a C++ program for the instructions below. Please read the instructions carefully and make sure they are followed correctly.   and please put comment with code! Problem:2 1. Class Student Create a "Hello C++! I love CS52" Program 10 points Create a program that simply outputs the text Hello C++!I love CS52" when you run it. This can be done by using cout object in the main function. 2. Create a Class and an Object In the same file as...

  • Hi, I need some help finishing the last part of this Python 1 code. The last...

    Hi, I need some help finishing the last part of this Python 1 code. The last few functions are incomplete. Thank you. The instructions were: The program has three functions in it. I’ve written all of break_into_list_of_words()--DO NOT CHANGE THIS ONE. All it does is break the very long poem into a list of individual words. Some of what it's doing will not make much sense to you until we get to the strings chapter, and that's fine--that's part of...

  • I need this in java please Create an automobile class that will be used by a...

    I need this in java please Create an automobile class that will be used by a dealership as a vehicle inventory program. The following attributes should be present in your automobile class: private string make private string model private string color private int year private int mileage. Your program should have appropriate methods such as: default constructor parameterized constructor add a new vehicle method list vehicle information (return string array) remove a vehicle method update vehicle attributes method. All methods...

  • In python Count the frequency of each word in a text file. Let the user choose...

    In python Count the frequency of each word in a text file. Let the user choose a filename to read. 1. The program will count the frequency with which each word appears in the text. 2. Words which are the spelled the same but differ by case will be combined. 3. Punctuation should be removed 4. If the file does not exist, use a ‘try-execption’ block to handle the error 5. Output will list the words alphabetically, with the word...

  • write a program which include a class containing an array of words (strings).The program will search...

    write a program which include a class containing an array of words (strings).The program will search the array for a specific word. if it finds the word:it will return a true value.if the array does not contain the word. it will return a false value. enhancements: make the array off words dynamic, so that the use is prompter to enter the list of words. make the word searcher for dynamic, so that a different word can be searched for each...

  • CSCI 2010 Lab11 Link-Lists Lab 11A Linked-Lists Preparation Create a Visual Studio C++ Project C...

    CSCI 2010 Lab11 Link-Lists Lab 11A Linked-Lists Preparation Create a Visual Studio C++ Project C2010Lab11A Add the following to the project. //LinkedList.cpp #include <cstdlib> #include "LinkedList.h" using namespace std; //--------------------------------------------------- //List Element Members //--------------------------------------------------- ListElement::ListElement(int d, ListElement * n) {    datum=d;    next=n; } int ListElement::getDatum () const {    return datum; } ListElement const* ListElement::getNext () const {    return next; } //--------------------------------------------------- //LinkedList Members //--------------------------------------------------- LinkedList::LinkedList () {    head=NULL; } void LinkedList::insertItem(int item) {    ListElement *currPtr = head;    ListElement *prevPtr =...

  • Assignment: Chapter 18 introduces you to linked lists. In this homework assignment you will use the...

    Assignment: Chapter 18 introduces you to linked lists. In this homework assignment you will use the algorithms presented in this chapter to implement a list of names that should be kept in alphabetical order. Section 18.2 defines the Linked List Operations based on a list of numbers. You will change this code to work with a list of names. struct ListNode { string firstNname; string lastName; struct ListNode *next; }; Rename class NumberList to something more appropriate like StringList or...

  • CSBP 319 Data structures - Linked Lists - USE JAVA (NetBeans) A company would like to...

    CSBP 319 Data structures - Linked Lists - USE JAVA (NetBeans) A company would like to implement its inventory of computing machines as a linked list, called ComputerList. Write a Computer node class, called ComputerNode, to hold the following information about a Computer: • code (as a String) • brand (as a String) • model (as a String) • price (as double) • quantity (as int) ComputerNode should have constructors and methods (getters, setters, and toString()) to manage the above...

  • IN PYTHON Assignment Overview This assignment will give you experience on the use of classes. Understand...

    IN PYTHON Assignment Overview This assignment will give you experience on the use of classes. Understand the Application The assignment is to first create a class calledTripleString.TripleStringwill consist of threeinstance attribute strings as its basic data. It will also contain a few instance methods to support that data. Once defined, we will use it to instantiate TripleString objects that can be used in our main program. TripleString will contain three member strings as its main data: string1, string2, and string3....

ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT