Python program
This assignment requires you to write a single large program. I have broken it into two parts below as a suggestion for how to approach writing the code. Please turn in one program file.
Sentiment Analysis is a Big Data problem which seeks to determine the general attitude of a writer given some text they have written. For instance, we would like to have a program that could look at the text "The film was a breath of fresh air" and realize that it was a positive statement while "It made me want to poke out my eye balls" is negative.
One algorithm that we can use for this is to assign a numeric value to any given word based on how positive or negative that word is and then score the statement based on the values of the words. But, how do we come up with our word scores in the first place?
That's the problem that we’ll solve in this assignment. You are going to search through a file containing movie reviews from the Rotten Tomatoes website which have both a numeric score as well as text. You’ll use this to learn which words are positive and which are negative.
The data file is movie_reviews.txt, and looks like this:
4 This quiet , introspective and entertaining independent is worth seeking . 1 Aggressive self-glorification and a manipulative whitewash . 4 Best indie of the year , so far . 2 Nothing more than a run-of-the-mill action flick . 2 Reeks of rot and hack work from start to finish .
Note that each review starts with a number 0 through 4 with the following meaning:
You are going to write a program that prompts the user to enter a phrase and then indicates whether that phrase is generally "positive" or "negative", by using the sentiment data contained in the data file.
To begin, your program has to compute the average score for all words in the movie_reviews.txt file. You should do this by writing code to do the following:
4 I loved it 1 I hated it
... might look like this as a dictionary:
words['i'] = [5,2] words['loved'] = [4,1] words['it'] = [5,2] words['hated'] = [1,1]
Initializing sentiment database. Sentiment database initialization complete. Read 8529 lines. Total unique words analyzed: 16442 Analysis took 0.142 seconds to complete.
Now, your program should:
Here is an example session:
Initializing sentiment database. Sentiment database initialization complete. Read 8529 lines. Total unique words analyzed: 16442 Analysis took 0.130 seconds to complete. Enter a phrase to test: i loved it * 'i' appears 383 times with an average score of 1.8302872062663185 * 'loved' appears 9 times with an average score of 2.6666666666666665 * 'it' appears 2405 times with an average score of 1.99002079002079 Average score for this phrase is: 2.1623248876512586 This is a POSITIVE phrase. Enter a phrase to test: this movie was awful * 'this' appears 994 times with an average score of 1.9657947686116701 * 'movie' appears 969 times with an average score of 1.8286893704850362 * 'was' appears 169 times with an average score of 1.621301775147929 * 'awful' appears 23 times with an average score of 1.0869565217391304 Average score for this phrase is: 1.6256856089959415 This is a NEGATIVE phrase. Enter a phrase to test: pikachu is watching you * 'pikachu' does not appear in any movie reviews. * 'is' appears 2409 times with an average score of 2.0568700705687006 * 'watching' appears 80 times with an average score of 1.875 * 'you' appears 850 times with an average score of 2.050588235294118 Average score for this phrase is: 1.9941527686209397 This is a NEGATIVE phrase. Enter a phrase to test: pikachu charmander * 'pikachu' does not appear in any movie reviews. * 'charmander' does not appear in any movie reviews. Not enough words to determine sentiment. Enter a phrase to test: happy birthday sad kitten * 'happy' appears 17 times with an average score of 2.588235294117647 * 'birthday' appears 9 times with an average score of 2.7777777777777777 * 'sad' appears 33 times with an average score of 2.212121212121212 * 'kitten' appears 1 times with an average score of 2.0 Average score for this phrase is: 2.3945335710041595 This is a POSITIVE phrase. Enter a phrase to test: it made me want to poke out my eyeballs * 'it' appears 2405 times with an average score of 1.99002079002079 * 'made' appears 148 times with an average score of 1.945945945945946 * 'me' appears 81 times with an average score of 1.5802469135802468 * 'want' appears 67 times with an average score of 1.8208955223880596 * 'to' appears 2996 times with an average score of 1.9589452603471296 * 'poke' does not appear in any movie reviews. * 'out' appears 298 times with an average score of 1.8187919463087248 * 'my' appears 83 times with an average score of 2.036144578313253 * 'eyeballs' appears 1 times with an average score of 1.0 Average score for this phrase is: 1.7688738696130188 This is a NEGATIVE phrase. Enter a phrase to test: I would not, could not, Sam I Am * 'i' appears 383 times with an average score of 1.8302872062663185 * 'would' appears 213 times with an average score of 1.6431924882629108 * 'not' appears 596 times with an average score of 1.919463087248322 * 'could' appears 155 times with an average score of 1.8838709677419354 * 'not' appears 596 times with an average score of 1.919463087248322 * 'sam' appears 2 times with an average score of 1.5 * 'i' appears 383 times with an average score of 1.8302872062663185 * 'am' appears 7 times with an average score of 2.7142857142857144 Average score for this phrase is: 1.90510621966498 This is a NEGATIVE phrase. Enter a phrase to test: quit Quitting.
Some notes:
#Sentiment Analysis
#Data of movie ratings followed by the review given by the
critic.
#Program takes into acount the rating and every individual word in
the review.
import time
begin_time = time.time()
#set up empty dictionary to hold words
sentiment = {}
#open reviews
file_object = open('movie_reviews.txt', 'r')
#grab data from file
alldata = str.lower(file_object.read())
#close file
file_object.close()
#cut based on new line character to analyze each review
split_reviews = alldata.split('\n')
print('Initializing sentiment database')
#examine every review in database
for review in split_reviews:
words = review.split(' ')
for word in words:
if word not in sentiment:
sentiment[word] = [1, int(words[0])]
else:
sentiment[word][0] += 1
sentiment[word][1] += int(words[0])
#examine every word in this review
#add to sentiment dictionary if neccessary, update if exists
already
end_time = time.time()
#display stats
time = format(end_time - begin_time, '.2f')
print('Sentiment database initilization complete')
print('Total unique words analyzed:', len(sentiment))
print('Analysis took', time, 'seconds to complete')
print('')
#convert to lowercase
phrase = str.lower(input('Enter a phrase to test: '))
phrase_split = phrase.split()
total_avg = 0
amount = 0
#count values to figure out the average score for the
phrase
for word in phrase_split:
if word in sentiment:
avg_score = sentiment[word][1] / sentiment[word][0]
print('* \'', word, '\' appears ', sentiment[word][0], ' times with
an average score of ', avg_score, sep = '')
total_avg += avg_score
amount += 1
else:
print('* \'', word, '\' does not appear in any movie reviews', sep
= '')
#if no words appear in reviews
if amount == 0:
print('Not enough words to determine sentiment.')
#else display the average and if > 2 display as a positive
statement.
#if less, display asnegative
else:
print('Average score for this phrase is:', total_avg /
amount)
if (total_avg / amount) > 2:
print('This is a POSITIVE phrase')
else:
print('This is a NEGATIVE phrase')
To solve this problem, we need to follow the steps mentioned in the assignment. We'll first compute the average score for all words in the 'movie_reviews.txt' file and store the data in a dictionary. Then, we'll prompt the user to enter phrases and analyze them using the dictionary to compute sentiment scores. Finally, we'll check whether the overall phrase is positive or negative based on the average scores of the words in the phrase.
Here's the Python program to accomplish this:
pythonCopy codeimport timedef read_movie_reviews(file_name): words = {} total_lines = 0 with open(file_name, 'r') as file: for line in file: total_lines += 1 review = line.strip().split(' ', 1)[1] words_in_review = set(review.lower().replace("'", "").replace("-", "").split()) for word in words_in_review: if word in words: words[word][0] += int(line[0]) words[word][1] += 1 else: words[word] = [int(line[0]), 1] return words, total_linesdef main(): print("Initializing sentiment database.") start_time = time.time() words, total_lines = read_movie_reviews('movie_reviews.txt') end_time = time.time() print("Sentiment database initialization complete.") print(f"Read {total_lines} lines.") print(f"Total unique words analyzed: {len(words)}") print(f"Analysis took {end_time - start_time:.3f} seconds to complete.") while True: user_input = input("Enter a phrase to test (type 'quit' to exit): ") if user_input.lower() == 'quit': print("Quitting.") break words_in_phrase = user_input.lower().replace("'", "").replace("-", "").split() num_words_in_phrase = len(words_in_phrase) total_score = 0 for word in words_in_phrase: if word in words: total_score += words[word][0] / words[word][1] if num_words_in_phrase == 0: print("Not enough words to determine sentiment.") else: average_score = total_score / num_words_in_phrase print(f"Average score for this phrase is: {average_score:.2f}") if average_score >= 2: print("This is a POSITIVE phrase.") else: print("This is a NEGATIVE phrase.")if __name__ == "__main__": main()
This Python program reads the 'movie_reviews.txt' file, analyzes it, and stores the word scores in a dictionary. It then prompts the user to enter phrases and computes the sentiment scores based on the average word scores. The program continues to prompt for phrases until the user enters "quit" to exit.
Please make sure to have the 'movie_reviews.txt' file in the same directory as the Python program before running it. The output of the program will match the examples provided in the assignment.
Python program This assignment requires you to write a single large program. I have broken it...
using Java program
please copy and paste the code don't screenshot it
import java.util.Scanner;
import java.io.File;
public class {
public static void main(String[] args) {
// Create a new Scanner object to obtain
// input from System.in
// --> TODO
// Ask user for a word to search for. Print
// out a prompt
// --> TODO
// Use the Scanner object you created above to
// take a word of input from the user.
// --> TODO
// ***...
(Python 3) Write a program that reads the contents of a text file. The program should then create a dictionary in which the keys are individual words found in the file and the values are the number of times each word appears and a list that contains the line numbers in the file where the word (the key) is found. Then the program will create another text file. The file should contain an alphabetical listing of the words that are...
In this assignment, you will explore more on text analysis and an elementary version of sentiment analysis. Sentiment analysis is the process of using a computer program to identify and categorise opinions in a piece of text in order to determine the writer’s attitude towards a particular topic (e.g., news, product, service etc.). The sentiment can be expressed as positive, negative or neutral. Create a Python file called a5.py that will perform text analysis on some text files. You can...
Python 3.7 Coding assignment This Program should first tell users that this is a word analysis software. For any user-given text file, the program will read, analyze, and write each word with the line numbers where the word is found in an output file. A word may appear in multiple lines. A word shows more than once at a line, the line number will be only recorded one time. Ask a user to enter the name of a text file....
Write a program IN PYTHON that checks the spelling of all words in a file. It should read each word of a file and check whether it is contained in a word list. A word list available below, called words.txt. The program should print out all words that it cannot find in the word list. Requirements Your program should implement the follow functions: main() The main function should prompt the user for a path to the dictionary file and a...
Write a program that employs the four letter word dictionary to check the spelling of an input word (test word). You will need to save the dictionary file to a folder on your computer. For this program you will prompt the user to enter a four letter word (or four characters). Then using a loop read each word from the dictionary and compare it to the input test word. If there is a match then you have spellchecked the word....
Homework description::::: Write JAVA program with following description. Sample output with code will be helful... A compiler must examine tokens in a program and decide whether they are reserved words in the Java language, or identifiers defined by the user. Design a program that reads a Java program and makes a list of all the identifiers along with the number of occurrences of each identifier in the source code. To do this, you should make use of a dictionary. The...
This program is in python and thanks fro whoever help me. In this program, you will build an English to Hmong translator program. Hmong is a language widely spoken by most Southeast Asian living in the twin cities. The program lets the user type in a sentence in English and then translate it to a Hmong sentence. The program does not care about grammar or punctuation marks. That means your program should remove punctuation marks from the English words before...
Dictionary.java
DictionaryInterface.java
Spell.java
SpellCheck.java
In this lab you will write a spell check program. The program has two input files: one is the dictionary (a list of valid words) and the other is the input file to be spell checked. The program will read in the words for the dictionary, then will read the input file and check whether each word is found in the dictionary. If not, the user will be prompted to leave the word as is, add...
Description: Overview: You will write a program (says wordcountfreq.c) to find out the number of words and how many times each word appears (i.e., the frequency) in multiple text files. Specifically, the program will first determine the number of files to be processed. Then, the program will createmultiple threads where each thread is responsible for one file to count the number of words appeared in the file and report the number of time each word appears in a global linked-list....