Before running this code, make sure that validators, requests packages are installed.
validators is used to check whether the given file_name is url or not.
remove_word(): this function is used to remove the specific word from the list and returns the modified list
along with the list of different representation of the specific word.
import os
import validators
import requests as req
# DELIMITERS is used to split the string based on the spacial characters it contains.
# Update DELIMITERS as per your needs
DELIMITERS = ['.',',','\t','\n']
def remove_word(data, word):
# this list is used to store the different representations of the word
representations = []
for d in data:
if d.lower() == word:
if d not in representations:
representations.append(d)
data.remove(d)
return data, representations
def word_stats(data: str):
# Replace all delimiters with space
for char in DELIMITERS:
data = data.replace(char,' ')
# coverting the string into list of words by splitting the string
words = data.split()
words_copy = data.lower()
total_words = len(words)
while len(words) != 0:
word = words[0].lower()
word_count = words_copy.count(word)
words, representations = remove_word(words, word)
print("{0}:{1}%,
{2} total occurrences, {3} representations
{4}".format(word,(word_count/total_words)*100,word_count,
len(representations), str(representations)))
if __name__ == "__main__":
file_name = input()
data = None
if validators.url(file_name):
data = req.get(file_name).text
else:
data = open(file_name, mode = 'r').read()
word_stats(data)
Output:
Could you please do this in Python to run in Spark. For your second assignment I...
Could anyone please help with this Python code assignment? In this programming assignment you are to create a program called numstat.py that reads a series of integer numbers from a file and determines and displays the following: The name of the file. The sum of the numbers. The count of how many numbers are in the file. The average of the numbers. The average is the sum of the numbers divided by how many there are. The maximum value. The...
I need some help with programming this assignment. Programming Assignment 6: A Python Class, Attributes, Methods, and Objects Obiectives .Be able to write a Python class Be able to define class attributes Be able to define class methods .Be able to process input from a text file .Be able to write an application using objects The goal of this programming assignment is to develop a simple image processing application. The application will import a class that creates image objects with...
In this assignment, you will explore more on text analysis and an elementary version of sentiment analysis. Sentiment analysis is the process of using a computer program to identify and categorise opinions in a piece of text in order to determine the writer’s attitude towards a particular topic (e.g., news, product, service etc.). The sentiment can be expressed as positive, negative or neutral. Create a Python file called a5.py that will perform text analysis on some text files. You can...
Assignment 3: Word Frequencies Prepare a text file that contains text to analyze. It could be song lyrics to your favorite song. With your code, you’ll read from the text file and capture the data into a data structure. Using a data structure, write the code to count the appearance of each unique word in the lyrics. Print out a word frequency list. Example of the word frequency list: 100: frog 94: dog 43: cog 20: bog Advice: You can...
Python program This assignment requires you to write a single large program. I have broken it into two parts below as a suggestion for how to approach writing the code. Please turn in one program file. Sentiment Analysis is a Big Data problem which seeks to determine the general attitude of a writer given some text they have written. For instance, we would like to have a program that could look at the text "The film was a breath of...
Could anyone help add to my python code? I now need to calculate the mean and median. In this programming assignment you are to extend the program you wrote for Number Stats to determine the median and mode of the numbers read from the file. You are to create a program called numstat2.py that reads a series of integer numbers from a file and determines and displays the following: The name of the file. The sum of the numbers. The...
Python Help Please! This is a problem that I have been stuck on.I am only suppose to use the basic python coding principles, including for loops, if statements, elif statements, lists, counters, functions, nested statements, .read, .write, while, local variables or global variables, etc. Thank you! I am using python 3.4.1. ***( The bottom photo is a continuation of the first one)**** Problem statement For this program, you are to design and implement text search engine, similar to the one...
I am having problems with the following assignment. It is done in the c language. The code is not reading the a.txt file. The instructions are in the picture below and so is my code. It should read the a.txt file and print. The red car hit the blue car and name how many times those words appeared. Can i please get some help. Thank you. MY CODE: #include <stdio.h> #include <stdlib.h> #include <string.h> struct node { char *str; int...
Python Assignment In this assignment, you will use Pandas library to perform analysis on the dataset stored in the following csv file: breast-cancer-wisconsin.csv. Please write script(s) to do the following: 1. Read the csv file and covert the dataset into a DataFrame object. 2. Persist the dataset into a SQL table and a JASON file. • Write the content of the DataFrame object into an SQLite database table. This will convert the dataset into a SQL table format. You can...
You need not run Python programs on a computer in solving the following problems. Place your answers into separate "text" files using the names indicated on each problem. Please create your text files using the same text editor that you use for your .py files. Answer submitted in another file format such as .doc, .pages, .rtf, or.pdf will lose least one point per problem! [1] 3 points Use file math.txt What is the precise output from the following code? bar...