Question

Problem B: Clean up your potty mouth: blogging software Online blogs and other written content generation...

Problem B: Clean up your potty mouth: blogging software

Online blogs and other written content generation are very popular. However, as we try different ways to allow anybody on the internet to create and share written content, we run into a problem. A lot of people on the internet have foul potty-mouths and write lots of bad words. The common solution to this is to perform some form of automatic content filtering to either remove or obscure inappropriate text on an online system.

While a lot of “bad” words (like the ones we have in the example files for this problem) are not actually that offensive (just crass) there are also many words that are authentically hurtful or offensive such as racial slurs, and other hate speech. Therefore, while our program will focus on a relatively arbitrary or silly list of words, you should keep in mind that this can be a serious issue and it is treated quite seriously in many online games and content hosting websites.

Program Behavior

Our program will process a text file one line at a time based on two lists of “inappropriate words”. The program will process the text removing words and lines that are bad and printing back to the user only the good lines of text

The first list of bad words is the banned word list. Any line of text in the text file that contains a banned word should not be shown to the user. The entire line of text should simply be removed. The second list of bad words is the bad word list. A line of text with a bad word can be printed, however each and every occurrence of the bad word needs to be “removed” by marking it with asterisks “*”.

Formally your program should: 1. prompt the user for the file of user text to read 2. Open this file, printing an error and halting if it cannot be opened. 3. For each line of text in the input file:

(a) If the line of text contains a banned word (from the banned word file) do not print this line

(b) If the line of text does not contain a banned word then it should replace every occurrence of any bad word (from the bad word file) with an appropriate number of asterisks “*”. (the number of asterisks should equal the replaced word).

8 (c) After replacing the bad words, the line can be printed back to the user.

Note, there are some obvious tasks which are not in the previous outline, notably opening the bad word list and banned word list and reading it’s contents. You can choose when and how you do this. It’s possible to write your code to read these files once and only once, storing the bad words in an array, or you can write your code to open these files many times, looping over it’s contents as needed. The choice is yours. Notes on the files:

• BadWords.txt - You can assume this file will always be named BadWords.txt (and that you do not need to ask the user for this file name). You cannot assume that the example BadWords file will be used when grading (we may swap for a different file). You cannot assume a specific length for this file, although we promise it won’t be longer than 100 words. Any word in this file should be replaced with asterisks “*” anywhere it appears in the input file.

BannedWords.txt - You can assume this file will always be named BannedWords.txt (and that you do not need to ask the user for this file name). You cannot assume that the example BannedWords file will be used when grading (we may swap for a different file). You cannot assume a specific length for this file, although we promise it won’t be longer than 100 words. Any line of text that contains these words should not be printed.

0 0
Add a comment Improve this question Transcribed image text
Answer #1

The screenshots of sample input file and BannedWord.txt and BadWords.txt are given at the end of the code. Refer that for understanding the code well.

import re
#reading banned words
fp1 = open("BannedWords.txt", "r")
BannedWords = []

words = fp1.readlines()
for w in words:
   BannedWords.append(w.strip('\n').lower())

fp1.close()

fp2 = open("BadWords.txt", "r")
BadWords = []
words = fp2.readlines()
for w in words:
   BadWords.append(w.strip().lower())
fp2.close()

# print("Badwords: ", BadWords)
# print("Banned words: ", BannedWords)

#reading user input
filename = input("Enter input file path")

inFile = open(filename, "r")
lines = inFile.readlines()

for line in lines:
   line = re.sub(r'[^\w\s]','',line)
   words = line.lower().strip().split()
   flag = 1
   for w in words:
       if w in BannedWords:
           flag = 0
       elif w in BadWords:
           words[words.index(w)] = "*"
   if flag == 1:
       print(' '.join(words))
   word = []

CODE:

Input To the program(input.txt)

Output of the program:

From the output you can see lines 3,4,5,6 in inputfile.txt contains bannedwords and hence is not printed

while lines 1 and 2 have badwords which are replaced with asterik.

BannedWords.txt:

BadWords.txt

Add a comment
Know the answer?
Add Answer to:
Problem B: Clean up your potty mouth: blogging software Online blogs and other written content generation...
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for? Ask your own homework help question. Our experts will answer your question WITHIN MINUTES for Free.
Similar Homework Help Questions
  • Pythong Program 4 should first tell users that this is a word analysis software. For any...

    Pythong Program 4 should first tell users that this is a word analysis software. For any user-given text file, the program will read, analyze, and write each word with the line numbers where the word is found in an output file. A word may appear in multiple lines. A word shows more than once at a line, the line number will be only recorded one time. Ask a user to enter the name of a text file. Using try/except for...

  • Please write this in C. Write this code in Visual Studio and upload your Source.cpp file for checking (1) Write a program to prompt the user for an output file name and 2 input file names. The progra...

    Please write this in C. Write this code in Visual Studio and upload your Source.cpp file for checking (1) Write a program to prompt the user for an output file name and 2 input file names. The program should check for errors in opening the files, and print the name of any file which has an error, and exit if an error occurs opening any of the 3 For example, (user input shown in caps in first line) Enter first...

  • Python Help Please! This is a problem that I have been stuck on.I am only suppose...

    Python Help Please! This is a problem that I have been stuck on.I am only suppose to use the basic python coding principles, including for loops, if statements, elif statements, lists, counters, functions, nested statements, .read, .write, while, local variables or global variables, etc. Thank you! I am using python 3.4.1. ***( The bottom photo is a continuation of the first one)**** Problem statement For this program, you are to design and implement text search engine, similar to the one...

  • In this assignment, you will explore more on text analysis and an elementary version of sentiment...

    In this assignment, you will explore more on text analysis and an elementary version of sentiment analysis. Sentiment analysis is the process of using a computer program to identify and categorise opinions in a piece of text in order to determine the writer’s attitude towards a particular topic (e.g., news, product, service etc.). The sentiment can be expressed as positive, negative or neutral. Create a Python file called a5.py that will perform text analysis on some text files. You can...

  • Program In Assembly For this part, your MAL program must be in a file named p5b.mal....

    Program In Assembly For this part, your MAL program must be in a file named p5b.mal. It must have at least one function in addition to the main program. For the purposes of Part (b), you may assume the following 1. Any line of text typed by a user has at most 80 characters including the newline character. 2. A whitespace character refers to a space, a tab or the new line character. 3. A word is any sequence of...

  • Goal: design and implement a dictionary. implement your dictionary using AVL tree . Problem​: Each entry...

    Goal: design and implement a dictionary. implement your dictionary using AVL tree . Problem​: Each entry in the dictionary is a pair: (word, meaning). Word is a one-word string, meaning can be a string of one or more words (it’s your choice of implementation, you can restrict the meaning to one-word strings). The dictionary is case-insensitive. It means “Book”, “BOOK”, “book” are all the same . Your dictionary application must provide its operations through the following menu (make sure that...

  • JAVA Write a program that prompts the user to enter a file name, then opens the...

    JAVA Write a program that prompts the user to enter a file name, then opens the file in text mode and reads it. The input files are assumed to be in CSV format. The input files contain a list of integers on each line separated by commas. The program should read each line, sort the numbers and print the comma separated list of integers on the console. Each sorted list of integers from the same line should be printed together...

  • //I NEED THE PROGRAM IN C LANGUAGE!// QUESTION: I need you to write a program which...

    //I NEED THE PROGRAM IN C LANGUAGE!// QUESTION: I need you to write a program which manipulates text from an input file using the string library. Your program will accept command line arguments for the input and output file names as well as a list of blacklisted words. There are two major features in this programming: 1. Given an input file with text and a list of words, find and replace every use of these blacklisted words with the string...

  • C++ (1) Write a program to prompt the user for an input and output file name....

    C++ (1) Write a program to prompt the user for an input and output file name. The program should check for errors in opening the files, and print the name of any file which has an error, and exit if an error occurs. For example, (user input shown in caps in first line, and in second case, trying to write to a folder which you may not have write authority in) Enter input filename: DOESNOTEXIST.T Error opening input file: DOESNOTEXIST.T...

  • You are given a set of ABC cubes for kids (like the one shown below) and a word. On each side of ...

    You are given a set of ABC cubes for kids (like the one shown below) and a word. On each side of each cube a letter is written 7 FI 0 You need to find out, whether it it possible to form a given word by the cubes For example, suppose that you have 5 cubes: B1: MXTUAS B2:OQATGE ВЗ: REwMNA B4: MBDFAC В5: IJKGDE (here for each cube the list of letters written on its sides is given) You...

ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT