Description:
Overview: You will write a program (says wordcountfreq.c) to find
out the number of words and how many times each word appears (i.e.,
the frequency) in multiple text files. Specifically, the program
will first determine the number of files to be processed. Then, the
program will createmultiple threads where each thread is
responsible for one file to count the number of words appeared in
the file and report the number of time each word appears in a
global linked-list. The typical format to run the program with
input parameters is as follows:
./wordcountfreq File_1 File_2 ... File_n
Details: First, the program needs to determine the number of
files to be processed. This can be done with the argc parameter of
the main function; Then, the argv parameter can be used to retrieve
the name for each file.
After that, you need to use pthread create() to create multiple
threads (one for each file). Each thread needs to count the number
of words in its file and keeps tracks of how many times each word
appears in all the files using a global linked-list, where each
node represented a different word in any file. When a thread
finishes, it should print out the number of words it found as
follows:
Thread x: number of words in File_x is XXX
At the end, the main thread needs to get the number of words from each thread and report the total number of words found by all threads as well as frequency of each word in all files as follows:
All n files have been counted and the total of xxx words found ! aaa appears XXX times bbb appears YYY times ...
zzz appears MMM times
Here, words should appear in the dictionary order in the report!
For word counting, you could simply use the space character as the delimiter. Anything that are not separated by the space will be counted as a single word. For instance, the example “The first program is a Hello-world.” will be reported as 6 words (where Hello-world is counted as a single word). You could compare your results using the wc utility that is available on Linux machines.
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <string.h>
#include <pthread.h>
struct thread_info { /* Used as argument to thread_start() */
pthread_t thread_id; /* ID returned by pthread_create() */
int thread_num; /* Application-defined thread # */
char *argv_string; /* From command-line argument filename */
};
// A linked list node
struct Node
{
char *data;
int count;
struct Node *next;
};
struct Node *head = NULL;
int totalWordCount = 0;
void push(char *new_data)
{
/* 1. allocate node */
struct Node *new_node = (struct Node *)malloc(sizeof(struct Node));
/* 2. put in the data */
new_node->data = new_data;
new_node->count = 1;
/* 3. Make next of new node as head */
new_node->next = (head);
/* 4. move the head to point to the new node */
(head) = new_node;
}
bool search(char *x)
{
struct Node *current = head; // Initialize current
while (current != NULL)
{
if (current->data == x)
{
current->count++;
return true;
}
current = current->next;
}
return false;
}
void AddWord(char *word)
{
if (search(word))
{
//update the count
}
else
{
push(word);
}
}
// This function prints contents of linked list starting from head
void printList()
{
struct Node *node = head;
while (node != NULL)
{
printf("%s appears %d times\n", node->data, node->count);
node = node->next;
}
}
void *processFile(void *vargp)
{
int noOfWords = 0;
// Store the value argument passed to this thread
struct thread_info *tinfo = vargp;
FILE *fp;
fp = fopen(tinfo->argv_string, "r"); // read mode
if (fp == NULL)
{
perror("Error while opening the file.\n");
exit(EXIT_FAILURE);
}
char line[256]; /* or other suitable maximum line size */
while (fgets(line, sizeof line, fp) != NULL) /* read a line */
{
//fputs ( line, stdout ); /* write the line */
char *token = strtok(line, " ");
while (token != NULL)
{
AddWord(token);
token = strtok(NULL, " ");
}
}
printf("Thread %d: number of words in %s is %d", tinfo->thread_num, tinfo->argv_string, noOfWords);
totalWordCount += noOfWords;
fclose(fp);
}
int main(int argc, char const *argv[])
{
pthread_t thread_id;
char ch, file_name[25];
int noOfFiles = argc - 1;
for (size_t i = 0; i < noOfFiles; i++)
{
struct thread_info tinfo;
tinfo.thread_num = i + 1;
tinfo.argv_string = argv[i + 1];
pthread_create(&tinfo.thread_id, NULL, processFile, &tinfo);
}
printf("All %d files have been counted and the total of %d words found !", noOfFiles, totalWordCount);
printList();
return 0;
}
//output:
Description: Overview: You will write a program (says wordcountfreq.c) to find out the number of words and how many times each word appears (i.e., the frequency) in multiple text files. Specifically,...
Write a Python program to read lines of text from a file. For each word (i.e, a group of characters separated by one or more whitespace characters), keep track of how many times that word appears in the file. In the end, print out the top twenty counts and the corresponding words for each count. Print each value and the corresponding words, in alphabetical order, on one line. Print this in reverse sorted order by word count. You can assume...
Programming Assignment 1 Structures, arrays of structures, functions, header files, multiple code files Program description: Read and process a file containing customer purchase data for books. The books available for purchase will be read from a separate data file. Process the customer sales and produce a report of the sales and the remaining book inventory. You are to read a data file (customerList.txt, provided) containing customer book purchasing data. Create a structure to contain the information. The structure will contain...
Python program This assignment requires you to write a single large program. I have broken it into two parts below as a suggestion for how to approach writing the code. Please turn in one program file. Sentiment Analysis is a Big Data problem which seeks to determine the general attitude of a writer given some text they have written. For instance, we would like to have a program that could look at the text "The film was a breath of...
CSC110 Lab 6 (ALL CODING IN JAVA) Problem: A text file contains a paragraph. You are to read the contents of the file, store the UNIQUEwords and count the occurrences of each unique word. When the file is completely read, write the words and the number of occurrences to a text file. The output should be the words in ALPHABETICAL order along with the number of times they occur and the number of syllables. Then write the following statistics to...
Edit a C program based on the surface code(which is after the question's instruction.) that will implement a customer waiting list that might be used by a restaurant. Use the base code to finish the project. When people want to be seated in the restaurant, they give their name and group size to the host/hostess and then wait until those in front of them have been seated. The program must use a linked list to implement the queue-like data structure....
Needs Help with Java programming language For this assignment, you need to write a simulation program to determine the average waiting time at a grocery store checkout while varying the number of customers and the number of checkout lanes. Classes needed: SortedLinked List: Implement a generic sorted singly-linked list which contains all of the elements included in the unsorted linked list developed in class, but modifies it in the following way: • delete the addfirst, addlast, and add(index) methods and...
1 Overview For this assignment you are required to write a Java program that plays (n, k)-tic-tac-toe; (n, k)-tic- tac-toe is played on a board of size n x n and to win the game a player needs to put k symbols on adjacent positions of the same row, column, or diagonal. The program will play against a human opponent. You will be given code for displaying the gameboard on the screen. 2 The Algorithm for Playing (n, k)-Tic-Tac-Toe The...
could you please help me with this problem, also I need a little text so I can understand how you solved the problem? import java.io.File; import java.util.Scanner; /** * This program lists the files in a directory specified by * the user. The user is asked to type in a directory name. * If the name entered by the user is not a directory, a * message is printed and the program ends. */ public class DirectoryList { public static...