Question

Write a program named text_indexing.c that does the following: Reads text and stores it as one...

Write a program named text_indexing.c that does the following:

Reads text and stores it as one string called text. You can read from a file or from the user. (In my implementation, I read only one paragraph (up to new line) from the user. With this same code, I am able to read data from a file by using input redirection (executable < filename) when I run the program. See sample runs below).

You can assume that the text will not have more than 10000 characters.

We want to be able to perform fast searches in this text. We will index it by keeping the indexes where each word starts in alphabetical order.

Find, store (in an array) and print the index for each word and the word itself. They should show in the order in which they appear in the text. (See sample run.)

Sort using INSERTION SORT the indexes and print again the indexes and the words. They should show in sorted order now.(See sample run.)

The sorting must be case insensitive for the first letter. That is, if the first letter of any of the two words you are comparing is upper case, you must use the lower case value in your comparison. (Notice in the sample run with data1.txt that 'Chemical' is placed after 'application'.). You may want to write your own string compare method.

When searching for words (with binary search) count the number of words you look at until you get the answer (found or not found). Display that number in the run.

When searching (not when sorting) you should ignore the punctuation at the end of words. See the run with data0.txt:

        ./textidx.exe < data0.txt
        Enter the text:

        ...
        ...
        processes - found (3 iterations)
        processes. - not found (4 iterations)
        ...
        

REQUIREMENT: you must NOT make a copy of the data. That is, you must keep it still as one long string and process the words in it. DO NOT create an array of individual words. (-20 points if you make an array. If you do make the array, it should NOT be in sorted order. The goal here is to use indirect sorting and if you create a sorted array you would not be doing an indirect sort. You will sort the indexes into this array. See the indirect sorting with selection sort in the slides.)

8 points will be given for good practices: variable names, comments, descriptions. These will only be earned if the program is 75 correct. Non-working programs or programs that implement too few of the requirements will not get them.

C function that I used:scanf - to read the whole text:

                scanf("%[^\n]s",text); // the [^\n] part makes it stop reading when it reaches the first \n.
                

sscanf - to read from the string with the whole text one word in a separate variable

if text is a char array, text[i] is a char, and &text[i] is a pointer. Note that in class I showed the incorrect expression: *text[i].

Words are 'cleaned-up' by: making them all lower case and looking at the last symbol. If it was not a letter, that symbol was cut from the word (and so . and , at the end of some words can be removed). However this clean-up is also applied to the words that I am searching for.

Sample runs

compile:
gcc -o textidx text_indexing.c

Sample run 1 with input redirected from a file (not typed by the user):
./textidx.exe < data0.txt
Enter the text:

Words and indexes (the printed words are 'cleaned'):
 i   |index[i]|  word
-----|--------|------------------
   0 |      0 |  chemical
   1 |      9 |  engineering
   2 |     21 |  encompasses
   3 |     33 |  the
   4 |     37 |  development
   5 |     50 |  application
   6 |     63 |  and
   7 |     67 |  operation
   8 |     77 |  of
   9 |     80 |  processes

The sorted data (the printed words are 'cleaned'):
 i   |index[i]|  word
-----|--------|------------------
   0 |     63 |  and
   1 |     50 |  application
   2 |      0 |  chemical
   3 |     37 |  development
   4 |     21 |  encompasses
   5 |      9 |  engineering
   6 |     77 |  of
   7 |     67 |  operation
   8 |     80 |  processes
   9 |     33 |  the

Binary search. ---- Enter words to search for. Stop with -1.
(original: that)
cleaned: that - not found (4 iterations)
(original: at)
cleaned: at - not found (3 iterations)
(original: the)
cleaned: the - found (4 iterations)
(original: application)
cleaned: application - found (2 iterations)
(original: processes)
cleaned: processes - found (3 iterations)
(original: processes.)
cleaned: processes - found (3 iterations)
(original: and)
cleaned: and - found (3 iterations)


 The original text is still the same: Chemical engineering encompasses the development, application, and operation of processes.




Sample run 2 (with user typing the text):

./textidx.exe
Enter the text:
THIS, is another, EXAMple.

Words and indexes (the printed words are 'cleaned'):
 i   |index[i]|  word
-----|--------|------------------
   0 |      0 |  this
   1 |      6 |  is
   2 |      9 |  another
   3 |     18 |  example

The sorted data (the printed words are 'cleaned'):
 i   |index[i]|  word
-----|--------|------------------
   0 |      9 |  another
   1 |     18 |  example
   2 |      6 |  is
   3 |      0 |  this

Binary search. ---- Enter words to search for. Stop with -1.
THis
(original: THis)
cleaned: this - found (3 iterations)
THIS
(original: THIS)
cleaned: this - found (3 iterations)
this,
(original: this,)
cleaned: this - found (3 iterations)
is,
(original: is,)
cleaned: is - found (2 iterations)
is,,
(original: is,,)
cleaned: is, - not found (3 iterations)
ANTOTHER
(original: ANTOTHER)
cleaned: antother - not found (2 iterations)
ANOTHER
(original: ANOTHER)
cleaned: another - found (2 iterations)
EXAMPLE
(original: EXAMPLE)
cleaned: example - found (1 iterations)
-1

 The original text is still the same: THIS, is another, EXAMple.


Sample run 3:  - Posted at 6pm, 9/14
Enter the text:
TT tabc

Words and indexes (the printed words are 'cleaned'):
 i   |index[i]|  word
-----|--------|------------------
   0 |      0 |  tt
   1 |      3 |  tabc

The sorted data (the printed words are 'cleaned'):
 i   |index[i]|  word
-----|--------|------------------
   0 |      3 |  tabc
   1 |      0 |  tt

Binary search. ---- Enter words to search for. Stop with -1.
-1

 The original text is still the same: TT tabc
0 0
Add a comment Improve this question Transcribed image text
Request Professional Answer

Request Answer!

We need at least 10 more requests to produce the answer.

0 / 10 have requested this problem solution

The more requests, the faster the answer.

Request! (Login Required)


All students who have requested the answer will be notified once they are available.
Know the answer?
Add Answer to:
Write a program named text_indexing.c that does the following: Reads text and stores it as one...
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Similar Homework Help Questions
  • This program should test the running time of these algorithms: Selection Sort Insertion Sort Bubble Sort...

    This program should test the running time of these algorithms: Selection Sort Insertion Sort Bubble Sort Merge Sort Quick Sort Heap Sort You have access to the implementation of all of these sorting algorithms, and you may use what is provided in your text directly. Be sure to cite the source of these implementations in the header of your program. Please maintain the property that these sorting algorithms sort arrays in ascending order. For this homework, you will write a...

  • C++ Lab 1. Read in the contents of a text file up to a maximum of...

    C++ Lab 1. Read in the contents of a text file up to a maximum of 1024 words – you create your own input. When reading the file contents, you can discard words that are single characters to avoid symbols, special characters, etc. 2. Sort the words read in ascending order in an array (you are not allowed to use Vectors) using the Selection Sort algorithm implemented in its own function. 3. Search any item input by user in your...

  • Using Arrays with Sorting and Searching Algorithms 1) This program has six required outputs and involves...

    Using Arrays with Sorting and Searching Algorithms 1) This program has six required outputs and involves searching and sorting an array of integers. Write a java application that initializes an array with the following numbers, in this order: 23, 17, 5, 90, 12, 44, 38, 84, 77, 3, 66, 55, 1, 19, 37, 88, 8, 97, 25, 50, 75, 61, and 49. Then display the unsorted values. This is required output #1 of 6 for this program. 2) Using a...

  • previous question was not answered correctly, I need to write a program for sortarray2.cpp that is...

    previous question was not answered correctly, I need to write a program for sortarray2.cpp that is based off searcharray.cpp and sortarray1.cpp. sortarray2 should have a counter that calculates the number of swaps used to complete the sorting of the array (Exercise) Create-sortArray2.cpp In this part of the lab, you will write a program that uses a different implementation of selection sort from the one you chose before: if you had chosen to search for the minimum for the traverse of...

  • You will be reading in 3 files in the program. One will contain a list of...

    You will be reading in 3 files in the program. One will contain a list of 1000 words in unsorted order. The second file will contain 1000 words in sorted order. The final file will contain 20 words to be searched for. The main program has been written for you. You will be implementing three functions: bool readWords(string array[], int size, string fileName); int linearSearch(string wordToFind, const string words[], int size); int binarySearch(string wordToFind, const string words[], int size); The...

  • LANGUAGE = C i. Write a program that takes int numbers from user until user gives...

    LANGUAGE = C i. Write a program that takes int numbers from user until user gives a sentinel value (loop terminating condition). Sort the numbers in ascending order using Insertion sort method. Sorting part should be done in different function named insertionSort(). Your code should count the number of swaps required to sort the numbers. Print the sorted list and total number of swaps that was required to sort those numbers. (4 marks) ii. In this part take another number...

  • (Python 3) Write a program that reads the contents of a text file. The program should...

    (Python 3) Write a program that reads the contents of a text file. The program should then create a dictionary in which the keys are individual words found in the file and the values are the number of times each word appears and a list that contains the line numbers in the file where the word (the key) is found. Then the program will create another text file. The file should contain an alphabetical listing of the words that are...

  • write a program which include a class containing an array of words (strings).The program will search...

    write a program which include a class containing an array of words (strings).The program will search the array for a specific word. if it finds the word:it will return a true value.if the array does not contain the word. it will return a false value. enhancements: make the array off words dynamic, so that the use is prompter to enter the list of words. make the word searcher for dynamic, so that a different word can be searched for each...

  • This is binary search tree problem. The program reads the text file, and creates a binary...

    This is binary search tree problem. The program reads the text file, and creates a binary search tree based on the words in the file. I can create the tree but I also have to store 'the order of insertion' in each node.   For example, if text includes "the" 3 times and it is the 1st, 5th, and 9th word in the file, in binary search tree, one node should have string value "the" and array list{1, 5, 9}. How...

  • In java, write a program with a recursive method which asks the user for a text...

    In java, write a program with a recursive method which asks the user for a text file (verifying that the text file exists and is readable) and opens the file and for each word in the file determines if the word only contains characters and determines if the word is alpha opposite. Now what I mean by alpha opposite is that each letter is the word is opposite from the other letter in the alphabet. For example take the word...

ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT