Question

Detecting Substrings (C++ Version) Introduction A very common task that is often performed by programs that...

Detecting Substrings (C++ Version)
Introduction
A very common task that is often performed by programs that work with text files is the problem of locating a specific substring within the file. I am sure we’ve all done this many times when working with Word, Notepad, or other editors.

Since we don’t have a GUI or other means of displaying the contents of a file all at once, let’s modify the problem slightly. Rather than locating a specific substring within a file and then highlighting the results, as most modern programs would do, let’s write a C++ program that locates the occurrences of a specific substring within a file and then displays the occurrence number as well as a portion of the text around the found substring. It should also count the number of occurrences and indicate that number in a brief report at the end.

The input file that we will use for this exercise is a text copy of the Declaration of Independence. You will find this in the input file “DeclOfIndep.txt” on eLearning. It is okay to hard code this file name for this project, but in the future, you might have your program ask the user for the filename. That would generalize your program to work on any input file.   

Even though this is a C++ program, let’s use C-strings in this exercise for all string operations. Thus, your program will read each line of input from the file into a C-string, not into a C++ string. The functions you use to locate the substrings should be C-string functions and not members of the string class. All printing and other manipulations on strings should be done with C-strings. In short, there should be no C++ strings used in this program. (Therefore, do not #include <string>.)   
Overview
Here’s a high-level overview of what your program should do:

1) Ask the user for the substring to search for (and store it in a C-string). 2) Calculate and display each of the found occurrences of the substring in the file. For each found occurrence, your program should display (a) the location number starting at 1 and going up to the total number of locations found, and (b) the portion of the string containing the found substring. This portion should consist of the substring itself and up to 8 characters before and 8 characters after the found substring. (Note that if the substring is within 8 characters of the end of a line this won’t be possible.)
Sample Runs
Here is a sample run looking for the string “people” in the file “DeclOfIndep.txt”.   

Looking for the substring "people" in file "DeclOfIndep.txt":

Location 1: String: "for one people to" Location

2: String: "icts of people, unless" Location

3: String: "s those people would r" Location

4: String: " of the people." Location

5: String: " of our people."

There were 5 occurrences of the string "people" within the file "DeclOfIndep.txt".

Or consider another run looking for the substring “oo” in the file “DeclOfIndep.txt.”   

Looking for the substring "oo" in file "DeclOfIndep.txt":

Location 1: String: "public goooood." Location 2: String: "ublic goooood."


Location 3: String: "blic goooood." Location 4: String: "lic goooood." Location 5: String: "armed troops among" Location 6: String: ". They too have" Location 7: String: "of the good People" Location 8: String: "illiam Hooper" Location 9: String: "s Lightfoot Lee" Location 10: String: "Witherspoon"

There were 10 occurrences of the string "oo" within the file "DeclOfIndep.txt".
Programming Notes:
There are several points to be made about this problem in general, and about both sample runs.   

1) Both of the sample runs given above are actual data, so you can use them to test your program.   
2) Note that we deliberately modified a single word “good” in the Declaration file to “goooood”. In other words, we added some extra “o’s” to the word.   

This makes the point that some substrings can overlap. For example, if we search for “oo” in the word “goooood,” the first “oo” will certainly be a hit. But the second hit occurs with the second “o” of the first hit, i.e., the two instances of “oo” overlap. Because of this overlap, there are really four hits of “oo” within that word, as illustrated above, not two, and your program should find all four of them.
3) Do not use “inFile >>” to read the data from the input file. As you know, “inFile >>” tokenizes around white space and would therefore extract each word from the input file separately. Of course, this has both advantages and disadvantages depending on the circumstances. It would, for example, be a good function to use if we wanted to process within individual words only. But since our program should be able to detect substrings consisting of more than one word, “inFile >>” will not serve our purposes.   

Therefore, use “inFile.getline()” (i.e., the member function version of getline() – see Chapter 10, Slide 17+.) as your primary input function. As we discussed in class, this will read a single line of input from the file at a time and place it in the target buffer, which should be a character array of adequate size.   
4) Note that the file will be processed one line at a time. It is not necessary to look for substrings that span more than one line.   
5) Since we are using C-strings for this assignment, we’ll have to use the C string processing functions. A very useful function for this assignment would be the strstr(const char *, const char *) function, which locates an instance of




the right hand string inside the left hand string and returns a pointer to the found instance. (It returns a NULL if no instance is found.) This function is described on slide 27 of the Chapter 10 slide set. The strchr(const char *, int ch), not listed in the slide set, locates the first occurrence of “ch” in the string and returns a pointer to it.   
6) Since all C-strings are based on character arrays, be careful about running off either end of the array. Since you are required to print not only the substring but also 8 characters to either side of it, this overrun can occur if the substring you are looking for is within 8 characters of either the beginning or the end of that line. (Often an array overrun will be detected if your program starts printing out gibberish or default characters instead of text from the Declaration.)

For an example, consider the first sample run (i.e., looking for the word “people”). Note that in locations 4 and 5, the word “people” appears not only at the end of a sentence but also at the end of a line of input. It is, therefore, impossible to display 8 characters after the found substring in those cases, since we are processing on a line-by-line basis. This is perfectly okay. If there are not 8 characters either before or after the found substring, just terminate the output report at that point.   
7) Build up your solution in a modular fashion, debugging as you go. Do not attempt to write the whole program at once. If you feel lost at some point, simplify your problem down to something manageable. You might, for example, create a sample input file with a single sentence in it and see if your program can detect substrings within it. In any case, unless it is necessary to solve a problem, you should never have more than one function at once under development. Debug that function before moving on to the next. If you will code in this way, your overall development time will be much quicker.
8) Be alert to array overruns on either side as you look for substrings.   
Deliverables
Please submit your C++ source code file. There is no output file on this problem.

0 0
Add a comment Improve this question Transcribed image text
Answer #1

#include<stdio.h>
#include<stdlib.h>
#include<string.h>
//#define fopen_s(fp,fmt,mode)
#define fopen_s(pFile,filename,mode) ((*(pFile))=fopen((filename),(mode)))==NULL

int File_to_Search(char*,char*);
int File_to_Search(char s, char str)//Error expected ';',',',or')'
{
FILE *fp;
int line_num = 1;
int find_res_string = 0;
char temp[512];
if((fopen_s(&fp, s, "r")) != NULL) {
return(-1);
}
while(fgets(temp, 512, fp) != NULL) {
if((strstr(temp, str)) != NULL) {
printf("line : %d\t", line_num);
printf("string: %s\n", temp);
find_res_string++;
}
line_num++;
}
if(find_res_string == 0) {
printf("\nSorry, couldn't find a match.\n");
}

//Close the file if still open.
if(fp) {
fclose(fp);
}
return(0);
}

int main(int argc, char *argv[]) {
int res_string, errno;
system("cls");
res_string = File_to_Search("Index.txt", "for one people to");
if(res_string == -1) {
perror("Error");
printf("Error number = %d\n", errno);
exit(1);
}
return(0);
}

==

See Images for help

* New Project-20170419囧ロ+ 《 윙 compile 1 Execute l > Share Code main.cpp x Index.txt x root 1 for one people to 2 You got it 3

You can change the string, I have kept it "or one people to" , you can change that
Keep the file name Index.txt

Thanks, let me know if there is any concern, I will be happy to help

=====
EDIT: Pass the Search string in command line, See Images to understand

#include <iostream>

using namespace std;
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
//#define fopen_s(fp,fmt,mode)
#define fopen_s(pFile,filename,mode) ((*(pFile))=fopen((filename),(mode)))==NULL

int File_to_Search(char*,char*);
int File_to_Search(char s, char str)//Error expected ';',',',or')'
{
FILE *fp;
int line_num = 1;
int find_res_string = 0;
char temp[512];
if((fopen_s(&fp, s, "r")) != NULL) {
return(-1);
}
while(fgets(temp, 512, fp) != NULL) {
if((strstr(temp, str)) != NULL) {
printf("line : %d\t", line_num);
printf("string: %s\n", temp);
find_res_string++;
}
line_num++;
}
if(find_res_string == 0) {
printf("\nSorry, couldn't find a match.\n");
}

//Close the file if still open.
if(fp) {
fclose(fp);
}
return(0);
}

int main(int argc, char *argv[]) {
int res_string, errno;
system("cls");
res_string = File_to_Search("Index.txt", argv[1]);
// cout<<argv[0]<<endl;
if(res_string == -1) {
perror("Error");
printf("Error number = %d\n", errno);
exit(1);
}
return(0);
}
=============
See Image

Add a comment
Know the answer?
Add Answer to:
Detecting Substrings (C++ Version) Introduction A very common task that is often performed by programs that...
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for? Ask your own homework help question. Our experts will answer your question WITHIN MINUTES for Free.
Similar Homework Help Questions
  • Write a C program to run on ocelot to read a text file and print it...

    Write a C program to run on ocelot to read a text file and print it to the display. It should optionally find the count of the number of words in the file, and/or find the number of occurrences of a substring, and/or take all the words in the string and sort them lexicographically (ASCII order). You must use getopt to parse the command line. There is no user input while this program is running. Usage: mywords [-cs] [-f substring]...

  • TASK Your task is to build a palindrome from an input string. A palindrome is a...

    TASK Your task is to build a palindrome from an input string. A palindrome is a word that reads the same backward or forward. Your code will take the first 5 characters of the user input, and create a 9- character palindrome from it. Words shorter than 5 characters will result in a runtime error when you run your code. This is acceptable for this exercise – we will cover input validation in a later class. Some examples of input...

  • Overview: file you have to complete is WordTree.h, WordTree.cpp, main.cpp Write a program in C++ that...

    Overview: file you have to complete is WordTree.h, WordTree.cpp, main.cpp Write a program in C++ that reads an input text file and counts the occurrence of individual words in the file. You will see a binary tree to keep track of words and their counts. Project description: The program should open and read an input file (named input.txt) in turn, and build a binary search tree of the words and their counts. The words will be stored in alphabetical order...

  • OK, here is the project, I need to get started and just don't understand how to...

    OK, here is the project, I need to get started and just don't understand how to get the registers, the array, stack to work with the UART. Just looking for some help to start, not looking for you to solve the project. I have to write a program that receives a string of characters via the UART, checks if this string is a palindrome, and then uses a print function to print either "Yes" or "No". A palindrome sequence of...

  • Lab2: Processing Strings Part#1 – Counting Vows Assume s is a string of lower case characters....

    Lab2: Processing Strings Part#1 – Counting Vows Assume s is a string of lower case characters. Write a program that counts up the number of vowels contained in the string s. Valid vowels are: 'a', 'e', 'i', 'o', and 'u'. For example, if s = 'azcbobobegghakl', your program should print: Number of vowels: 5 Part#2 – Counting Bobs Assume s is a string of lower case characters. Write a program that prints the number of times the string 'bob' occurs...

  • Please solve in Python. You would like to set a password for an email account. However,...

    Please solve in Python. You would like to set a password for an email account. However, there are two restrictions on the format of the password. It has to contain at least one uppercase character and it cannot contain any digits. You are given a string S consisting of N alphanumerical characters. You would like to find the longest substring of Sthat is a valid password. A substring is defined as a contiguous segment of a string. For example, given...

  • Consider the following C++ program. It reads a sequence of strings from the user and uses...

    Consider the following C++ program. It reads a sequence of strings from the user and uses "rot13" encryption to generate output strings. Rot13 is an example of the "Caesar cipher" developed 2000 years ago by the Romans. Each letter is rotated 13 places forward to encrypt or decrypt a message. For more information see the rot13 wiki page. #include <iostream> #include <string> using namespace std; char rot13(char ch) { if ((ch >= 'a') && (ch <= 'z')) return char((13 +...

  • Instructions: Consider the following C++ program. It reads a sequence of strings from the user and...

    Instructions: Consider the following C++ program. It reads a sequence of strings from the user and uses "rot13" encryption to generate output strings. Rot13 is an example of the "Caesar cipher" developed 2000 years ago by the Romans. Each letter is rotated 13 places forward to encrypt or decrypt a message. For more information see the rot13 wiki page. #include <iostream> #include <string> using namespace std; char rot13(char ch) { if ((ch >= 'a') && (ch <= 'z')) return char((13...

  • Help please Write a program named one.c that takes a single command-line argument, the name of...

    Help please Write a program named one.c that takes a single command-line argument, the name of a file. Your program should read all the strings (tokens) from this file and write all the strings that are potentially legal words (the string contains only upper-case and lower-case characters in any combination) to the file words. Your program should ignore everything else (do not write those strings anywhere 1. As an example, running /a.out dsia would result in the generation of the...

  • Hi, I need help with my comp sci assignment. The parameters are listed below, but I...

    Hi, I need help with my comp sci assignment. The parameters are listed below, but I am having trouble generating the number of occurrences of each word. Please use a standard library. Read in the clean text you generated in part 2 (start a new cpp file). Create a list of all the unique words found in the entire text file (use cleanedTextTest.txt for testing). Your list should be in the form of an array of structs, where each struct...

ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT