Question

Let’s build a dynamic string tokenizer! Start with the existing template and work on the areas...

Let’s build a dynamic string tokenizer!

Start with the existing template and work on the areas marked with TODO in the comments: Homework 8 Template.c

Note: If you turn the template back into me without adding any original work you will receive a 0.

By itself the template does nothing. You need to fill in the code to dynamically allocate an array of strings that are returned to the user.

Remember: A string is an array. A tokenizer goes through this array looking for special characters called delimiters . The result is a series of strings (tokens) that were taken out of the original array.

Example: Assume the input array is "1.1, 2.2, 3.3". The tokens would be "1.1" then "2.2" then "3.3".

Your objective: Finish up the template. We will use the old tokenizer system (strtok) to parse the string for comma characters ',' and space characters ' '.

  1. In the BetterTokenizer() function you will dynamically allocate the array of strings.
    • The BetterTokenizer() function calls GetDelimiterCnt which returns an estimate on the number of tokens that will be found.
    • Use this estimate to dynamically allocate an array of strings and set the address of tokenizedArray that is within the BetterStrTok structure.
  2. Once you know the number of tokens to expect, and have allocated for them (step 1), begin extracting each token.
    • The function GetNextToken() will handle the strtok calls for you.
      • It returns the next token string and sets the newStringLength variable to, you guessed it, the string's length.
    • Use the newStringLength variable to dynamically allocate the next string to be copied into strtok.tokenizedArray.
  3. You want to dynamically allocate, and copy to, the arrayIndex of strtok.tokenizedArray[].
    • You may use strcpy (which is one line). Similarly memcpy.
    • Or use a for loop and assign the values if you are uncomfortable with the other options.
  4. Fill out the PrintTokens function which goes through the array of tokens (strTok.tokenizedArray[]) and prints them to the screen using printf.
  5. Fill out the CleanUpTokens function. You need to free any memory that was dynamically allocated.
    • Set up a loop to go through the array (strTok.tokenizedArray[]).
    • Free each string as you go.

#define _CRT_SECURE_NO_WARNINGS

#include <stdio.h>

#include <stdlib.h>

#include <stdbool.h>

#include <string.h>

#define MAX_INPUT_SIZE 10000

/* I used typedef to create a "string" data type. All strings are just

arrays of characters. I really did this to get rid of the scary

double pointers. */

typedef char* string;

/*This is the primary structure

It contains an array of strings (tokens) in tokenizedArray

And a count of how many tokens were found in tokenCnt.*/

typedef struct

{

string* tokenizedArray;

int tokenCnt;

} BetterStrTok;

//Declare all functions up front

string GetNextToken(string sStr, string delimiter, bool* pTokenizedOnce, int*

pLength);

BetterStrTok BetterTokenize(string sInput, string delimiter, int maxStringLength);

void PrintTokens(const BetterStrTok* pInput);

void CleanUpTokens(BetterStrTok* pInput);

int GetDelimiterCnt(string sStr, int maxStringLength, string delimiter);

void main()

{

/*Don't worry about this. The main function has an array of characters

(which is the same as the string data type). It uses gets() to read the

string in.*/

char sInput[MAX_INPUT_SIZE];

BetterStrTok tokens;

printf("Enter a series of numbers (or words) separated by commas: ");

gets(sInput);

/* BetterTokenize parses out the tokens in between the ',' and ' '

characters.

It will return a data structure that contains an array of strings

(dynamically allocated).

We are mimicing the Java string.Split() function. It breaks a string apart

into its

component parts.

*/

tokens = BetterTokenize(sInput, ", ", MAX_INPUT_SIZE);

/* Once the tokens have been made, we print each one and finally clean up the

dynamically allocated memory. Ultimately this is just a test.

We read in a string, break it apart, print out each piece, then clean up.

In the final homework we will do more with this.

*/

PrintTokens(&tokens);

CleanUpTokens(&tokens);

}

/*HOMEWORK 8:

Find the TODO comments and fill them in with the appropriate lines of code.

This function takes the original input string and fills out a BetterStrTok

structure.

This structure will contain a count of tokens and a dynamically allocated

array

of strings (tokens) that were extracted from the original string.

*/

BetterStrTok BetterTokenize(string sInput, string delimiter, int maxStringLength)

{

int i;

bool tokenizedOnce = false;

int newStringLength = 0;

int expectedCnt = 0;

BetterStrTok strTokStruct;

memset(&strTokStruct, 0, sizeof(BetterStrTok));

expectedCnt = GetDelimiterCnt(sInput, maxStringLength,delimiter);

/*TODO 1:

Use calloc or malloc to dynamically create the array of strings

This is an array of size expectedCnt times sizeof(string).

strTokStruct.tokenizedArray will be set to the pointer returned by

the calloc (or malloc) call.

Explanation: We are constructing an array of strings dynamically.

GetDelimiterCnt returned a total number of tokens. We cannot know

this at compile time, so an array of strings is dynamically

allocated.

*/

strTokStruct.tokenCnt = 0;

//This loop goes through every token that is expected

//The tokens are extracted, and then copied into dynamic memory.

for (i = 0; i < expectedCnt; i++)

{

/*Don't worry about this GetNextToken function, it returns the next

token string.

and sets newStringLength to the length of that string.

Explanation: This extracts the next token from the original string.

A token is every character between the space characters and ,

characters.

So given the string "1.1, 2.2, 3.3" the first token would be

1.1,

the next token would be 2.2, and the final token would be 3.3.

*/

string token = GetNextToken(sInput, delimiter, &tokenizedOnce,

&newStringLength);

if (newStringLength != 0)

{

int arrayIndex = strTokStruct.tokenCnt++;

/*TODO 2:

Use malloc or calloc to create a string of size

newStringLength times

the sizeof char. This pointer would be set to next index of

the array:

strTokStruct.tokenizedArray[arrayIndex]

Explanation: Remember that the string data type is an array

of chars.

We do not know, in advance (at compile-time), the length

of the token.

So a new array of chars is allocated. Keep in mind that

strTokStruct.tokenizedArray is an array of strings. That

is, it is an

array of array of chars.

*/

/*TODO 3:

Copy the string, token, to the newly allocated string which

is pointed at

by strTokStruct.tokenizedArray[arrayIndex] (assuming you

did the above step

correctly). You may use a for loop, the strcpy function or

the memcpy function.

Explanation: The token was found and is in the token

string, but that will be

overwritten. Before that occurs, we want to copy it to

the newly

allocated string (In the TODO 2 step).

*/

}

}

return strTokStruct;

}

/*HOMEWORK 8:

Assuming the above function works, this one will print every token in the

array.

*/

void PrintTokens(const BetterStrTok* pInput)

{

/* TODO 4:

Create a loop that goes from 0 to pInput->tokenCount.

Inside the loop print the token. If you want output like mine:

printf "%d: %s\n"

where the %d variable is the counter variable for the loop

and %s is the string in pInput's tokenizedArray array at

index i (or whatever you call your counter variable).

*/

}

/*HOMEWORK 8:

You always have to clean up dynamic memory, otherwise the system does not

know

that you are done with it. This function goes through the array of strings

and

releases each one. Then it releases the array itself.

*/

void CleanUpTokens(BetterStrTok* pInput)

{

/* TODO 5:

Create a loop that goes from 0 to pInput->tokenCount.

Inside the loop call the free function on pInput->tokenizedArray[]

at index i, or whatever you call your counter variable.

Explanation: pInput->tokenizedArray is an array of dynamically

allocated

strings. At the next line we will free the array itself, but first

you want

to go through the loop and free the dynamic memory one string at a

time.

Failure to do this create a memory leak.

Always remember, for every malloc or calloc you need a corresponding

free

somewhere in your code.

Similarly in C++ for every new call you would want a corresponding

delete,

but that isn't a part of this homework assignment.

*/

/*After freeing every string inside pInput->tokenizedArray the array itself

must be freed.*/

if(pInput->tokenizedArray != NULL)

free(pInput->tokenizedArray);

}

/*You don't have to worry about this function. It walks through the original string

And compares each character with the list of delimiters. It returns an accurate

Count of how many tokens exist in the original string.*/

int GetDelimiterCnt(string sStr, int maxStringLength, string delimiter)

{

int i;

int j;

int cnt = 0;

for (i = 0; i < maxStringLength || sStr[i] != '\0'; i++)

{

for (j = 0; j < strlen(delimiter); j++)

{

if (sStr[i] == delimiter[j])

cnt++;

}

}

//There will also be a concluding string

return cnt + 1;

}

/*You don't have to worry about this function. It uses the original strtok

functions to

extract the next token. strtok is not thread safe, so as good practice I actually

use strtok_s or strtok_r depending on if you are using OSX or Windows.

If you want to know what thread safe means, please feel free to ask.*/

string gContext;

string GetNextToken(string sStr, string delimiter, bool* pTokenizedOnce, int*

pLength)

{

string rv = NULL;

if (*pTokenizedOnce == false)

{

*pTokenizedOnce = true;

#ifdef _WIN32

rv = strtok_s(sStr, delimiter, &gContext);

#else

rv = strtok_r(sStr, delimiter, &gContext);

#endif

}

else

{

#ifdef _WIN32

rv = strtok_s(NULL, delimiter, &gContext);

#else

rv = strtok_r(NULL, delimiter, &gContext);

#endif

}

if (rv != NULL)

*pLength = strlen(rv)+1;

else

*pLength = 0;

return rv;

}

Annotations

0 0
Add a comment Improve this question Transcribed image text
Answer #1

Summary :

Pasted below the c code and output towards the end .

I have to modify the getTokenziecount logic as it was not behavior properly with original code .

Added few print statements to show the input and Number of tokens received .

Output shown for varied lenght of tokens .

######################## C Code ########################################

#define _CRT_SECURE_NO_WARNINGS
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <string.h>
#define MAX_INPUT_SIZE 10000

/* I used typedef to create a "string" data type. All strings are just

arrays of characters. I really did this to get rid of the scary

double pointers. */

typedef char* string;

/*This is the primary structure

It contains an array of strings (tokens) in tokenizedArray

And a count of how many tokens were found in tokenCnt.*/

typedef struct
{
   string* tokenizedArray;
   int tokenCnt;
} BetterStrTok;

//Declare all functions up front

string GetNextToken(string sStr, string delimiter, bool* pTokenizedOnce, int* pLength);
BetterStrTok BetterTokenize(string sInput, string delimiter, int maxStringLength);
void PrintTokens(const BetterStrTok* pInput);
void CleanUpTokens(BetterStrTok* pInput);
int GetDelimiterCnt(string sStr, int maxStringLength, string delimiter);

void main()
{

/*Don't worry about this. The main function has an array of characters

(which is the same as the string data type). It uses gets() to read the

string in.*/

   char sInput[MAX_INPUT_SIZE];

   BetterStrTok tokens;

   printf("Enter a series of numbers (or words) separated by commas: ");

   gets(sInput);
   //strcpy(sInput ,"1.2, 1.3, 1.43456, 01.5, 13.456, 12.7, 1");
   printf(" Given Str : %s \n" , sInput);

/* BetterTokenize parses out the tokens in between the ',' and ' '
characters.
It will return a data structure that contains an array of strings
(dynamically allocated).
We are mimicing the Java string.Split() function. It breaks a string apart
into its
component parts.
*/

   tokens = BetterTokenize(sInput, ", ", MAX_INPUT_SIZE);

/* Once the tokens have been made, we print each one and finally clean up the
dynamically allocated memory. Ultimately this is just a test.
We read in a string, break it apart, print out each piece, then clean up.
In the final homework we will do more with this.
*/

   PrintTokens(&tokens);
   CleanUpTokens(&tokens);

}

/*HOMEWORK 8:

Find the TODO comments and fill them in with the appropriate lines of code.

This function takes the original input string and fills out a BetterStrTok

structure.

This structure will contain a count of tokens and a dynamically allocated

array

of strings (tokens) that were extracted from the original string.

*/
BetterStrTok BetterTokenize(string sInput, string delimiter, int maxStringLength)
{

   int i;
   bool tokenizedOnce = false;
   int newStringLength = 0;
   int expectedCnt = 0;
   BetterStrTok strTokStruct;
   memset(&strTokStruct, 0, sizeof(BetterStrTok));
   expectedCnt = GetDelimiterCnt(sInput, maxStringLength,delimiter);
  
   printf(" Num of Tokens % d \n " , expectedCnt );
/*TODO 1:
Use calloc or malloc to dynamically create the array of strings
This is an array of size expectedCnt times sizeof(string).
strTokStruct.tokenizedArray will be set to the pointer returned by
the calloc (or malloc) call.
Explanation: We are constructing an array of strings dynamically.
GetDelimiterCnt returned a total number of tokens. We cannot know
this at compile time, so an array of strings is dynamically
allocated.
*/
   strTokStruct.tokenizedArray = malloc( sizeof(char *) * (expectedCnt + 1));

   strTokStruct.tokenCnt = 0;

//This loop goes through every token that is expected
//The tokens are extracted, and then copied into dynamic memory.

   for (i = 0; i < expectedCnt; i++)
   {
/*Don't worry about this GetNextToken function, it returns the next
token string.
and sets newStringLength to the length of that string.
Explanation: This extracts the next token from the original string.
A token is every character between the space characters and ,
characters.
So given the string "1.1, 2.2, 3.3" the first token would be
1.1,
the next token would be 2.2, and the final token would be 3.3.
*/

       string token = GetNextToken(sInput, delimiter, &tokenizedOnce, &newStringLength);

       if (newStringLength != 0)
       {
           int arrayIndex = strTokStruct.tokenCnt++;
          
           //printf(" Got token %s , of length %zu , arrayIndx : %d , tokenCnt : %d \n" , token , strlen(token), arrayIndex, strTokStruct.tokenCnt);
          
/*TODO 2:
Use malloc or calloc to create a string of size
newStringLength times
the sizeof char. This pointer would be set to next index of
the array:
strTokStruct.tokenizedArray[arrayIndex]
Explanation: Remember that the string data type is an array
of chars.
We do not know, in advance (at compile-time), the length
of the token.
So a new array of chars is allocated. Keep in mind that
strTokStruct.tokenizedArray is an array of strings. That
is, it is an
array of array of chars.
*/
           strTokStruct.tokenizedArray[arrayIndex] = malloc(sizeof( char) * (strlen(token) + 1));
          
/*TODO 3:
Copy the string, token, to the newly allocated string which
is pointed at
by strTokStruct.tokenizedArray[arrayIndex] (assuming you
did the above step
correctly). You may use a for loop, the strcpy function or
the memcpy function.
Explanation: The token was found and is in the token
string, but that will be
overwritten. Before that occurs, we want to copy it to
the newly
allocated string (In the TODO 2 step).
*/
           strcpy(strTokStruct.tokenizedArray[arrayIndex], token);
       }

   }

   return strTokStruct;

}

/*HOMEWORK 8:

Assuming the above function works, this one will print every token in the

array.

*/

void PrintTokens(const BetterStrTok* pInput)

{

/* TODO 4:

Create a loop that goes from 0 to pInput->tokenCount.

Inside the loop print the token. If you want output like mine:

printf "%d: %s\n"

where the %d variable is the counter variable for the loop

and %s is the string in pInput's tokenizedArray array at

index i (or whatever you call your counter variable).

*/
   for(int i = 0 ; i < pInput->tokenCnt ; i++ ) {
       printf(" Token num : %d is -> %s \n" , i , pInput->tokenizedArray[i]);
   }

}

/*HOMEWORK 8:

You always have to clean up dynamic memory, otherwise the system does not

know

that you are done with it. This function goes through the array of strings

and

releases each one. Then it releases the array itself.

*/

void CleanUpTokens(BetterStrTok* pInput)

{

/* TODO 5:

Create a loop that goes from 0 to pInput->tokenCount.

Inside the loop call the free function on pInput->tokenizedArray[]

at index i, or whatever you call your counter variable.

Explanation: pInput->tokenizedArray is an array of dynamically

allocated

strings. At the next line we will free the array itself, but first

you want

to go through the loop and free the dynamic memory one string at a

time.

Failure to do this create a memory leak.

Always remember, for every malloc or calloc you need a corresponding

free

somewhere in your code.

Similarly in C++ for every new call you would want a corresponding

delete,

but that isn't a part of this homework assignment.

*/

/*After freeing every string inside pInput->tokenizedArray the array itself

must be freed.*/

if(pInput->tokenizedArray != NULL)
{
   for(int i = 0 ; i < pInput->tokenCnt ; i++ ) {
       free(pInput->tokenizedArray[i]);
   }

free(pInput->tokenizedArray);
}

}

/*You don't have to worry about this function. It walks through the original string

And compares each character with the list of delimiters. It returns an accurate

Count of how many tokens exist in the original string.*/

int GetDelimiterCnt(string sStr, int maxStringLength, string delimiter)
{
       int i;
       int j;
       int cnt = 0;
       int cnt0 = 0;

       for (i = 0; i < maxStringLength || sStr[i] != '\0'; i++)
       {
           if ( sStr[i] == delimiter[0] ) {
               cnt0 = 1;
               for (j = 1; j < strlen(delimiter); j++)
               {
                   if (sStr[i+j] == delimiter[j])
                       cnt0++;
               }
              
               if( cnt0 == strlen(delimiter) ) {
                   cnt = cnt + 1;
               }
              
           }
       }
       //There will also be a concluding string
       return cnt + 1;

}

/*You don't have to worry about this function. It uses the original strtok
functions to
extract the next token. strtok is not thread safe, so as good practice I actually
use strtok_s or strtok_r depending on if you are using OSX or Windows.
If you want to know what thread safe means, please feel free to ask.*/

string gContext;

string GetNextToken(string sStr, string delimiter, bool* pTokenizedOnce, int* pLength)
{
       string rv = NULL;
       if (*pTokenizedOnce == false)
       {
           *pTokenizedOnce = true;
           #ifdef _WIN32
               rv = strtok_s(sStr, delimiter, &gContext);
           #else
               rv = strtok_r(sStr, delimiter, &gContext);
           #endif
       }
       else
       {
           #ifdef _WIN32
               rv = strtok_s(NULL, delimiter, &gContext);
           #else
               rv = strtok_r(NULL, delimiter, &gContext);
           #endif
       }

       if (rv != NULL)
           *pLength = strlen(rv)+1;
       else
           *pLength = 0;
       return rv;
}

########################## End Code ######################################

############################# Output ###################################

########################################################################

Add a comment
Know the answer?
Add Answer to:
Let’s build a dynamic string tokenizer! Start with the existing template and work on the areas...
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for? Ask your own homework help question. Our experts will answer your question WITHIN MINUTES for Free.
Similar Homework Help Questions
  • Write the code to dynamically allocate ONE integer variable using calloc (contiguous allocation) or malloc (memory...

    Write the code to dynamically allocate ONE integer variable using calloc (contiguous allocation) or malloc (memory allocation) and have it pointed to by a pointer (of type int * ) named ptr_1. Use ptr_1 to assign the number 7 to that dynamically allocated integer, and in another line use printf to output the contents of that dynamically allocated integer variable. Write the code to dynamically allocate an integer array of length 5 using calloc or malloc and have it pointed...

  • Using C programming

    Using C, create a data file with the first number being an integer. The value of that integer will be the number of further integers which follow it in the file. Write the code to read the first number into the integer variable how_many.Please help me with the file :((This comes from this question:Write the code to dynamically allocate ONE integer variable using calloc (contiguous allocation) or malloc (memory allocation) and have it pointed to by a pointer (of type int...

  • IN C ONLY As mentioned earlier there are two changes we are going to make from...

    IN C ONLY As mentioned earlier there are two changes we are going to make from lab 5, The file you read into data structures can be any length. studentInfo array will be stored in another struct called studentList that will contain the Student pointer and current length of the list. Sometimes data can be used in structs that correlate between variables so it's convenient to store the data in the same struct. Instead of tracking a length variable all...

  • Malloc function For the prelab assignment and the lab next week use malloc function to allocate...

    Malloc function For the prelab assignment and the lab next week use malloc function to allocate space (to store the string) instead of creating fixed size character array. malloc function allows user to allocate memory (instead of compiler doing it by default) and this gives more control to the user and efficient allocation of the memory space. Example int *ptr ptr=malloc(sizeof(int)*10); In the example above integer pointer ptr is allocated a space of 10 blocks this is same as creating...

  • Answer this in c++ #include <iostream> #include <fstream> #include <string> using namespace std; class Person {...

    Answer this in c++ #include <iostream> #include <fstream> #include <string> using namespace std; class Person { public: Person() { setData("unknown-first", "unknown-last"); } Person(string first, string last) { setData(first, last); } void setData(string first, string last) { firstName = first; lastName = last; } void printData() const { cout << "\nName: " << firstName << " " << lastName << endl; } private: string firstName; string lastName; }; class Musician : public Person { public: Musician() { // TODO: set this...

  • C++ Chapter 16 Problem: Implement #25 as a template function (Demonstrate using int, double, string, and...

    C++ Chapter 16 Problem: Implement #25 as a template function (Demonstrate using int, double, string, and x,y pair object) 24. Write a function that searches a numeric array for a specified value. The function should return the subscript of the element containing the value if it is found in the array. If the value is not found, the function should throw an exception. 25. Write a function that dynamically allocates a block of memory and returns a char pointer to...

  • In C Programming Language In this lab you will implement 4 string functions, two using array...

    In C Programming Language In this lab you will implement 4 string functions, two using array notation and two using pointers. The functions must have the signatures given below. You may not use any C library string functions. The functions are 1. int my strlen (char s ) - This function returns the number of characters in a string. You should use array notation for this function. 2. int my strcpy (char s [], char t I)- This function overwrites...

  • Can anyone help me with my C hw? Exercise 3 You will write a new program...

    Can anyone help me with my C hw? Exercise 3 You will write a new program that combines dynamically allocating an array and saving that array to a file. These are the tasks your program must perform Open an output file named "data.txt" and prepare it for writing in text mode o If the file handle is NULL, quit the program o By default, it is created and stored in the same directory as your source code file Prompt the...

  • Objectives: Use strings and string library functions. Write a program that asks the user to enter...

    Objectives: Use strings and string library functions. Write a program that asks the user to enter a string and output the string in all uppercase letters. The program should then display the number of white space characters in the string. You program should run continuously until the user enters an empty string. The program must use the following two functions: A function called count_spaces that counts the number of white spaces inside a string. int count_space(char str[]); which tell you...

  • C programming The program will require the following structure: struct _data { char *name; long number;...

    C programming The program will require the following structure: struct _data { char *name; long number; }; The program will require command line arguments: int main(int argv, char **argc) { Where argv is the number of arguments and argc is an array holding the arguments (each is a string). Your program must catch any case where no command line arguement was provided and print a warning message (see below). You MUST include/use the following functions, defined as follows: int SCAN(FILE...

ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT