The first project involves modifying the attached lexical analyzer and the compilation listing ge...

Question

Question

The first project involves modifying the attached lexical analyzer and the compilation listing ge...

The first project involves modifying the attached lexical analyzer and the compilation listing generator code. You need to make the following modifications to the lexical analyzer, scanner.l: 1. A second type of comment should be added that begins with // and ends with the end of line. As with the existing comment, no token should be returned. 2. The definition for the identifiers should be modified so that underscores can be included, however, consecutive underscores, leading and trailing underscores should not be permitted. 3. A real literal token should be added. It should begin with a sequence of one or more digits following by a decimal point followed by zero or more additional digits. It may optionally end with an exponent. If present, the exponent should begin with an e or E, followed by an optional plus or minus sign followed by one or more digits. The token should be named REAL_LITERAL. 4. A Boolean literal token should be added. It should have two lexemes, which are true and false. The token should be named BOOL_LITERAL. 5. Two additional logical operators should be added. The lexeme for the first should be or and its token should be OROP. The second logical operator added should be not and its token should be NOTOP. 6. Five relational operators should be added. They are =, /=, >, >= and <=. All of the lexemes should be represented by the single token RELOP. 7. One additional lexeme should be added for the ADDOP token. It is binary -. 8. One additional lexeme should be added for the MULOP token. It is/. 9. A new token REMOP should be added for the remainder operator. Its lexeme should be rem. 10. A new token EXPOP should be added for the exponentiation operator. Its lexeme should be **. 11. A new token ARROW should be added for the two character punctuation symbol =>. 12. The following reserved words should be added: case, else, endcase, endif, if, is, others, real, returns, then, when Each reserved words should be a separate token. The token name should be the same as the lexeme, but in all upper case. You must also modify the header file tokens.h to include each the new tokens mentioned above. The compilation listing generator code should be modified as follows: 1. The lastLine function should be modified to compute the total number of errors. If any errors occurred the number of lexical, syntactic and semantic errors should be displayed. If no errors occurred, it should display Compiled Successfully. It should return the total number of errors. 2. The appendError function should be modified to count the number of lexical, syntactic and semantic errors. The error message passed to it should be added to a queue of messages that occurred on that line. 3. The displayErrors function should be modified to display all the error messages that have occurred on the previous line and then clear the queue of messages. An example of the output of a program with no lexical errors is shown below: 1 (* Program with no errors *) 2 3 function test1 returns boolean; 4 begin 5 7 + 2 > 6 and 8 = 5 * (7 - 4); 6 end; Compiled Successfully Here is the required output for a program that contains more than one lexical error on the same line: 1 -- Function with two lexical errors 2 3 function test2 returns integer; 4 begin 5 7 $ 2 ^ (2 + 4); Lexical Error, Invalid Character $ Lexical Error, Invalid Character ^ 6 end; Lexical Errors 2 Syntax Errors 0 Semantic Errors 0 You are to submit two files. 1. The first is a .zip file that contains all the source code for the project. The .zip file should contain the flex input file, which should be a .l file, all .cc and .h files and a makefile that builds the project. 2. The second is a Word document (PDF or RTF is also acceptable) that contains the documentation for the project, which should include the following: a. A discussion of how you approached the project b. A test plan that includes test cases that you have created indicating what aspects of the program each one is testing and a screen shot of your compiler run on that test case c. A discussion of lessons learned from the project and any improvements that could be made. SKELETON CODE: ======================================================================== listing.h enum ErrorCategories {LEXICAL, SYNTAX, GENERAL_SEMANTIC, DUPLICATE_IDENTIFIER, UNDECLARED}; void firstLine(); void nextLine(); int lastLine(); void appendError(ErrorCategories errorCategory, string message); ======================================================================== listing.cc #include #include using namespace std; #include "listing.h" static int lineNumber; static string error = ""; static int totalErrors = 0; static void displayErrors(); void firstLine() { lineNumber = 1; printf("\n%4d ",lineNumber); } void nextLine() { displayErrors(); lineNumber++; printf("%4d ",lineNumber); } int lastLine() { printf("\r"); displayErrors(); printf(" \n"); return totalErrors; } void appendError(ErrorCategories errorCategory, string message) { string messages[] = { "Lexical Error, Invalid Character ", "", "Semantic Error, ", "Semantic Error, Duplicate Identifier: ", "Semantic Error, Undeclared " }; error = messages[errorCategory] + message; totalErrors++; } void displayErrors() { if (error != "") printf("%s\n", error.c_str()); error = ""; } ========================================================================== tokens.h enum Tokens {RELOP = 256, ADDOP, MULOP, ANDOP, BEGIN_, BOOLEAN, END, ENDREDUCE, FUNCTION, INTEGER, IS, REDUCE, RETURNS, IDENTIFIER, INT_LITERAL}; ========================================================================== scanner.l /* This file contains flex input file */ %{ #include #include using namespace std; #include "listing.h" #include "tokens.h" %} %option noyywrap ws [ \t\r]+ comment \-\-.*\n line [\n] id [A-Za-z][A-Za-z0-9]* digit [0-9] int {digit}+ punc [,:;] %% {ws} { ECHO; } {comment} { ECHO; nextLine();} {line} { ECHO; nextLine();} "<" { ECHO; return(RELOP); } "+" { ECHO; return(ADDOP); } "*" { ECHO; return(MULOP); } begin { ECHO; return(BEGIN_); } boolean { ECHO; return(BOOLEAN); } end { ECHO; return(END); } endreduce { ECHO; return(ENDREDUCE); } function { ECHO; return(FUNCTION); } integer { ECHO; return(INTEGER); } is { ECHO; return(IS); } reduce { ECHO; return REDUCE; } returns { ECHO; return(RETURNS); } and { ECHO; return(ANDOP); } {id} { ECHO; return(IDENTIFIER);} {int} { ECHO; return(INT_LITERAL); } {punc} { ECHO; return(yytext[0]); } . { ECHO; appendError(LEXICAL, yytext); } %% int main() { firstLine(); FILE *file = fopen("lexemes.txt", "wa"); int token = yylex(); while (token) { fprintf(file, "%d %s\n", token, yytext); token = yylex(); } lastLine(); fclose(file); return 0; } ============================================================================== makefile compile: scanner.o listing.o g++ -o compile scanner.o listing.o scanner.o: scanner.c listing.h tokens.h g++ -c scanner.c scanner.c: scanner.l flex scanner.l mv lex.yy.c scanner.c listing.o: listing.cc listing.h g++ -c listing.cc ============================================================================

engineering Computer-Science

Add a comment Improve this question Transcribed image text

Answer 1

Answer #1

ANSWER:-

USING C PROGRAM

#include <stdlib.h>
#include <stdio.h>
#include <stdarg.h>
#include <ctype.h>
#include <string.h>
#include <errno.h>
#include <stdbool.h>
#include <limits.h>

#define NELEMS(arr) (sizeof(arr) / sizeof(arr[0]))

#define da_dim(name, type) type *name = NULL; \
int qy ## name ## _p = 0; \
int qy ## name ## _max = 0
#define da_rewind(name) qy ## name ## _p = 0
#define da_redim(name) do {if (qy ## name ## p >= qy ## name ## max) \
name = realloc(name, (qy ## name ## _max += 32) * sizeof(name[0]));} while (0)
#define da_append(name, x) do {da_redim(name); name[_qy ## name ## p++] = x;} while (0)
#define da_len(name) qy ## name ## _p

typedef enum {
tk_EOI, tk_Mul, tk_Div, tk_Mod, tk_Add, tk_Sub, tk_Negate, tk_Not, tk_Lss, tk_Leq,
tk_Gtr, tk_Geq, tk_Eq, tk_Neq, tk_Assign, tk_And, tk_Or, tk_If, tk_Else, tk_While,
tk_Print, tk_Putc, tk_Lparen, tk_Rparen, tk_Lbrace, tk_Rbrace, tk_Semi, tk_Comma,
tk_Ident, tk_Integer, tk_String
} TokenType;

typedef struct {
TokenType tok;
int err_ln, err_col;
union {
int n; / value for constants /
char text; / text for idents */
};
} tok_s;

static FILE source_fp, dest_fp;
static int line = 1, col = 0, the_ch = ' ';
da_dim(text, char);

tok_s gettok();

static void error(int err_line, int err_col, const char *fmt, ... ) {
char buf[1000];
va_list ap;

va_start(ap, fmt);
vsprintf(buf, fmt, ap);
va_end(ap);
printf("(%d,%d) error: %s\n", err_line, err_col, buf);
exit(1);
}

static int next_ch() { / get next char from input /
the_ch = getc(source_fp);
++col;
if (the_ch == '\n') {
++line;
col = 0;
}
return the_ch;
}

static tok_s char_lit(int n, int err_line, int err_col) { / 'x' /
if (the_ch == '\'')
error(err_line, err_col, "gettok: empty character constant");
if (the_ch == '\\') {
next_ch();
if (the_ch == 'n')
n = 10;
else if (the_ch == '\\')
n = '\\';
else error(err_line, err_col, "gettok: unknown escape sequence \\%c", the_ch);
}
if (next_ch() != '\'')
error(err_line, err_col, "multi-character constant");
next_ch();
return (tok_s){tk_Integer, err_line, err_col, {n}};
}

static tok_s div_or_cmt(int err_line, int err_col) { / process divide or comments /
if (the_ch != '*')
return (tok_s){tk_Div, err_line, err_col, {0}};

/ comment found /
next_ch();
for (;;) {
if (the_ch == '*') {
if (next_ch() == '/') {
next_ch();
return gettok();
}
} else if (the_ch == EOF)
error(err_line, err_col, "EOF in comment");
else
next_ch();
}
}

static tok_s string_lit(int start, int err_line, int err_col) { / "st" /
da_rewind(text);

while (next_ch() != start) {
if (the_ch == '\n') error(err_line, err_col, "EOL in string");
if (the_ch == EOF) error(err_line, err_col, "EOF in string");
da_append(text, (char)the_ch);
}
da_append(text, '\0');

next_ch();
return (tok_s){tk_String, err_line, err_col, {.text=text}};
}

static int kwd_cmp(const void p1, const void p2) {
return strcmp(*(char *)p1, (char **)p2);
}

static TokenType get_ident_type(const char *ident) {
static struct {
char *s;
TokenType sym;
} kwds[] = {
{"else", tk_Else},
{"if", tk_If},
{"print", tk_Print},
{"putc", tk_Putc},
{"while", tk_While},
}, *kwp;

return (kwp = bsearch(&ident, kwds, NELEMS(kwds), sizeof(kwds[0]), kwd_cmp)) == NULL ? tk_Ident : kwp->sym;
}

static tok_s ident_or_int(int err_line, int err_col) {
int n, is_number = true;

da_rewind(text);
while (isalnum(the_ch) || the_ch == '_') {
da_append(text, (char)the_ch);
if (!isdigit(the_ch))
is_number = false;
next_ch();
}
if (da_len(text) == 0)
error(err_line, err_col, "gettok: unrecognized character (%d) '%c'\n", the_ch, the_ch);
da_append(text, '\0');
if (isdigit(text[0])) {
if (!is_number)
error(err_line, err_col, "invalid number: %s\n", text);
n = strtol(text, NULL, 0);
if (n == LONG_MAX && errno == ERANGE)
error(err_line, err_col, "Number exceeds maximum value");
return (tok_s){tk_Integer, err_line, err_col, {n}};
}
return (tok_s){get_ident_type(text), err_line, err_col, {.text=text}};
}

static tok_s follow(int expect, TokenType ifyes, TokenType ifno, int err_line, int err_col) { / look ahead for '>=', etc. /
if (the_ch == expect) {
next_ch();
return (tok_s){ifyes, err_line, err_col, {0}};
}
if (ifno == tk_EOI)
error(err_line, err_col, "follow: unrecognized character '%c' (%d)\n", the_ch, the_ch);
return (tok_s){ifno, err_line, err_col, {0}};
}

tok_s gettok() { / return the token type /
/ skip white space /
while (isspace(the_ch))
next_ch();
int err_line = line;
int err_col = col;
switch (the_ch) {
case '{': next_ch(); return (tok_s){tk_Lbrace, err_line, err_col, {0}};
case '}': next_ch(); return (tok_s){tk_Rbrace, err_line, err_col, {0}};
case '(': next_ch(); return (tok_s){tk_Lparen, err_line, err_col, {0}};
case ')': next_ch(); return (tok_s){tk_Rparen, err_line, err_col, {0}};
case '+': next_ch(); return (tok_s){tk_Add, err_line, err_col, {0}};
case '-': next_ch(); return (tok_s){tk_Sub, err_line, err_col, {0}};
case '*': next_ch(); return (tok_s){tk_Mul, err_line, err_col, {0}};
case '%': next_ch(); return (tok_s){tk_Mod, err_line, err_col, {0}};
case ';': next_ch(); return (tok_s){tk_Semi, err_line, err_col, {0}};
case ',': next_ch(); return (tok_s){tk_Comma,err_line, err_col, {0}};
case '/': next_ch(); return div_or_cmt(err_line, err_col);
case '\'': next_ch(); return char_lit(the_ch, err_line, err_col);
case '<': next_ch(); return follow('=', tk_Leq, tk_Lss, err_line, err_col);
case '>': next_ch(); return follow('=', tk_Geq, tk_Gtr, err_line, err_col);
case '=': next_ch(); return follow('=', tk_Eq, tk_Assign, err_line, err_col);
case '!': next_ch(); return follow('=', tk_Neq, tk_Not, err_line, err_col);
case '&': next_ch(); return follow('&', tk_And, tk_EOI, err_line, err_col);
case '|': next_ch(); return follow('|', tk_Or, tk_EOI, err_line, err_col);
case '"' : return string_lit(the_ch, err_line, err_col);
default: return ident_or_int(err_line, err_col);
case EOF: return (tok_s){tk_EOI, err_line, err_col, {0}};
}
}

void run() { / tokenize the given input /
tok_s tok;
do {
tok = gettok();
fprintf(dest_fp, "%5d %5d %.15s",
tok.err_ln, tok.err_col,
&"End_of_input Op_multiply Op_divide Op_mod Op_add "
"Op_subtract Op_negate Op_not Op_less Op_lessequal "
"Op_greater Op_greaterequal Op_equal Op_notequal Op_assign "
"Op_and Op_or Keyword_if Keyword_else Keyword_while "
"Keyword_print Keyword_putc LeftParen RightParen LeftBrace "
"RightBrace Semicolon Comma Identifier Integer "
"String "
[tok.tok * 16]);
if (tok.tok == tk_Integer) fprintf(dest_fp, " %4d", tok.n);
else if (tok.tok == tk_Ident) fprintf(dest_fp, " %s", tok.text);
else if (tok.tok == tk_String) fprintf(dest_fp, " \"%s\"", tok.text);
fprintf(dest_fp, "\n");
} while (tok.tok != tk_EOI);
if (dest_fp != stdout)
fclose(dest_fp);
}

void init_io(FILE *fp, FILE std, const char mode[], const char fn[]) {
if (fn[0] == '\0')
*fp = std;
else if ((*fp = fopen(fn, mode)) == NULL)
error(0, 0, "Can't open %s\n", fn);
}

int main(int argc, char *argv[]) {
init_io(&source_fp, stdin, "r", argc > 1 ? argv[1] : "");
init_io(&dest_fp, stdout, "wb", argc > 2 ? argv[2] : "");
run();
return 0;
}

Add a comment

Answer 2

The first project involves modifying the attached lexical analyzer and the compilation listing ge...

Homework Answers

Add Answer to:
The first project involves modifying the attached lexical analyzer and the compilation listing ge...

Post as a guest

Earn Coins

If i could get any guidance on how to get this started it will be great....

lex.h ----------------- #ifndef LEX_H_ #define LEX_H_ #include <string> #include <iostream> using std::string; using std::istream; using std::ostream;...

Infix Expression Evaluator For this project, write a C program that will evaluate an infix expression. The algorithm REQ...

Infix Expression Evaluator For this project, write a C program that will evaluate an infix expression. The algorithm REQ...

In c programming The Consumer Submits processing requests to the producer by supplying a file name, its location and a character. It also outputs the contents of the file provided by the producer...

The first programming project involves writing a program that computes the minimum, the maximum and the...

For the following task, I have written code in C and need help in determining the...

IN C ONLY As mentioned earlier there are two changes we are going to make from...

// READ BEFORE YOU START: // You are given a partially completed program that creates a...

Ensure the following compiles 5. Variable scope (1 mark) Some variables are only accessible while executing...

The first project involves modifying the attached lexical analyzer and the compilation listing ge...

Homework Answers

Add Answer to: The first project involves modifying the attached lexical analyzer and the compilation listing ge...

Post as a guest

Earn Coins

Add Answer to:
The first project involves modifying the attached lexical analyzer and the compilation listing ge...