Problem statement For this program, you are to implement a simple machine-learning algorithm that uses a...

Question

Question

Problem statement For this program, you are to implement a simple machine-learning algorithm that uses a...

Problem statement For this program, you are to implement a simple machine-learning algorithm that uses a rule-based classifier to predict whether or not a particular patient has diabetes. In order to do so, you will need to first train your program, using a provided data set, to recognize a disease. Once a program is capable of doing it, you will run it on new data sets and predict the existence or absence of a disease. While solving this problem, you are to use a procedural approach in Java. Decompose your program into methods and use small lists where appropriate, i.e. you are only to use data types covered in chapters 1-6 and chapter 10. Training In order to train your program to recognize diabetes, you need to use a data set that contains both data and diagnoses (this is the learning phase). Your program is to ask the user to enter the name of the training file. You may assume that the file will be a txt file. Each line on that file constitutes one patient record. Each patient's record consists of 6 attributes and a numerical diagnosis. From our perspective, it does not matter much what these attributes signify. The only important observation here is that the attributes are floating point numbers delimited by tabs. If an attribute is missing, a question mark replaces a number. In your program, treat question marks as zeros. As far as diagnosis is concerned (the 7th value listed on a line), any value0.65 indicates a presence of a disease, while a value0.65 indicates its absence. You may assume that all data in the file will be valid, i.e there is no need to handle user error or file errors. Sample file train.txt is provided Here is a sample line from the input file 25 175.6 101.14 159.4 Since the last element is a zero, this patient has no diabetes. In case you are wondering, the first 6 attributes are: age, sex, random blood sugar level fasting blood sugar level, oral glucose tolerance level, hemoglobin A1c. Based on the input data that follows this format, the program is to create a classifier that will be used later on to process new patients and make diagnostic predictions for these new patients. The classifier is created in the following fashion. For all patients with no diabetes, for each attribute, we calculate the average value of that attribute. For all patients with diabetes, for each attribute, we calculate the average value of that attribute. After processing all the patients, we end up with two sets of averages-one set for healthy patients (use a list) and one set for ill patients (use a list). In short, as you process a line, you need to first determine whether you are dealing with a healthy or ill patient and update your averages accordingly The picture below illustrates the idea (one row in either table is one line in a file). Health ents Ill patients 13 51 10 4 2 16 61 13 54 11 52 14 | S| 3 4 300 115 0 3 10 0

engineering Computer-Science

Add a comment Improve this question Transcribed image text

Answer 1

Answer #1

Answer: fromfuture import with statement #COUNT FOR HEALTHY PATIENTS goodCnt0i att1=0 att2=0 att3=0 att4=0 att5=0 #COUNT FOR badCountt-1 #HEALTHY PATIENTS AVERAGE Hatt1-att1/goodCnt Hatt2-att2/goodCnt Hatt3-att 3/goodCnt Hatt Hatt5-att5/goodCnt Hatt6 if float (st [6])-0: ng+=1 ntwnt+i1 else: Accuracy(ng/geodCnt) pant Accuracy : print Accuracy #DIAGNOSING TEST DATA SET len code:

from __future__ import with_statement
#COUNT FOR HEALTHY PATIENTS
goodCnt=0;
att1=0
att2=0
att3=0
att4=0
att5=0
att6=0
#COUNT FOR ILL PATIENTS
badCount=0;
attA=0
attB=0
attC=0
attD=0
attE=0
attF=0
attG=0
#READING TRAINING DATA SET
filen=raw_input("enter the name of the training set file:")
with open(filen,'rb') as f:
for temp in f:
st=temp.split(",")
#HEALTHY PATIENTS
if float(st[6])==0:
att1+=(float(st[0]))
att2+=(float(st[1]))
att3+=(float(st[2]))
att4+=(float(st[3]))
att5+=(float(st[4]))
att6+=(float(st[5]))
goodCnt+=1
#ILL PATIENTS
else:
attA+=(float(st[0]))
attB+=(float(st[1]))
attC+=(float(st[2]))
attD+=(float(st[3]))
attE+=(float(st[4]))
attF+=(float(st[5]))
badCount+=1
#HEALTHY PATIENTS AVERAGE
Hatt1=att1/goodCnt
Hatt2=att2/goodCnt
Hatt3=att3/goodCnt
Hatt4=att4/goodCnt
Hatt5=att5/goodCnt
Hatt6=att6/goodCnt
#ILL PATIENTS AVERAGE
HattA=attA/badCount
HattB=attB/badCount
HattC=attC/badCount
HattD=attD/badCount
HattE=attE/badCount
HattF=attF/badCount

#PRINT HEALTHY AND ILL PATIENTS AVERAGE
print "healthy patients' averages:"
print Hatt1,Hatt2,Hatt3,Hatt4,Hatt5,Hatt6

print "Ill patients' averages:"
print HattA,HattB,HattC,HattD,HattE,HattF

#FIND SEPARATOR VALUES
satt1=(Hatt1+HattA)/2
satt2=(Hatt2+HattB)/2
satt3=(Hatt3+HattC)/2
satt4=(Hatt4+HattD)/2
satt5=(Hatt5+HattE)/2
satt6=(Hatt6+HattF)/2

#DISPLAY SEPARATOR VALUES
print "Separator values:"
print satt1,satt2,satt3,satt4,satt5,satt6

#FINDING ACCURACY
ng=0;
nt=0
with open(filen,'rb') as f:
for temp in f:
st=temp.split(",")
if float(st[6])==0:
ng+=1
nt=nt+1
else:
nt+=1
Accuracy=(ng/goodCnt)

print "Accuracy:"
print Accuracy

#DIAGNOSING TEST DATA SET
infilename=raw_input("enter the name of the test set file:")
outfile=raw_input("Enter the name of the output file:")
ot=open(outfile,'wb')
with open(infilename,'rb') as f:
for temp in f:
st=temp.split(",")
att1=(int(st[0]))
att2=(float(st[1]))
att3=(float(st[2]))
att4=(float(st[3]))
att5=(float(st[4]))
att6=(float(st[5]))
att7=(float(st[6]))
if((att2>satt1) or (att3>satt2) or (att4>satt3) or (att5>satt4) or (att6>satt5) or (att7>satt6)):
st=str(att1)+",Ill"
else:
st=str(att1)+",Healthy"
ot.write(st) #WRITING TO FILE
ot.write(" ")
ot.close()#CLOSE OUTPUT FILE
print "Done"

Note: Due to time constraint i am not able to prepare the flowchart

Add a comment

Answer 2

Problem statement For this program, you are to implement a simple machine-learning algorithm that uses a...

Homework Answers

Add Answer to:
Problem statement For this program, you are to implement a simple machine-learning algorithm that uses a...

Post as a guest

Earn Coins

Write a program that will take input from a file of numbers of type double and...

In this assignment, you will write a program in C++ which uses files and nested loops...

Any little bit of information will be helpful. Thank you. Problem Statement You are to develop a program to read Mo...

Problem: You will write a program to compute some statistics based on monthly average temperatures for...

PLEASE DO THIS IN C#.Design and implement a program (name it ProcessGrades) that reads from the...

I need this in JAVA please: Design and implement a program (name it MinMaxAvg) that defines...

HELP NEEDED in C++ (preferred) or any other language Write a simple program where you create...

Topics: list, file input/output (Python) You will write a program that allows the user to read...

In this assignment, you are to write a C++ program (using Visual C++ 2015) that reads...

Create the Python code for a program adhering to the following specifications. Write an Employee class...

Problem statement For this program, you are to implement a simple machine-learning algorithm that uses a...

Homework Answers

Add Answer to: Problem statement For this program, you are to implement a simple machine-learning algorithm that uses a...

Post as a guest

Earn Coins

Add Answer to:
Problem statement For this program, you are to implement a simple machine-learning algorithm that uses a...