Python Assignment In this assignment, you will use Pandas library to perform analysis on the dataset stored in the following csv file: breast-cancer-wisconsin.csv. Please write script(s) to do the fol...

Question

Question

Python Assignment In this assignment, you will use Pandas library to perform analysis on the dataset stored in the following csv file: breast-cancer-wisconsin.csv. Please write script(s) to do the fol...

Python Assignment

In this assignment, you will use Pandas library to perform analysis on the dataset stored in the following csv file: breast-cancer-wisconsin.csv.

Please write script(s) to do the following:

1. Read the csv file and covert the dataset into a DataFrame object.

2. Persist the dataset into a SQL table and a JASON file. • Write the content of the DataFrame object into an SQLite database table. This will convert the dataset into a SQL table format. You can define your own database and table name. • Write the content of the DataFrame object into a JASON file. This will convert the dataset into a JASON format. You can decide which JASON format (column, record or split) you like to convert.

3. Calculate the mean and standard deviation for every (numerical) column using DataFrame methods.

4. Use DataFrame Data Visualization methods to draw either the Boxplot or Kernel Density (KDE) diagram to display the distribution function for each column of the DataFrame object. Please compare the curves generated and determined which columns have distribution functions of similar shape.

5. Use the DataFrame method to calculate the correlation coefficient between any two columns. Also draw the Scatter Plots to demonstrate how any two columns are correlated. Use the coefficient coefficients and Scatter Plots to determine if any two columns are positively correlated, negatively correlated or not correlated.

6. Use the class column to group the records in the dataset and repeat step 3 and 4 for all groups.

engineering Computer-Science

Add a comment Improve this question Transcribed image text

Answer 1

Answer #1

import pandas as pd
data = pd.read_csv("breast-cancer-wisconsin.csv")
data.head()

import pandas
import scipy
import numpy
from sklearn.preprocessing import MinMaxScaler
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/pima-indians-diabetes/pima-indians-diabetes.data"
names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class']
dataframe = pandas.read_csv(url, names=names)
array = dataframe.values
X = array[:,0:8]
Y = array[:,8]
scaler = MinMaxScaler(feature_range=(0, 1))
rescaledX = scaler.fit_transform(X)
numpy.set_printoptions(precision=3)
print(rescaledX[0:5,:])

Add a comment

Answer 2

Python Assignment In this assignment, you will use Pandas library to perform analysis on the dataset stored in the following csv file: breast-cancer-wisconsin.csv. Please write script(s) to do the fol...

Homework Answers

Add Answer to:
Python Assignment In this assignment, you will use Pandas library to perform analysis on the dataset stored in the following csv file: breast-cancer-wisconsin.csv. Please write script(s) to do the fol...

Post as a guest

Earn Coins

Lab Exercise #15 Assignment Overview This lab exercise provides practice with Pandas data analysis library. Data...

Classification in Python: Classification In this assignment, you will practice using the kNN (k-Nearest Neighbors) algorithm...

Answer the following and explain so I can double check my answers If you use the Management Studio to create a database, the Studio will automatically create a database file plus a ________________...

Use the csv file on spotify from any date Code from lab2 import java.io.File; import java.io.FileNotFoundException;...

Assignment Overview In Part 1 of this assignment, you will write a main program and several...

Please try to write the code with Project 1,2 and 3 in mind. And use java language, thank you very much. Create an Edit Menu in your GUI Add a second menu to the GUI called Edit which will have one me...

Python Assignment In this assignment, you will use Pandas library to perform analysis on the dataset stored in the following csv file: breast-cancer-wisconsin.csv. Please write script(s) to do the fol...

Homework Answers

Add Answer to: Python Assignment In this assignment, you will use Pandas library to perform analysis on the dataset stored in the following csv file: breast-cancer-wisconsin.csv. Please write script(s) to do the fol...

Post as a guest

Earn Coins

Lab Exercise #15 Assignment Overview This lab exercise provides practice with Pandas data analysis library. Data...

Classification in Python: Classification In this assignment, you will practice using the kNN (k-Nearest Neighbors) algorithm...

Answer the following and explain so I can double check my answers If you use the Management Studio to create a database, the Studio will automatically create a database file plus a ________________...

Use the csv file on spotify from any date Code from lab2 import java.io.File; import java.io.FileNotFoundException;...

Assignment Overview In Part 1 of this assignment, you will write a main program and several...

Please try to write the code with Project 1,2 and 3 in mind. And use java language, thank you very much. Create an Edit Menu in your GUI Add a second menu to the GUI called Edit which will have one me...

Add Answer to:
Python Assignment In this assignment, you will use Pandas library to perform analysis on the dataset stored in the following csv file: breast-cancer-wisconsin.csv. Please write script(s) to do the fol...