import numpy as np import pandas as pd import seaborn as sns import matplotlib.pyplot as plt import scipy.stats as stats from sklearn.datasets import load_breast_cancer
In [1]:
iris=sns.load_dataset('iris')
In [3]:
iris.shape,iris.species.shape
Out[3]:
((150, 5), (150,))
In [4]:
X=iris.iloc[:,:-1].values
In [5]:
y=iris.iloc[:,4].values
In [6]:
from sklearn.model_selection import train_test_split
In [7]:
X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.20)
In [8]:
X_train.shape,X_test.shape,y_train.shape,y_test.shape
Out[8]:
((120, 4), (30, 4), (120,), (30,))
In [9]:
def top_neighbors(train_instance_1,train_instance_2,k=5): distances=[] for idx in range(len(train_instance_1)): distances.append(np.linalg.norm(train_instance_1[idx,:]-train_instance_2)) return np.argsort(distances)[:k]
In [11]:
idxs=top_neighbors(X_train[:,:],X_test[0,:]) idxs
Out[11]:
array([ 66, 52, 68, 116, 106], dtype=int32)
In [12]:
y_train[idxs]
Out[12]:
array(['versicolor', 'versicolor', 'versicolor', 'versicolor', 'versicolor'], dtype=object)
In [13]:
from sklearn.neighbors import KNeighborsClassifier
In [14]:
classifier=KNeighborsClassifier(n_neighbors=5)
In [15]:
classifier.fit(X_train,y_train)
Out[15]:
KNeighborsClassifier(algorithm='auto', leaf_size=30, metric='minkowski', metric_params=None, n_jobs=None, n_neighbors=5, p=2, weights='uniform')
In [20]:
y_pred=classifier.predict(X_test) y_pred
Out[20]:
array(['versicolor', 'setosa', 'virginica', 'setosa', 'virginica', 'versicolor', 'versicolor', 'virginica', 'versicolor', 'virginica', 'versicolor', 'versicolor', 'setosa', 'virginica', 'setosa', 'setosa', 'versicolor', 'setosa', 'versicolor', 'virginica', 'versicolor', 'setosa', 'setosa', 'virginica', 'virginica', 'versicolor', 'setosa', 'setosa', 'versicolor', 'versicolor'], dtype=object)
In [18]:
I need to create the "nnclassifier using the euclidean distance formula to find nearest neighbor. I...
Classification in Python: Classification In this assignment, you will practice using the kNN (k-Nearest Neighbors) algorithm to solve a classification problem. The kNN is a simple and robust classifier, which is used in different applications. The goal is to train kNN algorithm to distinguish the species from one another. The dataset can be downloaded from UCI Machine Learning Repository: https://archive.ics.uci.edu/ml/machine-learning-databases/iris/ (Links to an external site.)Links to an external site.. Download `iris.data` file from the Data Folder. The Data Set description...
You are to build trees with varying max tree depth for the
dataset provided (use maximum tree depths 2-10). For each tree of a
given maximum depth, record the accuracy, precision and recall.
Plot each of these metrics as a line plot (tree depth on the x axis
and % on the y axis). Below is what I have attempted.
In [1]: import pandas as pd import numpy as np from sklearn.model_selection import train_test_split from sklearn.neighbors import KNeighborsClassifier from sklearn.metrics...
Can someone please just run this on their system and post the screenshot so I can know it works. The .csv file and program are given below, please just run it and provide the screenshot. Also, I am using 3.6 Python in Pycharm. Thanks! Bank_Predictions.csv use a portion of the dataset Bank_Predictions which I have provided below (it is only 10 lines because the actual file has over 1000 lines so here is a small snippet); Number Customer_ID Last_Name Cr_Score...
I need to use the distance formula below to find the exact
corordinates of the terminal point of pi/8 from pi/4 and then solve
for (b) and then (c ). It has to use the distance formula not half
angle formula of cosine and sine. Please include your step by step
solution. Thank you.
The point at x/8 is halfway between 0 and x/4. So, if its coordinates are (x,y), then we have d[(x,y),(1,0)) = d[(x,y). (7312, V3/2)) which is...
PYTHON
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
Our goal is to create a linear regression model to estimate
values of ln_price using ln_carat as the only feature. We will now
prepare the feature and label arrays.
"carat" "cut" "color"
"clarity" "depth" "table"
"price" "x" "y" "z"
"1" 0.23 "Ideal" "E" "SI2" 61.5 55 326
3.95 3.98 2.43
"2" 0.21 "Premium" "E" "SI1"...
this is my code to predict a housing price based on data but i
get a lot errors, how do i resolve all these errors
9]: import seaborn as sns from sklearn.linear model import LinearRegression 20]: import numpy as np import matplotlib.pyplot as plt %matplotlib inline import pandas as pd data- pd.read_csv('C:\\Users \\Downloads \\house-prices -advanced-regression-techniques\\test.csv) I data 20]: ScreenPorch PoolArea PoolQC Fence Id MSSubClass MSZoning LotFrontage LotArea Street Alley LotShape LandContour Utilities NaN MnPry 120 Lvl AlPub Pave NaN Reg...