STATISTICS PROBABILITY AND CODE IN PYTHON TO PLOT THE GRAPH. Consider the following Gross Domestic...

Question

Question

STATISTICS PROBABILITY AND CODE IN PYTHON TO PLOT THE GRAPH.

engineering Computer-Science

Add a comment Improve this question Transcribed image text

Answer 1

Answer #1

First creating the data frame woth the help of below code in Pandas in Python:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

df = pd.DataFrame(columns = ["year", "gdp"])

year = [1930, 1940, 1950, 1960, 1970, 1980, 1990, 2000, 2010]
gdp = [1.015, 1.33, 2.29, 3.26, 4.951, 6.759, 9.366, 13.131, 15.599]

df["year"] = year
df["gdp"] = gdp

Dataframe looks like:

year	gdp
0	1930	1.015
1	1940	1.330
2	1950	2.290
3	1960	3.260
4	1970	4.951
5	1980	6.759
6	1990	9.366
7	2000	13.131
8	2010	15.599

1.) Plotting the above graph:

plt.plot( 'year', 'gdp', data=df, linestyle='-', marker='o')

plt.xlabel("Year")

plt.ylabel("GDP")

plt.title("Real US GDP(in trillions)")

plt.show()

2.)

Finding Mathematical relationship between year and gdp

Let's Check Correlation between Year and GDP first to check if there exists any possible linear relationship.

df["year"].corr(df["gdp"])

O/P = 0.96

The correlation is very high showing that it we can find and plot the linear relation between these 2 variables.

Let's fit Linear regression model on above data in python.

from sklearn import linear_model

model = linear_model.LinearRegression(normalize= True)

model.fit(df[["year"]], df["gdp"])

Above i am taking only absolute values.

Checking coefficient and intercept to build the equation

model.intercept_


-359.319

model.coef_


0.18565

So, Linear equation becomes:

GDP = 0.1865 * YEAr - 359.319

Now, predicting the values of gdp with the help of above model

df["gdp_predicted"] = model.predict(df[["year"]])

	year	gdp	gdp_predicted
0	1930	1.015	-1.014778
1	1940	1.330	0.841722
2	1950	2.290	2.698222
3	1960	3.260	4.554722
4	1970	4.951	6.411222
5	1980	6.759	8.267722
6	1990	9.366	10.124222
7	2000	13.131	11.980722
8	2010	15.599	13.837222

Now checking the Coefficient of determination,

from sklearn.metrics import r2_score
r2_score(df["gdp"], df["gdp_predicted"])

0.9483301167987669

The Coefficeint of determination is very high indicating that we are able to capture large amount of variation with the hep of simple linear model only.

Checking for Root Mean Squared Error:

from sklearn.metrics import mean_squared_error
rms = np.sqrt(mean_squared_error(df["gdp"], df["gdp_predicted"]))
print(rms)

1.19

The error is also very less indicating that model predicts very close to actual values.

3.)

Let's draw our linear graph on top of our original graph

plt.plot( 'year', 'gdp', data=df, linestyle='-', marker='o', label = "original")
plt.plot( 'year', 'gdp_predicted', data=df, linestyle='-', marker='o', label="predicted")
plt.xlabel("Year")
plt.ylabel("GDP")
plt.title("Real US GDP(in trillions)")
plt.legend()
plt.show()

The above plot shows that there exists some error in our predicted values, but if we are going to predict exactly same values as orginal, there are high chances of overfitting. As of now our model has very good accuracy.

Add a comment

Answer 2

Similar Homework Help Questions

The following table provides data for life expectancy for Batiki Island. a. Check students' understanding of...

The following table provides data for life expectancy for Batiki Island. a. Check students' understanding of the tables with questions like:  In 1900 to what age did women expect to live?  Was there any year in which life expectancy decreased?  Why do you think there is no data for the years 1940 and 1945?  Comparing just the years 1890 and 1990, has the difference between the life expectancies of men and women decreased or increased? ...
Unit Project: Population Analysis

Read the Overview and look at the Sample Project to understand what you will be creating. Check out the Rubric to make sure you earn every possible point. Use the Presentation Template to create the presentation you will submit. Use the spreadsheet template to organize your data and create your scatter plots.All of the files can be found in this Google Drive folder. The files are Microsoft Office files. You can download them to use with MS Office or you...
Unit Project: Population Analysis

Read the Overview and look at the Sample Project to understand what you will be creating. Check out the Rubric to make sure you earn every possible point. Use the Presentation Template to create the presentation you will submit. Use the spreadsheet template to organize your data and create your scatter plots.All of the files can be found in this Google Drive folder. The files are Microsoft Office files. You can download them to use with MS Office or you...

3-The population in the city of Houston from 1900 to 2010 is given below: Year Population...

3-The population in the city of Houston from 1900 to 2010 is given below: Year Population 1900 44,633 1910 78,800 1920 138,276 1930 292,352 1940 384,514 1950 596,163 1960 938,219 1970 1,233,505 1980 1,595,138 1990 1,631,766 2000 1,953,631 2010 2,100,263 a. Give a scatter-plot and residual plot of the data. b. Based on the graphs in part a, propose a model for the data. Show me evidence to support your conclusion. Go through all necessary steps to construct a model...
Refer to Table 12.1 and look at the period from 1970 through 1975.

Table 12.1 (below)TABLE 12.1 Year-to-Year Total Returns: 1926–2019YearLarge-Company StocksLong-Term Government BondsU.S. Treasury BillsConsumer Price Index192611.62%7.77%3.27%–1.49%192737.498.933.12–2.08192843.61.103.56–.971929–8.423.424.75.201930–24.904.662.41–6.031931–43.34–5.311.07–9.521932–8.1916.84.96–10.30193353.99–.07.30.511934–1.4410.03.162.03193547.674.98.172.99193633.927.52.181.211937–35.03.23.313.10193831.125.53–.02–2.781939–.415.94.02–.481940–9.786.09.00.961941–11.59.93.069.72194220.343.22.279.29194325.902.08.353.16194419.752.81.332.11194536.4410.73.332.251946–8.07–.10.3518.1619475.71–2.62.509.0119485.503.40.812.71194918.796.451.10–1.80195031.71.061.205.79195124.02–3.931.495.87195218.371.161.66.881953–.993.641.82.62195452.627.19.86–.50195531.56–1.291.57.3719566.56–5.592.462.861957–10.787.463.143.02195843.36–6.091.541.76195911.96–2.262.951.501960.4713.782.661.48196126.89.972.13.671962–8.736.892.731.22196322.801.213.121.65196416.483.513.541.19196512.45.713.931.921966–10.063.654.763.35196723.98–9.184.213.04196811.06–.265.214.721969–8.50–5.076.586.1119703.8612.116.525.49197114.3013.234.393.36197219.005.693.843.411973–14.69–1.116.938.801974–26.474.358.0012.20197537.239.205.807.01197623.9316.755.084.811977–7.16–.695.126.7719786.57–1.187.189.03197918.61–1.2310.3813.31198032.50–3.9511.2412.401981–4.921.8614.718.94198221.5540.3610.543.87198322.56.658.803.8019846.2715.489.853.95198531.7330.977.723.77198618.6724.536.161.1319875.25–2.715.474.41198816.619.676.354.42198931.6918.118.374.651990–3.106.187.816.11199130.4719.305.603.0619927.628.053.512.90199310.0818.242.902.7519941.32–7.773.902.67199537.5831.675.602.54199622.96–.935.213.32199733.3615.855.261.70199828.5813.064.861.61199921.04–8.964.682.682000–9.1021.485.893.392001–11.893.703.831.552002–22.1017.841.652.38200328.681.451.021.88200410.888.511.203.2620054.917.812.983.42200615.791.194.802.5420075.499.884.664.082008–37.0025.871.60.09200926.46–14.90.102.72201015.0610.14.121.5020112.1127.10.042.96201216.003.43.061.74201332.39–12.78.021.51201413.6924.71.02.7620151.38–.65.02.73201611.961.75.202.07201721.836.24.802.112018–4.38–.571.811.91201931.4912.162.142.29Questions:a.Calculate the arithmetic average returns for large-company stocks and T-bills over this period. (Do not round intermediate calculations and enter your answers as a percent rounded to 2 decimal places, e.g., 32.16.)b.Calculate the standard deviation of the returns for large-company stocks and T-bills over this period. (Do not round intermediate calculations and enter your answers as a percent rounded to 2 decimal places, e.g., 32.16.)c-1.Calculate the observed risk premium...

STATISTICS PROBABILITY AND CODE IN PYTHON TO PLOT THE GRAPH. Consider the following Gross Domestic...

Homework Answers

Add Answer to:
STATISTICS PROBABILITY AND CODE IN PYTHON TO PLOT THE GRAPH. Consider the following Gross Domestic...

Post as a guest

Earn Coins

The following table provides data for life expectancy for Batiki Island. a. Check students' understanding of...

Unit Project: Population Analysis

Unit Project: Population Analysis

3-The population in the city of Houston from 1900 to 2010 is given below: Year Population...

Refer to Table 12.1 and look at the period from 1970 through 1975.

​ STATISTICS PROBABILITY AND CODE IN PYTHON TO PLOT THE GRAPH. Consider the following Gross Domestic...

Homework Answers

Add Answer to: ​ STATISTICS PROBABILITY AND CODE IN PYTHON TO PLOT THE GRAPH. Consider the following Gross Domestic...

Post as a guest

Earn Coins

The following table provides data for life expectancy for Batiki Island. a. Check students' understanding of...

Unit Project: Population Analysis

Unit Project: Population Analysis

3-The population in the city of Houston from 1900 to 2010 is given below: Year Population...

Refer to Table 12.1 and look at the period from 1970 through 1975.

STATISTICS PROBABILITY AND CODE IN PYTHON TO PLOT THE GRAPH. Consider the following Gross Domestic...

Add Answer to:
STATISTICS PROBABILITY AND CODE IN PYTHON TO PLOT THE GRAPH. Consider the following Gross Domestic...