Question

Before you start For this homework, we will need to import some libraries. You need to...

Before you start

For this homework, we will need to import some libraries. You need to execute the following cell only once; you don't need to copy this in every cell you run.

In [ ]:

import pandas
import numpy
from urllib.request import urlretrieve
from matplotlib import pyplot
%matplotlib inline
#This library is needed for testing
from IPython.display import set_matplotlib_close
set_matplotlib_close(False)

Introduction

In this homework, you will work with data from the World Bank. The subject of study is the "Life expectancy at birth, total (years)". According to the World Bank, life expectancy at birth indicates the number of years a newborn infant would live if prevailing patterns of mortality at the time of its birth were to stay the same throughout its life.

Importing the data

We are going to import the data and assign it to a pandas data frame named life_exp. Before you start, make sure to execute the cells below to download the data file, read it into a data frame, clean it, and setting the index column.

In [ ]:

URL = 'http://go.gwu.edu/engcomp2hw1data'
urlretrieve(URL, 'life_expectancy.csv')

In [ ]:

life_exp = pandas.read_csv('life_expectancy.csv')
life_exp = life_exp.dropna() #clean data frame
life_exp = life_exp.set_index('Country') #Set Country column as index

To see the first lines of our data, uncomment the following cell and execute it.

In [ ]:

#life_exp.head()

Exercise 1

a) Using pandas, compute the minimum and maximum values in 1989 of the life_exp data frame , assign the result to variables named min_1989 and max_1989, respectively, and print them.

Hint

The syntax to compute the minimum value with pandas is:

    name_of_the_data_frame['name_of_column'].min()

and the maximum:

    name_of_the_data_frame['name_of_column'].max()

In [ ]:

# YOUR CODE HERE
raise NotImplementedError()

In [ ]:

b) Using the pandas built-in functions to compute minimum and maximum values, and for-statements, compute the minimum and maximum values in each year and append them to the lists min_per_year and max_per_year, respectively.

Hint: pandas data frames have a built-in function that returns the columns. Check the syntax in the documentation.

In [ ]:

# YOUR CODE HERE
raise NotImplementedError()

In [ ]:

c) Using pandas, obtain the countries that correspond to the minimum and maximum values in 1989 of the life_exp data frame, assign the result to variables named min_1989_country and max_1989_country respectively, and print them.

Hint

  1. Remember that the index of our data frame is the "Country" column.
  2. The syntax to compute the index of the minimum is:
    name_of_the_data_frame['name_of_column'].idxmin()

and the index of the maximum is:

    name_of_the_data_frame['name_of_column'].idxmax()

In [ ]:

# YOUR CODE HERE
raise NotImplementedError()

In [ ]:

d) Using the pandas built-in functions to obtain the index of the minimum and maximum values, and for-statements, compute the countries to which the minimum and maximum values correspond each year, and append them to the lists min_per_year_country and max_per_year_country respectively.

Hint: pandas data frames have a built-in function that returns the columns. Check the syntax in the documentation

In [ ]:

# YOUR CODE HERE
raise NotImplementedError()

In [ ]:

e) Using pandas, create a Series based on the list min_per_year_country, apply the method value_counts(), assign the result to a variable named min_count, and print it.

Tip: The syntax to generate a Series from a list is:

pandas.Series(your_list)

In [ ]:

# YOUR CODE HERE
raise NotImplementedError()

In [ ]:

f) Using the pandas series methods min() and idxmin(), compute the minimum of the min_count series and the country to which it corresponds, assign them to variables named min_count_min and min_count_min_country, and print them.

Notes:

  1. You can check that min_count is a Series using there type() built-in function.
  2. The country min_count_min_country is the country that for min_count_min times had the minimum life expectancy over the years.

In [ ]:

# YOUR CODE HERE
raise NotImplementedError()

In [ ]:

g) Using the pandas series methods max() and idxmax(), compute the maximum of the min_count series and the country to which it corresponds, assign them to variables named min_count_max and min_count_max_country, and print them.

Note:

  • The country min_count_max_country is the country that for min_count_max times had the minimum life expectancy over the years.

In [ ]:

# YOUR CODE HERE
raise NotImplementedError()

In [ ]:

h) Repeat the same process applied in e), f) and g) but now using the list max_per_year_country obtained in exercise d).

The variable names should now be:

For e) part : max_count For f) part : max_count_min and max_count_min_country. For g) part : max_count_max and min_count_max_country

In [ ]:

# YOUR CODE HERE
raise NotImplementedError()

In [ ]:

Exercise 2

a) Using the pandas built-in function mean(), and for-statements, compute the mean value of each year and append it to the list mean_per_year.

Hint: pandas data frames have a built-in function that returns the columns. Check the syntax on the documentation

In [ ]:

# YOUR CODE HERE
raise NotImplementedError()

In [ ]:

b) Using NumPy:

  1. Convert the list mean_per_year into an array and assign the result to a variable named mean_arr.
  2. Create an array that goes from 1960 to 2015 jumping by one year, and assign the result to a variable named year_arr.

In [ ]:

# YOUR CODE HERE
raise NotImplementedError()

In [ ]:

c) Use the year_arr and mean_arr arrays to perform a linear regression (year_arr as the "x" and mean_arr as the "y" axes) using the built-in functions from NumPy. Name the coefficients of the linear regression a1 and a0, and name the fitting function f_linear.

In [ ]:

# YOUR CODE HERE
raise NotImplementedError()

In [ ]:

d) Plot the original data and the linear regression curve on one plot. The plot must include:

  • Title
  • Label in the x-axis
  • Label in the y-axis with units
  • Legend for the linear regression

In [ ]:

# YOUR CODE HERE
raise NotImplementedError()

In [ ]:

e) Using the f_linear function, compute the predicted value of the average life expectancy for the year 2030, assign it to a variable named le_estimate_2030 and print it.

In [ ]:

# YOUR CODE HERE
raise NotImplementedError()

In [ ]:

Remember!!

  1. Before you turn in this homework, make sure everything runs as expected. To do this, go to the Kernel menu option, and select "Restart & Run All.

  2. Make sure you filled in any place that says YOUR CODE HERE or "YOUR ANSWER HERE", as well as your name at the beginning of the notebook.

  3. DO NOT change the name of the file! Make sure your submission is still named engcomp2hw2.ipynb.

0 0
Add a comment Improve this question Transcribed image text
Answer #1

You need to create a jupyter notebook, and name it as "engcomp2hw2", and submit that file as your assignment. I will be looking forward to solve second Exercise also. Just follow the instructions.

To show how to put them in jupter notebook, I'm adding some screenshots also. you can find them attached with this answer.

Here you are supposed put this assignment in 2 questions, I'm solving only 1st exercise ( this is we were instructed to do). To get solution for 2nd Exercise, You are requested to put that in a separate question.

I was suppose to solve only 4 sub-parts for exercise1 also, but i did all part for exercise 1.

Note: In separate question also provide questions for Exercise 1 also, just add one more text asking for the solution of Exercise 2, And edit this question by adding a text "Provide a solution for Exercise 1."

Solution for Exercise 1 (all options) is as follows.

# In[1]:


import pandas
import numpy
from urllib.request import urlretrieve
from matplotlib import pyplot
get_ipython().run_line_magic('matplotlib', 'inline')


#This library is needed for testing

from IPython.display import set_matplotlib_close

set_matplotlib_close(False)


# In[2]:
URL = 'http://go.gwu.edu/engcomp2hw1data'
urlretrieve(URL, 'life_expectancy.csv')


# In[4]:
life_exp = pandas.read_csv('life_expectancy.csv')
life_exp = life_exp.dropna() #clean data frame
life_exp = life_exp.set_index('Country') #Set Country column as index


# In[35]:
life_exp.head()


# ## Excercise 1

# a) Using pandas, compute the minimum and maximum values in 1989 of the life_exp data frame , assign the result to variables named min_1989 and max_1989, respectively, and print them.

# In[28]:
min_1989 = life_exp["1989"].min()
max_1989 = life_exp["1989"].max()

print("min_1989 : {} \nmax_1989: {}".format(min_1989,max_1989))


# In[ ]
# b) Using the pandas built-in functions to compute minimum and maximum values, and for-statements, compute the minimum and maximum values in each year and append them to the lists min_per_year and max_per_year, respectively.

# In[12]:
min_per_year = []
max_per_year = []

for year in life_exp:
    min_per_year.append(life_exp[year].min())
    max_per_year.append(life_exp[year].max())
   
# In[ ]:

# c) Using pandas, obtain the countries that correspond to the minimum and maximum values in 1989 of the life_exp data frame, assign the result to variables named min_1989_country and max_1989_country respectively, and print them.

# In[31]:
min_1989_country = life_exp["1989"].idxmin()
max_1989_country = life_exp["1989"].idxmax()

print("min_1989_country: {} \nmax_1989_country: {}".format(min_1989_country,max_1989_country))


# In[ ]:

# d) Using the pandas built-in functions to obtain the index of the minimum and maximum values, and for-statements, compute the countries to which the minimum and maximum values correspond each year, and append them to the lists min_per_year_country and max_per_year_country respectively.

# In[32]:
min_per_year_country = []
max_per_year_country = []

for year in life_exp:
    min_per_year_country.append(life_exp[year].idxmin())
    max_per_year_country.append(life_exp[year].idxmax())


# In[ ]:

# e) Using pandas, create a Series based on the list min_per_year_country, apply the method value_counts(), assign the result to a variable named min_count, and print it.
#

# In[37]:
min_count = pandas.Series(min_per_year_country).value_counts()

print(min_count)


# In[ ]:

# f) Using the pandas series methods min() and idxmin(), compute the minimum of the min_count series and the country to which it corresponds, assign them to variables named min_count_min and min_count_min_country, and print them.

# In[39]:


min_count_min = min_count.min()
min_count_min_country = min_count.idxmin()

print("min_count_min: {} \nmin_count_min_country: {}".format(min_count_min,min_count_min_country))


# In[ ]:

# g) Using the pandas series methods max() and idxmax(), compute the maximum of the min_count series and the country to which it corresponds, assign them to variables named min_count_max and min_count_max_country, and print them.

# In[40]:


min_count_max = min_count.max()
min_count_max_country = min_count.idxmax()

print("min_count_max: {} \nmin_count_max_country: {}".format(min_count_max,min_count_max_country))


# In[ ]:

# h) Repeat the same process applied in e), f) and g) but now using the list max_per_year_country obtained in exercise d).The variable names should now be:
#
# For e) part : max_count For f) part : max_count_min and max_count_min_country. For g) part : max_count_max and min_count_max_country

# In[42]:
max_count = pandas.Series(max_per_year_country).value_counts()

print(max_count)

# In[43]:
max_count_min = max_count.min()
max_count_min_country = max_count.idxmin()

print("max_count_min: {} \nmax_count_min_country: {}".format(max_count_min,max_count_min_country))


# In[44]:
max_count_max = max_count.max()
max_count_max_country = max_count.idxmax()

print("max_count_max: {} \nmax_count_max_country: {}".format(max_count_max,max_count_max_country))

Add a comment
Know the answer?
Add Answer to:
Before you start For this homework, we will need to import some libraries. You need to...
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for? Ask your own homework help question. Our experts will answer your question WITHIN MINUTES for Free.
Similar Homework Help Questions
  • def gradient_descent(feature_matrix, label, learning_rate = 0.05, epoch = 1000): """ Implement gradient descent algorithm for regression....

    def gradient_descent(feature_matrix, label, learning_rate = 0.05, epoch = 1000): """ Implement gradient descent algorithm for regression.    Args: feature_matrix - A numpy matrix describing the given data, with ones added as the first column. Each row represents a single data point.    label - The correct value of response variable, corresponding to feature_matrix.    learning_rate - the learning rate with default value 0.5    epoch - the number of iterations with default value 1000 Returns: A numpy array for the...

  • def stochastic_gradient_descent(feature_matrix, label, learning_rate = 0.05, epoch = 1000): """ Implement gradient descent algorithm for regression....

    def stochastic_gradient_descent(feature_matrix, label, learning_rate = 0.05, epoch = 1000): """ Implement gradient descent algorithm for regression.    Args: feature_matrix - A numpy matrix describing the given data, with ones added as the first column. Each row represents a single data point.    label - The correct value of response variable, corresponding to feature_matrix.    learning_rate - the learning rate with default value 0.5    epoch - the number of iterations with default value 1000 Returns: A numpy array for the...

  • Python Programming (Just need the Code) Index.py #Python 3.0 import re import os import collections import...

    Python Programming (Just need the Code) Index.py #Python 3.0 import re import os import collections import time #import other modules as needed class index:    def __init__(self,path):    def buildIndex(self):        #function to read documents from collection, tokenize and build the index with tokens        # implement additional functionality to support methods 1 - 4        #use unique document integer IDs    def exact_query(self, query_terms, k):    #function for exact top K retrieval (method 1)    #Returns...

  • Lab Exercise #15 Assignment Overview This lab exercise provides practice with Pandas data analysis library. Data...

    Lab Exercise #15 Assignment Overview This lab exercise provides practice with Pandas data analysis library. Data Files We provide three comma-separated-value file, scores.csv , college_scorecard.csv, and mpg.csv. The first file is list of a few students and their exam grades. The second file includes data from 1996 through 2016 for all undergraduate degree-granting institutions of higher education. The data about the institution will help the students to make decision about the institution for their higher education such as student completion,...

  • PYTHON import numpy as np import pandas as pd import matplotlib.pyplot as plt from sklearn.linear_model import...

    PYTHON import numpy as np import pandas as pd import matplotlib.pyplot as plt from sklearn.linear_model import LinearRegression from sklearn.linear_model import LogisticRegression from sklearn.model_selection import train_test_split Our goal is to create a linear regression model to estimate values of ln_price using ln_carat as the only feature. We will now prepare the feature and label arrays. "carat"   "cut" "color"   "clarity"   "depth"   "table"   "price"   "x"   "y"   "z" "1" 0.23   "Ideal" "E" "SI2" 61.5 55 326   3.95   3.98   2.43 "2" 0.21   "Premium" "E" "SI1"...

  • import java.util.List; import java.util.ArrayList; import java.util.LinkedList; public class ListPractice {       private static final int[]...

    import java.util.List; import java.util.ArrayList; import java.util.LinkedList; public class ListPractice {       private static final int[] arr = new int[100000];    public static void main(String[] args) {        for(int i=0; i<100000; i++)            arr[i] = i;               //TODO comment out this line        LinkedList<Integer> list = new LinkedList<Integer>();               //TODO uncomment this line        //List<Integer> list = new ArrayList<Integer>();               //TODO change the rest of the...

  • 12p I need help this is Python EXCEPTIONS: It's easier to ask forgiveness than permission. Try...

    12p I need help this is Python EXCEPTIONS: It's easier to ask forgiveness than permission. Try the code and catch the errors. The other paradigm is 'Look before you leap' which means test conditions to avoid errors. This can cause race conditions. 1.Write the output of the code here: class MyError(Exception): pass def notZero(num): if num == 0: raise MyError def run(f): try: exec(f) except TypeError: print("Wrong type, Programmer error") except ValueError: print("We value only integers.") except Zero Division Error:...

  • Undecimal to decimal&decimal to undecimal #Your code here Thank you! Binary-to-Decimal In a previous lab, we...

    Undecimal to decimal&decimal to undecimal #Your code here Thank you! Binary-to-Decimal In a previous lab, we considered converting a byte string to decimal. What about converting a binary string of arbitrary length to decimal? Given a binary string of an arbitrarily length k, bk-1....bi .box the decimal number can be computed by the formula 20 .bo +21.b, + ... + 2k-1. bx-1- In mathematics, we use the summation notation to write the above formula: k- 2.b; i=0) In a program,...

  • In the lectures about lists we have seen some Python code examples that involve the processing of...

    In the lectures about lists we have seen some Python code examples that involve the processing of lists containing weather statistics. The original source of the values shown in the slides was taken from part of the Environment Canada website that serves up historical data: http://climate.weather.gc.ca/historical_data/search_historic_data_e.html    Data can be provied by that website using the Comma Separated Value (CSV) format which is stored in text files normally using a CSV suffix. We will not work directly with such files in...

  • Use the link in the Jupyter Notebook activity to access your Python script. Once you have...

    Use the link in the Jupyter Notebook activity to access your Python script. Once you have made your calculations, complete this discussion. The script will output answers to the questions given below. You must attach your Python script output as an HTML file and respond to the questions below. In this discussion, you will apply the statistical concepts and techniques covered in this week's reading about hypothesis testing for the difference between two population proportions. In the previous week’s discussion,...

ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT