Question

I am working on a data frame using pandas with some of the column names (PCTFLOAN, SATMTMID, STAT...

I am working on a data frame using pandas with some of the column names (PCTFLOAN, SATMTMID, STATE, INSTITUTION_NAME). Some explanation, Column name STATE has state abbreviations for each school in that particular state.

a. Data grouping. For each state in dataframe, find the 5 institutes that have the lowest loanpercentage (PCTFLOAN). Ignore all the missing values.

b. Data summarizing. For each state in dataframecalculate the average of the median SAT math scores (SATMTMID) for the 5 low loan institutes that you find in question (a). Ignore all the missing values when you calculate the average.

PYTHON PANDAS PLEASE

0 0
Add a comment Improve this question Transcribed image text
Answer #1

#import pandas package alias pd for reading the data

import pandas as pd

#import numpy to calculate the median of any column in data frame

from numpy import median

#read the data from csv (comma separated value) using pd.read_csv("path to file directory/file name")

dataframe = pd.read_csv("path_to_file/file_name.csv")

#selecting only PCTFLOAN column from dataframe

data_loan = dataframe["PCTFLOAN"]

#nsmallest(n,column="column_name") function is used to find the smallest n column values

lst = data_loan.nsmallest(5)

#type casting to the list

lst = list(lst)

#getting the names of all five states with lowest PCTFLOAN using retriving based on the condition based retrival isin list check for the loan PCTFLOAN value is in lst are not and based on the it returns the row from that we are retriving the dataframe with state name

df=dataframe["STATE"][dataframe["PCTFLOAN"].isin(lst)]

# df is a data frame that contains the column named STATE containing the 5 state with the lowest PCTFLOAN

sum = 0 #initial sum variable to find the average as average is equal to (sum of all medain of PCTFLOAN/num of states)

#looping through the all the states

for state in df:

satmtmid_list = df["SATMTMID"][df["STATE"]==state] #retriving the SATMTMID colums based on state value

sum+=median(satmtmid_list) #finding up the median and it to the sum repeatdly for each state

#average of median of pctfloan of five state with lowest loan per

average = sum/len(df)

Add a comment
Know the answer?
Add Answer to:
I am working on a data frame using pandas with some of the column names (PCTFLOAN, SATMTMID, STAT...
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for? Ask your own homework help question. Our experts will answer your question WITHIN MINUTES for Free.
Similar Homework Help Questions
  • (a) Load the data file data/tips.csv into a pandas DataFrame called tips_df using the pandas read_table()...

    (a) Load the data file data/tips.csv into a pandas DataFrame called tips_df using the pandas read_table() function. Check the first five rows. (b) Create a new dataframe called tips by randomly sampling 6 records from the dataframe tips_df. Refer to the sample() function documentation. (c) Add a new column to tips called idx as a list ['one', 'two', 'three', 'four', 'five', 'six'] and then later assign it as the index of tips dataframe. Display the dataframe. (d) Create a new...

  • Lab Exercise #15 Assignment Overview This lab exercise provides practice with Pandas data analysis library. Data...

    Lab Exercise #15 Assignment Overview This lab exercise provides practice with Pandas data analysis library. Data Files We provide three comma-separated-value file, scores.csv , college_scorecard.csv, and mpg.csv. The first file is list of a few students and their exam grades. The second file includes data from 1996 through 2016 for all undergraduate degree-granting institutions of higher education. The data about the institution will help the students to make decision about the institution for their higher education such as student completion,...

  • Question:- Please create the 5*3 two dimensional data having numerical value by using pandas dataframe (You...

    Question:- Please create the 5*3 two dimensional data having numerical value by using pandas dataframe (You can give any name to the columns), please make sure that there should be at least one null value in each column. Once you are done with creating the matrix, please answer below questions:- (You have to do all the operations on jupyter notebook) e) Is there any way that you can put the restrictions on column wise or row wise to drop the...

  • Pandas DataFrame in Python : I have csv file which has date column with object data...

    Pandas DataFrame in Python : I have csv file which has date column with object data type which ranges from 1908 to 2018: (Original) Date                 (My result) Date                  (I Need) Date                       17-Sep-08                                  2008-09-17                 1908-09-17 7-Sep-09                                    2009-09-07                  1909-09-07 .    (more years)                         .   (more years)               .     .                                                 .                                      . 8-Nov-07                                       2007-11-07                 2007-11-07 23-Sep-08                                     2008-09-23                 2008-09-23 29-Dec-18                                     2018-12-29                 2018-12-29 When I am converting it to datetime64[ns] or/and adding column as year after extracting just year values from date...

  • Hi - Some help create a short program that uses a pandas DataFrame (solution MUST use...

    Hi - Some help create a short program that uses a pandas DataFrame (solution MUST use a DataFrame) to do the following. Please include comments /notes to explain what is going on. Solution does not need to define functions two csv files are included (grades01.csv): raw marks of students taking an examination; and a further csv file (rubric01.csv) summarizing the maximum mark available for each of the six questions. Structure of grades01: Student ID,Question 1,Question 2,Question 3,Question 4,Question 5,Question 6...

  • 23.4 Project 4: Using Pandas for data analysis and practice with error handling Python Please! 23.4...

    23.4 Project 4: Using Pandas for data analysis and practice with error handling Python Please! 23.4 PROJECT 4: Using Pandas for data analysis and practice with error handling Overview In this project, you will use the Pandas module to analyze some data about some 20th century car models, country of origin, miles per gallon, model year, etc. Provided Input Files An input file with nearly 200 rows of data about automobiles. The input file has the following format (the same...

  • Before you start For this homework, we will need to import some libraries. You need to...

    Before you start For this homework, we will need to import some libraries. You need to execute the following cell only once; you don't need to copy this in every cell you run. In [ ]: import pandas import numpy from urllib.request import urlretrieve from matplotlib import pyplot %matplotlib inline ​ #This library is needed for testing from IPython.display import set_matplotlib_close set_matplotlib_close(False) Introduction In this homework, you will work with data from the World Bank. The subject of study is...

  • Date: Names Directions: You must work with one or two other students on this take-home exam and you may use your textbo...

    Date: Names Directions: You must work with one or two other students on this take-home exam and you may use your textbook. Your work answering Questions 1 and 2 can be shared, but each of you must do your own Question 3, where each of you will pose your own question based on the data. Only one project will be turned for each team, consisting of joint answers for Questions 1 and 2, and as many Questions 3 answers are...

  • 23.4 PROJECT 4: Using Pandas for data analysis and practice with error handling Overview In this...

    23.4 PROJECT 4: Using Pandas for data analysis and practice with error handling Overview In this project, you will use the Pandas module to analyze some data about some 20th century car models, country of origin, miles per gallon, model year, etc. Provided Input Files An input file with nearly 200 rows of data about automobiles. The input file has the following format (the same as what you had for your chapter 13 labs). The following is an example of...

  • IN PYTHON PLEASE...... Process bowlers and their 3 bowling scores. Use my data file below: bowlers2.txt...

    IN PYTHON PLEASE...... Process bowlers and their 3 bowling scores. Use my data file below: bowlers2.txt ( Showed below) And you can also use a while loop to read your file if you prefer. How to average down an array, which you don’t have to do. In this case that would be the game 1 average for all bowlers for example. Find the low average and high average. Start with read the data into arrays/lists and printed out the arrays/lists...

ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT