Question

I have a data set called ACS that I set my_data <- read.csv('acs_ny_CSV.csv'). One of the...

I have a data set called ACS that I set my_data <- read.csv('acs_ny_CSV.csv'). One of the values in the data set here is FamilyIncome having a value from 50 to 1 mill plus FamilyIncome Min. : 50 1st Qu.: 52540 Median : 87000 Mean : 110281 3rd Qu.: 133800 Max. :1605000 I need to convert this value to a 0 and 1 as I need to "Make a binary variable with value TRUE for income above $150,000 and FALSE for income below." Can yo tell me what I need to do for coding? For additional info the overall problem I am solving is that I need this info for is -...... Use the subset (acs_ny.csv) of the 2010 American Community Survey (ACS) for New York state found here http://www.jaredlander.com/data/acs_ny.csv , make a logistic regression model in R. predict whether a household has an income > than $150,000. Explain your results including deviance residuals, coefficients, and AIC. Make a coefficient plot for logistic regression on family income greater than $150,000. Make a new binary variable with value TRUE for income above $150,000 and FALSE for income below. Make a density plot of family income to see distribution. Use glm() function to perform logistic regression in R.

I am going to use the code

model <- glm(formula=FamilyIncome~.,data=my_data,family='binomial')

which I am hoping once the Family income has been changed to True for > = 150,000 and False for < 150,000 false will run and work

0 0
Add a comment Improve this question Transcribed image text
Answer #1

m = 150000
data[data$FamilyIncome > m,'y'] = TRUE
data[data$FamilyIncome <= m,'y'] = FALSE

Above will create a new column y with your 1 or 0 category

You can remove FamilyIncome now

data$FamilyIncome <- NULL

Then run the code

model <- glm(formula=y~.,data=my_data,family='binomial')

Add a comment
Know the answer?
Add Answer to:
I have a data set called ACS that I set my_data <- read.csv('acs_ny_CSV.csv'). One of the...
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for? Ask your own homework help question. Our experts will answer your question WITHIN MINUTES for Free.
Similar Homework Help Questions
  • For expert using R , I solve it but i need to figure out what I...

    For expert using R , I solve it but i need to figure out what I got is correct or wrong. Thank you # Simple Linear Regression and Polynomial Regression # HW 2 # # Read data from csv file data <- read.csv("C:\data\SweetPotatoFirmness.csv",header=TRUE, sep=",") head(data) str(data) # scatterplot of independent and dependent variables plot(data$pectin,data$firmness,xlab="Pectin, %",ylab="Firmness") par(mfrow = c(2, 2)) # Split the plotting panel into a 2 x 2 grid model <- lm(firmness ~ pectin , data=data) summary(model) anova(model) plot(model)...

  • linear stat modeling & regression please , i need the solution for Q3, but i copy Q2 because you need info from Q2 in order to answer Q3. 2) Suppose you have multiple regression set up YxXB...

    linear stat modeling & regression please , i need the solution for Q3, but i copy Q2 because you need info from Q2 in order to answer Q3. 2) Suppose you have multiple regression set up YxXBp The ridge regression estimator is given by Here, llell'-Σ.< where is a vector of Vik. a) Find the expectation and variance-covariance matrix of Bridge, when X'X is a diagonal matrix with each diagonal entry is eqal to. Com pare these variances with the...

  • Simple R programming question: I need to download this data set from kaggle, what is the...

    Simple R programming question: I need to download this data set from kaggle, what is the correct code? the one that I am using is not working: library(data.table) boston_variable <- read.csv("https://www.kaggle.com/rojour/finishers-boston-marathon-2017#marathon_results_2017.csv") returns: Version:1.0 StartHTML:0000000107 EndHTML:0000000950 StartFragment:0000000127 EndFragment:0000000932 cannot open URL 'https://www.kaggle.com/rojour/finishers-boston-marathon-2017#marathon_results_2017.csv': HTTP status was '404 Not Found

  • R STUDIO Create a simulated bivariate data set consisting of n 100 (xi, yi) pairs: Generate...

    R STUDIO Create a simulated bivariate data set consisting of n 100 (xi, yi) pairs: Generate n random a-coordinates c from N(0, 1) Generate n random errors, e, from N(0, o), using o 4. Set yiBoB1x; + , Where Bo = 2, B1 = 3, and eN(0, 4). (That is, y is a linear function of , plus some random noise.) (Now we have simulated data. We'll pretend that we don't know the true y-intercept Bo 2, the true slope...

  • ** MATLAB HELP** I have been given a large data set in excel in a format...

    ** MATLAB HELP** I have been given a large data set in excel in a format that can be imported into matlab which i have done. The data recored is Wind Data. The point at which data is collected has been collated as "Timestamps" and they are in milliseconds. I am being asked to plot windspeed against time/date. But i need to convert the millisecond data into a usable time/date vector. Any code help for this problem would be great....

  • uestion 1: The sales of a company (in million dollars) for each year are shown in...

    uestion 1: The sales of a company (in million dollars) for each year are shown in the table below, identify the linear regression model in the form y=mx+b and report the values of m (slope) and b (intercept) as well as the estimated value of y when the value of x is 10. x (year) 2005 2006 2007 2008 2009 y (sales) 12 19 29 37 45 NOTE: You should consider the value x as the elapsed time. For 2005...

  • How do you find the standard deviation of a data set of numbers? I know the...

    How do you find the standard deviation of a data set of numbers? I know the median, and the upper and lower quartile numbers. I am making a box plot and need to know how to graph that on a box plot as well. For example let us say the numbers are 1,2,3,4,5,6,7,8,9. How would you find the standard deviation?

  • Decide (with short explanations) whether the following statements are true or false. e) In a simple linear regression model with explanatory variable x and outcome variable y, we have these summary s...

    Decide (with short explanations) whether the following statements are true or false. e) In a simple linear regression model with explanatory variable x and outcome variable y, we have these summary statisties z-10, s/-3 sy-5 and у-20. For a new data point with x = 13, it is possible that the predicted value is y = 26. f A standard multiple regression model with continuous predictors and r2, a categorical predictor T with four values, an interaction between a and...

  • The data set 401KSUBS.RAW contains information on net financial wealth (nettfa), age of the survey respondent...

    The data set 401KSUBS.RAW contains information on net financial wealth (nettfa), age of the survey respondent (age), annual family income (inc), family size (fsize), and participation in certain pension plans for people in the United States. The wealth and income variables are both recorded in thousands of dollars. For this question, use only the data for single-person households (so fsize = 1).(i) How many single-person households are there in the data set?(ii) Use OLS to estimate the modeland report the...

  • Wage EDUC EXPER AGE Male 40 39 38 53 59 36 45 37 37 43 32 40 49 43 31 45 31 37.85 21.72 14.34 21....

    NEED TO DO IN PROGRAM R Wage EDUC EXPER AGE Male 40 39 38 53 59 36 45 37 37 43 32 40 49 43 31 45 31 37.85 21.72 14.34 21.26 24.65 71 25.65 815.45 9 20.39 10 29.13 11 27.33 12 18.02 1320.39 15 12 1 0 12 14 18 1424.18 1517.29 16 15.61 1 10 17 35.07 18 1920.39 20 21 40.33 14 16.61 16.33 30 28 Ch17 009 Data File ype here to search 1 0...

ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT