Please describe stages of modelling of Classical Linear Regression Model: Specification, Estimation, Contrast and Validation and Utilization.
Thank you.
Steps in regression analysis:-
Regression analysis includes the following steps:
• Statement of the problem under consideration
• Choice of relevant variables
• Collection of data on relevant variables
• Specification of model
• Choice of method for fitting the data
• Fitting of model
• Model validation and criticism
• Using the chosen model(s) for the solution of the posed
problem.
1. Statement of the problem under consideration:
The first important step in conducting any regression analysis is
to specify the problem and the objectives to
be addressed by the regression analysis. The wrong formulation or
the wrong understanding of the problem
will give the wrong statistical inferences. The choice of variables
depends upon the objectives of study and
understanding of the problem. For example, height and weight of
children are related. Now there can be two
issues to be addressed.
(i) Determination of height for given weight, or
(ii) determination of weight for given height.
In the case 1, the height is response variable whereas weight is
response variable in case 2. The role of
explanatory variables are also interchanged in the cases 1 and
2.
2. Choice of relevant variables:
Once the problem is carefully formulated and objectives have been
decided, the next question is to choose
the relevant variables. It has to kept in mind that the correct
choice of variables will determine the statistical
inferences correctly. For example, in any agricultural experiment,
the yield depends on explanatory
variables like quantity of fertilizer, rainfall, irrigation,
temperature etc. These variables are denoted by
1 2 , ,..., XX Xk as a set of k explanatory variables.
3. Collection of data on relevant variables:
Once the objective of study is clearly stated and the variables are
chosen, the next question arises is to
collect data on such relevant variables. The data is essentially
the measurement on these variables. For
example, suppose we want to collect the data on age. For this, it
is important to know how to record the
data on age. Then either the date of birth can be recorded which
will provide the exact age on any specific
date or the age in terms of completed years as on specific date can
be recorded. Moreover, it is also
important to decide that whether the data has to be collected on
variables as quantitative variables or
qualitative variables. For example, if the ages (in years) are
15,17,19,21,23, then these are quantitative
values. If the ages are defined by a variable that takes value 1 if
ages are less than 18 years and 0 if the ages
are more than 18 years, then the earlier recorded data is converted
to 1,1,0,0,0. Note that there is a loss of
information in converting the quantitative data into qualitative
data. The methods and approaches for
qualitative and quantitative data are also different. If the study
variable is binary, then logistic and probitregressions etc. are
used. If all explanatory variables are qualitative, then analysis
of variance technique is
used. If some explanatory variables are qualitative and others are
quantitative, then analysis of covariance
technique is used. The techniques of analysis of variance and
analysis of covariance are the special cases of
regression analysis .
Generally, the data is collected on n subjects, then y on data,
then y denotes the response or study
variable and 1 2 , ,..., n yy y are the n values. If there are k
explanatory variables 1 2 , ,.., XX Xk then ij x
denotes the th i value of th j variable i nj k = = 1, 2,..., ; 1,
2,..., . The observation can be presented in the
following table:
Notation for the data used in regression analysis
Observation Response Explanatory variables
number y ______________________________________________
X1 X2 Xk
1 1 y 11 x 12 x 1k x
2 2 y 21 x 22 x 2k x
3 3 y 31 x 32 x 3k x
n n y n1 x n2 x nk x
______________________________________________________________________________________
4. Specification of model:
The experimenter or the person working in the subject usually help
in determining the form of the model.
Only the form of the tentative model can be ascertained and it will
depend on some unknown parameters.
For example, a general form will be like
1 2 12 ( , ,..., ; , ,..., ) k k y fX X X = ββ β ε +
where ε is the random error reflecting mainly the difference in the
observed value of y and the value of
y obtained through the model. The form of 1 2 12 ( , ,..., ; ,
,..., ) k k fX X X ββ β can be linear as well as nonlinear
depending on the form of parameters 1 2 , ,..., ββ βk . A model is
said to be linear if it is linear in parameters.
For example
yX X X
y X
ββ β ε
ββ ε
=+ + +
=+ +
are linear models whereas
( )
2
11 2 2 3 2
1 1 22 ln
yX X X
y XX
ββ β ε
β βε
=+ + +
= ++
are non-linear models. Many times, the nonlinear models can be
converted into linear models through
some transformations. So the class of linear models is wider than
what it appears initially.
If a model contains only one explanatory variable, then it is
called as simple regression model. When
there are more than one independent variables, then it is called as
multiple regression model. When there
is only one study variable, the regression is termed as univariate
regression. When there are more than one
study variables, the regression is termed as multivariate
regression. Note that the simple and multiple
regressions are not same as univariate and multivariate
regressions. The simple and multiple regression are
determined by the number of explanatory variables whereas
univariate and multivariate regressions are
determined by the number of study variables.
5. Choice of method for fitting the data:
After the model has been defined and the data have been collected,
the next task is to estimate the
parameters of the model based on the collected data. This is also
referred to as parameter estimation or
model fitting. The most commonly used method of estimation is the
least squares method. Under certain
assumptions, the least squares method produces estimators with
desirable properties. The other estimation
methods are the maximum likelihood method, ridge method, principal
components method etc.
6. Fitting of model:
The estimation of unknown parameters using appropriate method
provides the values of the parameter.
Substituting these values in the equation gives us a usable model.
This is termed as model fitting. The
estimates of parameters 1 2 , ,..., ββ βk in the model
1 2 12 ( , ,..., , , ,..., ) k k y fX X X = ββ β ε +
are denoted by 1 2
ˆˆ ˆ
When the value of y is obtained for the given values of 1 2 ,
,..., , XX Xk it is denoted as yˆ and called as
fitted value.
The fitted equation is used for prediction. In this case, yˆ is
termed as predicted value. Note that the fitted
value is where the values used for explanatory variables correspond
to one of the n observations in the data
whereas predicted value is the one obtained for any set of values
of explanatory variables. It is not generally
recommended to predict the y -values for the set of those values of
explanatory variables which lie outside
the range of data. When the values of explanatory variables are the
future values of explanatory variables,
the predicted values are called forecasted values.
7. Model criticism and selection
The validity of statistical method to be used for regression
analysis depends on various assumptions. These
assumptions become essentially the assumptions for the model and
the data. The quality of statistical
inferences heavily depends on whether these assumptions are
satisfied or not. For making these assumptions
to be valid and to be satisfied, care is needed from beginning of
the experiment. One has to be careful in
choosing the required assumptions and to decide as well to
determine if the assumptions are valid for the
given experimental conditions or not? It is also important to
decide that the situations is which the
assumptions may not meet.
8. Objectives of regression analysis
The determination of explicit form of regression equation is the
ultimate objective of regression analysis. It
is finally a good and valid relationship between study variable and
explanatory variables. The regression
equation helps is understanding the interrelationships of variables
among them. Such regression equation
can be used for several purposes. For example, to determine the
role of any explanatory variable in the
joint relationship in any policy formulation, to forecast the
values of response variable for given set of
values of explanatory variables.
Thank you.
Please describe stages of modelling of Classical Linear Regression Model: Specification, Estimation, Contrast and Valida...
7. In a simple regression model, suppose all of the assumptions of the classical linear regression morel apply, except that rather than assume E (ui | Xi) = 0, you assume that E (Ui / X;) = ali and E (xi) = 0 where a > 0 is a constant. (a) What is the conditional expectation of the OLS slope coefficient, i.e. E (B1 | 21, ..., XN)? (b) In this case, is ß1 an unbiased estimator of B1 or...
14. Discuss how multicollinearity violates an assumption of the classical linear regression model (CLRM). How do we detect multicollinearity?
In the multiple linear regression model with estimation by ordinary least squares, why must we make an analysis of the scatter plot indices 1, 2,. . . , n and with the residuals ei for observations that are somehow ordered (for example, in time)? And what is the purpose of analyzing the sample autocorrelation function?
012. (a) The ordinary least squares estimate of B in the classical linear regression model Yi = α + AXi + Ui ; i=1,2, , n and xi = Xi-K, X-n2Xī i- 1 Show that if Var(B-.--u , no other linear unbiased estimator of β n im1 can be constructed with a smaller variance. (All symbols have their usual meaning) 18
: To estimate demand, the specification of the model... (a) (b) (c) must be a linear functional form to use regression should use an exponential functional form requires access to "big data"
Consider the multiple regression with three independent variables under the classical linear model assumptions: y Bo+BBx,+B,x, +u 1. You would like to test the hypothesis: H0: B-3B, 1 What is the standard error of B-3B,? (i Write the t-statistic of B-3B ( Define 0,= B-3B.. Write a regression equation that allows you to directly obtain 0, and its standard error.
5. What do we mean by the term "heteroskedasticity"? Describe the consequences of heteroskedasticity for estimation and inference within the context of the classical linear regression model. How can we detect the presence of heteroskedasticity? Be specific. Should anything be done about heteroskedasticity if it is detected? If so, what should be done? Be specific. If not, why not? 5. What do we mean by the term "heteroskedasticity"? Describe the consequences of heteroskedasticity for estimation and inference within the context...
5. What do we mean by the term "heteroskedasticity"? Describe the consequences of heteroskedasticity for estimation and inference within the context of the classical linear regression model. How can we detect the presence of heteroskedasticity? Be specific. Should anything be done about heteroskedasticity if it is detected? If so, what should be done? Be specific. If not, why not? 5. What do we mean by the term "heteroskedasticity"? Describe the consequences of heteroskedasticity for estimation and inference within the context...
Econometrics 13) Consider the classical linear regression model y = XB + E, EN(0,021) The data are collected in such a way that the X matrix is orthogonal, that is X'X = 1. We want to test the null hypothesis that Ho: B1 + B2 + ... + Bx = 0. For this particular hypothesis, the standard t-test for a single linear restriction r' B = q reduces to ki bi a) t= i=1 b) t = svk Ek=1b c)t...
Given a simple linear regression model, please answer the questions