Студопедия
Случайная страница | ТОМ-1 | ТОМ-2 | ТОМ-3
АвтомобилиАстрономияБиологияГеографияДом и садДругие языкиДругоеИнформатика
ИсторияКультураЛитератураЛогикаМатематикаМедицинаМеталлургияМеханика
ОбразованиеОхрана трудаПедагогикаПолитикаПравоПсихологияРелигияРиторика
СоциологияСпортСтроительствоТехнологияТуризмФизикаФилософияФинансы
ХимияЧерчениеЭкологияЭкономикаЭлектроника

The linear regression model

Читайте также:
  1. Assumptions of the regression model
  2. B) Read and act the extract according to the model.
  3. Computer solution of multiple regressions
  4. Dummy variables in the regression models
  5. Estimation of model error variance
  6. Exercise 1 Change the sentences according to the model
  7. Exercise 1. Change the sentences according to the model.

Let us return to the example of an economist investigating the relationship between food expenditure and income. What factors or variables

does a household consider when deciding how much money should be spent on food every week or every month? Certainly, income of household is one factor. Many other factors, say, the size of household, the preferences and tests of household members, are some of the variables that will influence a household’s decision about food expenditure. These variables are called independent variables because they are all vary independently and they explain the variation in food expenditure among different households. In other words, these variables explain why different households spend different amounts of money on food. Food expenditure is called the dependent variable because it depends on the independent variables. Studying the effect of two or more independent variables on a dependent variable using regression analysis is called multiple regression. If we choose only one (usually the most important) independent variable and study the effect of that single variable on a dependent variable, it is called a simple regression. Thus, simple regression includes only two variables: one independent and one dependent.

Definition: A regression model is a mathematical equation that describes relationship between two or more variables. A simple regression model includes only two variables: one independent and one dependent. The dependent variable is the one being explained and the independent variable is the one used to explain the variation in the dependent variable.

The relationship between two variables in a regression analysis is expressed by a mathematical equation called a regression equation or model.

A regression equation that gives a straight line relationship between two variables is called a linear regression model; otherwise, it is called a nonlinear regression model. In this chapter we will consider only linear regression model.

In a regression model, the independent variable is usually denoted by x and the dependent variable is usually denoted by y. Simple linear regression model is written as

(1)

In model (1), gives the value of y for , and gives the change in y due to a change of one unit in x. This model simply states that y is determined exactly by x and for a given value of x there is one and only one value of y. For example, if y is food expenditure and x is income, then model (1) would state that food expenditure is determined by income only and that all households with the same income will spend the same amount on food. But as mentioned above, food expenditure is determined by many variables, only one of which is included in model (1). In reality, different households with the same income spend different amounts of money on food because of the differences in size of the household, their preferences and tastes. Hence, to take these variables into consideration and make model complete, we add another term to the right side of model (1). This term is called the error term. It is denoted by (Greek letter epsilon). The complete regression model is written as

(2)

 

Equation (2) is called the population (or true) regression line of y on x.

In equation (2) and are the population model coefficients and

is a random error term.

 

Population data are difficult to obtain. As a result, we almost always use sample data to estimate model (2). The estimated regression model is given by the equation

where a and b are estimated values of the coefficients and e is the difference between the predicted value of y on the regression line, defined as

and the observed value . The difference between and for each value of x is defined as the residual

 

Thus for each observed value of x there is a predicted value of y from the estimated model and an observed value. The difference between the observed and predicted values of y is defined as the residual. The residual, , is not the model error, , but is the combined measure of the model error and errors in estimating, a and b, and in turn the errors in estimating the predicted value.

 


Дата добавления: 2015-08-05; просмотров: 138 | Нарушение авторских прав


Читайте в этой же книге: Издательство МВТУ | Future Work Will Determine | The scatter diagram | Correlation analysis | Hypothesis test for correlation | Exercises | Spearman rank correlation | Least square procedure | Interpretation of a and b | Assumptions of the regression model |
<== предыдущая страница | следующая страница ==>
Exercises| Least squares coefficient estimators

mybiblioteka.su - 2015-2024 год. (0.007 сек.)