Читайте также:
|
|
We will introduce some statistical measures that provide greater precision for describing relationships.
Let and be a pair of random variables, with means and , and variances and . As a measure of the association between these variables, we introduced the covariance, defined as
where and are the observed values, and are the sample means, and n is the sample size.
A positive value of the covariance indicates a direct or increasing linear relationships and a negative value of covariance indicates a decreasing linear relationship. Positive association indicates that the high values of X tend to be associated with high values of Y and low X with low Y. When there is a negative association, so that high values of X are associated with low values of Y andlow X with high Y, the covariance is negative. If there is no linear association between X and Y, their covariance is 0.
Another measure of the relationship between two variables is the correlation coefficient. In this section we will consider the simple linear correlation, for short linear correlation, which measures the strength of the linear association between two variables.
Definition:
The simple linear correlation, denoted by, , measures the strength of the linear relationship between two variables for a sample and is calculated as
(3.1)
An equivalent expression is
(3.2)
1. The sample correlation coefficient ranges from to with,
a) indicates a perfect positive linear relationship;
b) indicates no relationships between X and Y
c) indicates a perfect decreasing linear relationship between X and Y.
2. Positive correlations indicate positive or increasing linear relationship with values closer to +1, indicating data points closer to a straight line, and closer to 0, indicating greater deviations from a straight line.
3. Negative correlations indicate negative or decreasing linear relationship with values closer to -1, indicating data points closer to a straight line, and closer to 0, indicating greater deviations from a straight line.
If , it is said to be a case of perfect positive linear correlation. In such cases, all points in the scatter diagram lie on a straight line that slopes upward from left to right, if , the correlation is said to be a perfect negative linear correlation. In this case, all points in a scatter diagram fall on a straight line that slopes downward from left to right.
If the correlation between two variables is positive and close to 1, we say that the variables have a strong positive linear correlation. If the correlation between two variables is positive but close to 0, then the variables have a weak positive linear correlation. On the other hand, if the correlation between two variables is negative and close to 1, then the variables are said to have a strong negative linear correlation. Also, if the correlation between two variables is negative and close to 0, there exists a weak negative linear correlation between the variables.
Example:
An economist is interested in the relationship between food expenditure and income. Calculate the sample correlation coefficient for the data recorded on monthly incomes and food expenditure of seven households.
Household | |||||||
Income (100’s of $) | |||||||
Food expenditure (100’s of $) |
Solution: The sample means are
;
The sample correlation coefficient can be calculated either by (3.1) or (3.2)
It is more convenient to use (3.2) to calculate correlation coefficient:
Necessary calculations of the sample correlation for the data are set out in the following table 3.1
Table 3.1
Household | Income | Food expenditure ( | |||
Sums |
Hence, the sample correlation is:
The sample correlation, 0.96, indicates very strong positive relationships between monthly income and food expenditure. The high value of monthly income tends to be associated with the higher value of food expenditure.
Дата добавления: 2015-08-05; просмотров: 131 | Нарушение авторских прав
<== предыдущая страница | | | следующая страница ==> |
The scatter diagram | | | Hypothesis test for correlation |