Understanding the Factors Affecting The Unemployment Rate Through Regression Analysis An Individual Report Presented to The Faculty of Economics Department In Partial Fulfillment To The Requirements for ECONMET C31 Submitted to: Dr. Cesar Rufino Submitted by: Aaron John Dee 10933557 April 8, 2011 1 TABLE OF CONTENTS I. INTRODUCTION A. Background of the Study B. Statement of the Problem C. Objective II. THEORETICAL FRAMEWORK AND RELATED LITERATURE A. GDP B. Average Years in School C. Population D. Literacy Rate III. OPERATIONAL FRAMEWORK A. Model Specification B. List and Description of Variables C.
A-priori Expectations IV. METHODOLOGY V. EMPIRICAL RESULTS AND INTERPRETATIONS A. Regression of the Original Model 4 4 5 5 6 6 6 7 7 9 9 9 10 12 13 13 2 B. Summary Statistics C. Testing for Misspecification in the Model D. Testing for Multicollinearity E. Testing for Heteroscedasticity VI. CONCLUSION VII. BIBLIOGRAPHY 15 16 17 18 21 22 3 I. INTRODUCTION A. Background of the Study When we were still kids, we dream of what we want to be in the future. Older people will usually ask us if what we want to be in the future. Most of us will say, they want to be a doctor, lawyer or engineer to name some.
We think and think about our career, but once we are already in the college level, we now dream to become successful in life and have a stable job. But with the rate of unemployment here in country continue to increase, there are no guarantee that once we graduated we will have a job immediately. Unfortunately, many still fail to have stable jobs. Some even can’t find a job even though they graduated from top schools. Then we found ourselves ending in the pool of unemployment. Unemployment is indeed a very important issue all over the world. People are getting laid off, some cannot find a job, and the number is increasing.
Government wants to achieve full employment but we all know that it will never happen simply because there are millions of people in country and the government or even the private sectors can provide for that huge number of laborers. The government cannot just expand and increase total output so that it will provide job opportunities to the unemployed because there also negative impact on the economy. I am aware that our country is suffering from high unemployment rate, because some workers are only on a contractual basis. Sure they can work but usually it is only for 6 months plus there are no benefits included.
After the span of 6 months, they will find themselves unemployed again and they will have a difficult time especially if they did not finish schooling. Companies now a day are more sophisticated and competitive, they don’t just hire college 4 graduate students even if you graduated from top schools. Having a master’s degree will surely be of help in finding a job for companies look only for the best. Some people engage in work that they are not inclined with like for past year, people work as call center agents even though their college degree is not mass communications or anything that has a connection of being a call center agent.
They do this because they don’t want to be unemployed and do nothing for an extended period of time. B. Statement of the Problem Unemployment is very important issue, not just here in our country but also for the rest of the world. This paper will seek to answer whether the literacy rate, average years in school, GDP and total population have a relationship with the total unemployment. Can these exogenous variables explain the unemployment that is happening all over the world? C. Objective The objective of this paper is to (1) find out what are the determinants of unemployment.
For this study, literacy rate, average years in school, GDP and total population will be considered as a determinant of unemployment. (2) Create an econometric model that will explain unemployment and (3) to give the readers idea what should be done to alleviate unemployment 5 II. REVIEW ON RELATED LITERATURE A. GDP Gross domestic product or GDP is considered as an indicator of the standard of living in a certain country. The higher the GDP the higher is the country’s standards of living and the lower the GDP the lower is the country’s standard of living.
According to (Abuqamar, Coomans, & Louckx, 2011), unemployment is an important factor in measuring country’s economic strength like GDP per capita. If the unemployment level is high, then economic growth is very low because they have a negative relationship. A sustainable growth accompanied by macroeconomic policies that promotes employment will eventually cut down the level of unemployment in the economy and growth is considered as a solution to decrease unemployment (Hussain, Siddiqi, & Iqbal, 2010). This is true because when government wants to increase output by building infrastructures and the like.
They create job opportunities for those who are unemployed thus, alleviating unemployment in the economy. More people will get jobs and earn to sustain their standard of living or even increase their standard of living depending on their salaries. B. Average Years in School Education is very important in everyone’s lives. It is our foundation of knowledge which will reflect us. Even though going to school and doing homework are boring, we will still benefit from it because we learn and by learning we become mature and responsible.
According to (Weisberg & Meltz), the higher the level of education or the years in school of a person, the 6 lower will be the unemployment rate. Which make sense since people are educated, they will have decent jobs and they can even create their own firm or business thus promoting employment. C. Population Population in a country is always increasing and that is inevitable. Population is also a determinant of unemployment. Based on the research paper of (Rafiq, Iftikhar, Asmat, & Zahoor) entitled Determinants of Unemployment: A Case Study of Pakistan Economy (19982008), population growth has a negative effect on unemployment.
The results of their tests show that when the population is increasing, unemployment also increases which is bad for every economy. Rapid growth in population is bad because it will only increase unemployment further. There will be pressure in employment since many people don’t have any job, unemployment will increase. Moen (1999) argues that in the competition for jobs, workers will prefer to have higher degree attainment so that they will have an edge over the other workers. With the preference of increasing a person’s educational attainment, the rate of unemployment will decrease. Nickell, 1979; Moen, 1999). D. Literacy Rate Literacy is important just like education. People must be literate in order to fit in the norm. According to the article Literacy and Unemployment, people who are illiterate have disadvantages because they cannot read and right, thus they will be more likely to be 7 unemployed. It is also stated in the article that once people get part of the unemployment cycle, it will be difficult for them to break it and because of long term of being unemployed they will feel discourage and therefore will lack self confidence. 8 III. OPERATIONAL FRAMEWORK
A. Model Specification totunem = ? 1 + ? 2litrate + ? 3yearisnch + ? 4gdp + ? 5totpop + ? B. List and Description of Variables Before we proceed to the a-priori expectations of each exogenous variable to the endogenous variable and the discussion of the results, we must describe first the components of the model. The model is comprised of both the exogenous variables and the endogenous variable. The exogenous variables or the independent variables are not affected or determined by any other variables in the model unlike the endogenous variable which depends on the exogenous variable.
Table 1 will tells us a brief description of the variables used in the model Table 1. Names of Variables Used and Descriptions Description This quantitative variable pertains to the total unemployment rate of all the countries in the world for the year 2000. lirate This quantitative variable pertains to the literacy rate of all the countries in the world for the year 2000. yearinsch This quantitative variable pertains to the average year in school of an adult ages 15 and up of all the countries in the world for the year 2000. dp This quantitative variable pertains to the gross domestic product of all the countries in the world for the year 2000. Variables totunem 9 totpop This quantitative variable pertains to the total population of all the countries in the world for the year 2000. C. A-priori Expectations The a-priori expectations capture the effect of an increase in the exogenous variables to the endogenous variable which in out model is totunem. The a-priori expectation are taken from the review on related literature a while ago.
Note however that the a-priori expectation does not cover the magnitude of their relationship. It only tells the direction of their relationship. A positive sign implies that the exogenous variable has a positive relationship with the endogenous variable and a negative sign implies otherwise. The magnitude of their relationship will be discussed later on. Table 2 shows the relationship if the variables, their signs and the intuition behind it. Table 2. Variables, Sign and Intuition Exogenous Variable: totunem Signs Intuition + Literacy is very important to everyone because it is a social norm.
Therefore it has a positive effect on unemployment because when literacy increases, it implies that people learned and attended school. Companies will hire them so there will be a decrease in the unemployment rate. Variables lirate 10 yearinsch +/- An increase in yearinsch doesn’t necessarily mean that you finished every level successfully. It can also mean that your year in school increases because you always fail in school. If the increase in average years in school is positive, people will be able to work or create businesses that give job opportunities to the unemployed.
But if the increase in average years in school is negative, it implies that people didn’t learn and therefore they will have a hard time looking for a job because companies will only accept people who performed well in school gdp + An increase on GDP will promote employment because when the government expands by building infrastructures, it gives job opportunities to those unemployed thus alleviating the unemployment. totpop - An increase in total population will have negative effect on unemployment.
It means that when the total population increase, more people will now demand for a job creating a pressure towards the unemployment and if the government cannot supply the increasing population with jobs, they will severely increase the unemployment rate. 11 IV. METHODOLOGY A cross sectional data comprising of 65 countries all over the world for the year 2000 was used in the study. All of the data sets were obtained from the World Bank data sets. The researcher will use the software program Gretl to estimate the model. With this software, the coefficients of the exogenous variables will be obtained.
A lin-lin type of model is used in this study and the Ordinary Least Squares approach will be used. After regressing the data, several outputs will be obtained like the coefficients, standard error, p-value and Rsquared to mention some. The model will now be subjected to various tests to check for any CLRM violations namely multicollinearity and heteroscedasticity. Autocorrelation is not present in this model since we are using a cross sectional data. To test for multicollierity, the auxiliary regression and the Variance Inflation Factor (VIF) will be used.
To check for the presence of heteroscedasticity, both the Breusch – Pagan Test and the White’s Test will be used. After the tests, if there are presence of multicollinearity and heteroscedasticty, corrective measures should be applied in order to correct the model. The Ramsey’s RESET is a test for misspecification errors in the model. The interpretation of the results will be also showed after every test on the model. 12 V. EMPIRICAL RESULTS AND INTERPRETATIONS A. Regression of the Original Model The regression results shown below are obtained by using the Ordinary Least Squares method also known as the OLS method.
Model 1: OLS, using observations 1-65 (n = 11) Missing or incomplete observations dropped: 54 Dependent variable: totunem coefficient std. error t-ratio p-value -----------------------------------------------------------const 14. 6143 6. 02794 2. 424 0. 0516 litrate -0. 344479 0. 129558 -2. 659 0. 0376 yearinsch 3. 48303 1. 04882 3. 321 0. 0160 gpd -1. 34898e-011 5. 94827e-012 -2. 268 0. 0639 totpop 1. 08535e-08 5. 83976e-09 1. 859 0. 1124 Mean dependent var Sum squared resid R-squared F(4, 6) Log-likelihood Schwarz criterion 6. 200000 40. 74309 0. 740853 4. 288221 -22. 80997 57. 60942 S. D. ependent var S. E. of regression Adjusted R-squared P-value(F) Akaike criterion Hannan-Quinn * ** ** * 3. 965098 2. 605862 0. 568088 0. 056084 55. 61995 54. 36586 Excluding the constant, p-value was highest for variable 5 (totpop) The results from the initial regression are shown above. We need to look at several values in interpreting the results for our model. We need to look first the coefficients and the pvalue but what do these things tells us about our model. Since our model is linear, the coefficient basically tells us the effect of a unit increase in the exogenous variables to the 3 endogenous variables. It is interpreted like this, a unit increase in an independent variable will increase or decrease the dependent variable by the coefficient value. The p-value shows us the individual significance of the exogenous variables. For the exogenous variables considered to be significant, the p-value should be less than or equal to the risk level of 0. 05 at a 95% confidence interval. The next thing that we need to look at is the R-squared or the goodness-offit. It tells us how many percent of the endogenous variable are explained by the exogenous variables.
The value of r-squared should be multiplied by 100% to be in percentage form. Interpreting the above model, it shows that a unit increase in litrate and yearinsch, totunem will decrease by 0. 344479 and increase by 3. 48303 repectively. These two variables are seen to be significant to our model with a p-value of 0. 0376 and 0. 0160 respectively. The other two variables which are gdp and totpop are seen to be insignificant in our model with a p-value of 0. 0639 and 0. 1124 respectively. With a unit increase in gdp and totpop, totunem will decrease by -1. 4898e-011 and increase by 1. 08535e-08 respectively. We now look on the Rsqaured of the model, as shown in the results above the R-sqaured has a value of 0. 740853 or 74. 08%. This implies that 74. 08% of the endogenous variables are explained by the exogenous variables. Note that these results and interpretation are only reliable if our model is free from any violation. These violations will be discussed later on and we will apply the corrective measure if necessary. 14 B. Summary Statistics Summary statistics, using the observations 1 - 65 (missing values were skipped) itrate yearinsch gpd totpop Mean 74. 787 6. 9008 3. 3822e+011 4. 5753e+007 Std. Dev. 20. 570 2. 8389 1. 2582e+012 1. 5949e+008 Median 79. 555 6. 8000 3. 7718e+010 1. 0467e+007 C. V. 0. 27505 0. 41138 3. 7202 3. 4859 Minimum 25. 654 0. 83900 2. 1546e+008 7. 8661e+005 Skewness -0. 66121 -0. 080552 6. 9480 7. 0442 Maximum 99. 767 12. 049 9. 8988e+012 1. 2626e+009 Ex. kurtosis -0. 67087 -0. 86711 49. 954 50. 864 litrate yearinsch gpd totpop The summary statistics shows us the details of our model. The mean, variance, skewness, and the kurtosis are the four moments of random variables.
Discussing further, the mean measures the central tendency, it is basically the sum of all the values of the observation with respect to the total number of observation or the average. The variance measures how spread out or dispersed the variables are from the mean. If the values of the variance are far from the mean, then it implies that observation are scattered around the mean. The values of the variance should be small so that the observations are near to the mean. A dataset is negatively skewed if the value of the mean of the model is less than the median. This focuses more on the higher values than the lower ones.
The positively skewed on the other hand tells us the other way around. 15 C. Testing for Misspecification in the Model RESET test for specification (squares and cubes) Test statistic: F = 0. 727289, with p-value = P(F(2,4) > 0. 727289) = 0. 538 RESET test for specification (cubes only) Test statistic: F = 0. 874685, with p-value = P(F(1,5) > 0. 874685) = 0. 393 RESET test for specification (squares only) Test statistic: F = 0. 664374, with p-value = P(F(1,5) > 0. 664374) = 0. 452 Misspecification occurs when there are important variables omitted. If the model is not correctly specified, the estimators will be biased and inconsistent.
Also, the error term is not estimated correctly. Because of the misspecification errors, the statistical significance of the variables will give us misleading conclusions. To be sure that our model is correctly specified, we run the Ramsey’s RESET test. The results above are from the Ramsey’s RESET test, this is the general test to check for misspecification of error in out model. There will be a null hypothesis that will be tested here which is Ho: there is no misspecification and the alternative hypothesis will be Ha: there is misspecification. To interpret the results above, we need to look at the p-values of the three results.
You will notice that all of the p-values are greater than the significance level of 0. 05, therefore there is no evidence that we need to accept the alternative hypothesis and we have no reason to reject the null hypothesis which tells us that there exist no misspecification of error. We can say confidently that the model is not misspecified. 16 D. Testing for Multicollinearity Multicollinearity exists when the independent variables are related to one another (Gujarati and Porter, 2009). It means that there is a linear relationship among the independent variables.
This is one of the classical linear regression violations and this is usually present in multiple regressions. Gujarati and Porter (2009) also pointed out that even though there is a presence of multicollinearity, the estimates are still BLUE. With the presence of multicollinearity, the standard error of the variables become larger than what their values should really be. Therefore the estimation will be difficult to determine whether it is precise or not. To know if the model exhibits multicollinearity, the model should be tested it and the Variance Inflation Factor (VIF) must be examined.
If the VIF of the independent variables exceed 10, then multicollinearity exists between the exogenous variables and corrective measures are taken in order to eliminate the multicollinearity between the variables. Variance Inflation Factors Minimum possible value = 1. 0 Values > 10. 0 may indicate a collinearity problem litrate yearinsch gpd totpop 5. 011 4. 724 4. 890 4. 480 VIF(j) = 1/(1 - R(j)^2), where R(j) is the multiple correlation coefficient between variable j and the other independent variables Properties of matrix X'X: 1-norm = 1. 8146616e+024 Determinant = 3. 3597218e+046 Reciprocal condition number = 9. 335124e-026 17
To interpret the results above, we need look at the individual VIF of the exogenous variable whether multicollinearity exists or not. If the VIFs of the exogenous variables are less than 10, it implies that multicollinearity is tolerable and there are no corrective measures to be applied. But if the value of the VIF are greater than 10, then severe multicollinearity exists and the necessary correction should be done. As seen in the results above, the VIFs of the exogenous variables are less than 10 which implies that the CLRM assumption of multicollinearity is tolerable in the model thus, it does not require any corrective actions.
E. Testing for Heteroscedasticity Heteroscedasticity is also a classical linear regression model (CLRM) violation that is usually present in panel data and cross sectional data sets. This problem violates the assumption that the model exhibits constant variance as the sample size increases. Therefore, if anyone still continues with the normal testing procedures even though heteroscedasticity is present, whatever the conclusion one draw from the results may be misleading (Gujarati and Porter, 2009). In order to know whether our model exhibits heteroscedasticity, we need to perform the Breusch-Pagan Test or the White’s Test.
Let us look first at the result for the Breusch-Pagan Test for heteroscedasticity. Breusch-Pagan test for heteroskedasticity OLS, using observations 1-65 (n = 11) Missing or incomplete observations dropped: 54 Dependent variable: scaled uhat^2 coefficient std. error t-ratio p-value ------------------------------------------------------------const 0. 353903 2. 83863 0. 1247 0. 9049 litrate 0. 0400827 0. 0610102 0. 6570 0. 5356 yearinsch -0. 394681 0. 493903 -0. 7991 0. 4547 18 gpd totpop -1. 46506e-012 2. 07008e-010 2. 80111e-012 2. 75001e-09 -0. 5230 0. 07528 0. 6197 0. 9424
Explained sum of squares = 2. 80998 Test statistic: LM = 1. 404991, with p-value = P(Chi-square(4) ; 1. 404991) = 0. 843327 Ho: Constant Variance vs Ha: Heteroscedasticity exists As we can see from the results above, the p-value is 0. 843327 which is greater than the 0. 05. Thus, the null hypothesis which tells us that our model exhibits a constant variance must be accepted and the alternative hypothesis to be rejected. Let us also use the White’s test for heteroscedsaticity to check whether the results from the Breusch-Pagan test performed above is the same with here.
White's test for heteroskedasticity OLS, using observations 1-65 (n = 11) Missing or incomplete observations dropped: 54 Dependent variable: uhat^2 coefficient std. error t-ratio p-value ---------------------------------------------------------------const -111. 711 169. 951 -0. 6573 0. 5785 litrate 3. 22957 5. 32033 0. 6070 0. 6056 yearinsch 0. 271900 18. 2904 0. 01487 0. 9895 gpd 2. 20028e-011 9. 24076e-011 0. 2381 0. 8340 totpop -7. 59484e-09 1. 24364e-07 -0. 06107 0. 9569 sq_litrate -0. 0208996 0. 0331887 -0. 6297 0. 5932 sq_yearinsch -0. 142336 1. 28197 -0. 1110 0. 9217 sq_gpd 0. 000000 0. 000000 -0. 437 0. 7639 sq_totpop 0. 000000 0. 000000 0. 2166 0. 8486 Unadjusted R-squared = 0. 470293 Test statistic: TR^2 = 5. 173227, with p-value = P(Chi-square(8) > 5. 173227) = 0. 738911 Ho: Constant Variance vs Ha: Heteroscedasticity exists 19 The results from the White’s test give us the same intuition as the Breusch-Pagan Test. The p-value here is 0. 738911 which is greater than 0. 05. Based on the results, we should accept the null hypothesis and reject the alternative hypothesis. Since both of the test’s that was performed have a p-value greater than 0. 05 which implieas that they are insignificant.
We should accept the null hypothesis which is the model exhibits a constant variance and reject the alternative hypothesis. There is no heteroscedasticity in the model. 20 VI. CONCLUSION Based on the results in the regression, we can conclude that all of the exogenous variables except for yearinsch match our a-priori expectations. The results after regressing the model shows that litrate, gdp are significant thus we can say that they are indeed factors in determining unemployment. The variable totpop is insignificant because when population increase, it doesn’t mean that there will people already available to work immediately.
But the review on related literatures proved that when population increases the unemployment rate will also increase eventually. As for the variable yearinsch, this is most significant variable among the four exogenous variables. The effect of this variable captures the negative effect. As said in the a-priori expectations, years in schooling may continue to increase because of poor performance in school, thus the students will repeat again and again and again, Yes it increases the years of schooling but it implies a negative effect.
People will end up unemployed since they are not doing well in school. The government plays an important role in maintaining a low level of unemployment. They will not be able to achieve its goal of having full employment but the government can provide job opportunities to alleviate unemployment. The people should also do their part in order for them not to be part of the unemployed by simply performing well in school and aim for higher level of education. 21 VI. BIBLIOGRAPHY Abuqamar, M. , Coomans, D. , & Louckx, F. (2011, January).
Correlation between socioeconomic differences and infant mortality in the Arab World (1990-2009). International Journal of Sociology and Anthropology Vol. 3(1) , 15-21. Gujarati, & Porter. (2009). Basic Econometrics. USA: John Weily and Sons. Hussain, T. , Siddiqi, M. , & Iqbal, A. (2010). A Coherent Relationship between Economic Growth and Unemployment: An Empirical Evidence from Pakistan. International Journal of Human and Social Sciences , 332-339. Literacy Fact Sheet. (n. d. ). Retrieved from Northwest Territories Literacy Council: http://www. nwt. literacy. a/litfacts/LiteracyandUnemployment. pdf Rafiq, M. , Iftikhar, A. , Asmat, U. , & Zahoor, K. (n. d. ). DETERMINANTS OF UNEMPLOYMENT:A CASE STUDY OF PAKISTAN ECONOMY (1998-2008). Abasyn Journal of Social Sciences Vol. 3. No. 1 , 17-24. The effects of education on the natural rate of unemployment. (2008, 4 1). Retrieved 4 7, 2011, from Goliath: Business knowledge on demand: http://goliath. ecnext. com/coms2/gi_0199-8128098/The-effects-of-education-on. html Weisberg, Y. , & Meltz, N. M. (n. d. ). Education and Unemployment in israel, 1976-1994: Reducing the Anomaly. 22