Professor Mumford mumford@purdue. edu Econ 360 - Fall 2012 Problem Set 1 Answers True/False (30 points) 1. FALSE If (ai , bi ) : i = 1, 2, . . . , n and (xi , yi ) : i = 1, 2, · · · , n are sets of n pairs of numbers, then: n n n (ai xi + bi yi ) = i=1 i=1 ai x i + i=1 bi yi 2. FALSE If xi : i = 1, 2, . . . , n is a set of n numbers, then: n n n n n (xi ? x) = ? i=1 n i=1 2 x2 i ? 2? x i=1 xi + i=1 x = ? i=1 2 x2 ? n? 2 x i where x = ? 1 n i=1 xi 3. TRUE If xi : i = 1, 2, . . . , n is a set of n numbers and a is a constant, then: n n a xi = a i=1 n i=1 xi = a n x ? here x = ? 1 n i=1 xi 4. FALSE If X and Y are independent random variables then: E (Y |X) = E (Y ) 1 5. TRUE If {a1 , a2 , . . . , an } are constants and {X1 , X2 , . . . , Xn } are random variables then: n n E i=1 ai X i = i=1 ai E (Xi ) 6. FALSE For a random variable X, let µ = E (X). The variance of X can be expressed as: V ar(X) = E X 2 ? µ2 7. TRUE For random variables Y and X, the variance of Y conditional on X = x is given by: V ar(Y |X = x) = E Y 2 |x ? [E (Y |x)]2 8. TRUE An estimator, W , of ? is an unbiased estimator if E (W ) = ? for all possible values of ?. 9.

FALSE The central limit theorem states that the average from a random sample for any population (with ? nite variance) when it is standardized, by subtracting the mean and then dividing by the standard deviation, has an asymptotic standard normal distribution. 10. TRUE The law of large numbers states that if X1 , X2 , . . . , Xn are independent, identically distributed random variables with mean µ, then ? plim Xn = µ 2 Multiple Choice Questions (a) ceteris paribus (b) correlation (c) causal e? ect (d) independence (20 points) 11. The idea of holding “all else equal” is known as 12.

If our dataset has one observation for every state for the year 2000, then our dataset is (a) cross-sectional data (b) pooled cross-sectional data (c) time series data (d) panel data 13. If our dataset has one observation for every state for the year 2000 and another observation for each state in 2005, then our dataset is (a) cross-sectional data (b) pooled cross-sectional data (c) time series data (d) panel data 14. If our dataset has one observation for the state of Indiana each year from 1950-2005 then our dataset is (a) cross-sectional data (b) pooled cross-sectional data (c) time series data (d) panel data 15.

Consider the function f (X, Y ) = (aX + bY )2 . What is (a) 2aX (b) a(aX + bY ) (c) 2a(aX + bY ) (d) a2 X ? f (X,Y ) ? X 3 Long Answer Questions (50 points) 16. The sum of squared deviations (subtracting the average value of x from each observation on x) is the sum of the squared xi minus n times the square of x. There are ? several ways to show this, here is one: n n xi (xi ? x) ? i=1 = i=1 n (xi ? x + x) (xi ? x) ? ? ? n = i=1 n (xi ? x) (xi ? x) + ? ? i=1 n x (xi ? x) ? ? = i=1 (xi ? x)2 + x ? ? i=1 n (xi ? x) ? (xi ? x) = 0, so ? and we know that i=1 n i=1 (xi ? x)2 ? 17. There are several ways to show that this expression equals the sample covariance between x and y, here is one: n n xi (yi ? y ) ? i=1 = i=1 n (xi ? x + x) (yi ? y ) ? ? ? n = i=1 n (xi ? x) (yi ? y ) + x ? ? ? i=1 (yi ? y ) ? = i=1 (xi ? x) (yi ? y ) ? ? 18. Correlation and causation are not always the same thing. (a) A negative correlation means that larger class size is associated with lower test performance. This could be because the relationship is causal meaning that having a larger class size actually hurts student performance.

However, there are other reasons we might ? nd a negative relationship. For example, children from more a? uent families might be more likely to attend schools with smaller class sizes, and a? uent children generally score better on standardized tests. Another possibility is that within a school, a principal might assign the better students to smaller classes. Or, some parents might insist that their children are in the smaller classes, and these same parents tend to be more involved in their children’s education. Given the potential for confounding factors such as these, ? ding a negative correlation between class size and test scores is not strong evidence that smaller 4 class sizes actually lead to better performance. Thus, without other information, we cannot draw a meaningful economic conclusions. A correct answer should explain that we should be careful about drawing economic conclusions from simple correlations. (b) The sample correlation between N and T is de? ned as: s rN T = N T sN sT where the sample covariance, sN T , is given by: sN T = 1 999 1000 ? Ni ? N i=1 ? Ti ? T and the sample standard deviations are given by: sN = 1 999 1000 Ni ? N i=1 2 sT = 1 999 1000 ? Ti ? T i=1 2 Note that there are several alternative ways to write this and statistical programs generally use other algorithms to calculate the correlation that are less prone to loss of precision due to roundo? error or storage over? ow. 19. Wage data (a) There are 526 observations. (b) There are 274 men in the sample. This means that the sample is 52. 09 percent male. (c) The average level of education in the sample is 12. 6 years. The median level of education is 12 years. (d) The highest education level in the sample is 18 years of school. 9 people in the sample report having 18 years of education. (e) The average hourly wage in the sample is $5. 90. The median hourly wage in the sample is $4. 65. 20. Fertility data (a) There are 363 women in the sample. (b) The average number of children ever born to a woman in the sample is 2. 3. The median number is 2. (c) The largest number of children ever born to a woman in the sample 7. Six women report having seven children. (d) 25 percent of the sample lived in the eastern United States at age 16. (e) The average level of eduction in the sample is 13. 2 years. 5