6CHAPTER2TEACHINGNOTESThisisthechapterwhereIexpectstudentstofollowmost,ifnotall,ofthealgebraicderivations.InclassIliketoderiveatleasttheunbiasednessoftheOLSslopecoefficient,andusuallyIderivethevariance.Ataminimum,Italkaboutthefactorsaffectingthevariance.Tosimplifythenotation,afterIemphasizetheassumptionsinthepopulationmodel,andassumerandomsampling,Ijustconditiononthevaluesoftheexplanatoryvariablesinthesample.Technically,thisisjustifiedbyrandomsamplingbecause,forexample,E(ui|x1,x2,…,xn)=E(ui|xi)byindependentsampling.IfindthatstudentsareabletofocusonthekeyassumptionSLR.4andsubsequentlytakemywordabouthowconditioningontheindependentvariablesinthesampleisharmless.(Ifyouprefer,theappendixtoChapter3doestheconditioningargumentcarefully.)Becausestatisticalinferenceisnomoredifficultinmultipleregressionthaninsimpleregression,IpostponeinferenceuntilChapter4.(Thisreducesredundancyandallowsyoutofocusontheinterpretivedifferencesbetweensimpleandmultipleregression.)Youmightnoticehow,comparedwithmostothertexts,IuserelativelyfewassumptionstoderivetheunbiasednessoftheOLSslopeestimator,followedbytheformulaforitsvariance.ThisisbecauseIdonotintroduceredundantorunnecessaryassumptions.Forexample,onceSLR.4isassumed,nothingfurtherabouttherelationshipbetweenuandxisneededtoobtaintheunbiasednessofOLSunderrandomsampling.Incidentally,oneoftheuncomfortablefactsaboutfinite-sampleanalysisisthatthereisadifferencebetweenanestimatorthatisunbiasedconditionalontheoutcomeofthecovariatesandonethatisunconditionallyunbiased.Ifthedistributionoftheissuchthattheycanallequalthesamevaluewithpositiveprobability–asisthecasewithdiscretenessinthedistribution–thentheunconditionalexpectationdoesnotreallyexist.Or,ifitismadetoexistthentheestimatorisnotunbiased.Idonottrytoexplainthesesubtletiesinanintroductorycourse,butIhavehadinstructorsaskmeaboutthedifference.7SOLUTIONSTOPROBLEMS2.1(i)Income,age,andfamilybackground(suchasnumberofsiblings)arejustafewpossibilities.Itseemsthateachofthesecouldbecorrelatedwithyearsofeducation.(Incomeandeducationareprobablypositivelycorrelated;ageandeducationmaybenegativelycorrelatedbecausewomeninmorerecentcohortshave,onaverage,moreeducation;andnumberofsiblingsandeducationareprobablynegativelycorrelated.)(ii)Notifthefactorswelistedinpart(i)arecorrelatedwitheduc.Becausewewouldliketoholdthesefactorsfixed,theyarepartoftheerrorterm.ButifuiscorrelatedwitheducthenE(u|educ)0,andsoSLR.4fails.2.2Intheequationy=0+1x+u,addandsubtract0fromtherighthandsidetogety=(0+0)+1x+(u0).Callthenewerrore=u0,sothatE(e)=0.Thenewinterceptis0+0,buttheslopeisstill1.2.3(i)Letyi=GPAi,xi=ACTi,andn=8.Thenx=25.875,y=3.2125,1ni(xi–x)(yi–y)=5.8125,and1ni(xi–x)2=56.875.Fromequation(2.9),weobtaintheslopeas1ˆ=5.8125/56.875.1022,roundedtofourplacesafterthedecimal.From(2.17),0ˆ=y–1ˆx3.2125–(.1022)25.875.5681.SowecanwriteGPA=.5681+.1022ACTn=8.TheinterceptdoesnothaveausefulinterpretationbecauseACTisnotclosetozeroforthepopulationofinterest.IfACTis5pointshigher,GPAincreasesby.1022(5)=.511.(ii)Thefittedvaluesandresiduals—roundedtofourdecimalplaces—aregivenalongwiththeobservationnumberiandGPAinthefollowingtable:8iGPAGPAˆu12.82.7143.085723.43.0209.379133.03.2253–.225343.53.3275.172553.63.5319.068163.03.1231–.123172.73.1231–.423183.73.6341.0659Youcanverifythattheresiduals,asreportedinthetable,sumto.0002,whichisprettyclosetozerogiventheinherentroundingerror.(iii)WhenACT=20,GPA=.5681+.1022(20)2.61.(iv)Thesumofsquaredresiduals,21ˆniiu,isabout.4347(roundedtofourdecimalplaces),andthetotalsumofsquares,1ni(yi–y)2,isabout1.0288.SotheR-squaredfromtheregressionisR2=1–SSR/SST1–(.4347/1.0288).577.Therefore,about57.7%ofthevariationinGPAisexplainedbyACTinthissmallsampleofstudents.2.4(i)Whencigs=0,predictedbirthweightis119.77ounces.Whencigs=20,bwght=109.49.Thisisaboutan8.6%drop.(ii)Notnecessarily.Therearemanyotherfactorsthatcanaffectbirthweight,particularlyoverallhealthofthemotherandqualityofprenatalcare.Thesecouldbecorrelatedwithcigarettesmokingduringbirth.Also,somethingsuchascaffeineconsumptioncanaffectbirthweight,andmightalsobecorrelatedwithcigarettesmoking.(iii)Ifwewantapredictedbwghtof125,thencigs=(125–119.77)/(–.524)–10.18,orabout–10cigarettes!Thisisnonsense,ofcourse,anditshowswhathappenswhenwearetryingtopredictsomethingascomplicatedasbirthweightwithonlyasingleexplanatoryvariable.Thelargestpredictedbirthweightisnecessarily119.77.Yetalmost700ofthebirthsinthesamplehadabirthweighthigherthan119.77.9(iv)1,176outof1,388womendidnotsmokewhilepregnant,orabout84.7%.Becauseweareusingonlycigstoexplainbirthweight,wehaveonlyonepredictedbirthweightatcigs=0.Thepredictedbirthweightisnecessarilyroughlyinthemiddleoftheobservedbirthweightsatcigs=0,andsowewillunderpredicthighbirthrates.2.5(i)Theinterceptimpliesthatwheninc=0,consispredictedtobenegative$124.84.This,ofcourse,cannotbetrue,andreflectsthatfactthatthisconsumptionfunctionmightbeapoorpredictorofconsumptionatverylow-incomelevels.Ontheotherhand,onanannualbasis,$124.84isnotsofarfromzero.(ii)Justplug30,000intotheequation:cons=