1Part5–OtheraspectsofsimplelinearregressionSTA503Outline5.1Thecenteredsimplelinearregressionmodel5.2Regressionthroughtheorigin5.3RegressionthroughtheorigininSAS5.4Normalcorrelationmodels5.5Inferencesaboutthecorrelationcoefficient5.6EstimationandinferenceaboutthecorrelationcoefficientinSAS5.7Powerinsimplelinearregression5.8PowercalculationsinSAS5.9Choiceofpredictorvalues5.1Thecenteredsimplelinearregressionmodel-Wenowconsideramodelwhichisequivalenttothenormalerrorssimplelinearregressionmodelconsidereduptothispoint.-WemayredefinethepredictoriXasthedeviationfromitsownsamplemean,iXX−,thatis,wemayregresstheiY’sonthe()iXX−’s.-Wesee010111011*01()()()iiiiiiiiiYXXXXXXXXXββεββββεβββεββε=++=+−++=++−+=+−+where*001Xβββ=+.-ThefittedinterceptofthismodelisequaltoY.-Hence,theestimatorsofthecoefficientsofthismodelare*0bY=and1112211()()()()()nniiiiiinniiiiXXYYXXYbXXXX====−−−==−−∑∑∑∑.-Predictedvaluesmaythenbeobtainedfrom1ˆ()iiYYbXX=+−.2-Recall,inourdiscussionoftheoriginalmodel,that20121(,)()niiXCovbbXXσ=−=−∑.-Sincethemeanofthepredictorvariableisnow1()0niiXXn=−=∑wehave*01(,)0Covbb=.-Rememberthisdoesnotimply*0band1bareindependent.-Inferencecalculationsresultinthesameconfidenceandpredictionintervalsandhypothesistestingproceduresasbeforewiththeexceptionofinferencesabout*0β.5.2Regressionthroughtheorigin-Thedataand/ortheorymaydictatethatthestraightlineshouldpassthroughtheorigin(0X=,0Y=).Inthiscasewemayconsiderthemodel,1iiiYXβε=+,1,2,...,in=,whereiYistheresponsecorrespondingtotheithobservation,1βistheregressioncoefficient,iXistheknown(fixed)predictorvariableassociatedwiththeithobservation,and2...(0,)iiidNεσ∼.-Notethisisthenormalerrorssimplelinearregressionmodelwith00β=.-Thismodelimpliesthat21(,)iiYNXβσ∼.-Leastsquaresestimationof1βinvolvesminimizationofthequantity22111ˆ()()nniiiiiiQYYYXβ===−=−∑∑whichleadstothesolenormalequation,()1120niiiiXYXβ=−−=∑.-Algebrarevelswemayestimatetheslopewiththeestimator1121niiiniiYXbX===∑∑.-Itcanbeshownthat21121,niibNXσβ=⎛⎞⎜⎟⎜⎟⎝⎠∑∼.3-Theestimateofthecommonvarianceis21ˆ()1niiiYYMSEn=−=−∑.-Noticethatthequantity21ˆ()niiiYY=−∑nowhas1n−degreesoffreedom.-Toperformhypothesistestingcorrespondingtothefollowingnullhypothesisandcompositealternativehypotheses,011,011,0::aHHββββ=≠wemayconsidertheteststatistic11,021niibTMSEXβ=−=∑.-Under0H,theabovehasa1nt−distribution.-A100(1)α−percentconfidenceintervalfor1βis11;1/221nniiMSEbtXα−−=±∑.-Predictedvaluesmaybeobtainedusing1ˆiiYbX=.-A100(1)α−percentconfidenceintervalfortheconditionmeanofYforsomespecificvalueofX,hX,isgivenby21;1/221ˆhhnniiXYtMSEXα−−=⎡⎤⎢⎥±⎢⎥⎣⎦∑.-Itcanbeshowna100(1)α−percentpredictionintervalfor()hnewY,correspondingtosomevaluehXisgivenby21;1/221ˆ1hhnniiXYtMSEXα−−=⎡⎤⎢⎥±+⎢⎥⎣⎦∑.-Noticethattheconfidenceintervalsfor()hEYandpredictionintervalswidenashXincrease.-Weseethattheconfidenceintervaliszeroat0hX=sincetheinterceptisknownunderthemodel.Thepredictionintervalisnonzerobecauserandomerroroffutureobservationmustbetakenintoaccount.4-Onesurprisingaspectofregressionthroughtheoriginisthatusually10niie=≠∑.-Theonlyconstraintontheresiduals,whichisfromthenormalequation,is10niiiXe==∑.-ThepartitionoftheSSTOnolongerholds,thatis,usuallySSTOSSRSSE≠+.-ThetraditionalF-testisnolongervalid.-Infact,itispossiblethatSSESSTOwhenthemodelnotappropriate(ex.00β≠)andso210SSERSSTO=−.-Itcanbeseenforthemodel,01iiiYXββε=++,foraknown0β,22200111ˆˆ()()()nnniiiiiiiYYYYββ===−=−+−∑∑∑.-So,if00β=,222111ˆˆ()nnniiiiiiiYYYY====+−∑∑∑.-Therefore,a2R-likequantitywecanconsideristheratioofthetermanalogoustotheregressionsumofsquares,21ˆniiY=∑andtheuncorrectedsumofsquares,21niiY=∑whichis221021ˆniiniiYRY===∑∑.-AvalidF-testmayalsobedeveloped.55.3RegressionthroughtheorigininSASWenowuseSAStocompareafittednointerceptmodeltothewithinterceptmodelforourexampledataset.procregdata=mercurynoprint;modelblood=intake;outputout=outdata1p=p_int;run;procregdata=mercury;modelblood=intake/noint;outputout=outdata2p=p_noint;run;procsortdata=outdata1;byintake;run;procsortdata=outdata2;byintake;run;dataoutdata3;mergeoutdata1outdata2;byintake;labelp_int='Interceptmodel'p_noint='Nointerceptmodel';run;procprintdata=outdata3;run;goptionscolors=(none);symbol1v=dot;symbol2v=nonei=joinl=1;symbol3v=nonei=joinl=20;axis1label=(angle=90height=1.5'Blood');axis2label=(height=1.5'Intake');procgplotdata=outdata3;plot(bloodp_intp_noint)*intake/legendoverlayvaxis=axis1haxis=axis2;run;quit;6TheREGProcedureModel:MODEL1DependentVariable:bloodNumberofObservationsRead12NumberofObservationsUsed12NOTE:Nointerceptinmodel.R-Squareisredefined.AnalysisofVarianceSumofMeanSourceDFSquaresSquareFValuePrFModel1737832737832361.22.0001Error11224682042.58740UncorrectedTotal12760300RootMSE45.19499R-Square0.9704DependentMean219.16667AdjR-Sq0.9678CoeffVar20.62129ParameterEstimatesParameterStandardVariableDFEstimateErrortValuePr|t|intake10.596230.0313719.01.00017Obsintakebloodp_intp_noint11057046.69962.60421809094.755107.3213200120107.570119.2464230125126.792137.1325250105139.607149.0576275170155.626163.9637410290242.127244.4538460205274.164274.2659550290331.831327.92510580