合肥学院2015-2016第二学期《多元统计分析》课程论文论文题目回归分析姓名陈毅学号1307021036专业数学与应用数学(1)成绩2015.5一元线性回归分析及其应用摘要应用一元线性回归分析南极站CAPETOWN68816从1901年到1960年这60年一月份的温度,根据最小二乘法的原理,采用SAS统计软件进行数据的处理,拟合出年份与温度间的线性关系。分析软件运算的结果,最终得到实际的一元线性关系。关键词温度与年份一元线性回归t检验一、线性回归理论(1)一元线性回归模型其中0,为模型参数,为随机误差项,X是自变量,Y是因变量。对(X,Y)进行观察,得到n组样本观测值niyxii,,2,1,,)(,则有iiy10,其中1为x对y的线性影响而形成的系统部分,反映两变量的平均变动关系,即本质特征,i为随机干扰:各种偶然因素、观察误差和其他被忽视因素的影响。(2)最小二乘估计参数的最小二乘估计量ˆ使误差平方和)(Q达到最小,即)(min)ˆ(QQ其中正规方程:,若XX可逆,经验回归方程:201,()0,()YXEVar2201111ˆˆ()min()()()()()nniiipipiiQQQQyxxyXyX参数的最小二乘估计量使误差平方和()达到最小,即其中1ˆˆ()XXXyXXXXXy正规方程:若可逆,1ˆˆ()XXXyXXXXXy正规方程:若可逆,011ˆˆˆˆppYXX回归拟合值和残差:回归拟合值:拟合向量:残差值:残差向量:(3)最小二乘估计的性质(4)回归方程的显著性检验212121ˆˆniiniiniiyyyyyy)()()(011ˆˆˆppyxxˆiy11ˆˆ()ˆˆ(,,)nyXXXXXyHyyy11ˆˆ()ˆˆ(,,)nyXXXXXyHyyyˆˆiiiyyˆˆ()yXIHyˆ0ˆ0,1,,ˆˆ0iiiijixjpy21ˆˆ1.E(),Var()()XXˆ2.BLUE.是的最优线性无偏估计量()2ˆˆ3.()ˆˆˆˆE()0,Var()(),Cov(,)0yXIHyIH残差向量满足22ˆˆESSˆ4.11是的无偏估计npnpˆ222ˆ()ˆ(()())ˆ()()()iiiyyiiiyyyyyyMSSRrTSSyyyyyy复相关系数:决定系数2R:2aR:即修正的2R线性模型回归的检验:方差来源平方和自由度F值回归误差总计RSSESSTSSpn-p-1n-1回归系数检验:二、问题提出与分析下表为南极南部海洋站CAPETOWN68816从1901年到196021MSSESSRTSSTSS22/(1)111(1)/(1)1aESSnpnRRTSSnnp012:0pH22112220122,~(0,)ˆ1~(,())(2)~(1)ˆ(3)(4):0,~()nppy=XNINXXESSnpESSMSSHp定理:在模型下,有()与相互独立成立时/()/(1)MSSpFESSnp0/(),~(,1)/(1)(,1),.若成立若则拒绝原假设MSSpHFFpnpESSnpFFpnp0:0iH21ˆ~(,),()()iiiiijNcXXl其中2ˆ~(1,1)/(1)iiiiFFnplESSnp年这60年一月份的温度,建立建立SAS数据文件,探讨年份与温度的关系。年份温度年份温度190119.6193123.6190219.3193220.5190319.9193321.3190420.7193422.2190520.8193522.1190619.9193619.4190720.7193721.7190819.8193821.1190921.3193921.8191021.4194022.2191121.1194122.2191220.9194221.4191322.8194320.3191420.4194421.8191522.9194521.2191621.4194620.7191721.6194721.1191821.6194821.8191920.5194921.7192022.7195021.6192120.0195120.5192220.3195221.7192321.0195322.7192422.1195421.4192520.9195522.2192621.8195622.0192722.3195722.3192821.7195821.7192922.5195920.7193021.2196021.9数据来源:三、模型建立设温度为因变量Y,年份为自变量X,建立一元线性回归模型如下:其中0,为模型参数,为随机误差项。做出这组数据的散点图如下:从图中可以看出,因变量与自变量在带状区域内呈线性关系,且因变量随着自变量的增大而增大,所以可以预测这组数据可以用某条直线来拟合,且在回归模型中,01。(1)程序(1):datach;inputwendunianfen@@;cards;19.6190119.3190219.9190320.7190420.8190519.9190620.7190719.8190821.3190921.4191021.1191120.9191222.8191320.41914201,()0,()YXEVar22.9191521.4191621.6191721.6191820.5191922.7192020.0192120.3192221.0192322.1192420.9192521.8192622.3192721.7192822.5192921.2193023.6193120.5193221.3193322.2193422.1193519.4193621.7193721.1193821.8193922.2194022.2194121.4194220.3194321.8194421.2194520.7194621.1194721.8194821.7194921.6195020.5195121.7195222.7195321.4195422.2195522.0195622.3195721.7195820.7195921.91960;procreg;modelwendu=nianfen;printcli;plotwendu*nianfenp.*nianfenl95.*nianfenu95.*nianfen/overlay;symbol1c=blackv=triangle;symbol2c=bluev=circle;symbol3c=greenv=square;symbol4c=redv=star;run;程序(2):datach;inputwendunianfen@@;cards;19.6190119.3190219.9190320.7190420.8190519.9190620.7190719.8190821.3190921.4191021.1191120.9191222.8191320.4191422.9191521.4191621.6191721.6191820.5191922.7192020.0192120.3192221.0192322.1192420.9192521.8192622.3192721.7192822.5192921.21930;procreg;modelwendu=nianfen;printcli;plotwendu*nianfenp.*nianfenl95.*nianfenu95.*nianfen/overlay;symbol1c=blackv=triangle;symbol2c=bluev=circle;symbol3c=greenv=square;symbol4c=redv=star;run;程序(3):datach;inputwendunianfen@@;cards;23.6193120.5193221.3193322.2193422.1193519.4193621.7193721.1193821.8193922.2194022.2194121.4194220.3194321.8194421.2194520.7194621.1194721.8194821.7194921.6195020.5195121.7195222.7195321.4195422.2195522.0195622.3195721.7195820.7195921.91960;procreg;modelwendu=nianfen;printcli;plotwendu*nianfenp.*nianfenl95.*nianfenu95.*nianfen/overlay;symbol1c=blackv=triangle;symbol2c=bluev=circle;symbol3c=greenv=square;symbol4c=redv=star;run;(2)程序说明首先利用DATA补建立数据集ch,INPUT语句中的wendu表示温度,nianfen表示年份。REG过程中的MODEL语句,nianfen作为回归变量或自变量,而把wendu作为相应变量或因变量。Printcli可以得到预测值、95%预测上限与下限、残差。Plot选项可以制出数据点、回归直线和预测界限的图形。四、模型的检验与分析(1)程序(1)输出结果:程序(2)输出结果一:(2)输出结果二:REG过程模型:MODEL1因变量:wendu输出统计量观测因变量预测值预测均值标准误差95%置信限预测残差119.600020.21400.297118.400522.0275-0.6140219.300020.27530.282118.471922.0787-0.9753319.900020.33660.267518.542622.1307-0.4366420.700020.39800.253218.612722.18330.3020520.800020.45930.239418.682022.23660.3407619.900020.52070.226118.750722.2906-0.6207720.700020.58200.213418.818622.34540.1180819.800020.64330.201518.885822.4008-0.8433921.300020.70470.190418.952422.45700.59531021.400020.76600.180419.018122.51380.63401121.100020.82730.171619.083222.57150.27271220.900020.88870.164219.147522.62980.01131322.800020.95000.158519.211122.68891.85001420.400021.01130.154519.273922.7487-0.61131522.900021.07270.152519.336022.80931.82731621.400021.13400.152519.397322.87070.26601721.600021.19530.154519.457922.93270.40471821.600021.25670.158519.517822.