时间地点实验题目多重共线性的诊断与修正一、实验目的与要求:要求目的:1、对多元线性回归模型的多重共线性的诊断;2、对多元线性回归模型的多重共线性的修正。二、实验内容根据书上第四章引子“农业的发展反而会减少财政收入”,1978-2007年的财政收入,农业增加值,工业增加值,建筑业增加值等数据,运用EV软件,做回归分析,判断是否存在多重共线性,以及修正。三、实验过程:(实践过程、实践所有参数与指标、理论依据说明等)(一)模型设定及其估计经分析,影响财政收入的主要因素,除了农业增加值,工业增加值,建筑业增加值以外,还可能与总人口等因素有关。研究“农业的发展反而会减少财政收入”这个问题。设定如下形式的计量经济模型:iY=1+22X+33X+44X+55X+66X+77X+i其中,iY为财政收入CS/亿元;2X为农业增加值NZ/亿元;3X为工业增加值GZ/亿元;4X为建筑业增加值JZZ/亿元;5X为总人口TPOP/万人;6X为最终消费CUM/亿元;7X为受灾面积SZM/千公顷。图1:1978~2007年财政收入及其影响因素数据年份财政收入CS/亿元农业增加值NZ/亿元工业增加值GZ/亿元建筑业增加值JZZ/亿元总人口TPOP/万人最终消费CUM/亿元受灾面积SZM/千公顷19781132.31027.51607138.2962592239.15079019791146.41270.21769.7143.8975422633.73937019801159.91371.61996.5195.5987053007.94452619811175.81559.52048.4207.11000723361.53979019821212.31777.42162.3220.71016543714.833130198313671978.42375.6270.61030084126.43471019841642.92316.12789316.71043574846.33189019852004.82564.43448.7417.91058515986.344365198621222788.73967525.71075076821.84714019872199.432334585.8665.81093007804.64209019882357.23865.45777.28101110269839.55087019892664.94265.9648479411270411164.24699119902937.150626858859.411433312090.53847419913149.485342.28087.11015.111582314091.95547219923483.375866.610284.5141511717117203.35133319934348.956963.8141882266.511851721899.94882919945218.19572.719480.72964.711985029242.25504319956242.212135.824950.63728.812112136748.24582119967407.9914015.429447.64387.412238943919.54698919978651.1414441.932921.44621.612362648140.65342919989875.9514817.634018.44985.812476151588.250145199911444.081477035861.55172.112578655636.949981200013395.2314944.7400365522.31267436151654688200116386.0415781.343580.65931.712762766878.352215200218903.641653747431.36465.512845371691.247119200321715.2517381.754945.57490.812922777449.554506200426396.4721412.7652108694.312998887032.937106200531649.292242076912.910133.813075696918.138818200638760.22404091310.911851.1131448110595.341091200751321.7828095107367.214014.1132129128444.648992利用EV软件,生成iY、2X、3X、4X、5X、6X、7X等数据,采用这些数据对模型进行OLS回归。(二)诊断多重共线性1、双击“Eviews”,进入主页。输入数据:点击主菜单中的File/Open/EVWorkfile—Excel—多重共线性的数据.xls;2、在EV主页界面的窗口,输入“lsycx2x3x4x5x6x7”,按“Enter”.出现OLS回归结果,图2:图2:OLS回归结果DependentVariable:YMethod:LeastSquaresDate:10/12/10Time:17:07Sample:19782007Includedobservations:30VariableCoefficientStd.Errort-StatisticProb.C-6646.6946454.156-1.0298320.3138X2-0.9706880.330409-2.9378410.0074X31.0846540.2285214.7463970.0001X4-2.7639282.076994-1.3307350.1963X50.0776130.0679741.1418080.2653X6-0.0471190.081509-0.5780840.5688X70.0075800.0350390.2163290.8306R-squared0.994565Meandependentvar10049.04AdjustedR-squared0.993147S.D.dependentvar12585.51S.E.ofregression1041.849Akaikeinfocriterion16.93634Sumsquaredresid24965329Schwarzcriterion17.26329Loglikelihood-247.0452F-statistic701.4747Durbin-Watsonstat2.167410Prob(F-statistic)0.000000由此可见,该模型的可决系数为0.995,修正的可决系数为0.993,模型拟和很好,F统计量为701.47,模型拟和很好,回归方程整体上显著。但是当=0.05时,)(2/knt=)23(025.0t=2.069,不仅X4、X5、X6、X7的系数t检验不显著,而且X2、X4、X6系数的符号与预期相反,这表明很可能存在严重的多重共线性。(即除了农业增加值2X、工业增加值3X外,其他因素对财政收入的影响都不显著,且农业增加值2X、建筑业增加值4X、最终消费6X的回归系数还是负数,这说明很可能存在严重的多重共线性。)3、计算各解释变量的相关系数:在Workfile窗口,选择X2、X3、X4、X5、X6、X7数据,点击“Quick”—GroupStatistics—Correlations—OK,出现相关系数矩阵,如图3:图3:相关系数矩阵X2X3X4X5X6X7X210.972980614561470.9826606234997890.9279784294067450.9889626197246670.226199965872465X30.9729806145614710.9985218083931880.8439002065687580.9926412367117840.129443710336215X40.9826606234997890.99852180839318810.8641521359280510.9960568434415960.154645718404353X50.9279784294067450.8439002065687580.86415213592805110.8888480555469790.387767264808787X60.9889626197246670.9926412367117840.9960568434415960.88884805554697910.185172880851582X70.2261999658724650.1294437103362150.1546457184043530.3877672648087870.1851728808515821由相关系数矩阵可以看出,各解释变量相互之间的相关系数较高,特别是农业增加值2X、工业增加值3X、建筑业增加值4X、最终消费之间6X,相关系数都在0.8以上。这表明模型存在着多重共线性。(三)修正多重共线性1、采用逐步回归法,去检验和解决多重共线性问题。分别作Y对X2、X3、X4、X5、X6、X7的一元回归,结果如下图4:在EV主页界面的窗口,输入“lsycx2”,“回车键”。DependentVariable:YMethod:LeastSquaresDate:10/12/10Time:17:49Sample:19782007Includedobservations:30VariableCoefficientStd.Errort-StatisticProb.C-4086.5441463.091-2.7930900.0093X21.4541860.11723512.403980.0000R-squared0.846034Meandependentvar10049.04AdjustedR-squared0.840536S.D.dependentvar12585.51S.E.ofregression5025.770Akaikeinfocriterion19.94689Sumsquaredresid7.07E+08Schwarzcriterion20.04030Loglikelihood-297.2033F-statistic153.8588Durbin-Watsonstat0.166951Prob(F-statistic)0.000000依次如上推出X3、X4、X5、X6、X7的一元回归。综上所述,结果如下图4:图4.一元回归估计结果变量参数估计值1.4541860.4268173.1868510.8297890.3303540.111530t统计量12.4039828.9016822.677336.20602518.128950.3203380.8460340.9675670.9483640.5790410.9214940.0036510.8405360.9664080.9465200.5640060.918690-0.0319322、其中,加入3X的2R最大,以3X为基础,顺次加入其他变量逐步回归。结果如下图5:DependentVariable:YMethod:LeastSquaresDate:10/13/10Time:01:27Sample:19782007Includedobservations:30VariableCoefficientStd.Errort-StatisticProb.C1976.086388.24135.0898410.0000X2-1.1053390.105222-10.504860.0000X30.7219890.02887925.000560.0000R-squared0.993624Meandependentvar10049.04AdjustedR-squared0.993152S.D.dependentvar12585.51S.E.ofregression1041.474