2010-6-221非参数回归的RRRR语言实现中国人民大学统计学院陈堰平陈堰平陈堰平陈堰平22010-6-22背景•回归模型•回归函数形式已知---参数回归•回归函数形式未知---非参数回归(|)()EYf=XXXXXXXX32010-6-22参数回归x=sort(runif(200))y=2*x+1+rnorm(200,0,0.1)fit.lin-lm(y~x)Example:42010-6-22summary(fit.lin)Call:lm(formula=y~x)Residuals:Min1QMedian3QMax-0.200168-0.066969-0.0034020.0704640.208087Coefficients:EstimateStd.ErrortvaluePr(|t|)(Intercept)0.979970.0127776.752e-16***x2.023680.0223690.502e-16***---Signif.codes:0‘***’0.001‘**’0.01‘*’0.05‘.’0.1‘’1Residualstandarderror:0.09269on198degreesoffreedomMultipleR-squared:0.9764,AdjustedR-squared:0.9763F-statistic:8189on1and198DF,p-value:2.2e-1652010-6-220.00.20.40.60.81.01.01.52.02.53.0xy62010-6-22非参数回归•回归函数未知,要根据观测值估计给定点的估计值–假设观测为(Yi,Xi),i=1,…,n,假设模型为()YfXε=+72010-6-22核函数法82010-6-22•核函数法(Nadaraya-Watson)92010-6-22局部多项式估计利用局部展开的思想,在待估计点,将函数泰勒展开000()()'()()fxfxfxxx=+−+⋯200,1(,)argmin[(())]niiiabiXxabYabXxKh=−⎛⎞=−+−⎜⎟⎝⎠∑距离x0较近的点,提供的信息多,距离远的点,提供的信息少0ˆ()fxa=可以转化为加权最小二乘的问题102010-6-22y.est=-0.9689503,sin(10*0.5)=-0.9589243112010-6-22带宽h的选择�CrossValidation2()11ˆ[()]niiiiCVYfXn−==−∑选取一系列的h,计算相应的CV,使得CV最小的就是最优带宽122010-6-22现成的包KernSmoothKernSmoothKernSmoothKernSmooth,locpollocpollocpollocpol,…………132010-6-220.00.20.40.60.81.0-1.0-0.50.00.51.0xy142010-6-22•quantreg包中有lprq函数在分位回归的应用lprq-function(x,y,h,tau=0.5,m=50){xx-seq(min(x),max(x),length=m)fv-xxdv-xxfor(iin1:length(xx)){z-x-xx[i]wx-dnorm(z/h)r-rq(y~z,weights=wx,tau=tau,ci=FALSE)fv[i]-r$coef[1]dv[i]-r$coef[2]}list(xx=xx,fv=fv,dv=dv)}152010-6-22•原理()00,1(,)argmin(())niiiabiXxabYabXxKhτρ=−⎛⎞=−+−⎜⎟⎝⎠∑()qyabxτ=+(,)1(,)argmin()niiabiabYabXτρ==−−∑线性分位回归估计方程非参数分位回归的估计方程162010-6-22172010-6-221020304050-100-50050millisecondsacceleration(ing)h=1h=2h=3h=4182010-6-22WhyR?灵活:研究新的模型时,可以在原有代码的基础上修改变系数分位回归模型:10()()qycuxcτ=+()000,1argmin(())niiiiabiUuYabUuXcKhτρ=−⎛⎞−+−−⎜⎟⎝⎠∑()000,1argmin()niiiiiabiUuYaXbUuXcKhτρ=−⎛⎞=−−−−⎜⎟⎝⎠∑192010-6-22lprq0-function(x,u,y,h,lprq0-function(x,u,y,h,lprq0-function(x,u,y,h,lprq0-function(x,u,y,h,tautautautau====0.5,u0){0.5,u0){0.5,u0){0.5,u0){####对单点进行估计require(quantregrequire(quantregrequire(quantregrequire(quantreg))))fv-u0fv-u0fv-u0fv-u0dvdvdvdv-u0-u0-u0-u0z-u-u0z-u-u0z-u-u0z-u-u0wxwxwxwx----Ker(z/hKer(z/hKer(z/hKer(z/h))))r-r-r-r-rq(yrq(yrq(yrq(y~~~~x+I(zx+I(zx+I(zx+I(z****x),weights=x),weights=x),weights=x),weights=wxwxwxwx,,,,method=method=method=method=br,taubr,taubr,taubr,tau====tautautautau,,,,cicicici=FALSE)=FALSE)=FALSE)=FALSE)fv-r$coef[c(1,2)]fv-r$coef[c(1,2)]fv-r$coef[c(1,2)]fv-r$coef[c(1,2)]dvdvdvdv-r$coef[3]-r$coef[3]-r$coef[3]-r$coef[3]list(u0=u0,fv=list(u0=u0,fv=list(u0=u0,fv=list(u0=u0,fv=fvfvfvfv,,,,dvdvdvdv====dvdvdvdv))))}}}}