广东金融学院金融实战——金融数据挖掘中国“新能源”板块股票价格预测学生姓名:卫俊严宇光陈浩杰指导教师:骆世广提交日期:2012年6月28日摘要研究股票价格预测,由于股票价格数据具非线性、随机性等变化规律,同时股票市场与国内外经济政治变化有关,因此通过简单的单个模型分析是很难准确有效的对股票价格进行预测,更准确的股价预测需要分层次进行各类模型分析,我们将运用SPSSClementineClient11.1系统和Eviews系统首先分别通过指数平滑法和ARIMA法对股票价格进行整体的预测(即股指预测),接着利用神经网络、logistic回归以及C5.0算法法对中国“新能源”股票价格进行涨跌预测,最后再运用K-means和两步法进行股票划分,通过此种方法更能合理有效的对股票价格作出系统、准确的预测。关键词:中国新能源;股票价格;指数平滑;ARIMA;神经网络;logistic回归;C5.0算法;K-means;两步法;SPSSClementineClient11.1;EviewsAbstractIntheresearchofstockprice’sforecasts,duetothenonlinearityandrandomnessofstockspriceandchangesofeconomicsandpoliticsinhomeandbroad,itisdifficulttopredictthestockpriceaccuratelyandeffectivelywithonlyasingleanalyzingmodel.Toforecastmoreeffectively,weneedtoapplydifferentmodelanalysisondifferentlevels.Firstly,weuseexponentialsmoothingandARIMAtoforecaststockspriceinNewEnergysector(thestockindexprediction)withSPSSClementineClient11.1andEviewsseparately.Secondly,weapplyANN,logisticregressionandC5.0algorithmtoforecasttheupsanddownsofstockspriceinthissector.AndthenwithK-meansandTwo-stepclusteranalyses,wetrytodividethestocksthroughwhichweareintendingtofindtheverysectorsthatcontributetheriseofthestocksfromtheinsideofdata.Bytryingoutallthisway,itcansystematicallymakeaccuratepredictionsonthestockprice.Keywords:Newenergy;stockprice;exponentialsmoothing;ARIMA;ANN;logisticregression;C5.0algorithm;K-means;Two-stepclusteranalysis;SPSSClementineClient11.1;Eviews目录引言.........................................................................................................................11.中国新能源当前现状...........................................................................................21.1中国新能源的发展......................................................................................21.2中国新能源市场..........................................................................................32.股指预测..............................................................................................................42.1指数平滑法..................................................................................................42.1.1指数平滑的基本公式.........................................................................42.1.2指数平滑的预测公式.........................................................................52.1.3指数平滑系数𝛂的确定......................................................................62.1.4指数平滑法的趋势调整.....................................................................72.2ARIMA模型................................................................................................72.2.1时间序列的AR、MA和ARIMA建模............................................73.涨跌主要因素分析...............................................................................................93.1Logistic回归模型.......................................................................................93.1.1Logistic回归模型.............................................................................93.1.2假设检验..........................................................................................93.1.3回归系数的意义:........................................................................103.2决策树算法................................................................................................103.2.1C4.5分类算法所涉及的概念描述................................................113.2.2C4.5算法对缺失数据的处理........................................................123.2.3C4.5算法对决策树的剪枝处理....................................................123.2.4C4.5算法的优缺点........................................................................123.3神经网络算法...........................................................................................133.3.1神经网络的简单原理......................................................................133.3.2神经元和神经网络的结构..............................................................133.3.3BP网络..............................................................................................153.4.4Hopfield神经网络............................................................................174.股票划分............................................................................................................214.1K-means聚类算法...................................................................................214.2Two-step聚类算法.................................................................................225.数值仿真.............................................................................................................235.1股指预测....................................................................................................235.1.1指数平滑法预测..............................................................................235.1.2ARIMA模型预测...............................................................................305.2涨跌主要因素分析.....................................................................................395.2.1数据处理...........................................................................................395.2.2小结...................................................................................................485.3股票划分.