第7章描述统计返回目录频数分布频数分布分析过程频数分布分析实例描述统计描述统计过程与实例探索分析探索分析的基本内容探索分析过程探索分析实例返回交叉列联表分析交叉列联表分析过程交叉列联表分析实例比率分析比率分析过程比率分析实例P-P图和Q-Q图习题及参考答案结束频数分布分析频数分布分析过程返回频数分布主对话框返回选择输出统计量对话框返回图形选择对话框返回频数分布表--格式对话框返回频数分布分析实例返回表7-2被调查者性别变量的频数分布表表7-3被调查者种族变量的频数分布表表7-4被调查者总体幸福感变量的频数分布表例1例2表7.5不同年龄人员和其受教育年限的描述统计例2表7.6受教育年限变量的频数分布表受教育年数2.1.1.15.3.3.55.3.3.86.4.41.212.8.82.0251.61.73.6684.54.58.1563.73.711.9734.84.816.7855.65.622.346130.430.552.81308.68.661.517511.511.673.0734.84.877.919412.812.890.7432.82.893.6453.03.096.6221.51.598.0302.02.0100.0151099.5100.07.51517100.0034567891011121314151617181920Total(5)ValidNA(6)Missing(7)Total(1)Frequency(2)Percent(3)ValidPercent(4)CumulativePercentage变量的直方图返回educ变量直方图返回例3图7-7Recodeintodifferencevariables主对话框例3图7-8Recodeintodifferencevariables:OldandNewVariables对话框例3图7-9ValueLables框例3表7-7分组后的年龄变量的频数分布表描述统计描述统计过程与实例返回算术平均数、中位数和众数四分位数和百分位数全距、方差、标准差和标准误偏度和峰度列联表及其独立性检验比率分析正态分布的检验返回基本参数描述统计分析主对话框返回描述统计:选择项对话框返回全美各种犯罪数据描述统计量返回DescriptiveStatistics50151153436.863.848503243678115.627.3485043774435076101.5191.19350272212936771135.4268.170501467286175346540930.80361.0505028566943550971821943.64709.829508007887818393367.86199.61050杀人案件强奸案件抢劫案件袭击案件入室行窃盗窃案件盗车案ValidN(listwise)NRangeMinimumMaximumSumMeanStd.Deviation探索分析过程返回箱图返回茎叶图返回8427363N=EmploymentCategoryManagerCustodialClericalEducationalLevel(years)2220181614121086箱图与Spreadvslevel图(a)返回Spreadvs.LevelPlotofEDUCByJOBCAT*PlotofLNofSpreadvsLNofLevelSlope=-.413Powerfortransformation=1.413Level2.92.82.72.62.52.4Spread1.41.31.21.11.0箱图与Spreadvslevel图(b)返回数据探索主对话框返回选择描述统计量对话框返回统计图对话框返回实例输出之一:观测量摘要表返回CaseProcessingSummary216100.0%0.0%216100.0%258100.0%0.0%258100.0%性别女男薪水NPercentNPercentNPercentValidMissingTotalCasessalary变量的描述统计量返回变量的极端值返回ExtremeValues371371$58,125348348$56,750468468$55,750240240$54,3757272$54,000378378$15,750338338$15,900411411$16,200224224$16,2009090$16,2002929$135,0003232$110,6251818$103,750343343$103,500446446$100,000192192$19,650372372$21,300258258$21,3002222$21,7506565$21,90012345123451234512345HighestLowestHighestLowest性别女男薪水CaseNumber雇员序号Value数据正态分布检验结果返回TestsofNormality.146216.000.842216.000.208258.000.813258.000性别女男薪水StatisticdfSig.StatisticdfSig.Kolmogorov-SmirnovaShapiro-WilkLillieforsSignificanceCorrectiona.方差齐性检验结果返回TestofHomogeneityofVariance119.6691472.00051.6031472.00051.6031310.594.00095.4461472.000BasedonMeanBasedonMedianBasedonMedianandwithadjusteddfBasedontrimmedmean薪水LeveneStatisticdf1df2Sig.按照性别分组后的CurrentSalary的茎叶图返回男、女组薪水数据的箱图返回交叉列联表分析过程返回交叉表分析主对话框返回选择统计量对话框返回精确检验对话框返回有关公式:NXXC2212kNXV列联系数:CramerV:返回显示单元格值对话框返回格式对话框返回Data-03数据交叉列联表分析结果——表7.15观测量统计处理摘要CaseProcessingSummary141493.2%1036.8%1517100.0%孩子数量*工作分类*地区分类NPercentNPercentNPercentValidMissingTotalCases返回表7.16各变量之间的多维频数分布表返回表7.17卡方检验返回Chi-SquareTests47.163a40.203.186b.160.21244.48340.289.262b.233.29148.225.117b.095.1389.514c1.002.003b.000.006.002b.000.00563961.974d40.014.016b.008.02565.95740.006.009b.003.01655.621.011b.004.0189.398e1.002.003b.000.006.001b.000.00238147.883f40.183.191b.165.21652.03540.096.115b.094.13647.618.072b.055.089.683g1.408.411b.378.443.200b.174.227394PearsonChi-SquareLikelihoodRatioFisher'sExactTestLinear-by-LinearAssociationNofValidCasesPearsonChi-SquareLikelihoodRatioFisher'sExactTestLinear-by-LinearAssociationNofValidCasesPearsonChi-SquareLikelihoodRatioFisher'sExactTestLinear-by-LinearAssociationNofValidCases地区分类东北部东南部西部ValuedfAsymp.Sig.(2-sided)Sig.LowerBoundUpperBound99%ConfidenceIntervalMonteCarloSig.(2-sided)Sig.LowerBoundUpperBound99%ConfidenceIntervalMonteCarloSig.(1-sided)有28个(占51.9%)单元格中的期望频数少于5,最小的期望频数为0.02。a.根据1517个数据的样本进行Fisher检验。b.标准化的卡方值为3.084。c.有30个(占55.6%)单元格中的期望频数少于5,最小大的期望频数为0.14。d.标准化的卡方值为3.066。e.有32个(占59.3%)单元格中的期望频数少于5,最小的期望频数为0.07。f.标准化的卡方值为0.827.g.Data-04交叉列联表分析结果——表7.18观测量统计处理摘要返回表7.19交叉列联表及表7.20卡方检验结果返回比率分析返回比率分析主对话框比率分析:统计量对话框比率分析实例——data07-05结果表7-21样本数据摘要表7-22地产最后估价与售价比值的比率统计量P-P图和Q-Q图P-P概率图主对话框图7-32(a)为肺癌生存时间的Weibull分布P-P概率图0.00.20.40.60.81.0ObservedCumProb0.00.20.40.60.81.0ExpectedCumProbWeibullP-PPlotoftime图7-32(b)为肺癌生存时间的无趋势Weibull分布P-P概率图0.00.20.40.60.81.0ObservedCumProb-0.020.000.020.040.06DeviationfromWeibullDetrendedWeibullP-PPlotoftimedata07-07pb变量转换前后的分布0.00.20.40.60.81.0ObservedCumProb0.00.20.40.60.81.0ExpectedCumProbNormalP-PPlotofBloodPb(μg/100g)0.00.20.40.60.81.0ObservedCumProb0.00.20.40.60.81.0ExpectedCumProbTransforms:naturallogNormalP-PPlotofBloodPb(μg/100g)Q-Q概率图主对话框图7-35(a)是对某市150名3岁女童身高数据所做的Q-Q正态概率图80859095100105ObservedValue80859095100105ExpectedNormalValueNormalQ-QPlotofHight(cm)图7-35(b)是某市150名3岁女童身高数据无趋势Q-Q正态概率图80859095100105ObservedValue-0.8-0.6-0.4-0.20.00.20.4DeviationfromNormalDetrendedNormalQ-QPlotofHight(cm)习题及参考答案返回习题7——第5题分析不同性别的受访者的工资水平与订阅报纸的比例之间是否存在差异。使用的数据文件为data05-05,“inccat”变量为工资分类情况、“News”变量为报纸订阅情况,“gender”变量为性别。返回第5题操作步骤(1)读取数据文件data05-09,按Analyze→DescriptiveStatistics→Crosstabs的顺序打开Crosstabs主对话框。(2)将变量“inccat”选入Row(s)框中,将变量“news”选入Column(s)框中,设置行列变量。将变量“gender”选入Layerof框中,作为控制变量。(3)单击Statistics按钮,展开Statistics对话框,选中Chi-square复选项。(4)单击Cells按钮,展开CellDisplay对话框,在Counts栏中