1BasicStatistics基础统计学(MeasurePhase)测量阶段2Statistics统计学Statisticsisthebranchofsciencethatdealswiththecollection,presentation,analysis&interpretationofdataforthepurposeofdecision-makingandproblem-solving.统计学是科学的其中一门,涉及数据的收集,演示,分析及解释,目的在作出决定及解决问题Statisticsisacriticalskillinqualityimprovementasstatisticaltechniquescanbeusedtodescribeandtounderstandvariability.统计学在质量改进方面是一个关键技巧,由於统计技巧可被用来描述及明白变量。3PopulationvsSample族群与样本对比Population族群theentiresetofmeasurementsofinterest从一段长时间期里变得一整群数据Sample样本asubsetofdatafromthepopulation从族群里抽出的数据组Parameters参量numericalmeasuresofapopulation族群之数据测量Statistics统计numericalmeasuresofasample样本之数据测量4ParametersvsStatistics参量与统计对比ParameterStatistic参量统计Mean平均数Variance方差StandardDeviation标准差Proportion比率px22ss5What’sInAName?什么称谓?XYMathematicsIndependentVariableDependentVariable数学自变量因变量StatisticsPredictor(Factor)Response统计学先知(因素)反应SystemsEngineeringInputOutput系统工程输入输出QualityEngineeringCauseEffect(Quality)质量工程原因影响(质量)ControlEngineeringParameterPerformanceIndex控制工程参数表现指数ProcessEngineeringControlCharacteristicProcessCharacteristic工序工程控制特性工序特性SixSigmaKPIVKPOV六标准差关键工序输入变量关键工序输出变量6Statistics—AnOverview统计学的总览Charts图表Tables表图GraphicalPresentations图象形式显示Location位置Dispersion分布Shape形状NumericalMeasures数字形式表建DescriptiveStatistics描述性统计学PointEstimate点判断IntervalEstimate距离判断ParameterEstimation参量判断ParametricMethods参量方法NonparametricMethods非参量方法HypothesisTesting假设测试InferentialStatistics推论性统计学Statistics统计学DataDisplay数据显示DataSummary数据摘要7DescriptiveStatistics—AnOverview描述性统计学概览DotPlot点图BoxPlot盒图Histogram直方图Stem&LeafDiagram叶干图BarChart棒形图TrendChart趋势图Charts图表FrequencyDistribution频率分布Tables表格GraphicalPresentations图像显示Mean平均数Median中位数Mode模范数Quartiles四分数Location位置Range极差StandardDeviation标准差Variance方差Inter-quartileRange内四分极差Dispersion分布Skewness歪斜度Kurtosis峰度Shape形状NumericalMeasures数据测量DescriptiveStatistics描述性统计学8DescriptiveStatistics描述性统计学GraphicalPresentationsprovidevisualizationofthedataset,andfacilitatesdetectionofanomalies.图像性表达提供对数据组的想像及有利於发现数据之不寻常。NumericalMeasuresprovideindicesthatcan/willbeusefulformakingdecisions,i.e.itfacilitatesInferentialStatistics.数字测量提供有用的指数作为决策执行,因此,他有利於推论性统计学。9NumericalMeasures数字的测量Describesthecharacteristicsofthedataset.描述数据组的特性Keynumericalmeasures关键数字的测量:measuresoflocation(centraltendency)位置测量(中心趋势)measuresofdispersion(variation)分布测量(变量)measuresofshape(distribution)形象测量(分布)10Mean平均数Iftheobservationsinasampleofsizenarex1,x2,....,xn,thenthesamplemeanis(equallyweightedaverage)如在一以n为大小的样本中,观察是x1,x2,....,xn,那样本平均数为同等比重的平均Themeanisthemostcommonmeasureoflocationorcenterofthedata.平均数是最常见用作测量数据的位置或中心nxnxxxxn1iin2111Mean平均数Thepullstrength(ingf)of10goldbondingwiresare10个金线接合拉力为16.8516.4017.2116.3516.5217.0416.9617.1516.5916.57Thesamplemeanpullstrengthforthe10observationsis对这10个观察,样本拉力平均数为gf764.161064.1671057.1640.1685.16nxxn1ii12Mean平均数Thesamplemeanrepresentstheaveragevalueofallobservationsinthesample.样本平均数代表在样本中所有观察的平均值。ForafinitenumberofobservationsN,thepopulationmean(denotedby)maybedeterminedby对於一定数的观察N,族群平均数(代表)可被决定为xNxN1ii13Median中位数Letx(1),x(2),...,x(n)denoteasamplearrangedinincreasingorderofmagnitude,thenthesamplemedianisdefinedas假设x(1),x(2),...,x(n)意味一样本以增大顺序排例,则样本中位数为:Theadvantageofthemedianisthatitisnotinfluencedverymuchbyextremevalues.中位数的好处是,他不太受极端值所影响。evenisnif2xxoddisnifxx~)1]2/n([)2/n()2/]1n([50thPercentile如n是单数如n是双数14Median中位数Ifthesampleobservationsare如样本的观察为1342786Thesamplemeanandmedianare4.4and4respectively.Bothquantitiesgiveareasonablemeasureofthecentraltendencyofthedata.样本的平均数和中位数分别为4.4及4这都对数据的中心趋势给予合理的测量。Ifthelastobservationischangedsothatthedataare如上次观察被更改变成如下1342782450Thesamplemeanis353.6whilethesamplemedianremainsunchanged.那样本平均数为353.6而样本中位数维持不变。15Mode模范数Themodeistheobservationthatoccursmostfrequentlyinthesample.模范数是在一样本中观察为最高频率的Themodemaybeunique,ortheremaybemorethan1mode.Sometimes,themodemaynotexist.模范数可能是独一的或是多於一个。有时候,模范数不一定存在。16Mode模范数Ifthesampleobservationsare如样本观察为3693583463110Thesamplemodeis3,sinceitoccursfourtimes.样本模范数是3,由於他发生4次Ifthesampleobservationsare如样本观察为36935834631106256Thesamplemodesareat3and6,sincetheybothoccurfourtimes.样本模范数是3和6,由於他们都发生4次Ifthesampleobservationsare如样本观察为1342768Thesamplemodedoesnotexist.样本模范数不存在17Mode模范数WhyusetheMode?为什么要用模式数?Whentheobservationsarecategoricalinnature,themodeisthesinglemeasurethatbestdescribesthedata.当数据是用类型来分类时,模式数是唯一可用来对数据的描述.Classicexampleistheshirtmanufacturerwhowantstoknowwhichsizes(S,M,L,XL,XXL)tointroducetothemarket.例如衣服制造商想知道那种衣服尺码是市场上最普遍的(S,M,L,XL,XXL)18Quartiles四分位数Whenanorderedsetofdataisdividedintofourequalparts,thedivisionpointsarecalledquartiles.当把一组排好的数据分成四份,其分界点就是四分位数.ThefirstorlowerquartileQ1isavaluethathasapproximately25%oftheobservationsbelowitsvalue.第一分位数Q1是指大约有25%的数值低于Q1.ThesecondquartileQ2isavaluethathasapproximately50%oftheobservationsbelowinvalue.Itisalsocalledthemedian.第二分位数Q2,是指大约有50%的数值中位数(median).ThethirdorupperquartileQ3isavaluethathasapproximately75%oftheobservationsbelowitsvalue.第三分位数Q3,是指大约有75%的数值低于Q3.19Quartiles四分位数Examp