SPC统计学基础BasicStatisticsSPCWhatisstatistics?什么是统计Definitionofstatistics:统计的定义*Statisticsarefactsandfigures.统计是事实和数据*Statisticsconsistofasetofmethodsandrulesfororganizingandinterpretingobservationsfrompopulationsandsamples统计通过一系列方法和规则来组织、解释来自总体和样本的观测值PopulationsandSamples总体和样本*Populationistheentiregrouporsetofallpossibleeventsofinterestintheparticularstudy.总体是被关注、研究的对象的全部*Sampleisasubsetofthepopulation样本是从总体中抽出的一部分ENTIREPOPULATION总体SAMPLEWITHIN(subset)样本AStatistic:统计值•Anumericalvaluethatdescribesasample•用来描述样本的数值SPCTwoTypesofData统计获得的两种数据•AttributeData–Categories–Yes,No–Go,Nogo–Machine1,Machine2,Machine3–Pass/Fail•VariableData–Discrete(Count)Data•MaintenanceEquipmentFailures,NumberofClogs•Numberofcustomerreturns–ContinuousData•Decimalsubdivisionsaremeaningful•Time,Pressure,ConveyorSpeed特性数据(定性)–等级–是非–通止–个体(如1号机器,2号机器,3号机器)–成败•变量数据(定量)–间断型数据(计数)如设备维修次数、阻塞次数等客户退货次数–连续型数据(计量)可有小数点如时间、压力、传送速度等SPCDescriptionofContinuousData-Graphical计量型数据的描述-图形HistogramHeightof90ladies#ofocurrenceHeight(inch)数据分布图在实际统计中,统计结果是分段表示的,因此作出的分布图为柱形图。在分析数据时,通常将它拟合成连续的曲线。SPCExamplesofdistributions不同的分布807060504030201003002001000C3FrequencyComparisonofDistributions.130120110100908070603002001000C2FrequencyComparisonofDistributions.1101009080706050403020100500C1FrequencyComparisonofDistributions.NegativeSkew负斜PositiveSkew正斜SymmetricDistribution对称分布Left-tailedRight-tailedTwo-tailedSPCMean:Arithmeticaverageofasetofvalues均值:算数平均值Reflectstheinfluenceofallvalues反映全部数据的影响StronglyInfluencedbyextremevalues受特殊值干扰大Median:Reflectsthe50%rank-thecenternumberafterasetofnumbershasbeensortedfromlowtohigh.中位数:反映系列的一半—将一组数据按大小顺序排列,取中间的一个数据Doesnotincludeallvaluesincalculation计算中未包含全部数据Is“robust”toextremeoutlierscores.对特殊值的干扰有抵抗Themeanandmedianwillbeaffectedbythenatureofthedistributionofnumbers.均值和中值都受数据分布的影响Mode:Mostfrequentlyoccurringvalueinadataset.InaParetothisisthelargestbaronthechart.众数:数据中重复次数最多的值,在柏拉图上表现为最高的那条柱DescriptionofQuantitativeData-CentralTendency计量型数据的描述-中心位置SPCRelationship:meanandmedian均值和中位数的比较1101009080706050403020100500NormalFrequencyMean,Median807060504030201003002001000NegSkewFrequencyMedianMean130120110100908070603002001000PosSkewFrequencyMedianMeanSPCSamplemeanofadistribution样本均值Mean=Average=xi/Ni=1N=X1+X2+....XNNExamples:例Partweights零件重量:8.47,8.67,9.34,7.99AVERAGE=8.47+8.67+9.34+7.99=8.62平均值4SPCDon’tWorry.Thatropeisoneinchthickontheaverage.不要担心。绳子是平均一英寸粗SPCRange=maximumvalue-minimumvalue范围=样本内最大值-最小值Variance=meansquareddistancefromthemean方差=数据与均值差距的平方之均值Standarddeviation=isthesquarerootofthevarianceandprovidesameasureofthestandarddistancefromthemean.标准偏差=方差的开方DescriptionofQuantitativeData-Dispersionor‘Spread’计量型数据的描述-离散度SPClDeviation(偏差)isthedistancefromthemean.是离开均值的距离lDeviationscore(偏差值)=observation-truemean观测值-均值lVariance方差=meanoraverageofsquareddeviationscores偏差值的平方均值.isthesymbolforvariance方差的符号.lStandardDeviation标准偏差=squarerootofvariance方差的开方.isthesymbolforthestandarddeviation.标准偏差的符号TheStandardDeviationisaMeasureofVariability标准偏差是对变异的描述=PopulationMean总体的均值iDeviation(distancefrommean)偏差s2sStandardDeviation标准偏差SPCs^s==SampleStandardDeviation样本标准偏差X=SampleMean样本均值=PopulationMean总体均值sStatisticsEstimateParameters=PopulationStandardDeviation总体标准偏差SamplePopulationSAMPLE样本POPULATION总体Statisticsorparameters?样本统计值与总体参数?统计活动的实质:用样本统计值来估计总体参数,从而了解总体SPCPopulationvs.sample总体和样本计算公式PopulationMean总体均值=XNiiN1SampleMean样本均值PopulationStandardDeviation总体标准偏差s=S=(X)Ni2i=1NSampleStandardDeviation样本标准偏差=x=xnii=1ns=s=(X)n-1i2i=1nX^^SPCExample:Calculating“sigma”计算练习Usingtheformabove,calculatethestandarddeviationforthenumbers用上列的表计算以下5个数据的标准偏差:2,1,3,5,412345678910Meanssquares(X)N-1i2i=1NX(X)N-1i2i=1NXXX-X(X-)2XSPCExerciseSolution计算结果:1-N)(XN1=i2iX(X)N-1i2i=1NX12-1121-243300452454116789101510Mean3ssquare2.5s1.581139XX-X(X-)2XSPC数据的描述总览分布的位置LocationMean均值Median中值Mode代表值Quantiles分位数Q1四分之一处Q2二分之一处Q3四分之三处P#%机率位置离散度SpreadRange范围StandardDeviation标准偏差Variance变差StabilityFactor稳定因子Span跨度InterquartileRange内分位宽度SumofSquares平方和Shape形状Histograms直方图RunCharts运行图TimePlots时序图ScatterPlots散点图BoxPlots盒状图BlockChart块图NormalityPlot正态性图NumercialGraphicalNormalDistribution正态分布sss2s2ssMean均值Bell-shapeSymmetricDistribution‘倒钟’状对称分布fx(x)=12ps2e-(x-)2/2s2MeasuredbyStandardDeviation用标准偏差为尺度sss2s2ssmean68.27%15.865%15.865%MeasuredbyStandardDeviation用标准偏差为尺度sss2s2ssmean95.45%SPC68.2%的数据落在±1s以内95.4%的数据落在±2s以内99.7%的数据落在±3s以内99.99999975%的数据落在±6s以内MeasuredbyStandardDeviation用标准偏差为尺度+4s+5s+6s+1s+2s+3s-2s-1s-4s-3s-6s-5s068.27%95.45%99.73%99.9937%99.999943%99.9999998%SPC总体任意抽取4组样品,每组3个样品xs11,xs22,xs33,xs44,s,总体的参数样品的统计值总体与样品在统计上的关系SPC样品之间的统计分布populationrandomsamplesofsize3xs11,xs22,xs33,xs44,s,857565550.150.100.050.00xf(x)DistributionofX1510500.20.10.0sf(s)DistributionofS857565550.150.100.050.00xf(xbar)DistributionofX-barSPC中心极限定理(CentralLimitTheorem)*•条件:X1,X2,…,Xn是从总体中随机抽取样品的某特性的测量值,总体关于该特性的均值为,总体的标准偏差为s,•结论:该组样品的均值所属分布(假定有多组这样的样品,多组的均值形成一个分布)的均值和标准偏差为:另外样品大小n越大,组均值的分布越接近正态分布.nXXssand