应用商务统计学讲义第二章-中英文对照

整理文档很辛苦,赏杯茶钱您下走!

免费阅读已结束,点击下载阅读编辑剩下 ...

阅读已结束,您可以下载文档离线阅读编辑

资源描述

1LLLChapter2:OrganizingandVisualizingVariables第2章:组织和可视化变量Inthischapteryoulearn:在这一章你学习:Organizingcategoricalvariables.组织分类变量Organizingnumericalvariables.组织数值变量Visualizingcategoricalvariables.分类变量可视化Visualizingnumericalvariables.可视化数值变量2LLLOrganizingDataCreatesBothTabularAndVisualSummariesSummariesbothguidefurtherexplorationandsometimesfacilitatedecisionmaking.摘要既指导进一步的探索,有时又促进决策。Visualsummariesenablerapidreviewoflargeramountsofdata&showpossiblesignificantpatterns.可视化摘要可以快速检查大量数据&显示可能的重要模式。Often,theOrganizeandVisualizestepinDCOVAoccurconcurrently.通常,该组织和可视化步DCOVA同时发生。DCOVA组织数据创建表格和可视化摘要3LLLCategoricalDataAreOrganizedByUtilizingTablesCategoricalDataTallyingDataSummaryTableDCOVAOneCategoricalVariableTwoCategoricalVariablesContingencyTable分类数据是利用表来组织的。分类数据统计数据一个分类变量二分类变量一览表,汇总表(质量管理)相依表4LLLOrganizingData:SummaryTable(OneCategoricalVariable)Asummarytabletalliesthefrequencies(counts)orpercentagesofitemsinasetofcategoriessothatyoucanseedifferencesbetweencategories.DCOVAAsummarytablewhichtalliesthefrequencies(counts)isalsocalledafrequencytable记录频率(计数)的汇总表也称为频率表。Asummarytablewhichtalliestherelativefrequenciesisalsocalledarelativefrequencytable记录相对频率的汇总表也称为相对频率表。组织数据:汇总表(一个分类变量)汇总表吻合频率(计数)或百分比的一组类的项目,你可以看到不同类别之间。5LLLOrganizingData:SummaryTable(OneCategoricalVariable)ReasonForShoppingOnline?PercentBetterPrices更好的价格37%Avoidingholidaycrowdsorhassles避开假日人群或麻烦29%Convenience方便,便利18%Betterselection更好的选择13%Shipsdirectly船舶直达3%DCOVAMainReasonYoungAdultsShopOnlineSource:Dataextractedandadaptedfrom“MainReasonYoungAdultsShopOnline?”USAToday,December5,2012,p.1A.组织数据:汇总表(一个分类变量)年轻人网上购物的主要原因网上购物的原因?来源:数据提取和改编自“主要原因年轻人网上购物吗?“今日美国,2012年12月5日,1aP.。6LLLContingencyTable(TwoCategoricalVariables)Arandomsampleof400invoicesisdrawn.随机抽取400张发票样本。Eachinvoiceiscategorizedasasmall,medium,orlargeamount.每张发票分为小、中、或大量。Eachinvoiceisalsoexaminedtoidentifyifthereareanyerrors.每张发票也检查以确定是否有任何错误Thisdataarethenorganizedinthecontingencytabletotheright.然后将数据在应急表中组织到右侧。DCOVANoErrorsErrorsTotalSmallAmount17020190MediumAmount10040140LargeAmount65570Total33565400ContingencyTableShowingFrequencyofInvoicesCategorizedBySizeandThePresenceOfErrors列联表(两个分类变量)列联表显示的大小和存在的错误分类发票频率没有错误错误小量误差7LLLContingencyTableBasedOnPercentageOfOverallTotalNoErrorsErrorsTotalSmallAmount17020190MediumAmount10040140LargeAmount65570Total33565400DCOVANoErrorsErrorsTotalSmallAmount42.50%5.00%47.50%MediumAmount25.00%10.00%35.00%LargeAmount16.25%1.25%17.50%Total83.75%16.25%100.0%42.50%=170/40025.00%=100/40016.25%=65/40083.75%ofsampledinvoiceshavenoerrorsand47.50%ofsampledinvoicesareforsmallamounts.基于总百分比的列联表83.75%的抽样发票没有错误,47.50%的抽样发票是少量的。8LLLContingencyTableBasedOnPercentageofRowTotalsNoErrorsErrorsTotalSmallAmount17020190MediumAmount10040140LargeAmount65570Total33565400DCOVANoErrorsErrorsTotalSmallAmount89.47%10.53%100.0%MediumAmount71.43%28.57%100.0%LargeAmount92.86%7.14%100.0%Total83.75%16.25%100.0%89.47%=170/19071.43%=100/14092.86%=65/70Mediuminvoiceshavealargerchance(28.57%)ofhavingerrorsthansmall(10.53%)orlarge(7.14%)invoices.基于行总计百分比的列联表中型发票比小(10.53%)或大(7.14%)发票的出错机会大(28.57%)。9LLLTablesUsedForOrganizingNumericalDataNumericalDataOrderedArrayDCOVACumulativeDistributionsFrequencyDistributions用于组织数值数据的表数据有序阵列频数分布图累积分布10LLLOrganizingNumericalData:OrderedArrayAnorderedarrayisasequenceofdata,inrankorder,fromthesmallestvaluetothelargestvalue.有序阵列是一个序列的数据,在排名顺序,从最小值到最大值。Showsrange(minimumvaluetomaximumvalue).§显示范围(最大值,最小值)。Mayhelpidentifyoutliers(unusualobservations).§可以帮助识别离群值(异常值)。AgeofSurveyedCollegeStudents大学生调查年龄DayStudents走读生161717181818191920202122222527323842NightStudents夜读生181819192021232832334145DCOVA组织数值数据:有序数组11LLLOrganizingNumericalData:FrequencyDistribution组织数值数据:频率分布Thefrequencydistributionisasummarytableinwhichthedataarearrangedintonumericallyorderedclasses.频率分布是一个汇总表,其中数据被排列成数字有序类。Youmustgiveattentiontoselectingtheappropriatenumberofclassgroupingsforthetable,determiningasuitablewidthofaclassgrouping,andestablishingtheboundariesofeachclassgroupingtoavoidoverlapping.你必须注意选择合适的表的班数,确定一个合适的一类分组的宽度,并建立每类分组以避免重叠的边界。Thenumberofclassesdependsonthenumberofvaluesinthedata.Withalargernumberofvalues,typicallytherearemoreclasses.类的数量取决于数据中的值的个数。具有较大数量的值,通常有更多的类。Todeterminethewidthofaclassinterval,youdividetherange(Highestvalue–Lowestvalue)ofthedatabythenumberofclassgroupingsdesired.§确定一类区间的宽度,你把范围(最高值–最低值)的数据由班所需的数字。DCOVA12LLLOrganizingNumericalData:FrequencyDistributionExampleExample:Amanufacturerofinsulationrandomlyselects20winterdaysandrecordsthedailyhightemperature.24,35,17,21,24,37,26,46,58,30,32,13,12,38,41,43,44,27,53,27DCOVA数值数据组织:频率分布的例子例子:一家绝缘制造商随机选择20个冬季,记录每天的高温。13LLLOrganizingNumericalData:FrequencyDistributionExampleSortrawdatainascendingorder:按升序排序原始数据:12,13,17,21,24,24,26,27,27,30,32,35,37,38,41,43,44,46,53,58.Step1:Findrange:58-12=46.Step2:Selectnumberofclasses:thebookchooses5(usuallybetween5and15).第2步:选择类的数量:书选择5(通常在5到15之间)。Follow“2tok”rule:遵循“2到K”规则:Numberofobservationsisn一些观察NChoosethesmallestksuchthat2kn选择最小的k,例如2KnStep3:Computeclassinte

1 / 31
下载文档,编辑使用

©2015-2020 m.777doc.com 三七文档.

备案号:鲁ICP备2024069028号-1 客服联系 QQ:2149211541

×
保存成功