FoodsBackgroundDatawerecollectedtoinvestigatetheconsumptionpatternofanumberofprovisionsindifferentEuropeancountries.Thepurposeoftheinvestigationwastoexaminesimilaritiesanddifferencesbetweenthecountriesandthepossibleexplanations.ObjectiveTheobjectiveofthisstudyistounderstandhowthevariationinfoodconsumptionamonganumberofindustrializedcountriesisrelatedtocultureandtraditionandhencefindthesimilaritiesanddissimilaritiesamongthecountries.Hencedatahavebeencollectedon20variablesand16countries.Thedatashowhowmanypercentofhouseholdsuse20fooditemsregularly.DataThedatasetconsistsof20variables(thedifferentfoods)and16observations(theEuropeancountries).Thevaluesarethepercentagesofhouseholdsineachcountrywhereaparticularproductwasfound.Forthecompletedatatable,seebelow.Thistableisagoodexampleofhowtoorganiseyourdata.Therearetwosecondaryobservationidentifiers,Location(geographic)andLatitude(ofcapital).ThecodingofLocationis:C=central;S=south;N=north;U=UK&Ireland;X=beneluX.ThecodingofLatitudeis:1=45°;2=45-50°;3=50-55°;4=55-60°;5=60°.OutlineThestepstofollowinSIMCA-Pare:•Importthedataset.•Preparethedata(Worksetmenu).•FitaPCmodelandreviewthefit(Analysismenu).•Interprettheresults(Analysismenu).DefineprojectStartSIMCA-PandcreateanewprojectfromFILE|NEWSIMCA-PTutorial0BFoods•1Selecttypeofdata(XLS)orALLSupportedFiles(thedefault)andfindthedataset(FOODS_update.XLS).Datacanbeimportedfromyourhard-diskorfromanetworkdrive.Datacanbeimportedindifferentformats,soselecttheonewhichisappropriateorAllSupportedFiles.InthisexamplewehavethedatainaXLS-filecreatedfromExcel.Ifthedatasetisonafloppydisk,werecommendthatyoufirstcopythefiletotheharddisk.Ifyouwanttoleaveopenthecurrentproject,removethecheckmarkfromtheboxCloseCurrentProject.Note:Thedatasettoimportcanbelocatedanywhereonanaccessibledirectory.Itdoesnothavetobelocatedwhereyouhavedefinedthedestinationdirectory.WhenyouclickonOpen,SIMCA-PopenstheImportWizard.WithSIMCA-P+,marktheradiobuttonSIMCA-Pnormalproject.Theimportwizarddetectsthatthereisanemptyrowandasksifyouwanttoexcludethatrow.ChoseYes.2•0BFoodsSIMCA-PTutorialSIMCA-Phastriedtodoaninterpretationofthedatatableandmadesomesettings.ObservationsandvariablesmusthaveaprimaryIDbutcanhavemanysecondaryID:s.TheprimaryIDmustbeuniquebutnotthesecondaryID.TheID:swillbeusedaslabelsinplots.Inthiscasewehavenameoncountries(unique)thataresuitableasaprimaryIDandnamesonfoodthatarealsouniqueandcanbesetasaprimaryID.Oneachrowandcolumnthereisasmallarrowthatcanbeusedtochangesettings.Clickonthearrowforthecolumnwithcountrynamesandchose“PrimaryObservationID”.Theavailablesettingsforcolumnscanbeseeninthelist.ThedefaultforvariablesareX.Thesettingforcolumn2isnowPrimaryID.Thefirstcolumnissettoexcludewhichisfine.The3rdand4thcolumn(GeographiclocationandCapitalLatitude)isnotuniqueandwillbothbesetasasecondaryID.Therestofthecolumnsarethedata(X-variables)andarenotchanged.SIMCA-PTutorial0BFoods•3Thesameprocedureisdoneforrowsinthetable.Firstrowisnumbersandsecondrowisnamesonfood(unique).ShiftthesecondrowtoPrimaryVariableID.Thefirstrowwillbeexcludedwhichisfine.ClickonNextandyougivetheprojectnameandadestinationdirectory.Missingvaluesareindicated.AnalysisAfterfinishingtheimportwizardtheprimarydatasetiscreatedinSIMCA.Theprimarydatasetisthedatausedtocreatemodelsfrom.DefaultthewholedatasetisselectedwithUV-scaling(unitvariance).Theprimarydatasetwillnotchangeandwhenyouwanttomakemodelswhereyouchangeobservationsand/orvariables,changescalingetc.TheprimarydatasetcanbeshownchoosingmenuDataset:Open:FOODS_update.Orusethespeedbutton.Hereitispossibletodoseveralthings.Ifyourightclickinthetableseveraloptionsareavailable.4•0BFoodsSIMCA-PTutorialSIMCA-PTutorial0BFoods•5Whendataareimportedtheprojectwindowopensupandwillshowthestartforthe1stmodel(PCA-Xunfitted).InthiscasewewanttofitamodeltothedataandweusemenuAnalysis:Autofitoraspeedbutton.ThiswillcalculatecomponentsoneatatimeandcheckthesignificanceofeachcomponentBasedoncrossvalidation).Whenacomponentisnotsignificanttheprocedureisstopped.AsummarywindowopensupshowingtheR2andQ2forthesignificantcomponents.Theprojectwindowisupdated:Toseethedetailsofthemodel,doubleclickonthemodelrowintheprojectwindow.TheplotwiththesummaryofthefitofthemodelisdisplayedwithR2X(cum)(fractionofthevariationofthedataexplainedaftereachcomponent)andQ2(cum)(crossvalidatedR2X(cum)).ThesummaryofthefitofthemodelisdisplayedwithR2X(fractionofthevariationofthedataexplainedbyeachcomponent)andcumulativeR2X(cum),Q2andQ2(cum)(crossvalidatedR2XandR2X(cum))aswellastheeigenvalues.Thefoodvariablesare,asexpected,correlated,andfairlywellsummarizedbythreenewvariables,thescores,explaining65%ofthevariation.6•0BFoodsSIMCA-PTutorialIntotalthemodeldescribes64.8%ofthevariation(R2(cum))inthedatawithaQ214.4%(badpredictionpropertiesofthemodel).1stcomponentdescribes31,7%ofthevariation.ltsfromthemodeluseaspeedbuttonthatwillcreatefourimportanttly.ScoresandLoadingsScoresTogetaquickoverviewoftheresuplotsdirecTheseplotsareathescoreplot(upperleft,t1vs.t2),theloadingplot(lowerleft,p1vs.p2),DModX(distancetomodel)andX/YOverviewplo