AGeometricPerspectiveonMachineLearning何晓飞浙江大学计算机学院1MachineLearning:theproblemf何晓飞Information(trainingdata)f:X→YXandYareusuallyconsideredasaEuclideanspaces.2ManifoldLearning:geometricperspectiveThedataspacemaynotbeaEuclideanspace,butanonlinearmanifold.☒Euclideandistance.☒fisdefinedonEuclideanspace.☒ambientdimension☑geodesicdistance.☑fisdefinedonnonlinearmanifold.☑manifolddimension.instead…3ManifoldLearning:thechallengesThemanifoldisunknown!Wehaveonlysamples!HowdoweknowMisasphereoratorus,orelse?HowtocomputethedistanceonM?versusThisisunknown:Thisiswhatwehave:??orelse…?TopologyGeometryFunctionalanalysis4ManifoldLearning:currentsolutionFindaEuclideanembedding,andthenperformtraditionallearningalgorithmsintheEuclideanspace.5Simplicity6Simplicity7Simplicityisrelative8Manifold-basedDimensionalityReductionGivenhighdimensionaldatasampledfromalowdimensionalmanifold,howtocomputeafaithfulembedding?Howtofindthemappingfunction?Howtoefficientlyfindtheprojectivefunction?fff9AGoodMappingFunctionIfxiandxjareclosetoeachother,wehopef(xi)andf(xj)preservethelocalstructure(distance,similarity…)k-nearestneighborgraph:Objectivefunction:Differentalgorithmshavedifferentconcerns10LocalityPreservingProjectionsPrinciple:ifxiandxjareclose,thentheirmapsyiandyjarealsoclose.11LocalityPreservingProjectionsPrinciple:ifxiandxjareclose,thentheirmapsyiandyjarealsoclose.Mathematicalformulation:minimizetheintegralofthegradientoff.12LocalityPreservingProjectionsPrinciple:ifxiandxjareclose,thentheirmapsyiandyjarealsoclose.Mathematicalformulation:minimizetheintegralofthegradientoff.Stokes’Theorem:13LocalityPreservingProjectionsPrinciple:ifxiandxjareclose,thentheirmapsyiandyjarealsoclose.Mathematicalformulation:minimizetheintegralofthegradientoff.Stokes’Theorem:LPPfindsalinearapproximationtononlinearmanifold,whilepreservingthelocalgeometricstructure.14ManifoldofFaceImagesExpression(SadHappy)Pose(RightLeft)15ManifoldofHandwrittenDigitsThicknessSlant16Learningtarget:TrainingExamples:LinearRegressionModelActiveandSemi-SupervisedLearning:AGeometricPerspective17GeneralizationErrorGoalofRegressionObtainalearnedfunctionthatminimizesthegeneralizationerror(expectederrorforunseentestinputpoints).MaximumLikelihoodEstimate18Gauss-MarkovTheoremForagivenx,theexpectedpredictionerroris:19-4-3-2-10123400.10.20.30.40.50.60.70.8-4-3-2-10123400.10.20.30.40.50.60.70.8Gauss-MarkovTheoremForagivenx,theexpectedpredictionerroris:Good!Bad!20ExperimentalDesignMethodsThreemostcommonscalarmeasuresofthesizeoftheparameter(w)covariancematrix:A-optimalDesign:determinantofCov(w).D-optimalDesign:traceofCov(w).E-optimalDesign:maximumeigenvalueofCov(w).Disadvantage:thesemethodsfailtotakeintoaccountunmeasured(unlabeled)datapoints.21ManifoldRegularization:Semi-SupervisedSettingMeasured(labeled)points:discriminantstructureUnmeasured(unlabeled)points:geometricalstructure?22Measured(labeled)points:discriminantstructureUnmeasured(unlabeled)points:geometricalstructure?randomlabelingManifoldRegularization:Semi-SupervisedSetting23Measured(labeled)points:discriminantstructureUnmeasured(unlabeled)points:geometricalstructure?randomlabelingactivelearningactivelearning+semi-supervsedlearningManifoldRegularization:Semi-SupervisedSetting24UnlabeledDatatoEstimateGeometryMeasured(labeled)points:discriminantstructure25UnlabeledDatatoEstimateGeometryMeasured(labeled)points:discriminantstructureUnmeasured(unlabeled)points:geometricalstructure26UnlabeledDatatoEstimateGeometryMeasured(labeled)points:discriminantstructureUnmeasured(unlabeled)points:geometricalstructureComputenearestneighborgraphG27UnlabeledDatatoEstimateGeometryMeasured(labeled)points:discriminantstructureUnmeasured(unlabeled)points:geometricalstructureComputenearestneighborgraphG28UnlabeledDatatoEstimateGeometryMeasured(labeled)points:discriminantstructureUnmeasured(unlabeled)points:geometricalstructureComputenearestneighborgraphG29UnlabeledDatatoEstimateGeometryMeasured(labeled)points:discriminantstructureUnmeasured(unlabeled)points:geometricalstructureComputenearestneighborgraphG30UnlabeledDatatoEstimateGeometryMeasured(labeled)points:discriminantstructureUnmeasured(unlabeled)points:geometricalstructureComputenearestneighborgraphG31LaplacianRegularizedLeastSquare(BelkinandNiyogi,2006)LinearobjectivefunctionSolution32ActiveLearningHowtofindthemostrepresentativepointsonthemanifold?33Objective:Guidetheselectionofthesubsetofdatapointsthatgivesthemostamountofinformation.Experimentaldesign:selectsamplestolabelManifoldRegularizedExperimentalDesignSharethesameobjectivefunctionasLaplacianRegularizedLeastSquares,simultaneouslyminimizetheleastsquareerroronthemeasuredsamplesandpreservethelocalgeometricalstructureofthedataspace.ActiveLearning34,Inordertomaketheestimatorasstableaspossible,thesizeofthecovariancematrixshouldbeassmallaspossible.D-optimality:minimizethedeterminantofthecovariancematrix2()CovIy12TXLXI12TTHZZXLXIˆwˆ()Covw112ˆ()TTZZXLXIZwyAnalysisofBiasa