VISION AND LEARNING FOR INTELLIGENT HUMAN-COMPUTER

c000000
1 ℃
2020-04-23

整理文档很辛苦，赏杯茶钱您下走！

还剩 ... 页未读，继续阅读 >>

免费阅读已结束，点击下载阅读编辑剩下 ... 页

阅读已结束，您可以下载文档离线阅读编辑

资源描述

c°CopyrightbyYingWu,2001VISIONANDLEARNINGFORINTELLIGENTHUMAN-COMPUTERINTERACTIONBYYINGWUB.E.,HuazhongUniversityofScienceandTechnology,1994M.E.,TsinghuaUniversity,1997THESISSubmittedinpartialfulﬁllmentoftherequirementsforthedegreeofDoctorofPhilosophyinElectricalEngineeringintheGraduateCollegeoftheUniversityofIllinoisatUrbana-Champaign,2001Urbana,IllinoisABSTRACTItwasadreamtomakecomputerssee.Theresearchincomputervisionprovidespromisingtechnologiestocapture,analyze,transmit,retrieveandinterpretvisualinformation.However,duetotherichnessandlargevariationsinthevisualinputs,thepracticeofmanystatisticallearningtechniquesforvisualmotioncapturingandrecognitionareconfrontedbysomesimilarproblems,suchthatmakingintelligentandvisuallycapablemachinesisstillachallengingtask.Thisdissertationconcentratesontwoimportantproblems:capturingandrecognizinghumanmotioninvideosequences,whicharecrucialfortheresearchandapplicationsofintelligenthumancomputerinteraction,multimediacommunication,andsmartenvironments.Thisdissertationpresentsthreeeﬀectivetechniquesforvisualmotionanalysistasks:non-stationarycolormodeladaptationforeﬃcientlocalization,multiplevisualcuesintegrationforrobusttracking,andlearningmotionmodelsforcapturingarticulatedhandmotion.Besides,thisdissertationdescribesanovelstatisticallearningmethod,theDiscriminant-EM(D-EM)algorithm,intheframeworkofself-supervisedlearningparadigm.D-EMemploysbothlabeledandunlabeledtrainingdataandconvergessupervisedandunsupervisedlearning.Manytopicsinthedissertationisuniﬁedbythefourproblemsofself-supervisedlearning,i.e.,transduction,co-transduction,modeltransductionandco-inferencing.Extensiveexperimentsandtwopro-totypesystemshavevalidatedtheproposedapproachesinthedomainofvision-basedhumancomputerinteraction.iiiTomyparentsandtoJindanivACKNOWLEDGMENTSAboveall,IwouldliketoexpressmysincerethankstomyadvisorProfessorThomasS.Huangforhisinsightfulguidance,enlighteningadvice,andendlessencouragementthroughoutmyPh.D.study,whichhasgivenmeagreatopportunitytoexplorevariousdiﬃcultbutin-terestingproblems.Iwasluckyandamproudofbeingastudentofhim,agreatmanwithextraordinaryvisionandwisdom.Especially,IwouldliketothankmytwomentorsinMicrosoftResearch,Dr.KentaroToyamaandDr.ZhengyouZhangfortheirselﬂessdiscussionsandsug-gestions,withoutwhichIcouldhavenotmadethisworkpossible.IwouldalsoliketothankmyPh.D.advisorycommitteemembersProfessorNarendraAhuja,ProfessorDavidKriegman,andDr.KentaroToyama,fortheirinspiringandconstructivediscussionsduringmystudy.IalsowouldliketothankallmycolleaguesintheImageFormationandProcessingGroupandmanyofmyfriendsinMicrosoftResearch.Inparticular,IwouldliketothankDr.SteveShafer,Dr.YingShan,Dr.HarryShum,Dr.JohnKrumm,Dr.RickSzeliski,ErikHanson,Dr.VladimirPavlovic,GregBerry,Dr.NebojsaJojic,Dr.QiongLiu,JohnY.Lin,QiTian,SeanXiangZhou,andYunqiangChen.SpecialthankstoJohnY.Linforhishardworkofcollectingﬁngermotiondataandhisselﬂesshelponﬁngertrackingexperimentsandpaperproofreading.Iwishtothankmyfamilyforalltheirendlesslove,supportandencouragementthoughallthetimeofmystudyabroad.Finally,butnotleast,Iwouldliketoexpressmydeepthankstomydearwife,Jindan,forallherlove,sacriﬁce,understandingandhelp,whichcouldbefeltineverywordinthiswork.vTABLEOFCONTENTSCHAPTERPAGE1INTRODUCTION:::::::::::::::::::::::::::::::::::11.1Background.......................................11.1.1Virtualenvironments..............................11.1.2Human-computerinteraction.........................21.1.3Vision-basedhuman-computerinteraction..................21.1.4Gestureinterfaces...............................31.1.5Visuallearning.................................41.2Motivation.......................................51.3Organization......................................61.4Contributions......................................82VISION-BASEDGESTUREINTERFACES:AREVIEW::::::::::102.1Introduction.......................................102.2GestureRepresentation................................102.3HandModeling.....................................112.3.1Modelingtheshape..............................122.3.2Modelingthekinematicstructure.......................132.3.3Modelingthedynamics............................152.4CapturingHumanHandMotion...........................152.4.1Formulatinghandmotion...........................152.4.2Localizinghandsinvideosequences.....................162.4.3Selectingimagefeatures............................182.4.4CapturinghandmotioninfullDOF.....................192.5DataPreparationforRecognition...........................202.5.1Featuresforgesturerecognition........................202.5.2Datacollectionforrecognition........................212.6StaticHandPostureRecognition...........................222.6.13-DModel-basedapproaches.........................232.6.2Appearance-basedapproaches.........................242.7TemporalGestureRecognition............................252.7.1Recognizinglow-levelmotion.........................25vi2.7.2Recognizinghigh-levelmotion.........................262.7.3GesturerecognitionbyHMM.......................