Persona:aontextualizedandpersonalizedwebsearhFranisoTanudjaja|fstanudmit.eduLikMui|lmuimit.eduLaboratoryofComputerSieneatMIT,Cambridge,MA02139June1,2001AbstratReentadvanesingraph-basedsearhtehniquesderivedfromKleinberg’swork[1℄havebeenimpressive.Thispaperfurtherimprovesthegraph-basedsearhalgo-rithmintwodimensions.Firstly,variantsofKleinberg’stehniquesdonottakeintoaountthesemantisofthequerystringnorofthenodesbeingsearhed.Asaresult,polysemyofquerywordsannotberesolved.ThispaperpresentsaninterativequeryshemeutilizingthesimplewebontologyprovidedbytheOpenDiretoryProjettoresolvemeaningsofauserquery.Seondly,weextendareentlyproposedpersonalizedversionoftheKleinbergalgorithm[3℄.Simulationresultsarepresentedtoillustratethesensitivityofourtehnique.WeoutlinetheimplementationofouralgorithminthePersonapersonalizedwebsearhsystem.11OverviewSearhenginesindexlargenumbersofdoumentsandletusersquerydesireddou-ments.However,mostsearhenginesarenottailoredtomeetindividualuserpref-erenes.[6℄notedthatalmosthalfofthedoumentsreturnedbysearhenginesaredeemedirrelevantbytheirusers.Thereareseveralaspetstotheproblem.Firstistheproblemofsynonymsandhomonyms.Synonymsaretwowordsthatarespeltdi erentlybuthavethesamemeaning.Homonymsarewordsthatarespeltthesamebuthavedi erentmeanings.Withoutpriorknowledge,thereisnowayforthesearhenginetopredituserinterestfromsimpletextbasedqueries.Seondly,searhenginesshouldbedeterministiinthatitshouldreturnthesamesetofdoumentstoalluserswiththesamequeryataertaintime.Thereforeitisinherentthatsearhenginesarenotdesignedtoadapttopersonalpreferenes.Currentinformationretrievalanddataminingresearhtriestoenhaneuser’swebexperienefromseveraldiretions.Onediretionistoreateabetterstruturalmodeloftheweb,suhthatitaninterfaemoreeÆientlywithsearhengines.Anotherapproahistomodeluserbehaviorastopreditusers’interestsbetter.Alongthelinesoftheformeraree ortsatbetterde ningthemeaningofqueriesthemselves.TheWordnetprojetatPrinetonUniversityisanonlinelexialreferenesystemthatorganizesEnglishwordsintosynonymsets[7℄.Asimilarapproahistobuildataxonomyofwords.Ataxonomyomprisesofatreestrutureinwhihawordbelongstoaertainnode,eahwithparentsandhildren.Anode’sparentservesagen-eralategorythatenompassesallofitshildren.Anodemayhavehildrenthataresub-ategoriesofitself.AnexampleofsuhwordtaxonomiesaretheOpenDiretory2Projet[℄andtheMagellanhierarhy[℄.Yetanotherapproahistoreateasemantistrutureinmahinereadableformat.Asopposedtolassifyingontentfromaperson’spointofview,thismethodembedsmetadataforlassi ation,allowingdoumentontenttobemahinereadable.Thereareurrentlye ortsatstandardizingtheselassi ation,forexampleOIL(OntologyInterhangeLanguage)andDAML(DARPAAgentMarkupLanguage).Haystak[4℄isanongoingprojetinsemantimetadataindexing.Alongthelinesofthelatterapproah,variousresearhindataminingandknowl-edgerepresentationhavebuildmodelstoreorduserinterestandpredituserbehavior.Ultimately,theseusermodelsinterfaewithasystemsoastogiveitaprioriknowledgeregardinguserpreferenes.Clearly,workinuserpro lingisloselyrelatedtobuildingbetterpersonalizedsystems.Di erentmethodsofgatheringuserdataisoftenoupledwithvariousper-sonalizationsystems.Wefoundthattheombinationsthatareavailableintheontextofpersonalizedsearhareunsatisfatory.Weproposeanovelapproahinbuildingabettersystemwiththefollowing.First,weextendexistingtheorywithregardstopersonalizedsearh.Seond,weproposetomodelusersinterestusinganinterativequeryshemeutilizingthewebontologyprovidedbytheOpenDiretoryProjet.Tosupportourargument,wehavebuiltanimplementationofapersonalizedsearhengine.Thesystemwrapsapersonalizationmoduleontoanexistingsearhengine,andre nessearhresultsusingtheproposedextensionofthegraphbasedalgorithm.Atitsore,theproposedsystemutilizesataxonomyofuserinterestanddisinterest.Weuseatreeoloringmethodtorepresentuserpro les.Visitednodesare’olored’bythenumberoftimeitisvisited,whethertheuserrateitaspositiveornegative,and3URLsthatitassoiatesto.Inaddition,werunsetsofontrolledexperimentstoanalyzetheperformaneofeahoftheexistingvariants.Theexperimentalresultsverifyourpreditionsandon rmthattheproposedextensionperformsbetter.Weo eraroadmapofthisdoument.Setion2outlinesrelatedworkinpersonal-izedwebbrowsingandreviewsexistingmethodsusinggraphbasedsearhalgorithms.Setion3desribesourextensiontoexistingtheory.Setion4desribestheusermod-elingtehnique.Setion5outlinestheimplementationofPersona.Setion6desribesthesimulationresults.Weonludeinsetion7withsomediretionforfuturework.2RelatedWorks2.1ExamplesofpersonalizationappliationsPersonalizationappliationsoverarangeofspetrum.Atoneendofthespetrum,wehave lteringsystems,whih lterinputfromaninformationresoure.Informationofpossibleinterestaremarked.Anexampleofsuha lteringsystem,SmartPush[8℄ombinesseveralnovelideastogether.Thesystem ndsinformationbymeansofsemantimetadatato lternewsartiles.Inaddition,itbuildstheuserpro leusingasimplehierarhialoneptmodel.Forexample,undertheategorynews,therearetheategoriessports,literature,eonomis,et.Themodelreordsuserpreferenebygivingweightingstothe