javaWeb英文文献

www8512133
1 ℃
2020-04-20

整理文档很辛苦，赏杯茶钱您下走！

还剩 ... 页未读，继续阅读 >>

免费阅读已结束，点击下载阅读编辑剩下 ... 页

阅读已结束，您可以下载文档离线阅读编辑

资源描述

PIDALION:ImplementationissuesofaJava-basedMultimediaSearchEngineoverthewebDimitrisE.Charilas,OuraniaI.MarkakiNationalTechnicalUniversityofAthens,DepartmentofElectricalandComputerEngineering,HeroonPolytechneiou9,Athens,15773,GreecePhone:(+30)210-7722078E-mail:omarkaki@gmail.comKeywords:multimediacontent,queries,content-basedretrieval,multimediacrawler,metadata,imagehistogram,hierarchicalpresentationAbstract-Fuelledbytherapidexpansionofbroadbandconnectivityandincreasinginterestinonlinemultimedia-richapplications,thegrowthofdigitalmultimediacontenthasskyrocketed.Amongothers,thisgrowthiscompoundingtheneedformoreeffectivemethodsforsearchingmultimediainformation.Theautomatedwebsearchenginesthatarecurrentlyusedrelyonlyontextdescriptionsandasaresultprovidematchesofpoorqualityincaseofmultimediacontent.Theservicesofamultimediasearchenginearethereforeapossibilitythattheinternetusersstilllack.Thus,thescopeofthispaperistopresentanimplementationapproachforapersonalizedweb-basedmultimediasearchengineintheJavaprogramminglanguage.Thisapproachcombinesthecharacteristicsofthecurrentsearchenginesaswellasnewinnovativefeatureswhichguaranteeatthesametimethesystem’squickresponseandbettersearchresults.Inthispaperthereadercanfindananalyticalpresentationofallthecomponentsrequiredtoformamultimediasearchengine,aswellasindicationsonhowtoimplementkeyalgorithmsandfunctions.1.INTRODUCTIONThewebcreatesnewchallengesforinformationretrieval.Theamountofinformationonthewebisgrowingrapidlyandsoisthenumberofnewusersinexperiencedintheartofwebresearch.Itisestimatedthat1-2Exa-Bytes(millionsofTera-Bytes)ofnewinformationarecreatedeachyearovertheWeb.Thishugeamountofinformationisanticipatedtogrowbyafactorof10inthefollowingtwoyears.Automatedsearchenginesthatrelyonkeywordmatchingusuallyreturntoomanylowqualitymatches.Thesituationisworseasfarasmultimediacontentisconcerned.Themostpopularsearchengine,Google[1],reliesonlyonkeywordstosearchforimagesanddoesnotcontainanyinformationonsemanticcontent.Content-basedimageretrievalsystems(CBIR)trytosolvethisproblem.ManyCBIRsystemshavebeenrecentlyproposedandimplementedintheliterature.ExamplesincludetheQBICsystem[2],wherecolourinformationisexploited,thePicToSeeksystem[3],whichcombinescolourandshapeinvariantfeaturestoperformimageretrievalandVirage[4]thatallowstheuserstomanuallyregulatetheimportanceoftheextracteddescriptorsaccordingtotheirownperception.Fuzzyorganizationofthedescriptorsisproposedin[5]forincreasingtheretrievalprecisionatacertainrecallvalue,while3Dsearchingisdiscussedin[6].Applicationsofcontent-basedretrievalsystemsareexaminedin[7],whilein[8]asystemregardingmusicaccessisproposed.Personalizedretrievalisexaminedintheworkpresentedin[9].Lastbutnotleast,Marvelthelatestandmoreintelligentcontent-basedsearchengine,developedbytheIBMresearchcentre,USAin2004[10],triestoincreasetheretrievalprecisionaccuracybyincorporatingsemanticannotationinthemediavolumes.However,alltheadoptedapproacheshavestaticandlocalaccessonlytothesystem’sdatabaseandthuscannotretrievecontentfromtheweb[11].Furthermore,theaforementionedworksfocusonthealgorithmsforefficientcontent-basedretrievalandnotonthepracticalissuesregardingtheimplementationofalargescalemultimediasearchengineovertheWeb.Sofar,severaldifferenttechniquesformakingdistributedmultimediacontentsearchablehavebeenproposed.In[12]thereisinformationonthetechniquesofcheckingtheoutgoinglinks,analyzingthereferringpage,miningfortextualinformationinthemediafileandutilizingmetadatausingtheDublinCoremetadatamodelortheMPEG-7standard.Thispaperfocusesondescribingamultimediasearchenginethatcombinesfeaturesfromexistingsearchenginesandenhancestheirfunctionalitiesthroughinnovativealgorithmsandmechanisms.Ourgoalisnotonlytodescribethesystem’sarchitectureandinterconnectivity,butalsotoexplainhowthealgorithmscanbeimplementedinJavacode.Theproposedsystem,namedPIDALION,runsonWindowsenvironment,whiletheJavaServerPages(JSP)andJavaServletstechnologiesareadoptedtoensurethesystem’sinteroperabilityanddynamicbehaviour.Thesystem’sdatabaserunsonSQLServer2000.Oneofthekeyfeaturesoftheproposedsearchengineistheprovisionoffullypersonalizedretrievalservices:usersofPIDALIONmaysharetheirpersonalcontenteitherwithallwebusersorwithintheframeofgroups,aswellasmaintainapersonalprofile,wheretheirpreferencesarestored.Personalized978-1-4244-4530-1/09/$25.00©2009IEEEretrievalcanbeachievedthroughthecreationofsocialgroupsandtheuseofdynamicrelevancefeedbackmechanisms,whichtailorthesystem’sperformancetothecurrentuser’spreferences.Thispaperisorganizedasfollows:Section2presentsthesystem’sarchitecture,explainingbrieflytheroleofeachmaincomponent.Sections3to7presentthefunctionality,architectureandkeyfeatures-innovationsofeachcomponent.Keyalgorithmsaredepictedintheformofpseudo-code.Finally,inSection8theissuescoveredinthispaperaresummarizedandfutureexpansionsareproposed.2.SYSTEMOVERVIEWFig.1.InterconnectionbetweensubsystemsTheplatformdescribedinthispaperconsistsofthefollowingsubsystems:•Themultimediacrawlingsubsystem,whoseroleistoindexmultimediacontenta