javaWeb英文文献

整理文档很辛苦,赏杯茶钱您下走!

免费阅读已结束,点击下载阅读编辑剩下 ...

阅读已结束,您可以下载文档离线阅读编辑

资源描述

PIDALION:ImplementationissuesofaJava-basedMultimediaSearchEngineoverthewebDimitrisE.Charilas,OuraniaI.MarkakiNationalTechnicalUniversityofAthens,DepartmentofElectricalandComputerEngineering,HeroonPolytechneiou9,Athens,15773,GreecePhone:(+30)210-7722078E-mail:omarkaki@gmail.comKeywords:multimediacontent,queries,content-basedretrieval,multimediacrawler,metadata,imagehistogram,hierarchicalpresentationAbstract-Fuelledbytherapidexpansionofbroadbandconnectivityandincreasinginterestinonlinemultimedia-richapplications,thegrowthofdigitalmultimediacontenthasskyrocketed.Amongothers,thisgrowthiscompoundingtheneedformoreeffectivemethodsforsearchingmultimediainformation.Theautomatedwebsearchenginesthatarecurrentlyusedrelyonlyontextdescriptionsandasaresultprovidematchesofpoorqualityincaseofmultimediacontent.Theservicesofamultimediasearchenginearethereforeapossibilitythattheinternetusersstilllack.Thus,thescopeofthispaperistopresentanimplementationapproachforapersonalizedweb-basedmultimediasearchengineintheJavaprogramminglanguage.Thisapproachcombinesthecharacteristicsofthecurrentsearchenginesaswellasnewinnovativefeatureswhichguaranteeatthesametimethesystem’squickresponseandbettersearchresults.Inthispaperthereadercanfindananalyticalpresentationofallthecomponentsrequiredtoformamultimediasearchengine,aswellasindicationsonhowtoimplementkeyalgorithmsandfunctions.1.INTRODUCTIONThewebcreatesnewchallengesforinformationretrieval.Theamountofinformationonthewebisgrowingrapidlyandsoisthenumberofnewusersinexperiencedintheartofwebresearch.Itisestimatedthat1-2Exa-Bytes(millionsofTera-Bytes)ofnewinformationarecreatedeachyearovertheWeb.Thishugeamountofinformationisanticipatedtogrowbyafactorof10inthefollowingtwoyears.Automatedsearchenginesthatrelyonkeywordmatchingusuallyreturntoomanylowqualitymatches.Thesituationisworseasfarasmultimediacontentisconcerned.Themostpopularsearchengine,Google[1],reliesonlyonkeywordstosearchforimagesanddoesnotcontainanyinformationonsemanticcontent.Content-basedimageretrievalsystems(CBIR)trytosolvethisproblem.ManyCBIRsystemshavebeenrecentlyproposedandimplementedintheliterature.ExamplesincludetheQBICsystem[2],wherecolourinformationisexploited,thePicToSeeksystem[3],whichcombinescolourandshapeinvariantfeaturestoperformimageretrievalandVirage[4]thatallowstheuserstomanuallyregulatetheimportanceoftheextracteddescriptorsaccordingtotheirownperception.Fuzzyorganizationofthedescriptorsisproposedin[5]forincreasingtheretrievalprecisionatacertainrecallvalue,while3Dsearchingisdiscussedin[6].Applicationsofcontent-basedretrievalsystemsareexaminedin[7],whilein[8]asystemregardingmusicaccessisproposed.Personalizedretrievalisexaminedintheworkpresentedin[9].Lastbutnotleast,Marvelthelatestandmoreintelligentcontent-basedsearchengine,developedbytheIBMresearchcentre,USAin2004[10],triestoincreasetheretrievalprecisionaccuracybyincorporatingsemanticannotationinthemediavolumes.However,alltheadoptedapproacheshavestaticandlocalaccessonlytothesystem’sdatabaseandthuscannotretrievecontentfromtheweb[11].Furthermore,theaforementionedworksfocusonthealgorithmsforefficientcontent-basedretrievalandnotonthepracticalissuesregardingtheimplementationofalargescalemultimediasearchengineovertheWeb.Sofar,severaldifferenttechniquesformakingdistributedmultimediacontentsearchablehavebeenproposed.In[12]thereisinformationonthetechniquesofcheckingtheoutgoinglinks,analyzingthereferringpage,miningfortextualinformationinthemediafileandutilizingmetadatausingtheDublinCoremetadatamodelortheMPEG-7standard.Thispaperfocusesondescribingamultimediasearchenginethatcombinesfeaturesfromexistingsearchenginesandenhancestheirfunctionalitiesthroughinnovativealgorithmsandmechanisms.Ourgoalisnotonlytodescribethesystem’sarchitectureandinterconnectivity,butalsotoexplainhowthealgorithmscanbeimplementedinJavacode.Theproposedsystem,namedPIDALION,runsonWindowsenvironment,whiletheJavaServerPages(JSP)andJavaServletstechnologiesareadoptedtoensurethesystem’sinteroperabilityanddynamicbehaviour.Thesystem’sdatabaserunsonSQLServer2000.Oneofthekeyfeaturesoftheproposedsearchengineistheprovisionoffullypersonalizedretrievalservices:usersofPIDALIONmaysharetheirpersonalcontenteitherwithallwebusersorwithintheframeofgroups,aswellasmaintainapersonalprofile,wheretheirpreferencesarestored.Personalized978-1-4244-4530-1/09/$25.00©2009IEEEretrievalcanbeachievedthroughthecreationofsocialgroupsandtheuseofdynamicrelevancefeedbackmechanisms,whichtailorthesystem’sperformancetothecurrentuser’spreferences.Thispaperisorganizedasfollows:Section2presentsthesystem’sarchitecture,explainingbrieflytheroleofeachmaincomponent.Sections3to7presentthefunctionality,architectureandkeyfeatures-innovationsofeachcomponent.Keyalgorithmsaredepictedintheformofpseudo-code.Finally,inSection8theissuescoveredinthispaperaresummarizedandfutureexpansionsareproposed.2.SYSTEMOVERVIEWFig.1.InterconnectionbetweensubsystemsTheplatformdescribedinthispaperconsistsofthefollowingsubsystems:•Themultimediacrawlingsubsystem,whoseroleistoindexmultimediacontenta

1 / 6
下载文档,编辑使用

©2015-2020 m.777doc.com 三七文档.

备案号:鲁ICP备2024069028号-1 客服联系 QQ:2149211541

×
保存成功