基于内容和结构的自适应神经模糊系统网页检索(IJMECS-V7-N8-8)

整理文档很辛苦,赏杯茶钱您下走!

免费阅读已结束,点击下载阅读编辑剩下 ...

阅读已结束,您可以下载文档离线阅读编辑

资源描述

I.J.ModernEducationandComputerScience,2015,8,69-84PublishedOnlineAugust2015inMECS()DOI:10.5815/ijmecs.2015.08.08Copyright©2015MECSI.J.ModernEducationandComputerScience,2015,8,69-84WebPagesRetrievalwithAdaptiveNeuroFuzzySystembasedonContentandStructureMohammadSaberIrajiFacultyMemberofDepartmentofComputerEngineeringandInformationTechnology,PayameNoorUniversity,I.R.ofIranEmail:iraji.ms@gmail.comHakimehMaghamniaDepartmentofComputerEngineeringandInformationTechnology,PayameNoorUniversity,I.R.ofIranEmail:h.maghamnia@gmail.comMarziehIrajiDepartmentofComputerEngineeringandInformationTechnology,UniversityCollegeofRouzbahan,Sari,IranEmail:marziehiraji@gmail.comAbstract—Volumeofwebpagesandinformationonthewebisconstantlyincreasing.Inthispaper,wepresentedasystemtoretrievepagesrelevanttoaquery,thatcanbeusedbythesearchengines.Thedesignofourproposedsystem,content,Pagecontentofneighbors,Connectivity(linkanalysis)featureswereusedandthemethodsoffuzzySugenoandadaptivefuzzyneuralnetworkmethodsconsidered.Resultsshowedthattheneuralmethod,theerrorislessthanothermethods,intheretrievalofwebpagestailoredtotheuserssearchqueryontheWeb,canincreasetheefficiencyofsearchengines.IndexTerms—Webpagesretrieval,adaptiveneurofuzzy,searchengines.I.INTRODUCTIONApplicationsofcomputerandInternetissearchinglargevolumesofpages,andinformationretrievalresearchersandcomputerusers.MostpeopleusesearchingthroughaquerysearchengineslikeGoogleuse.ThevolumesofinformationavailableontheInternetareincreasingeverymoment.Memberslookingforusefulinformationonthemassareimportantforsearchenginestoprovideusefulinformationtousers,often.InSearchenginesthemainchallengeisdeterminerelevantdocumentsandirrelevanttothequery.SearchenginebenefitfromspiderssuchasWebrobots,usingdifferentalgorithmtoretrieveWebpagesrelevanttoaspecificdomain.Filteringmethodsaredividedintofourcategories:1.DeterminetherelevanceofaWebpagetoasubjectmanuallybyexperts[1].2.Suitabilityofawebpagetoaspecifictopic,thenumberofoccurrencesofkeywords[2].3.TFIDF(termfrequencyinversedocumentfrequency)iscomputedbasedonalexicon[3].4.Textclassificationmethodsthatappliedtowebpages[4].WebpagefilteringcanbepragmaticinsearchenginesandWebapplicationssuchasWebcontentmanagement.Eventspammingcircumfuseonwebpages,afteremail.Resultofwebspammingisdecreasequalityforsearchengine.Thus,itwastefulpagesindexedinthesearchenginesandqueryprocessingcostincreases[5].Thisisachallengeforservers,providetheappropriateinformationtoInternetusersbasedonthecontentandlinksofwebpages.ThisarticleismotivatedbydesigninganeuralfuzzysysteminordertoaccuratelyretrieveWebpages,accordingtoInternetusers'queries.Theaimofthisstudyistoexamineanddiscussaboutthewebpagesretrievalsystem,thispaperattemptstooptimizethewebpagesretrievalalgorithm.Thepaperisorganizedinfivesections.AftertheintroductioninSectionI,SectionIIwhichalsointroducestherelatedworksofwebpagesfiltering.SectionIIcontinueswithAdaptiveneurofuzzymodelsforproposedsystemandexamplesinsectionIII.SectionIVandVpresentstheresults,conclusionsoftheresearch.Thepaperendswithalistofreferences.II.WORKHISTORYGooglescholarisasearchenginethatuseforresearcher.GoogleScholarCitationisthehighestfactorintheretrievalprocessandtheincidenceofasearchwordinanarticle’stitletohaveapotentimpactonthearticle’sranking[6].RongmeiLibeevolvedclickedpagesfromclickeddomainsInordertoimprovetheefficiencyofWebinformationretrieval[9].HemaDubey,B.N.Royofferanewpagerankalgorithmbaseonmeanpageranksandreducesalgorithmcomplexity[10].Bhamidipati,etallintroducethescorefusiontechniqueandapplywhentwopages70WebPagesRetrievalwithAdaptiveNeuroFuzzySystembasedonContentandStructureCopyright©2015MECSI.J.ModernEducationandComputerScience,2015,8,69-84havesameranking[11].Sharma,etallwerecomparedDifferentmethodsforrankingwebpageswithdifferentalgorithms[12].Minnie,etallhaveimplementedLinksAlgorithmandotheralgorithms[13].theresultisretrievedIfanexactmatchoccurs,otherwisenot.Qiu,Hemmje,etallofferpagefilteringsystembasedonpagelinks,information’spagelinksforimprovesearchqueryalgorithms[14].In[15]reportamachine-learning-basedmethodthatmixWebcontentandstructureanalysis.TheydisplayeachWebpagebyasetofcontent-basedandlink-basedfeatures.TheywereusedtypeofneuralnetworksNamelysupportvectormachineandcomparetheirproposedmethodwithtwoexistingwebpagefilteringmethods—akeyword-basedmethodandalexicon-basedmethodandresultsperformbetter.Scarselli,etallhaveintroducedamachinelearningtypeforwebspamdiscoverybasedonGraphNeuralNetworksPM-GraphSOMs.theyuseLink-based(Degree-relatedmeasures,PageRank,TrustRank,TruncatedPageRank,Estimationofsupporters)andContent-basedfeatures(Fractionofanchortext,Fractionofvisibletext,Compressionrate,Corpusprecisionandcorpusrecall,Queryprecisionandqueryrecall,Independenttrigramlikelihood,Entropyoftrigrams)inyourproposedsystem[5].Theywereappraisetheirsystemintoatrainingdata(8339pages)andatestdata(1851pages)fromWEBSPAM-UK2006dataset.Theresultsshowthattheoptimizationoftheirmethod.Khokale,etallofferedaWebinformationretrievalwithFuzzylogic.Inth

1 / 16
下载文档,编辑使用

©2015-2020 m.777doc.com 三七文档.

备案号:鲁ICP备2024069028号-1 客服联系 QQ:2149211541

×
保存成功