1資訊檢索策略與技巧黃慕萱,Chap.6Harter,Chap.72檢索策略v.s.檢索技巧最早為軍方用語各家看法1979,MarciaBates,”InformationSearchTactics”Hartly如何避免找到不相關文章的方法處理找到過多或過少相關文章的可能對策Palmer指分區組合檢索和引用文獻滾雪球法Pao指布林邏輯、引用文獻及機率檢索策略檢索策略(searchstrategy)針對一檢索問題之通盤考量或全面性之規劃如分區組合檢索法、引用文獻滾雪球法….等檢索技巧(searchheuristics)為完成特定目的所採取的行動3Briefsearch簡易檢索最常見的檢索方式快速簡單fastandinexpensive但常是低recall,低precision適用主題明確想瞭解資料庫製作者所使用的敘述語和索引詞彙確認書目資料已知書名、作者等4BuildingBlocksSearch分區組合檢索法亦有人稱為“blockbuilding”或“buildingblock”檢索方式將索引問題分解成數個主題層面(facets)確定主題層面間的關係通常facets間的關係為”AND”,出現”OR”或”NOT”的情況較少找出可代表各主題層面的檢索詞彙利用布林邏輯”OR”做聯集,以求完整性使用率最高,早期參考晤談表格常依此設計5BuildingBlocksSearchStrategy--1/41.Conductreferenceinterviews2.FormulatesearchobjectivesHighrecallHighprecisionModeratelevelsofrecallandprecision3.Selectdatabase(s)andsearchsystem4.Identifymajorconceptsorfacetsandtheirlogicalrelationshipswithoneanother6BuildingBlocksSearchStrategy--2/45.IdentifysearchstringsthatrepresenttheconceptsWordsFull-textphrasesPiecesofwordsDescriptorsIdentifiersCodesNon-semanticbibliographiccharacteristics非主題相關的欄位,如資料類型、語言、年代等包括同義詞、類同義詞、狹義詞、相關詞fieldstobesearched7BuildingBlocksSearchStrategy--3/46.Foreachdistinctfacetofthesearch,asetofpostingswillbecreatedforeachsearchstringwithinthatfacet.ThesetsarethencombinedintoasinglesetrepresentingthatfacetusingBooleanOR7.Followingsetp#6,thefacetssetsthemselveswillbecombinedwithBooleanANDandNOT8.Planalternatives8BuildingBlocksSearchStrategy--4/49.Formulatetheinitialstatementsofthesearchinthecommandlanguageofthesystem10.Logonandputthesearchtothesystem11.Evaluatetheintermediateresults12.IterateUsetheinteractivefeaturesofthesystemtocarryoutsearchheuristicstactics,maneuvers,strategies,tricks,devices,approaches,totrytoimprovesearchresults9BuildingblocksapproachFacetAFacetBTermA1ORTermA2OR………..TermApTermB1ORTermB2OR………..TermBqFactCTermC1ORTermC2OR………..TermCrAnswerSetBooleancombinationoffacets(AND,OR,NOT)10BuildingBlockssearchsampleFacet1Facet2Facet3Facet4Facet5RISKMEASUREMENTRISKAVERSIONBEHAVIORALDECISIONTHEORYINSURANCEriskmeasurementassessmentchoicedecisionoutcomeriskaversionriskavoidanceriskneutralityriskpronerisktendencybehavioraldecisiontheoryinsurancecontractbankfinancestockinvestmentadvertisementMeasurementofRiskTendencies(lookingforhighrecall)BooleanCombination:((RISKANDMEASUREMENT)ORRISKAVERSIONORBEHAVIORALDECISIONTHEORY)NOTINSURANCE11檢討結果重新檢索想增加recall時findadditionalconceptsorsearchtermstoaddtooneormorefacetsdeleteafacet想增加precision時deletesomeofthemorebroaderormoreambiguoustermsinthefacetsaddanadditionalfacettobeintersectedwiththeothers12Successivefacetstrategies主題層面連續檢索法—1/3其他名稱fewestpostingsfirst(最少筆數優先)mostspecificconceptfirst(最精確概念優先)successivefractions(非以主題層面開始的連續檢索)分區v.s.主題層面分區檢索法使用所有主題層面主題層面連續檢索法設法動用最少的主題層面決定檢索問題的主題層面後,需確定其優先順序,視結果決定是否要繼續進行檢索13Successivefacetstrategies--2/3FirstFacetSecondFacet(optional)OtherFacet(optional)OtherFacetSolutionSet(optional)ANDAND例1:“membersandactivitiesof4-Hclubs”例2:”theemotional,physical,andintellectualcharacteristicsofchildrenwhohavestudiedviolinwiththeSuzukimethod”14Successivefacetstrategies--3/3適用情況當所有的主題層面以布林運算元結合,很可能產生零筆資料時當檢索問題中有一至兩個主題層面涵義相當模糊時當檢索問題具備其他非主題之檢索條件,如資料類型、語言、或出版年代等,可將此非主題檢索條件視為第一個檢索概念時當檢索者寧願忍受誤引而不願失去相關文章時當加入其他主題層面所花費的時間和金錢,可能會超越直接列印檢索結果時當相關文獻過少,檢索者願意檢視一些相關度較低的文章時15PairwiseFacets主題層面配對法—1/3將主題層面兩兩配對並取其交集,而後再聯集之適用情形所有主題層面都同樣重要主題層面之精確性或模糊性相差不大將所有主題層面結合會導致零筆資料注意:主題層面過多時,盡量以3-4個為執行交集的基本單位,以免混淆16PairwiseFacets—2/3分區組合檢索主題層面配對檢索AANDBANDC(AANDB)OR(AANDC)OR(BANDC)17PairwiseFacets—3/3Facet#1Facet#2Facet#3SolutionSetBSolutionSetASample:Adoctoralstudentwantsahighrecallbibliographypreparedontherelationshipbetweenfacialmusculatureandthephysiological(autonomic)respondingofemotions,e.g.,fear.SolutionSetCFINALSOLUTIONSET:AORBORCANDANDAND18CitationPearlGrowing引用文獻滾雪球法以highprecision為目的由100%precision(相關的文章),反推追求recall不斷從已知相關的文獻中,獲取檢索所需的descriptors、identifiers、words,重新進行檢索適用情形資料庫無索引典或詞彙集新興學科常需重複多次檢索,不適於初學者19OtherfacetstrategiesMultipleBriefsearch利用不同的database,盡量取得highrecallInteractiveScanningmosttime-consumingandinteractive如使用classificationcodes,naturallanguageImpliedConcepts掌握隱含性概念,視資料庫之主題性質,選用不同詞彙例:possiblehealthhazardsfromfoodscookedusingmicrowaveovens20Citationindexingstrategies利用引用(citing)與被引用(cited)文獻之間的關係,建構檢索策略Offerhighlyinterdisciplinaryandmultidisciplinaryapproachestoonlinesearching檢索策略Citedpublication、CitedAuthor、CocitedAuthors國科會人文學研究中心人文學引用文獻資料庫(THCI)Non-subjectsearchingDocumenttype、yearofpublication、language、author、corporatesourcedoublelimitingFactsearchingSearchforaknownitemMultipledatabasesearching注意收錄欄位和控制語言用法22檢索技巧(Heuristics)LanguageHeuristicsCommandLanguage,DatabaseandFileStructureHeuristicsRecallandPrecisionHeuristicsHeuristicsforIncreasingRecallHeuristicsforIncreasingPrecisionPersonalHeuristics23LanguageHeuristics—1/2當有下列情形,應使用自然語言檢索OneormoreoftheconceptsofinterestinvolvesasubtlenuanceofmeaningOneormoreoftheconceptsofinterestishigh