Query Automata (extended abstract)

整理文档很辛苦,赏杯茶钱您下走!

免费阅读已结束,点击下载阅读编辑剩下 ...

阅读已结束,您可以下载文档离线阅读编辑

资源描述

QueryAutomata(extendedabstract)FrankNevenLimburgsUniversitairCentrumThomasSchwentickJohannesGutenberg-UniversitatMainzInstitutfurInformatikAbstractItiscommontomodelstructureddocumentda-tabasesbycontext-freeandextendedcontext-freegrammars.Acrucialdierenceisthatthederiva-tiontreesoftheformerareranked,whilethoseofthelatterarenot.Amaintaskindocumenttransformationandinformationretrievalislocatingsubtreessatisfyingsomepattern.Therefore,unaryqueries,i.e.,queriesthatmapatreetoasetofitsnodes,playanimportantroleinthecontextofstructureddocumentdatabases.Wewanttounder-standhowthenaturalandwell-studiedcomputa-tionmodeloftreeautomatacanbeusedtoexpresssuchqueries.Wedeneaqueryautomaton(QA)asadeterministictwo-wayniteautomatonovertreesthathastheabilitytoselectnodesdepend-ingonthestateandthelabelatthosenodes.WestudyQAsoverrankedaswellasoverunrankedtrees.Moreprecisely,wecharacterizetheexpres-sivenessofthedierentformalismsbylinkingthemtomonadicsecond-orderlogic,andweestablishthecomplexityoftheirnon-emptinessandequivalenceproblem.1IntroductionAcommonapproachtostructureddocumentda-tabasesistomodelthemasderivationtreesofcontext-free(CFG)orextendedcontext-freegram-ResearchAssistantoftheFundforScienticResearch,Flanders.Contactauthorforthissubmission:FrankNeven,LUC,Dept.WNI,UniversitaireCampus,B-3590Diepen-beek,Belgium.E-mail:fneven@luc.ac.be.Phone:+32-268211.Fax:+32-268299.mars(ECFG)[1,4,7,12,13,15,22,23].ECFGsarecontext-freegrammarsthatallowarbitraryregu-larexpressionsovergrammarsymbolsontheright-handsideofproductions.Acrucialdierencebe-tweenCFGsandECFGs,isthatderivationtreesoftheformerarerankedwhilethoseofthelatterarenot.Amaintaskindocumenttransformationandinformationretrievalislocatingsubtreessatis-fyingsomepattern[2,3,20,21,16,17].Therefore,unaryqueries,i.e.,queriesthatmapatreetoasetofitsnodes,playanimportantroleinthecontextofstructureddocumentdatabases.Ourgoalistounderstandhowthenaturalandwell-studiedcomputationmodeloftreeau-tomata[10,27]canbeusedtoexpresssuchqueries.Weabstractawayfromthegrammarbyconsider-ingdatabasessimplyasrankedorunrankedtreesoversomealphabet.1Wedeneaqueryautomaton(QA)asatwo-waydeterministicniteautomatonovertreesthatcanselectnodesdependingonthestateandthelabelatthosenodes.AQAcanex-pressqueriesinanaturalway:theresultofaQAonatreeconsistsofallthosenodesthatareselectedduringthecomputationoftheQAonthattree.Thequeryautomataweconsiderarequitedier-entfromthetreeacceptorsstudiedinformallan-guagetheory[10].Forexample,itisnotsodi-culttoseethatqueryautomata,asopposedtotwo-waytreeautomata,arenotequivalenttobottom-upones.Indeed,abottom-upQAcannotexpressthequery\selectallleavesiftherootislabelledwith.Moresurprising,however,isthatvariousQA1Thisisnolossofgenerality,astreeautomatacaneasilydeterminewhethertheinputtreeisaderivationtreeofagiven(E)CFG[10,19]1formalismsacceptthesameclassoftreelanguages2,whilenotexpressingthesameclassofqueries.Thisindicatesasubstantialdierencebetweenlookingatautomatafromaformallanguagepointofview(i.e.,fordeningtreelanguages)andlookingatau-tomatafromadatabasepointofview(i.e.,forex-pressingqueries).First,weconsiderinSection3QAsoverrankedtrees,i.e.,treeswithaxedboundonthenumberofchildrenthatavertexmighthave.AQAr(rstandsforranked)isatwo-waydeterministictreeautoma-tonasdenedbyMoriya[18]3extendedwithase-lectionfunctionthatdependsonthestateandthelabelatanode.Weshowthattheseautomatacanexpressexactlytheunaryqueriesdenableinmonadicsecond-orderlogic(MSO).Second,inSection4,weconsiderautomataonunrankedtrees.Onlyrecently,inthecontextofXML[6]andSGML[30],asystematicstudyofautomataoverunrankedtreeshasbeeninitiated.BasedonworkofPairandQuere[24],andTaka-hashi[26],Muratadenedabottom-upautoma-tonmodelforunrankedtrees[19].Thedicultyinobtainingthisisthatthetransitionfunctionoftheautomatonshouldbedenedforanynumberofchildren.Murata’sapproachistoassignastatetoanode,dependingonthelabelofthatnodeandde-pendingonwhetherthesequenceofstatesassignedtoitschildrenbelongstoacertainregularlanguage.Inthisway,the\innitetransitionfunctionisrep-resentedinaniteway.Atreelanguagenowisrecognizableifitisacceptedbysuchanautoma-ton.ThetheoryofautomataforunrankedtreeshasbeenfurtherdevelopedbyBruggemann-Klein,MurataandWood[5],andhasbeenappliedbyMu-rata[20],andNeumannandSeidl[21].Arstapproachtodenequeryautomataforun-rankedtrees,istoaddaselectionfunctiontothetwo-waydeterministictreeautomataoverunrankedtreesdenedbyBruggemann-Klein,MurataandWood[5].WedenotetheseautomatawithQAu(ustandsforunranked).Althoughtheseautomata2Atreelanguageisasetoftrees.Wesay,aQAacceptsatreeiftheunderlyingtreeautomatonacceptsit.3Theseautomataareverydierentfromthe(alternating)tree-walkingautomatausedin,e.g.,[29].canacceptallrecognizabletreelanguages,theycan-notevenexpressallunaryqueriesdenableinrst-orderlogic.Thereasonforthisweaknessisthatinformationcannotbepassedfromonesiblingtoanother.Toresolvethis,weintroducestaytransi-tionswhereatwo-waystring-automatonreadsthestringformedbythestatesatthechildrenofacer-tainnode,andthenoutputsforeachchildanews

1 / 15
下载文档,编辑使用

©2015-2020 m.777doc.com 三七文档.

备案号:鲁ICP备2024069028号-1 客服联系 QQ:2149211541

×
保存成功