TechniquesforStoringXMLXMLGroup,IDKE,RUC2002/5/202Outline1DataorDocument?Data-CentricDocument-Centric2StoringData-CentricXMLDatainXML-EnanbledDBMS3Commercial(O)DBMSSupportForXML4IntroductiontoXMLBenchMark3I.DataorDocument?4ClassesofXMLDocumentsData-centricDocumentproductcatalogs,inventorydata,medicalrecords,networkmessages,logs,stockquotesmachineconsumptionDocument-centricDocumentUsermanualsHumanconsumptionMixed5Data-CentricExamplecompanyid=id2_3cnameOracle/cnameaddressstreet1435Bayave./streetcityRedwoodShores/citystateCA/statezip11234/zip/address…6Document-CentricExampleaction-descsponsorMr.English/sponsorintroducedthefollowingbill;whichwasreferredtothecommittee-nameCommitteeonFinancialServices/committee-name.../action-desc7MixedExamplebillbill-stage=Introductioncongress110thCONGRESS/congresssession1stSession/sessionlegis-numH.R.133/legis-numaction-descsponsorMr.English/sponsorintroducedthefollowingbill………/bill8II.StoringData-CentricXMLDatainXML-EnanbledDBMS9TwoMethodsCLOBMappingXMLDatatoDBMSTable10CLOBTheSimplestWay!StrongFull-TextSearchCompatibleWithSQLExtensibleByAddingFunctions.Drawbacks:PieceUpdatingNOTsupportedXML-specificQueryNOTsupported11CLOBExamples:Oracle9iXMLTypeDB2XMLCLOB——大的XML文档,类似于Oracle的XMLType;XMLVARCHAR——小的XML文档XMLFile——外部的XML文档12TwoMethodsCLOBMappingXMLDatatoDBMSTable13MappingSchema-LessMappingSchema-basedMapping14Schema-LessMappingDefinition:Edge(source,ordinal,name,flag,target)Value(vid,value)Rules:storealledgesinonetableScalarvaluesstoredinseparatetable15EdgeRelationExample&0HL7&1PATIENT&2PIDOBX&3&4…@IDNumPaNaDTofBi&5&6&7PATID1234“JonesWm”date&81961-06-13sourceordinaltagflagtarget&01HL7ref&1&11PATIENTref&2&21PIDref&3&22OBXref&4&31IDNumstring&5&32PaNastring&6&33DTofBiref&7NodeValue&5PATID1234&6JonesWilliamEdgeTableValueTable16EdgeTable优点:不依赖于XMLDTD(Schema),不许事先建立映射关系可以还原源文档,但开销巨大缺点:大量的连接,导致低效17MappingSchema-LessMappingSchema-basedMapping18Schema-basedMappingRulesRule1:ComplexElement:Complexelementtypesgenerateclasstableswithprimarykeycolumns.19Schema-basedMappingRulesRule2:Simpleelement:Singlereferencestosimpleelementtypesgeneratecolumns;ifthereferenceisoptional(?operator),thecolumnisnullable.Repeatedreferencestosimpleelementtypesgeneratepropertytableswithforeignkeys.Referencestocomplexelementtypesgenerateforeignkeysinremoteclasstables.PCDATAinmixedcontentgeneratesapropertytablewithaforeignkey.OptionallygenerateordercolumnsforallreferencedelementtypesandPCDATA.20Schema-basedMappingRulesRule3:AttirbutesSingle-valuedattributesgeneratecolumns;iftheattributeisoptional,thecolumnisnullable.Multi-valuedattributesgeneratepropertytableswithforeignkeys.Ifanattributehasadefault,itisusedasthecolumndefault.21Schema-basedMappingExample!ELEMENTOrder(OrderNum,Date,CustNum,Item*)!ELEMENTOrderNum(#PCDATA)!ELEMENTDate(#PCDATA)!ELEMENTCustNum(#PCDATA)!ELEMENTItem(Quantity,Part)!ATTLISTItemNumID#REQUIRED!ELEMENTQuantity(#PCDATA)!ELEMENTPart(PartNum,Price)!ELEMENTPartNum(#PCDATA)!ELEMENTPrice(#PCDATA)22TableOrderColumnOrderPKColumnOrderNumColumnDateColumnCustNumTableItemColumnItemPKColumnItemNumColumnQuantityColumnOrderFKTablePartColumnPartPKColumnPartNumColumnPriceColumnPartFK23III.Commercial(O)DBMSSupportForXML24TwoMethodsCLOBOracle9iXMLTypeMappingXMLDatatoDBMSTable25XMLType(介绍)1XMLType是Oracle定义的一种数据对象CREATETABLEwarehouses(warehouse_idNUMBER(3),warehouse_specSYS.XMLTYPE,warehouse_nameVARCHAR2(35),location_idNUMBER(4));26XMLType(介绍)2插入操作如:INSERTintowarehouses(warehouse_id,warehouse_spec)VALUES(1001,sys.XMLType.createXML('?xmlversion=1.0?WarehousewhNo=“200BuildingOwned/Building/Warehouse'));27XMLTpye(介绍)3XMLType不支持pieciesupdate如果要把BuildingOwned/Building改为BuildingLoaned/Building:UPDATEwarehousesSETwarehouse_spec=sys.XMLType.createXML('?xmlversion=1.0?Warehousewhono=200BuildingLoaned/Building/Warehouse'));28XMLType(方法)1Extract()和ExistNode()其它方法包括:creatXML()、isFragment()、getStringVal()、getNumberVal()通过这些内建的方法,实现与SQL的交互。如:29XMLType(方法)2SELECTw.warehouse_spec.extract('//Building/text()').getStringVal()asbuldingFROMwarehouseswWHEREw.warehouse_spec.existsNode('/Warehouse/Building')=1ANDwarehouse_id=100130XMLType(OracleText)1Oracle用OracleText(interMedia的一种)来进行对XMLType的索引和查找。它的主要函数是Contains(),为了支持XML的Xpath,引入了两个新的函数hasPath()和inPath()。31XMLType(OracleText)2hasPath()的使用:找出po_xml_tab的poDoc中有CUSTOMER元素的所有元组。(因为CUSTOMRE是可选的元素)SELECT*FROMpo_xml_tabwWHERECONTAINS(w.poDoc,'hasPath(/PO/CUSTOMER)')0;32XMLType(OracleText)3InPath()的使用:找出路径/PO/CITY中出现了Edison的所有元组。SELECT*FROMpo_xml_tabwWHERECONTAINS(w.poDoc,'EdisoninPath(/PO/CITY)')0;33XMLType(OracleText)4withIn()的使用找出CUSTNAME属性中出现John的所有元组:SELECT*FROMpo_xml_tabwWHERECONTAINS(w.poDoc,'JohnwithIn/PO/CUSTOMER@CUSTNAME')0;34XMLType(索引)1针对Element或Attribute的索引有些Element或者Attribute可能经常查询,这时可以建立相应的索引,如:SELECT*FROMpo_xml_tabeWHEREe.poDoc.extract('//PONO/text()').getNumberVal()=100