12Informatica数据集成平台介绍曹顺波3目录•概述•Informatica数据集成平台介绍•总结4即将分享的4个案例-数据集成领域数据集成光大证券深圳证券信息有限公司华融湘江银行上海电信PowerCenter+PowerExchangecdcfordb2udbPowerCenter+B2BDataTransformationPowerCenter+PowerExchangeforgreenplumPowerCenter+PowerExchangeforsap5即将分享的4个案例-数据集成领域数据集成光大证券深圳证券信息有限公司华融湘江银行上海电信PowerCenter+PowerExchangecdcfordb2udbPowerCenter+B2BDataTransformationPowerCenter+PowerExchangeforgreenplumPowerCenter+PowerExchangeforsap6产品•PowerCenter•PowerExchangeconnection•PowerExchangeCDC•B2BDataTransformation•MetadataManager•DataService•DataReplication7PowerCenter+PowerExchagngeconnection8PowerCentersalesforce.comPowerExchangeCDCRDBMSPowerExchangeforSAPNetweaverPowerExchangeforRDBMSPowerExchangeforsalesforce.comOn-DemandInformaticaB2B9PowerCenter强大的数据集成平台Transformation监督,观察,报告确保数据一致,提供影响分析与持续的数据质量监控LoadingExtractionAccessAnysysteminbatchorreal-timeDeliverIntegrateProviderightdata,attherighttime,intherightformatreconcilealldatatypesTransformCleansestandardizealldatatypesValidateandcorrectalldatatypesPowerCenterTransformation10Informatica数据集成连接性WebServicesMQSeriesJMSTIBCOwebMethodsSAPNetWeaverXIEncyryptedstreamOracleDB2UDBDB2/400SQLServerSybaseInformixTeradataODBCFlatFilesWebLogs…XMLIndustryFormatsFlatFilesFTPComplexFilesTapeFormats…DatabasesXMLFlatfilesMainframeUnstructuredDataMessagesADABASDatacomDB2IDMSIMSVSAMC-ISAMTapeFormats.PDF.DOC.XLSEmail支持广泛的数据源11Informatica产品特色-提升开发生产力•图形化界面提升开发和维护生产力•内嵌超过100多种运算,提供数据模型载入时常用的SequentialGenerator、Lookup以及其他常用函数如RANK等•支持数据模型常用的SlowlyChangeDimension功能,提供向导提升开发速度•支持数据库的Partitioning功能,可支持动态Partitioning,不需要人工调整Partitioning设定12Informatica产品特色-强调管理•逻辑设计与数据库独立,开发人员不需要直接接触实体数据库,符合IT管理需求•影响/依赖性分析,当表级项出现异动时可透过此功能了解对其他表项的影响范围和评估•执行过程以甘特图的形式展现,可以跟踪执行状态,成功失败笔数,耗用资源等等,其日志LOG可存放到数据库中,便于提供后续的系统查询和整合13Informatica产品特色-弹性与开放标准•数据源和目的端的数据库可根据需要调整,不限制厂商,开发环境和生产环境可采用不同的厂商的数据库或文本文件,不需要重新编译•主动式Metadata,可存放在各种主流的开放式数据库中,可轻易的进行移植•支持C、DCOM、JAVA、StoredProcedure,提供SDK,提升与外部整合的接口的弹性•支持SOA架构,可以成为WebServices的Consumer及Provider14丰富的ETL功能异构数据源,异构目标实现多种缓慢变化维全局变量及参数,支持参数文件局部变量,前后记录比较条件汇总异构数据源关联行/列转换静态、动态Lookup支持ETL事务处理自定义SQLPreSQL和PostSQL复用组件复用Mapping调用存储过程调用外部用户自定义过程可视化Debug强大的函数支持、功能丰富的转换语言……数据源为文件列表SessionRecovery基于多目标表约束装载错误数量控制FTP源和FTP目标ETL任务分区增量汇总测试装载BulkLoadingExternalLoader(Oracle、DB2、SYBASE…)复用Workflow功能丰富的Workflow控制任务串行、并行控制基于时间、事件和指示文件触发任务Workflow中调用操作系统外部命令Workflow中调用Email多ETLServer协同工作…….15PowerCenter图形化界面完全图形化操作、易使用、易开发、易维护16最易于实施与使用的产品17易学易用的Informatica18完善的数据采集和分发手段•数据采集和分发•批处理•增量•实时•断点续传•RSA加密和压缩•调度和监控•任务串行、并行控制•多Server协同工作•基于时间、事件、操作系统外部命令等触发方式•Email提醒异常情况•。。。19Partitioning(分区并行)支持-提高系统性能Read/WriteIn-memoryLookupCacheRead/WriteIn-memoryAggregatorCacheExtractTransformLoadPartition1Partition2Partition320PowerCenterserversonServerGridOff-gridPowerCenterserverDistributedprocessingofworkflowsDynamicallyroutetaskstoavailableserversUNIXLINUXNT网格计算支持-提高性能21BackupServicesConfigRepositoryServicesDataIntegrationServicesComponentFailure(HW/SW)AutomaticFailoverRestartRecoveryPowerCenterHighAvailabilityOptionAutomaticFailoverSimulation22PowerCenter企业级扩展功能PowerCenter企业级扩展功能Real-TimeImproveactionabilitybydeliveringdatawhenneeded,whereneededinreal-timeDataCleansingStandardize,validateandcorrectdatatomaximizeitsintegrityandvalueDataProfilingMeasure,monitorandensurequalityofdataovertimeusingreusabletoolsPartitioningOptimizeparallelprocessingonmulti-processinghardwareTeamBasedDevReduceITcostsbyacceleratingdevelopmentandsimplifyingadministrationServerGridHarnessthegridtoenhancescalabilityandperformanceHighAvailabilityeliminatesasinglepointoffailureandprovidesminimalserviceinterruptionintheeventoffailure230:00:001:00:002:00:003:00:004:00:005:00:006:00:002Nodes(4CPUs)4Nodes(8CPUs)8Nodes(16CPUs)1Terabyte300Gigabytes100Gigabytes2Nodes(4CPUs)4Nodes(8CPUs)8Node(16CPUs)1Terabyte5:21:432:58:071:30:17300Gigabytes1:36:240:53:380:27:18100Gigabytes0:32:370:18:430:09:26PowerCenterHPGridBenchmark24PowerCenter总结•支持广泛的数据源和目标•丰富的ETL清洗转换功能•简单的开发部署•完善的监控和调度•企业级的option(分区并行、GIRD、HA、版本控制、…)•性能强大(1TB数据,8个节点,1个半小时)25WhatIsPowerExchangeforSAP?TableRelationshipsHierarchiesMetadataSAPApplicationsLayer(ABAPDataDictionary)SAPR/3ApplicationTablesRDMS(Oracle/SQLServer/DB2)InformaticaDataIntegrationPlatformEnablesnativeintegrationofSAPintoInformatica26TableextractionALE/IDOCBAPI/FunctionModulesDMIBCIOSSAPGUIInformaticaRFCRFCLibraryCalltoTransportFunctionModulesHowINFAIntegrateswithSAPNetWeaverArchitecture–CompleteDatabaseLayerWAS–WebApplicationServerJavaWebServicesESB’sServletsJSP’sJ2EEWebServicesEP27Greenplum通过PowerCenter加载数据方式•PowerExchangeforGreenplum•ODBC28PowerExchangeforGreenplum•PowerExchangeforGreenplum可以快速高效的将数据加载到Greenplum数据库,它通过调用Greenplumgpload批量加载数据,效率比ODBC高很多29PowerExchangeforGreenplum架构30PowerExchangeforGreenplum效率(单位条/秒)0500001000001500002000002500003000001000260000ODBCPowerExchangeforGreenplum31PowerExchangeCDC32PowerExchangeUniversalAccesstoDataEnterpriseApplications,SoftwareasaService(SaaS)JDEEnterpriseOneJDEWorldLotusNotesOracleE-BusinessSuite✔PeopleSoftEnterpriseSalesforce(salesforce.com)✔SAPNetWeaver