1 / 48

SAP Netweaver Business Intelligence – ETL Process

SAP University Alliances Version 3.0 Authors Klaus Freyburger Tobias Hagen. SAP Netweaver Business Intelligence – ETL Process. Abstract: This chapter covers the data extraction, transformation and loading process of SAP Netweaver Business Intelligence. . Product

tolla
Download Presentation

SAP Netweaver Business Intelligence – ETL Process

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SAP University Alliances Version 3.0 Authors Klaus Freyburger Tobias Hagen SAP Netweaver Business Intelligence – ETL Process Abstract: This chapter covers the data extraction, transformation and loading process of SAP Netweaver Business Intelligence. Product SAP NetWeaver 7.3 BI Level Undergraduate Beginner

  2. Agenda Architecture Overview SAP NetWeaver BI Data Flow Overview Source Systems Data Sources Transformations Data Flow Control Real-time Summary

  3. Architecture of a Data Warehouse Access to Information OperationalData Store (Architected) Data Marts Data Warehouse (Persistent) Staging Area Any Source • Data extraction • Data tranformation • Data loading and data flows

  4. Agenda Architecture Overview SAP Netweaver BI Data Flow Overview Source Systems DataSources Transformations Data Flow Control Real-time Summary

  5. Data Flow in SAP Netweaver BI Info Spoke SAP Netweaver Business Intelligence InfoProvider Open Hub Desti- nation * Transformation Data Transfer Process InfoSource Transformation any target Data Source InfoPackage SAP SAP NetWeaver BI NON SAP any source *optional

  6. Multi-level Data Flow in SAP Netweaver BI

  7. Data Flow Concept – Simple Example Infoprovider (DTP) TR • E:- Extraction • T:- Transformation [TR] • L:- Loading • PSA:- Persistent stagingarea (IP)

  8. Data Flow Concept – Simplified Infoprovider TR (DTP)

  9. Data Transfer Process – More Sophisticated Infoprovider

  10. InfoSource Scenario: InfoSource as a uniform source for several targets and as target from different sources Example: unit conversion / currency conversion

  11. GraphicalModeling of Data Flow

  12. Agenda Architecture Overview SAP Netweaver BI Data Flow Overview Source Systems DataSources Transformations Data Flow Control Real-time Summary

  13. Different Types of Source Systems SAP Netweaver Business Intelligence DataSource File Interface BI Service BAPI DB Connect UD Connect Web Service Staging BAPIs ETL Tool File SAP Source Relational Source Multi-Dimensional Source XML Legacy Applications *

  14. File Interface • File Interface • File name, number of header rows, data format, number format, field list, preview, ….

  15. SAP Source • SAP Business Content DataSources reside in SAP ERP, SAP CRM, … • SAP Business Content DataSources exist in SAP BW after replication Data Source Replication Service API SAP ERP OLTP System Transfer structure User Exit Extraction structure Customizing Application specific extration

  16. SAP Source • SAP BW system collects in SAP BW all objects required to activate an InfoCube

  17. Database Connectivity (DB Connect) • Any relational database • Access to table / view • Field selection and data type conversation

  18. Universal Data Connect DataSources (UDC) • Industry standard: J2EE connection architecture >200 3rd party database drivers (JDBC) • Works with SAP Netweaver J2EE connection framework (used for application integration in SAP‘s portal)

  19. Web Service • Web Service push data to PSA • Web Service is SOAP compliant • The incoming XML file will be transformed into SAP BW format • Used for business activity monitoring or real-time data delivery • Note: only for small data volume (due to XML structure overhead) Data Example with SAP BW see http://www.dbai.tuwien.ac.at/proj/lixto/WebBI.pdf

  20. 3rd Party ETL Tool Connector • SAP BW provides open interfaces - staging BAPIs • These interfaces enable connection between various third-party tools and SAP BW • For example, data from an Oracle application can be transferred to SAP BW and evaluated there • Data transfer can take place via a data request from SAP BW or be triggered by the third-party tool via BAPIs • See vendor’s documentation for more details BAPI :- Business Application Programming Interface

  21. Agenda Architecture Overview SAP Netweaver BI Data Flow Overview Source Systems DataSources Transformations Data Flow Control Real-time Summary

  22. Types of DataSources • A DataSource is the SAP BW system view to data in the source system SAP Netweaver Business Intelligence DataSource File Interface BI Service BAPI DB Connect UD Connect Web Service Staging BAPIs ETL Tool File SAP Source Relational Source Multi-Dimensional Source XML Legacy Applications *

  23. DataSource • How to extract? (Source system dependent) • Full upload vs. Delta upload • Data location • Data format specifications • Etc.

  24. DataSource • DataSource describes the fields of external data source • Fields are in internal SAP BW format • Error handling is often required! • What to extract? • Field selection • Data types and length • …

  25. Examples of conversions Source numberformat 1.234,56  hastobespecified in DataSource In alphanumeric field numbers should be stored with leading zeros in database, e.g., `0000001000` instead of `1000`  Conversion Routine ALPHA hastobespecified  In the GBI Curriculum, there is no need to specify conversion routines!

  26. InfoPackages Move Data from Source Systems DataSource • When to extract? • Time handling • Data target definition • Exception handling • Selection criteria • Preview and Monitoring • … InfoPackage „Successfully loaded to PSA“ SourceSystem

  27. Agenda Architecture Overview SAP Netweaver BI Data Flow Overview Source Systems DataSources Transformations Data Flow Control Real-time Summary

  28. Transformations in SAP NW BI – Overview • Transformation types • Move, aggregate, constant, master data look up, … • Business rules, e.g. unit & currency translation • Formula builder with rich predefined functions library • ABAP routines incl. regular expressions • Transformation: Record by record processing • Start / End Routine: Data set processing SAP Netweaver Business Intelligence Data Target End Routine Transformation - Field 1 Transformation - Field n Start Routine Data Source

  29. Transformation – User Interface Transformation Rules Target Fields Source Fields

  30. Transformation Rules • Rule Types: • Constant assignment • Direct assignment • Reading master data (eg. product group from Infoobject) • ABAP Routine • Formula editor • Time conversion, unit conversion • Aggregation • Sum, min, max, first, last, avg Directassignment

  31. Transformation Rules (Formula) Formulaassignment

  32. Transformation Rules (Formula editor) Formula Source Fields Functions

  33. Transformation Rules (Time conversion) FromCalendar Day … … toYear/Monthonly

  34. Agenda Architecture Overview SAP Netweaver BI Data Flow Overview Source Systems DataSources Transformations Data Flow Control Real-time Summary

  35. Full vs. Delta Data Upload • Use Data Transfer Process to set the extraction mode • Full: Extract all data from Data Source • Delta: Extract data only from Data Source which have changed since last upload InfoPackage attached to DataSource Set extraction mode

  36. Monitor for Data Transfer Process • Integrated in InfoProvider Management • Integrated in DataTransferProcess Management • Error Stack is displayed in Monitor Error Stack

  37. Error Handling for DataTransfer Process • Error Handling Options: • Valid Records Update, No Reporting at all (status FAIL) • No Update, No Reporting (status FAIL) • Valid Records Update, No Reporting (status FAIL) • Valid Records Update, Reporting possible (status OK) SAP Netweaver Business Intelligence Data Target Error DTP Transformation – Error Handling DTP Data Source

  38. Master Data Upload from Flat Files An InfoObject represents a type of business entity Attributes describe the characteristics of a business entity Texts may have multilingual descriptions of an attribute and must therefore be loaded as a single process Hierarchies represent groups of business entities. There may be multiple hierarchies. Characteristics InfoObjects Attributes Hierarchies Multi lang. texts

  39. Master Data Upload from Flat Files without InfoPackage 4. Create and start Data Transfer Process 3. Create Transformation 2. Create DataSource 1. Select Source System

  40. Master Data Upload from Flat Files with InfoPackage 5. Create and start Data Transfer Process 4. Create Transformation 3. Create InfoPackage and load data into PSA 2. Create DataSource 1. Select Source System

  41. Master Data Flow - Global Bike 7. Create andstart Data Transfer Process 2 6. Create Transformation for Texts 5. Create andstart Data Transfer Process 6 5 4. Create Transformation for Attributes 4 7 3 3. Create newDataSource 2. Add existingInfoObject Material 1. Create Data Flow

  42. Transactional Data Upload 5. Create and start Data Transfer Process 4. Create Transformation 3. Create InfoPackage and load data into PSA 2. Create DataSource 1. Select Source System

  43. Transactional Data Flow - Global Bike 10. Create andstart Data Transfer Processto Cube NetSales:= Revenue - Discount CostofGoodsManufactured := SalesQuantity * TransferPrice 9. Create Transformation 8 Calmonth:= Caldayas YYYYMM 9 10 8. AssignexisitingInfoCube 7. Activate Data in DataStoreObject 7 4 6. Create andstart Data Transfer Processto DSO 5 6 5. Create Transformation Calday:= Year & Month & Day 4. Create DataStoreObject 2 3. Create InfoPackageandload PSA 3 2. Create newDataSource 1. Create Data Flow from Template

  44. Data Transfer Process - Process Chain A process chain is a sequence of processes that are scheduled to wait in the background for an event. Each process can trigger one or more events that trigger other processes in turn. In the administration function area of the Data Warehousing Workbench, choose Process Chains from the navigation pane to call the process chain maintenance. A process chain is a very complex area. It Automates the complex schedules in BW with the help of the event-controlled processing Visualizes the processes using network graphics Centrally controls and monitors the processes  To reduce complexity, we‘ll skip this now, but we need to cover this in the future

  45. Agenda Architecture Overview SAP Netweaver BI Data Flow Overview Source Systems Data Sources Transformations Data Flow Control Real-time Summary

  46. Real-time Data Acquisition Demon Realtime Update External System Realtime Update Application SAP Source System

  47. Agenda Architecture Overview SAP Netweaver BI Data Flow Overview Source Systems Data Sources Transformations Data Flow Control Real-time Summary

  48. Summary Loading Master Data Loading Transactional Data

More Related