220 likes | 311 Views
Explore the data and IT applications involved in banking operations, from cashier interactions to management reporting. Learn about operational system attributes, database design, volume of data, and analytical applications.
 
                
                E N D
Sheffield Hallam University Data Warehousing & Data Mining Principles Operational v Analytical Systems
A customer walks into a bank, and speaks to a cashier. The cashier uses his/her computer to answer the queries / fulfil the actions. • List five typical queries / actions • __________________________________ • __________________________________ • __________________________________ • __________________________________ • __________________________________
What similarities can you spot about the nature of the data involved in your five queries / actions? • Hint: volume, up-to-dateness (“currency”), level of detail • List four similarities • __________________________________ • __________________________________ • __________________________________ • __________________________________
What general characteristics do you think can be stated about the IT application that the cashier uses to handle the customer queries / actions? • Hint: Complete the following sentence: The application is oriented towards .......... • List four characteristics • __________________________________ • __________________________________ • __________________________________ • __________________________________
High volume of transactions Small processing per transaction Frequent updating of data Data is always current Transaction driven Predictable query types Static structure Content varies High accuracy High availability Mature support Operational Systems Table 1: Attributes of an Operational System
Explanation of points 7 & 8 on the previous slide • How often does the underlying design of the bank’s database have to change (tables, relationships, integrity rules etc)? • Just thinking about the current data (not the archive data), how much bigger /smaller will the bank’s volume of data be in one year’s time? • Just thinking about the current data, how much of that data will have different values in one year’s time? • Do you understand why we said • “Static structure; content varies” ?
Operational Systems Operational systems are generally based on Relational Database systems. • Very highly optimised towards fast writing / retrieval of small items of data(eg Oracle, DB2, SQL-Server, MySQL = very long established, huge $$ research investments) • Highly optimised towards using Relationships to fetch related data (eg Customer Name, and current balance) These reasons make Relational Databases extremely quick
Operational Systems Operational systems are generally based on Relational Database systems (cont ...) • The database itself can enforce Referential Integrity (eg cannot delete customer name & address if they still have an account open) • Relational design: Only store each data item in one place (eg if customer changes address, only one copy to change) • These reasons make Operational applications much easier to write.
The literature tends to use the term On-Line Transactional Processing (OLTP) for what we have described as “operational” systems. Transactional = On-line : This term is a bit historic ... originally most systems processed batches of data non-interactively (cheques are one of the few batch-oriented systems left now). Now systems all tend to be on-line / interactive Since the industry calls it OLTP, we will too.
Management Reporting Management have different needs to that of the operational side of the business. Managers are much more concerned with trends and totals and are generally not so concerned with the finer details. What they want are reporting systems that: • Give quick access to summaries of data • Have data structures that are business oriented • Allow users to explore the data • Give them control over report writing
The manager of the bank wants to analyse the effectiveness of last month’s business • List five typical queries • __________________________________ • __________________________________ • __________________________________ • __________________________________ • __________________________________ • Nb: TRY to stick with just data drawn from the Cashier system. But you will find this hard. Later we will see that it is a feature of Analytical system that they integrate data from many sources
What similarities can you spot about the nature of the data involved in your five queries? • Hint: volume, up-to-dateness (“currency”), level of detail • List four similarities • __________________________________ • __________________________________ • __________________________________ • __________________________________
What general characteristics do you think can be stated about analytical applications? • Hint: Complete the following sentence: The application is oriented towards .......... • List four characteristics • __________________________________ • __________________________________ • __________________________________ • __________________________________
Small volume of transactions Often huge processing per transaction Data output level is summary Data routinely added to, but infrequently changed Analysis driven Flexible results structure 'Fairly accurate' better than no result Medium availability Requires different database tools Analytical Systems Table 2: Attributes of an Analytical System
On-line Analytical Processing (OLAP) The Management Data is better stored in a Data Warehouse. • Data is stored in structures easy for business users to understand • (not constrained by Relational rules) • Data held in duplicate if this makes access easier/quicker • (eg can hold summary/totals of data too) • Out-of-date data held (with timestamp) as well as new • (allows examination of historical trends) These sort of systems are referred to as OLAP systems or On-Line Analytical Processing Systems.
The Process Flow The diagram below shows how Data is converted into Business Intelligence Data Information Knowledge Intelligence Structure Apply Analyse
The diagram below shows the relationships between the various components: SQL Server Analysis Services, Crystal Analysis, Oracle Discovery etc Application Software, SAP, SQL Server, Oracle, Spreadsheets etc OLAP OLTP SQL Server, Oracle, Ingres, Informix, DB/2 etc SAS Warehouse Administrator, SQL Server etc Database Data Warehouse Cleansing/Staging
Data Information Knowledge Intelligence Structure Analyse Apply SQL Server Analysis Services, Crystal Analysis, Oracle Discovery etc Application Software, SAP, SQL Server, Oracle, Spreadsheets etc OLAP OLTP SQL Server, Oracle, Ingres, Informix, DB/2 etc SAS Warehouse Administrator, SQL Server etc Database Data Warehouse Cleansing/Staging
Definition A Data Warehouse is ... "... where data is specifically structured for query and analysis performance and ease-of-use" Kimball, 2002
Why a Data Warehouse? Data Warehouses offer the flexibility needed to cope with the Management demands. A major issue is that often there are many differing OLTPsystems and other data storage media. The Data Warehouse offers the opportunity to gather these together into one system with a unified structure. Because data is stored in a simplified aggregated format it allow reports to be written by staff who have a lesser computing background.
OLAP versus OLTP If a report is designed to obtain information from an OLTPsystem it will generally be: • Slow to produce the answer • Complicated to write • Slow down the operational system • Need complicated formulas for grouping data
Operational Analytical Data Individual Items Summarised Data Relations Simple Chains Complex/Unknown Time Present Past Access Record-at-a-time Many records Approach Support a transaction Explore a domain Implementation Relational Follow the module! Summary