1 / 26

Monitoring of Aggregation Levels in Distributed Component Based Data Production Systems

Monitoring of Aggregation Levels in Distributed Component Based Data Production Systems. BTW 2003, Leipzig, 27.02.2003. Anja Schanzenberger GfK Marketing Services, Nürnberg University of Middlesex, London Colin Tully, Dave Lawrence University of Middlesex, London. 2. 1. Application.

ailish
Download Presentation

Monitoring of Aggregation Levels in Distributed Component Based Data Production Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Monitoring of Aggregation Levels in Distributed Component Based Data Production Systems BTW 2003, Leipzig, 27.02.2003 • Anja Schanzenberger • GfK Marketing Services, NürnbergUniversity of Middlesex, London • Colin Tully, Dave Lawrence • University of Middlesex, London

  2. 2 • 1 • Application • Application Area • 3 • Monitoring of Aggregation Levels • The General Business of GfK Marketing Services • The Basic Idea of Data Production System Agenda • The Planning, Controlling and Monitoring System • Single Record Tracking • The Tubing System • Reconstructing Aggregation Levels

  3. 1 Application Area

  4. The GfK Group: Key Features • Anticipated EUR 568 million in 2002; previous year: EUR 506 million • Increase on the previous year: +12% Total revenue • More than 4,800 full-time staff • 70% of which abroad Employees • Integrated systems using standardised instruments throughout Europe and beyond Services • Over 130 subsidiaries, branches and participations in 50 countries on five continents Network

  5. Consumer Tracking Consumer and retail panel based Business Information Solutions for manufacturers and retailers for consumer packaged goods and service companies Non-Food Tracking Retail panelbased marketing information for manufacturers and retailers in consumer technology industries Interview and test market based support information for new product development and brand management across a wide range of industries Ad Hoc Research In interview and panel based audience and readership measurement and consumer response testing for TV, print, radio and Internet Media Four Complementary Business Divisions

  6. GfK Business Divisions Consumer Tracking16.4% Non-Food Tracking23.6% Other 6.4% Share of total performance 12.1% Media 41.5% Ad Hoc Research

  7. Non-Food Tracking Ad Hoc Research Consumer Tracking Media Non-Food Tracking: Key Services periodical monitoring Information services in 44 countries on marketing, sales, logistics in retail and industry for companies operating in consumer technology markets. Key services Direct access to databases and/or transmission of standardized analyses to support, monitor and manage short, medium and long term decisions on product and pricing policy, advertising, distribution, sales and logistics. The advantage for clients Market leader in the regions Europe and Asia and Pacific as well as in the Arab countries; together with partner NPD Intelect, market leader in North America. Positioning Non-Food Tracking Retail panel Information services on consumer durables, in particular for the consumer electronics, photographic, information technology, telecommunications, software, domestic appliances and equipment markets

  8. 2 Application

  9. Clients Retailers StarTrack Working Areas Data - IN Data - Preparation MDM IDAS Data Warehouse(Extrapolation, Reports) DWH Creating value through knowledge

  10. Data Production System Local client Local server Central server Identification(WebTAS) General InterfaceManager (GIM) Separation Central IDASoutput pool DWH Projectionsystem Data receipt Local Output Mainframe Planning – Controlling – Monitoring System

  11. PCMS Dimensions current state • predefined process steps • manual state checking • manual error tracking PLANNING Data Production System • dynamic production process configuration • production planning and monitoring • proactive error handling MONITORING CONTROLING envisioned state

  12. 3 Monitoring of Aggregation Levels

  13. Definitions aggregate functions aggregation levels SUM, AVG,... instruction input many data sets output one data set input one data set output many data sets separation (disaggregation) aggregation GROUP BY multiple groupings

  14. Aspects to the Monitoring of Aggregation Levels • Summaries • after significant process steps • summaries of operating figures • Single Record Tracking • tracking of single retailer items up to the customer report • simulation of planned production cycles(ETL-Tools)

  15. Example - Single Record Tracking R: retailer CW: calendar weekDP: delivery periodRP: reporting period component X pool A pool B Item A R: Vobis – DP: CW 04/2002- sales volume: 6 Item A R: Vobis – RP: Jan 2002- sales volume: 10 Item B R: Vobis – DP: CW 04/2002-sales volume: 9 Item B R: Vobis – RP: CW 04/2002-sales volume: 9 Item A R: Vobis – DP: CW 05/2002-sales volume: 4

  16. Strategies of Tracing Aggregation Levels • Tubing System • the complete workflow cycle • error situations

  17. Characteristics of Monitoring

  18. Possibilities to reconstruct Aggregation Levels (1) • Static Volumes of Data component X instruction:SELECT...WHEREDP1=CW 6, DP2=CW 7DP3=CW 8, DP4=CW 9GROUP BY Vobis, item pool A pool B DP: delivery period CW: calendar week job parameters:itemretailerreporting period item,retailer: Vobisreporting period: Feb/2002 -all items -all retailers -all delivery periods

  19. (1) Static Volumes of Data • Advantages • no additional storage required • historically stored data allows stepwise tracking possibilities • Disadvantages • historically stored requires increased storage facilities • this approach is only significant for a small (historical stored) quantity of data • all job parameters are required • increasing the quantity of data in storage slows down the control system as well as the controlled system • requires additional administration effort

  20. Possibility (2) • Single Record Logging component X log: timestampjob parameters records of A: -item -retailer -delivery period -facts -price pool A pool B job parameters:job_iditemretailer reporting period item,retailer: Vobisreporting period: Feb/2002 -all items -all retailers -all delivery periods

  21. (2) Single Record Logging • Advantages • no policies needed • no static volumes of data • Disadvantages • additional job parameters are needed • at least twice the storage requirement • additional administration effort • slowdown of systems

  22. Possibility (3) • Primary Key Logging • most important attributes • job parameters needed • logging: item, retailer, delivery period • reduction at GfK: ~1/5 • Advantages • storage requirement (approach 3) <storage requirement (approach 2) • no policies needed • no static data volumes • Disadvantages • no deleting of records, but new attribute values for the same records • additional administration effort • slowdown of systems

  23. Possibility (4) • Data Evaluation job parameters:item retailerreporting period component X pool B1 pool A1 item,retailer: Vobisreporting-period: Feb/2002 all instruction processing time pool A2 tracking time pool B2 item,retailer: Vobisreporting-period: Feb/2002 instruction all

  24. (4) Data Evaluation • Advantages • no additional logging • no additional storage required • alterations of records allowed • no static data volumes • Disadvantages • policies are needed program extension • job parameters are needed • only an imprecise estimate (processing time <> tracking time) • double execution time of component

  25. Conclusion (I) • Static Volumes of Data • environments: (historical) static data volumes • least logging effort • best approach, but often not applicable • Single Record Logging • environments: min. 2*storage required and slowdown acceptable • suitable when gathered amount of data >> processed amount of data (e.g. ad-hoc reports) • Primary Key Logging • environments: less manipulations acceptable • deleting of records is not allowed • additional logging effort

  26. Conclusion (II) • Data Evaluation • environments: level of impreciseness acceptable • no additional logging effort • no additional implementation work • no additional storage required • system load increases -> recommended for slack times Support from Database Tools ? … more information anja.schanzenberger@gfk.de

More Related