220 likes | 321 Views
Discover the innovative approaches employed by Betfair in utilizing StreamInsight for real-time event-driven cube monitoring. David Prime and David Elliott delve into the intricacies of managing over five million daily bets with a latency of less than one second, addressing challenges such as fraud detection and compliance. They explore architectural directions, BI integration, and the complexities of event processing crucial for effective analytics. This presentation highlights the significance of lightweight aggregated usage information and alerting mechanisms to enhance operational efficiency.
E N D
EVENT DRIVEN CUBE MONITORING. David Prime & David Elliott SQLBits 6
WHO WE ARE. Who we are • David Prime – Betfair Research • David Elliott – Information Management & Analytics Architect Betfair • Launched June 2000 around an exchange betting platform • You can bet that an outcome will happen (back) or that it won't happen (lay). • You can choose the odds at which you want to play. • You can bet whilst the game is in play. • You can play on a range of products and games other than sports wagering What this means in terms of data • Bets: >5 million bets daily • Latency: 99.9% bets processed <1 sec • More trades than all of the European stock exchanges combined
OUR OBJECTIVES. Background • Early look at StreamInsight in Deep-dive • Architectural direction: EDSOA • Real Time requirements: Anti-Fraud, Legislation, Exposure Monitoring • BI / OI • Analytics API / Continuous ETL Cube Monitoring • A good use-case and an opportunity to assess using SI with the rest of the BI stack • Provide light-weight aggregated usage information for the business Real-Time • Alerts: name and shame greedy users, discover broken code • Aggregate session data • Using time windows to run complex monitoring scenarios
COMPLEX EVENT PROCESSING. Betfair is awash with events Your online business probably is too
STREAMINSIGHT. • What is StreamInsight? • New • Integrated • Fast • Improving
OVERVIEW. StreamInsight Input adaptor CEP Server Cubes Output adaptors Trace & Real-Time ETL Trace events Alerting DB
TRACE EVENTS. QUERY BEGIN QUERY SUBCUBE QUERY END • EXISTING SESSION • AUDIT • LOGIN • SESSION INTIALIZE • AUDIT LOGOUT ERROR
TRACING ANALYSIS SERVICES. String connString = "Provider=MSOLAP;Data Source=bigbox;InitialCatalog=AdventureWorksSample;Integrated Security=SSPI;"; // Create AS server object server = new Microsoft.AnalysisServices.Server(); // Connect server server.Connect(connString); Trace trace = server.Traces.Add(); TraceEventsessionInit = trace.Events.Add(TraceEventClass.SessionInitialize); sessionInit.Columns.Add(TraceColumn.TextData); sessionInit.Columns.Add(TraceColumn.ConnectionID); sessionInit.Columns.Add(TraceColumn.NTDomainName); sessionInit.Columns.Add(TraceColumn.NTUserName); sessionInit.Columns.Add(TraceColumn.ApplicationName); sessionInit.Columns.Add(TraceColumn.StartTime); sessionInit.Columns.Add(TraceColumn.CurrentTime); sessionInit.Columns.Add(TraceColumn.DatabaseName); etc... C#
TRACING ANALYSIS SERVICES. C# trace.Update(); //engage the traces TraceEventHandleronTraceEvent = new TraceEventHandler(OnTraceEvent); TraceStoppedEventHandleronTraceStopped = new TraceStoppedEventHandler(OnTraceStopped); trace.OnEvent += new TraceEventHandler(OnTraceEvent); trace.Stopped += new TraceStoppedEventHandler(OnTraceStopped); trace.Start();
TRACING ANALYSIS SERVICES. private void OnTraceEvent(object sender, TraceEventArgs e) { siAdapter.PutEvent(e); //send the event out to streaminsight dbwriter.putEvent(e); //the dbwriter constructs a load of inserts based on the shape of the event //and dumps to our DB for cube-ness switch (e.EventClass.ToString()) { case "SessionInitialize": break; case "ExistingSession": break; case "QueryEnd": break; case "QuerySubcube": decodeQuery(e, querySubCubeID); break; . . . C#
STREAMINSIGHT. //filters out the events we want CepStream<QuerySumm> querySumm = from e in producer.AlterEventDuration(e => TimeSpan.FromMinutes(1)) where e.eventClass == "QueryEnd" select new QuerySumm { userName = e.userName, allTime = e.duration, cpuTime = e.cpuTime, startTime = e.startTime, endTime = e.endTime }; //detects slow queries so we can go and moan at the user CepStream<SlowAlert> slowProducer = from e in querySumm where e.allTime.Milliseconds > 1000 select new SlowAlert { userName = e.userName, allTime = e.allTime, cpuTime = e.cpuTime, startTime = e.startTime, endTime = e.endTime }; LINQ
STREAMINSIGHT. //filters out the events we want CepStream<QuerySumm> querySumm = from e in producer.AlterEventDuration(e => TimeSpan.FromMinutes(1)) where e.eventClass == "QueryEnd" select new QuerySumm { userName = e.userName, allTime = e.duration, cpuTime = e.cpuTime, startTime = e.startTime, endTime = e.endTime }; //detects slow queries so we can go and moan at the user CepStream<SlowAlert> slowProducer = from e in querySumm where e.allTime.Milliseconds > 1000 select new SlowAlert { userName = e.userName, allTime = e.allTime, cpuTime = e.cpuTime, startTime = e.startTime, endTime = e.endTime }; LINQ
AND THEN? • Output adaptors are nice • Nagios • Splunk • Homebrew • MORE CUBES :)
OUTPUT CUBE. How do we do this? • FACTS • Dimensions
NEXT STEPS: SECURITY. Security Monitoring / Auditing • Alerting on suspicious querying activity / disallowed querying • Alerting • Reporting • Analysis • Provide an audit trail of querying on sensitive attributes • Regulatory Reporting • Dynamic Security
NEXT STEPS: PERFORMANCE. Performance Recommendations • Provide data to enable assessment of ‘hot’ areas within the cubes • Alerting • Reporting • Analysis • Feed into third party monitoring tools • Identify heavy users • Identify poorly performing queries for tuning • Automatic aggregation generation