1 / 34

Seattle bI Meetup BI & ETL @ Big Fish

Seattle bI Meetup BI & ETL @ Big Fish. April 2 nd , 2014 Emre Motan. About the Speaker. Emre Motan BI Engineer, Big Fish Been in Seattle 1.5 years, previously in Chicago Involved in BI community Chicago SQL BI PASS chapter UW BI Certificate Program BI Over Beers TDWI

ashtyn
Download Presentation

Seattle bI Meetup BI & ETL @ Big Fish

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Seattle bIMeetupBI & ETL @ Big Fish April 2nd, 2014 Emre Motan

  2. About the Speaker • Emre Motan • BI Engineer, Big Fish • Been in Seattle 1.5 years, previously in Chicago • Involved in BI community • Chicago SQL BI PASS chapter • UW BI Certificate Program • BI Over Beers • TDWI • Random: Basketball, co-rec sports, greyhounds

  3. About the Seattle BI Meetup • Started in 2012, I took over after period of inactivity • Meetups will be monthly • Primary goal is to educate • Topics will be wide-ranging but more technical • Networking is encouraged • Speakers will be people who use technologies

  4. About BI Meetup (cont.) • Looking to develop relationships with willing speakers, venues, and sponsors • Desire is to have meetings in different venues each month • As part of hosting, it would be nice to “sponsor” with food & drink

  5. About Big Fish • World’s largest producer of casual games • Core business used to be PC-based “Hidden Object” games • Now pushing deep into Mobile space with titles such as Big Fish Casino and Fairway Solitaire

  6. Today’s Story • BI/DW Implementation • ETL Framework • ETL Development • BIML • BIDS Helper • Mist BIML IDE w/ Hadron BIML Compiler • Summary

  7. Bi infrastructure April 2nd, 2014 Emre Motan

  8. BI Infrastructure

  9. What do we want from our ETL? • Minimize manual coding errors • Minimize time spent on boilerplate • Support agile software development practices • Use software development best practices • Pair / collaborate • Source control / diff changes • Auto-generate / automate • Reduce effort spent on operations and support

  10. Etl framework April 2nd, 2014 Emre Motan

  11. ETL Framework Overview • Cycles (subject areas, master package) • Jobs (restart-able units of work, package) • Steps (logic units, Data Flow / Execute SQL / …)

  12. Job A Job B Step 1 Step 4 Step 2 Step 5 Step 3 Step 6 Job C Log Start Cycle Step 7 Do Work Log Fail Step 8 Step 9 Log Success Job D Job E Step 10 Step 13 Step 11 Step 14 Step 12 Step 15

  13. 1st Run Job A Job B Step 1 Step 4 Step 2 Step 5 Step 3 Step 6 Job C Step 7 Step 8 Step 9 Job D Job E Step 10 Step 13 Success Failure Step 11 Step 14 Didn’t Run Step 12 Step 15

  14. 2nd Run Completed Successfully last run; Skip Next Time Choose to Retry because of a logical dependency Job A Job B Step 1 Step 4 Step 2 Step 5 Step 3 Step 6 Rerun on Failure Job C Step 7 Step 8 Step 9 Choose to Skip Run because it did not run last time Job D Job E Step 10 Step 13 Step 11 Step 14 Step 12 Step 15

  15. Example Job Flow Source Systems ELT Server Data Warehouse ODS Reporting Layer Staging ods_sales ods_customers ods_products ods_payment_methods … fact_sales dim_customers dim_products dim_payment_methods … Stg_sales Stg_customers Stg_products Stg_payment_methods … Ecommerce Extract/Load

  16. ETL Framework Summary • Framework is injected at compile time • Uses SQL Server 2008 R2 via stored procedures • Logical units automatically logged • Event handlers automatically added • Metadata based alterations for flow control (skip/restart) • Metadata based balances and validation scripts to detect warning/error conditions • Variables and values stored for use in ETL • Metadata based run-time Alterations for flow control (e.g. skip job, skip step) • Custom tools to administer ETL infrastructure and metadata

  17. BIML Ecosystem April 2nd, 2014 Emre Motan

  18. Language and Compiler • BIML (BIMarkup Language) • Lightweight XML dialect • Represents SQL Server BI Stack objects (SSIS, SSAS, SQL Server) • Works like ASP.NET / PHP (combines declarative & imperative language) • Hadron • Compiles BIML to SQL Server BI Stack artifacts • Called via MSBuild, Mist, …

  19. Tools • BIDS Helper • Free, open-source extension to BIDS • Code in BIML, then generate SSIS • Subset of functionality • Mist IDE • Graphical & Text Based Editors • Transformers • Extensions

  20. ETL Development April 2nd, 2014 Emre Motan

  21. Big Fish BI Engineering • We integrate a wide variety of data sources • We don’t develop in BIDS/SSIS • We code in BIML, compile in Mist or via Hadron directly

  22. BI Engineering ETL Development Flow • Develop BIML locally, committed to SVN • Generate most of the code besides business logic • Run code validations before / during compile • Compile BIML during development or on deployment to ELT boxes • Produce SSIS packages • Handle pushing to target environment • Kick off Cycles using Job Scheduler or DtExec

  23. Demos April 2nd, 2014 Emre Motan

  24. Demo: Simple SSIS Package • Demo of BIMLScript1.biml • Show BIDS environment • Show BIML • Generate SSIS • Run SSIS

  25. Demo: Programmatic BIML • Demo of BIMLScript2.biml • Introduce .NET addition to BIML script • Describe what we’re doing with getting tables from DB • Describe how we’ll loop over each table, and then each column of table, to generate insert commands • Generate SSIS • Run SSIS

  26. Demo: Mist Visual Designer • Show audience visual designer of one job • Select elements to see visual designer • We don’t use visual designer very often since most code is auto-generated now and we have established patterns

  27. Demo: Mist Project • Show Mist environment with sample cycle • Show cycle file with one job • Show job file • Show metadata for sample table (source, ODS) • Show Extensions

  28. Demo: Auto-generating ETL • One substantial accelerator of our work is auto-generating ETL for new extracts, loads, and processing • BIML representation of table • Columns, business keys, primary keys, data types • Annotations like ETL pattern required (full load, incremental new, incremental new/updated) • Only need to code transformation logic, all boilerplate is auto-generated • BimlScript to autogenerate boilerplate ETL code

  29. Demo: NZ SSIS Console • Cycle/Job/Step Status • Alterations • Variables • Deploy / Execute Cycles

  30. Summary April 2nd, 2014 Emre Motan

  31. Why did Big Fish choose BIML? • Non-standard technology needs (extensibility) • Ease of developing and maintaining ETL to leave more time for high business value work • Plenty of people with SSIS experience in Seattle • Cost effective • Organization already supported SQL Server and SSIS • Happy developers 

  32. Q & A April 2nd, 2014 Emre Motan

  33. Thank yoU April 2nd, 2014 Emre Motan

  34. Resources • BIDSHelper.codeplex.com • Varigence.com • BimlScript.com • Biml Tutorials by Andy Leonard

More Related