310 likes | 475 Views
All That Movies. BIFE Presentation. Agenda. Brief Introduction Dashboard Showcase Other Interesting Discoveries Technical Design Team Work. Brief Introduction. Cereal Killers. Jiang Yongli. Peng Cen. Xia Bing. Motivation. What we want to do is all about movies
E N D
All That Movies BIFE Presentation
Agenda • Brief Introduction • Dashboard Showcase • Other Interesting Discoveries • Technical Design • Team Work
Cereal Killers Jiang Yongli Peng Cen Xia Bing
Motivation • What we want to do is all about movies • Analyze movies and movie business from different perspectives • Give suggestions for different kinds of people
Target Customers • Movie fans • Movie Journalists • Movie companies • Directors and Actors/Actresses • Investors
Data source • IMDB website • 11,000+ movies from 1898 to present • 81,000+ actors/actresses • 4,700+ directors • 11,000+ movie companies
Sex Battle Males’ Favorite Females’ Favorite
And many more… • Directors born in Maryland are fond of Comedy movies (36/55 movies) and have no interest in Animation movies (0/55 movies) • Directors born in Rome love Horror movies (40/83 movies) and hate Romance movies (4/83 movies) • …
ETL Processing • Data scale: • 131,000+ web pages • Crawler: • Simulate HTTP request • Extraction: • XPath + Regular Expressions • Save to DB: • ODBC + SQL
Logical Data Model • Time Hierarchy Year Month of Year Quarter Month Day
Logical Data Model (continued) • Geography Hierarchy Continent Country Language State City Movie
Logical Data Model (continued) • Production Hierarchy Birth Country Birth Date Gender Performer Director Movie Company Genres
Data Warehouse Schema • 16 Look Up Tables
Data Warehouse Schema (continued) • 2 Fact Tables
Data Warehouse Schema (continued) • 6 Relationship Tables • MOVIE_DIRECTOR • MOVIE_PERFORMER • MOVIE_GENRES • MOVIE_COUNTRY • MOVIE_LANG • MOVIE_COMPANY
Project/Report/Dashboard Design • 25 Tables including one Data Mart table • 21 Attributes • 53 Facts • 3 User Hierarchies • 72 Metrics • Used smart metric, level metric, evaluation order, derived metric, view filter, conditional metric, report as filter, etc. in our reports • Widgets used: Interactive Stack Graph, Interactive Bubble Graph, Media, Data Cloud, Heat Map, Time Series Sliders, etc. • Miscellaneous selectors
Problems We Met • Media widget automatically shrinks image whenever we resize it We set filling color the same as the border color and put it in another container with same filling color to make this not obvious.
Problems We Met • We cannot use dynamic text for different attributes with the same name(e.g. Director’s Birth Date and Performer’s Birth Date), even if we use {[dataset name]}:{[object name]}. • We use grid to show these attributes and using formatting tricks.
Problems We Met • View Filter on most Metrics is not valid in dashboard. • We try to make sophisticated level metric and report filter to solve the problem.
Problems We Met • Flash mode always timeout when loading after we merged all dashboards together. • We divided our dashboards into two.
Problems We Met • And many more problems… • And many more solutions…
Cooperation • Face to face discussion • Communicator • Email • Shared Folders • Shared Intelligence Server • Everyone took part in each section more or less
Work Foucuses • Xia Bing: • Team leader, ETL process, recommended directors and performers dashboard and related reports • Jiang Yongli: • Warehouse design, project building, movie business dashboard and related reports • Peng Cen: • Logical model design, top and bottom movies dashboard and related reports, dashboard formatting
Thanks Do Not Imitate! We Are Professional!