wrangling customer usage data with hadoop n.
Skip this Video
Loading SlideShow in 5 Seconds..
Wrangling Customer Usage Data with Hadoop PowerPoint Presentation
Download Presentation
Wrangling Customer Usage Data with Hadoop

Loading in 2 Seconds...

play fullscreen
1 / 20

Wrangling Customer Usage Data with Hadoop - PowerPoint PPT Presentation

  • Uploaded on

Wrangling Customer Usage Data with Hadoop. Clearwire – Thursday, June 27 th Carmen Hall – IT Director Mathew Johnson – Sr. IT Manager. Starting With…. …a little ingenuITy !. ingenuITy Day @ Clearwire. Opportunity for everyone in IT to innovate and present new and even crazy ideas

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

Wrangling Customer Usage Data with Hadoop

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
wrangling customer usage data with hadoop

Wrangling Customer UsageData with Hadoop

Clearwire – Thursday, June 27th

Carmen Hall – IT Director

Mathew Johnson – Sr. IT Manager

starting with
Starting With…
  • …a little ingenuITy!
ingenuity day @ clearwire
ingenuITy Day @ Clearwire
  • Opportunity for everyone in IT to innovate and present new and even crazy ideas
  • One of those crazy ideas was from Roger Hosto
  • Roger had the solution for Clearwire’s Big Data problem: Hadoop
but wait
But Wait!
  • Now we had a solution for Big Data
  • We needed a Big Data opportunity
  • We had just the thing…
the perfect problem
The Perfect Problem
  • Customer Usage Data – our commodity to Wholesale partners
totally un wired
Totally (un)Wired
  • Americans used more than 1,304 petabytes of wireless data in 2012 - an increase of 69.3% over the previous 12 months' usage (827 TB)
  • Clearwire processes over 3B individual usage detail records each month
shifting landscape
Shifting Landscape
  • The U.S. wireless industry is a $195.5 billion enterprise - larger than publishing, agriculture, hotels and lodging, air transportation and movies – just to name a few
  • Prepaid/Pay-As-You-Go services' share of overall market penetration is 23.4% driving higher exposure of lost revenue if usage delivery is delayed.
  • In some cases, a customer can consume data faster than we can bill for it
let s talk numbers
Let’s Talk Numbers
  • Assume a 2GB plan
  • An HD movie from Netflix consumes 2+ GB per hour
  • Assume wholesale price = $6/GB
  • Assume the retail price for a GB of data (as top up or overage) ranges from $20 – $100
as if that wasn t enough
As if that wasn’t enough -
  • Clearwire was locked into a very expensive vendor contract which handled both network provisioning and usage delivery needs
  • Legacy solution was not adaptable or flexible
  • We needed something innovative, reliable, internally supportable, scalable– and we needed it fast
putting ingenuity to work
Putting ingenuITy to Work!
  • Roger’s idea was suddenly a project
  • We needed to build a platform to ingest, process, and provide cleaned usage data for downstream applications – and quickly
  • We needed:
    • A Hadoop Cluster
    • 24x7 Operations
    • Code to ingest data and handle a myriad of business rules
    • Integration with legacy and new systems
atlas was born
Atlas was Born
  • Development work began immediately on Clearwire’s private cloud infrastructure
  • Selected BigTop Packaging of Apache Hadoop v1.0.1
  • Custom code leveraging Hive and other common tools to ingest and process data was written
  • Infrastructure was built
hybrid approach to hadoop
Hybrid Approach to Hadoop
  • Virtual Edge Nodes
    • Leveraged our existing private cloud
  • Physical Data Nodes
    • Per Unit Cost (Storage & CPU) was lower than existing infrastructure
  • Smaller and more efficient than you think
    • 24 data nodes, each with 3TB of usable storage
    • Gives us 72TB of usable space
    • 3x block replication for production data
  • Deployed identical DR/Analytics platform
operational in no time
Operational in No Time
  • 2.5 months from project approval to production
  • Leveraged our existing support organizations
    • Solution leveraged common tools, did not require specialized teams
    • Fault tolerance inherent within Hadoophelps us minimize late night calls
  • An endless supply of data was quickly flowing through the system
  • The results were looking good!
real results
Real Results
  • 65% improvement in end to end delivery times
    • From 2.5 hours to 1.3 hours
  • Reduced catch up time from upstream outages by more than half
  • Reduced outage impacts by introducing flexibility to deliver partial files
  • Eliminated 4 hour weekly usage delivery outages tied to provisioning system maintenance
real financial results
Real (Financial) Results
  • 6 month return on investment
  • Delivered at 1/3 the cost of competing solutions
  • Foundational – Enabling Wholesale support plan of legacy platform migration
    • Saving Clearwire 10’s of millions of dollars over life of contract and internalizing support and development
the intangibles
The Intangibles
  • Proved to internal and external partners that we deliver what we promise with limited negative impacts to ongoing business
    • This was KEY to the speed at which we were able to migrate our billing platform
  • Delivered more than just a single, targeted process – delivered an enterprise usage platform to grow from
  • Kept true to our innovative spirit and the commitment to IT professionals that they can make a difference
evolution proving more
Evolution – Proving More

The Atlas Hadoop platform is now a go-to IT solution

  • LTE Usage Data – Now in production
  • Other Data Sources - ESR Data
  • Data Replication and real-time ETL
  • Exploring opportunities with network team to move closer to usage generation
  • Changing mindset of what IT can mean to an organization