1 / 22

Understanding Big Data

By: Paul Kenosky. Understanding Big Data. What will be covered!. Big Data Define Big Data Challenges Increase in Technology Characteristics of Big Data Fraud Detection Social Media Hadoop BigInsight. Define. Understanding Big Data

aitana
Download Presentation

Understanding Big Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. By: Paul Kenosky Understanding Big Data

  2. What will be covered! • Big Data Define • Big Data Challenges • Increase in Technology • Characteristics of Big Data • Fraud Detection • Social Media • Hadoop • BigInsight

  3. Define • Understanding Big Data • Big Data applies to information that cant be processed or analyzed using traditional processes or tools. • Wiki • Big data is the term for a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications

  4. Challenges • Business face big data challenges more and more in today's world • They are overloaded with information that can be beneficial to the organization • However they do not know how to make use of the raw and unstructured data

  5. Big Data Technology • Interconnectivity: • More and more systems, people, and technology are becoming interconnected • Inexpensive • Integrated circuits are continually becoming cheaper to produce and buy • This allows intelligence to be added to many devices that once seemed too costly

  6. Big Data on the rise! • Example railway cars have hundreds of sensors. • Sensors can track things such as conditions experienced by the rail car, the state of individual parts, and GPS based data for shipment • With the rise of technology these rail cars are becoming more advanced and sensors are added to sensor data on parts that are prone to wear, so they can be replaced before they fail • Data is stored on the rails, railroad crossing sensors, weather patterns that cause rail movements, cargo location, cargo arrival, and cargo departure times • Processing all this data using a traditional relational system would be impractical if not impossible

  7. Characteristics of Big data • Volume: • Data being stored today is increasing at an overwhelming number • Booking a flight, posting to facebook, sending a text, and more • Variety: • Represents all types of data • Velocity: • How quickly data is arriving, stored, and analyzed

  8. Fraud Detection Background • Transactions • Online auctions, insurance claims • A big data platform can present opportunities to increases detection success • Patterns of fraud can come and go in hours, days, or weeks. • If fraud detection pattern has a low latency by the time it is discovered the damage is already done

  9. Fraud Detection Questions • An estimate of 20% of available information that could be useful for fraud detection is being used • Why not load the other 80 percent of data into the traditional analytic warehouse? • Too expensive • Would it not pay for itself? • How can we be sure this new information will be valuable before making a costly business decision • Use BigInsights to provide an elastic and cost-effective repository to establish what of the remaining 80 percent of the information is useful for fraud modeling.

  10. Fraud Detection • IBM teamed up with a large credit card issuer to improve there fraud detection model. • They discovered they could improve the speed of detection and have more accurate results using the new model • A process that once took three weeks was improved to just a few hours. • They also found that about half of the 80% was actually beneficial information that could be used

  11. The Social Media Pattern • Organizations can use Big Data usage pattern in social media to find out what is being said about the company and competitors • This information can be used to significantly improve decision making • IBM has built a solution to accelerate an organization usage called Cognos Consumer Insights (CCI) • CCI allows an organization to see what people are saying, how topics are trending in social media, and all sorts of things that affect the organization

  12. Why are they unhappy with my company? • Although you can find out what people are saying, another more important question would be why are they saying and behaving in this way? • An organization needs to look beyond that data to answer the question • Sales, promotions, loyalty programs, merchandising mix, competitor actions, and even weather can come into play.

  13. Example • Company introduced a different kind of packaging for one of its products. • Customers were giving negative feedback on the new packaging • Months later the company discovered the problem and switched the packaging to an eco-friendly package. • This in turn increased sales and customer happiness

  14. Example 2 • An author of the book is a prolific facebook poster • Traveling on airlines is essential to his job and after a number of flight delays he posted his frustration with these airlines on his facebook wall • These flight delays were found on his facebook wall by the airline and they contacted him • Although, it doesn't mention what the airlines to did to compensate or fix the problem it does show one thing which is the company where listening

  15. Hadoop • Hadoop is a top level apache project and is open source • Is designed to scan through large data sets to produce its results through a highly scalable, distributed batch processing system • Data is redundantly stored in multiple places across clusters • The programming model is build to expect failures and it will automatically resolve them by running portions of the program on various servers. • Hardware components might fail but due to the redundancy hadoop can provide fault tolerance

  16. InfoSphereBigInsights • Hadoop can be complex to install, configure, and administrate • IBM takes this complexity away with the BigInsight installer • BigInsights makes it simpler for people to use Hadoop and build big data applications. • It enhances this open source technology to withstand the demands of your enterprise, adding administrative, discovery, development, provisioning, and security features, along with best-in-class analytical capabilities from IBM Research. • The result is that you get a more developed and user-friendly solution for complex, large scale analytics.

  17. Special Thanks to • http://www-01.ibm.com/software/data/infosphere/biginsights/index.html • http://en.wikipedia.org/wiki/Big_data • http://www.decalsplanet.com/item-10485-black-pot-of-gold.html • http://drshocker.blogspot.com/2007_03_01_archive.html • http://www.mytinyphone.com/wallpaper/31448/ • https://www.facepunch.com/showthread.php?t=1332655

  18. What Big Data Says About You • Short YouTube video that explains Big Data • Some interesting stories the speaker went over

  19. Extra Story • Bats flying around airports • Noise was produced and airports filtered this noise out • Weather patterns • Airplane movement • 15 years later scientists got together • Collecting data on bat migration • Throwing this data away • One mans garbage is another mans treasure

  20. Extra Story • Gates foundation • Eradicate polio in Nigeria • Satellite maps • Found villages no one knew of • Government did not know these people where there • No maps showed these villages • Gates gave out GPS phones to polio eradication workers • Combining satellites, vaccine, and cell phones is not something that comes to mind when thinking of big data • Problems caused by misinformation or get the information to late

  21. Special Thanks • http://motherboard.vice.com/blog/big-data-explained-brilliantly-in-one-short-video • http://www.netanimations.net/Moving-vampire-bat-and-Dracula-blood-sucking-animations.htm • http://www.nbcnews.com/id/37086846#.Uxd7-YXpbYg

  22. Any Questions?

More Related