1 / 7

Massive Data Analysis Lab (MassDAL)

Massive Data Analysis Lab (MassDAL). S. Muthukrishnan CS Dept. MassDAL. Agenda: Gather, manage and process massive data logs ----Web, IP/wireless traffic data, location trajectories of objects, sensor readings of physical world. Key Challenges:

rumor
Download Presentation

Massive Data Analysis Lab (MassDAL)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Massive Data Analysis Lab(MassDAL) S. Muthukrishnan CS Dept

  2. MassDAL • Agenda:Gather, manage and process massive data logs----Web, IP/wireless traffic data, location trajectories of objects, sensor readings of physical world. • Key Challenges: • Scale: Beyond the traditional “human” scale. Eg., IP data at a single router interface for an hour exceeds total yearly worldwide credit card transactions! • Data Collection: probes/sensors with associated data quality and communication problems. • Need breakthroughs in Mathematics, Algorithms, Systems and Engineering, to meet these challenges. • Potential: Major impact in Homeland Security, Telecom, Transportation and Society-at-large.

  3. State of MassDAL • Mathematics and Computer Science. • Algorithmic tools for embedding vectors, strings, trees and other objects for “compact” representation. • Algorithmic tools for analyzing data summaries for heavy hitters, deviants, clustering, decision trees, etc. • Invited talks at ACM, SIAM, European conferences in Algorithms, Databases, Statistics, and Data Mining on novel models and algorithms. • Over dozen research papers in last 2 years on experience with massive data analysis. • Supported by NSF grants. Partner: MIT, DIMACS.

  4. State of MassDAL • Science • Developing wearable sensors for tracking location of objects as well as “interactions” between objects. Measuring behavioral data. • Current partner: Telcordia. Their initial investment: $300k/3 months (est). Potential parter in works: Los Alamos National Lab. • Potential: Analysis of social networks for Epidemiology and Homeland Security, and health industry.

  5. State of MassDAL • Engineering. • Consulting in analysis of wireless network logs. AT&T Wireless, 3rd largest in US, 20 Million customers. Terabytes/month. Fully operational, telco-grade! • Incorporated novel algorithms in operational IP network data analysis tools. Partner: Gigascope. • Developed principled approach to data cleaning and data quality monitoring for operational IP network. Partner: PACMAN. • Developed new burst-detection algorithms for text streams. Partner: DIMACS, Monitoring message streams.

  6. Future • See http://cs.rutgers.edu/~muthu/massdal.html

  7. Future of MassDAL • Research: Need breakthrough research in mathematics, systems, databases, algorithms, sensor networking. • Expand data domains. • Potential partners: Google, NJ auto insurance fraud data, USPTO patent data, AWS location trajectories, etc. • Build state-of-art facility at Rutgers. • Secure, 24X7, data hosting and analysis infrastructure capable of gathering and processing petabytes of data/month across domains, data sources, etc. Unique in the world! • Potential. • Every wireless, telecom, internet service provider is looking to farm out this crucial piece of their operations. Estimated market for these services: 100’s of millions in US $ per year. Crucial for NJ State. Interest from multiple VCs now.

More Related