1 / 25

Stream Hierarchy Data Mining for Sensor Data

Stream Hierarchy Data Mining for Sensor Data. Margaret H. Dunham SMU Dallas, Texas 75275 mhd@engr.smu.edu. Vijay Kumar UMKC Kansas City, Missouri 64110 kumarv@umkc.edu. From Sensors to Streams – An Outline. Data Stream Overview Data Stream Visualization Temporal Heat Map

ivo
Download Presentation

Stream Hierarchy Data Mining for Sensor Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Stream Hierarchy Data Mining for Sensor Data Margaret H. Dunham SMU Dallas, Texas 75275 mhd@engr.smu.edu Vijay Kumar UMKC Kansas City, Missouri 64110 kumarv@umkc.edu

  2. From Sensors to Streams – An Outline • Data Stream Overview • Data Stream Visualization • Temporal Heat Map • Data Stream Modeling • Extensible Markov Model • Data Stream Hierarchy

  3. From Sensors to Streams – An Outline • Data Stream Overview • Data Stream Visualization • Temporal Heat Map • Data Stream Modeling • Extensible Markov Model • Data Stream Hierarchy

  4. From Sensors to Streams Data captured and sent by a set of sensors is usually referred to as “stream data”. Real-time sequence of encoded signals which contain desired information. It is continuous, ordered (implicitly by arrival time or explicitly by timestamp or by geographic coordinates) sequence of items Stream data is infinite - the data keeps coming.

  5. Data Stream Management Systems (DSMS) Software to facilitate querying and managing stream data. Retrieve the most recent information from the stream Data aggregation facilitates merging together multiple streams Modeling stream data to “summarize” stream Visualization needed to observe in real-time the spatial and temporal patterns and trends hidden in the data.

  6. DSMS Problems • Stream Management development in state similar to that of databases prior to 1970’s • Each system/researcher looks at specific application or system • No standards concerning functionality • No standard query language • Unreasonable to expect end users will access raw data, data in the DSMS, or even data at a summarized view • Domain experts need to “see” a higher level of data

  7. Our Proposal Four level data abstraction to facilitate the creation of actionable intelligence for domain experts evaluating sensor data.

  8. From Sensors to Streams – An Outline • Data Stream Overview • Data Stream Visualization • Temporal Heat Map • Data Stream Modeling • Extensible Markov Model • Data Stream Hierarchy

  9. Assumptions for Our Research • End User: • May not be knowledgeable concerning sensors • Probably a Domain Expert • May not need to see exact sensor values • Concerned with trends and approximate values • Need to see data from MANY sensors at one time • Need to see data continuously in a visualization of the stream

  10. Suppose There Were MANY Sensors • Traditional line graphs would be very difficult to read • Requirements for new visualization technique: • High level summary of data • Handle multiple sensors at once • Continuous • Temporal • Spatial

  11. Temporal Heat Map • Also called Temporal Chaos Game Representation (TCGR) • Temporal Heat Map (THM) is a visualization technique for streaming data derived from multiple sensors. • It is a two dimensional structure similar to an infinite table. • Each row of the table is associated with one sensor value. • Each column of the table is associated with a point in time. • Each cell within the THM is a color representation of the sensor value • Colors normalized (in our examples) • 0 – While • 0.5 – Blue • 1.0 - Red

  12. Cisco – Internal VoIP Traffic Data Values → Complete Stream: CiscoEMM.png VoIP traffic data was provided by Cisco Systems and represents logged VoIP traffic in their Richardson, Texas facility from Mon Sep 22 12:17:32 2003 to Mon Nov 17 11:29:11 2003. Time → NGDM'07

  13. Derwent River (UK) Derwent Temporal Heat Map derwentrotate.png

  14. From Sensors to Streams – An Outline • Data Stream Overview • Data Stream Visualization • Temporal Heat Map • Data Stream Modeling • Extensible Markov Model • Data Stream Hierarchy

  15. Data Stream Modeling Requirements • Summarization (Synopsis )of data • Use data NOT SAMPLE • Temporal and Spatial • Dynamic • Continuous (infinite stream) • Learn • Forget • Sublinear growth rate - Clustering

  16. Extensible Markov Model • Extensible Markov Model (EMM): at any time t, EMM consists of a Markov Chain with designated current node, Nn, and algorithms to modify it, where algorithms include: • EMMCluster, which defines a technique for matching between input data at time t + 1 and existing states in the MC at time t. • EMMIncrement algorithm, which updates MC at time t + 1 given the MC at time t and clustering measure result at time t + 1. • EMMDecrementalgorithm,which removes nodes from the EMM when needed. •  In addition, the EMM has associated Data Mining functions such a Rare Event Detection and Prediction Jie Huang, Yu Meng, and Margaret H. Dunham, “Extensible Markov Model,” Proceedings IEEE ICDM Conference, November 2004, pp 371-374.

  17. 2/3 1/2 N3 2/3 N1 2/3 1/2 N3 1/3 1/1 N2 N1 N1 2/3 1/2 1/3 1/1 N2 1/3 N2 N1 1/3 N2 N3 1 1/1 N1 1/1 2/2 1/1 N1 EMM Learning <18,10,3,3,1,0,0> <17,10,2,3,1,0,0> <16,9,2,3,1,0,0> <14,8,2,3,1,0,0> <14,8,2,3,0,0,0> <18,10,3,3,1,1,0.> NGDM'07

  18. 1/3 1/3 1/3 1/6 1/6 N1 N1 N3 N3 1/3 2/2 1/3 1/6 N2 1/3 1/2 N6 N6 N5 N5 EMM Forgetting

  19. EMM Sublinear Growth Rate Minnesota Department of Transportation (MnDot)

  20. From Sensors to Streams – An Outline • Data Stream Overview • Data Stream Visualization • Temporal Heat Map • Data Stream Modeling • Extensible Markov Model • Data Stream Hierarchy

  21. Traditional DBMS Data Abstraction • Three levels of data abstraction • Physical, • Logical • External • Data is normally pulled to the user by a query

  22. Proposed DSMS Data Abstraction • Abstraction • Level 0 - Physical Level • Raw data from sensors • Cannot be stored • Level 1 – DSMS • Sensor data is merged, aggregated, and cleansed. • DSMS queries may be processed against this data. • Level 2 – Model • Summarization (Synopsis )of data • Level 3 – Domain Expert • Summary Visualization • Data is normally pushed to the user

  23. Stream Hierarchy Summary • Except for the inter-level functionality requirements, each level functionality is independent of the others and may differ across different implementations. • The model used must capture time and ordering of data, be able to both learn and forget, and use some variation of clustering. • Visualization at the domain expert level must capture both time and ordering. It addition it should be able to be easily “read” for many sets of sensors.

More Related