1 / 61

A Key-Value-based Persistence Model for Sensor Networks

A Key-Value-based Persistence Model for Sensor Networks. By: Marcello Alves de Sales Junior Masters of Science in Computer Science Advisor: Prof. Arno Puder , Ph.D. Committee Chair: Prof. Marguerite Murphy, Ph.D. Department of Computer Science. Outline.

maine
Download Presentation

A Key-Value-based Persistence Model for Sensor Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Key-Value-based Persistence Model for Sensor Networks By: Marcello Alves de Sales Junior Masters of Science in Computer Science Advisor: Prof. Arno Puder, Ph.D. Committee Chair: Prof. Marguerite Murphy, Ph.D. Department of Computer Science

  2. Outline Motivation and Literature Review A Taxonomy for Data Persistence NetBEAMS: A Case Study Empirical Analysis for Technology Selection DSP Data Persistence: Design and Architecture Experimental Results: Correct Behavior and Performance Conclusions and Future Works

  3. Motivation • Persistence for NetBEAMS Sensor Network Infrastructure • Component-based sensor network for environmental monitoring (Puder et al) • Biologists are the main users of the system • What types of database systems? • Use the traditional relational data model? • Bai et al proposes programming languages (domain specific)

  4. State of the Art in Data Persistence • Sensor Networks [Akyildiz et al] • Infrastructure (Topology) and Node Types • Size and Lifetime • Sensor Networks Nodes [Romer et al] • Deployment and Mobility • Size, Resources, Cost, Energy, Heterogeneity • Communication Mode • Coverage and Connectivity •  [wp] [snc]

  5. Persistence Storage for Sensor Networks •  • How the Collected Data is Used  • Real-time Data Stream • Data Archival • Storage Location for Collected Data • Local or External • Data-Centric • Query Processing Used • In-Network • Centralized • Data Volume Produce [ns]

  6. Data Models and Query Engines • Tabular Data Model: flat files • Comma-separated values • Text Comparison • No Index • Relational Data Model: binary files • Data Normalization • Structured Query Language (SQL) • Lee et alproposed new operators for SQL • Structured Data Model: structured XML documents • XML Schema: document structure • XML Xpath: data retrieval •  Database System Data Sink

  7. How Collected Data is Described? • Data Stream:334 55.45 -23.44 119.394 44 1 22 | 5/ • Ledlie et al proposes the use of Data Provenance  • Metadata: Data about data • What was collected? • Temperature = 54.3 : data • Scale = ’fahrenheit’ : metadata • When was the data collected? • Valid Time = Collected at 10:34am • Transaction Time = Time of Arrival • From where was the data collected? • GPS Coordinates: (12.342, -145.304) • Site: ‘lower-pier’ : metadata • 

  8. Problem: recent oil spill in the San Francisco Bay (Oct 2009) [sfb09] • Correlations between the collected data and the oil spill • Describing historical data events • Data Annotation • Liu et al annotates video frames fromsensor cameras • Descriptive Metadata • YouTube Video Tag • Tags for Web 2.0 “Junk” Data [an]

  9. 2. Data Persistence in Sensor Networks: a Proposed Taxonomy

  10. 2. A Taxonomy for Data Persistence • Taxonomy(Greek τάξις, taxis (meaning 'order', 'arrangement') 
and νόμος, nomos ('law' or 'science').) [Wikipedia]: • Practice and science of classification • Represented by hierarchical diagrams • Relationships between the root and branches Taxon Taxa

  11. Data Persistence Taxonomy

  12. 3. NetBEAMS: A Case Study

  13. 3. NetBEAMS: A Case Study • NetBEAMS: Data collection using Data Sensor Platform (DSP) • Automates operation of SF-BEAMS • SF-BEAMS: single-star sensor network – data archival • Nodes geographicallyfixed • Single-hop communication • Production intervals: 1, 6 or 15 minutes • Heterogeneous Devices • Coverage: Tiburon coast • 1 Data Sink (RTC Labs)

  14. 3. NetBEAMS: A Case Study • NetBEAMS Gateway Node • YSI Sonde + Gumstix Embedded System + GSM Modem Centralized Data SinkRTC Labs

  15. Device Used by NetBEAMS • YSI 6600EDS V2: COTS Water Quality Monitoring • 13 Measurement parameters • 1 Year worth of raw data • Max 23.99 Mb at 1/min • 483,840 samples per year • 5 YSI in current deployment [ysi]

  16. SF-BEAMS Classification

  17. NetBEAMS Data Collection Scenario Missing Component!!!! 12.20 192 179 55 88.40 0.09 0.084 0.059 7.98 -79.6 99.5 8.83 0.4 8.7 Collected Data DSP Messages

  18. Functional Requirements

  19. Non-Functional Requirements • Open-Source • Free of charge • Easy to Scale (Data Partitioning) • Accessibility (API) • Cope with RTC Small Volume of Data

  20. 4. Technology Selection Empirical Analysis

  21. 4. Empirical Analysis for Technology Selection • Technologies used by the literature reviewed • MySQL: Jacob Nikomused it in Linux cluster for sensor networks; • TinyDB: Madden et al and Lee et al used it or sensor networks; • DB2: Sow et al used as a hybrid approach of XML and Relational • models to persist and query biometric events; • mongoDB: Buyya et al reported it in new trends in persistence in the cloud.

  22. Use traditional Relational Databases • Tony Bain questions the adoption of the Relational Model • Traditional approach: 30 years • Accommodates changes? • Try adding entities • Try adding properties • Changes to the schema • Maintain schema normalized • Change Software Layers

  23. Schema-less: Key-Value Pair Data Model • Data Collections: “denormalized” data • No Data Integrity = Data located on same physical space Annotation Observation Provenance

  24. Tiburon, CA Berkeley, CA South Bay, CA • KVP Databases: better supports horizontal data partitioning • Shenker et alsurveysData-Centric Storage • Targeted Query vs Global Query Collected Sensor Data - Region 1 – Master Shard Collected Sensor Data - Region 2 – Master Shard Collected Sensor Data - Region 3 – Master Shard Projection Collected Sensor Data - Region 1 – Shard 2 Collected Sensor Data - Region 3 – Shard 2 Count Operation

  25. Selection Criteria

  26. 5. DSP Data Sensor Platform: Design and Architecture

  27. 5. DSP Data Persistence: Design and Architecture • Persistence Scenario for NetBEAMS – Solution

  28. Data Model Design: mongoDB Document Instance Where When • Data Manipulation: Programming Language Abstraction • ”Dot Notation” • sensor.location.latitude= 37.89 • time.transaction = Dec 17, 2009 • observation.pH = 7.11 What

  29. Adding DSP Data Component Adding mongoDB

  30. Deployment of the DSP Data Persistence • As External Storage Single Server • As Data-Centric Distributed Server

  31. 6. Experimental Results: Correct Behavior and Performance

  32. 6. Experimental Results: Correct Behavior and Performance • Goal: Simulate RTC Environment • Experiment Setup - Infrastructure • Key-Value definition • Randomly Generated YSI Sonde Data (R0) • Simulates Different Types of Storages using Virtualization; • Workload • Compatible data volume used by RTC • 1 YSI = 483,840 documents = First Round • 5 YSI = 5 * 483,840 = 2,419,200 = Consecutive Rounds

  33. Scenarios • Use Cases as Agile User Stories – Persona, Action, Result • (R1) ”As a marine biologist, I would like to search observations by filtering values of the sensor device’s properties such as water temperature and salinity on December 17, 2009, so that I can find associated values to the observation.”; • db.SondeDataContainer.find( { observation.Salinity : 0.01, • observation.WaterTemperature : 46.47, • time.valid: new Date( 2009, 12, 17) } ) Programming Language mongoDB Abstraction to Access Data

  34. Scenarios • •(U1) ”As a estuarine ecologist, I would like to annotate observations from the time the “oil spill” occurred in the San Francisco Bay, so that I can maintain historical evidence of the impact of such event.” •  • db.SondeDataContainer.update( { • time.valid : { $gte:new Date(2009,10,12) , • $lt:new Date(2009,11,13) }} , • {$set : {tag: "oil spill"}} • ) Programming Language mongoDB Abstraction to Access Data

  35. Implementation of use cases API diversity;

  36. Implementation fulfills all the taxonomies • 1.35GB Claimed Disk Space • ~25,091 Inserts/min • Retrieval ~milliseconds • Update Varies (Depends on Partition Size, Dataset) • Simpler Implementation of Use Cases • Data accessibility • Different APIs, different languages • Key-Value Data Model • No schema changes to modify data design • Trade-off between Disk Storage (commodity) and performance

  37. Data-Centric approach • Scales in terms of disk space available • Decreased processing time • Less data in a shard, faster query processing • Novel approach: alternative to existing ones • New Data Model Taxonomy

  38. 7. Conclusions and Future Works

  39. 7. Conclusions and Future Works • How Important is Data Collection • Environmental Sensor Networks: Hazard Alerts • How to describe data: Data Provenance guidelines • Important descriptions: annotations, tags • Contributions • Data Persistence in Sensor Networks Taxonomies • Novel Approach: KVP data model for sensor networks data • Implementation for External and Data-Centric Storages • Technology ready for Cloud Computing

  40. Future Works • Data-Centric Deployment with MapReduce Application • Sorting, subsets

  41. Future Works • RTC gathers data by time period; Data are mostly repeated • Wang et al surveyed efficient schedulers for Sensor Networks; • Yin et al and Chen et al showed the use of Data Clustering before sending data to data sink; • Creation of a DSP Data Clustering before persisting data; • Research Problems • In-network storage/query using KVP databases • Partitioned Data nodes • Event-Based application developed on top of YSI Sonde Data • “observation.Battery” carries the battery life-time information;

  42. References • Arno Puder, Teresa Johnson, Kleber Sales, Marcello de Sales, andDale Davidson. A component-based sensor network for environmen-tal monitoring. In SNA-2009: 1st International Conference on SensorNetworks and Applications, pages 54–60, San Francisco, CA, USA, November 2009. The International Society for Computers and Their Applica-tions - ISCA. • I.F. Akyildiz, Weilian Su, Y. Sankarasubramaniam, and E. Cayirci.A survey on sensor networks. Communications Magazine, IEEE,40(8):102–114, Aug 2002. • K. Romer and F. Mattern. The design space of wireless sensor networks.IEEE Wireless Communications, 11(6):54–61, December 2004 • Seungjae Lee, Changhwa Kim, and Sangkyung Kim. New database operators for sensor networks. In SERA ’07: Proceedings of the 5th ACIS International Conference on Software Engineering Research, Management & Applications, pages 689–696, Washington, DC, USA, 2007. IEEE Computer Society. • Jonathan Ledlie, Chaki Ng, and David A. Holland. Provenance-aware sensor data storage. In ICDEW ’05: Proceedings of the 21st International Conference on Data Engineering Workshops, page 1189, Washington, DC, USA, 2005. IEEE Computer Society. • Xiaotao Liu, Mark Corner, and PrashantShenoy. Seva: Sensor-enhancedvideo annotation. ACM Trans. Multimedia Comput. Commun. Appl., 5(3):1–26, 2009. • Jacob Nikom. Real-time sensor data warehouse architecture using mysql. InMySQL Users Conference. O’Reilly Media, Inc., April 2005. • [sfb09] Oil spills into s.f. bay south of bay bridge. http://www.sfgate.com/cgi-bin/article.cgi?f=/c/a/2009/10/30/BA9B1ACTST.DTL,October 2009. • Images • [sd] http://www.zess.uni-siegen.de/ipp_home/ipp/research/master-student-topics/ • [snc] http://www.dei.unipd.it/~schenato/pics/SensorNetwork.jpg • [ns] http://www.cc.gatech.edu/projects/disl/specialProjects/figure1.gif • [an] http://eurekr.com/pics/AnnotatinganImageinWPF_A7D8/image.png • []ysi] http://www.ckjorc.org/cn/admin/news/edit/UploadFile/200681616301130.jpg

  43. References • Daby M. Sow, Lipyeow Lim, Min Wang, and Kyu Hyun Kim. Persisting and querying biometric event streams with hybrid relational-xml dbms. In DEBS’07: Proceedings of the 2007 inaugural international conference on Distributedevent-based systems, pages 189–197, New York, NY, USA, 2007. ACM. • Samuel R.Madden, Michael J. Franklin, JosephM. Hellerstein, andWeiHong.Tinydb: an acquisition query processing system for sensor networks. ACMTrans. Database Syst., 30(1):122–173, 2005 • . • Images • [sd] http://www.zess.uni-siegen.de/ipp_home/ipp/research/master-student-topics/ • [snc] http://www.dei.unipd.it/~schenato/pics/SensorNetwork.jpg • [ns] http://www.cc.gatech.edu/projects/disl/specialProjects/figure1.gif • [an] http://eurekr.com/pics/AnnotatinganImageinWPF_A7D8/image.png • []ysi] http://www.ckjorc.org/cn/admin/news/edit/UploadFile/200681616301130.jpg

  44. Department of Computer Science A Key-Value-based Persistence Model for Sensor Networks ? Marcello de Sales Master of Science in Computer Science(msales@sfsu.edu) http://code.google.com/p/netbeams http://www.netbeams.org “The brick walls are not there to keep us out. The brick walls are thereto give us a chance to show how badly we want something. Because the brick walls are there to stop the people who don't want it badly enough.” Dr. Randy Pausch

  45. DSP in practice = NetBEAMSUse Cases • Data Payload for the YSI Sonde 6600V2 • SondeDataType: representation for the collected data • SondeDataContainer: collection of the collected data

  46. Data Sensor Platform (DSP)Message Structure • DSP Message • Header • Producer • Consumer • Body • Message Content • DSP Messages Container • Package of DSP Messages

  47. Data Sensor Platform (DSP)Communication Mechanism • DSP Broker • Local delivery • Remote delivery • Gateway Component • DSP Matcher • Filtering based on rules • Independent Per Host

  48. DSP Data Persistence Component

  49. DSP Data Persistence Component

  50. 3. NetBEAMS: A Case Study DSP Data Persistence component Requirements • Open-Source • Support Data-Centric • Free of charge • Accessibility (API) • Cope with RTCSmall Volume of Data

More Related