rfid data management l.
Skip this Video
Loading SlideShow in 5 Seconds..
RFID Data Management PowerPoint Presentation
Download Presentation
RFID Data Management

Loading in 2 Seconds...

play fullscreen
1 / 43

RFID Data Management - PowerPoint PPT Presentation

  • Updated on

RFID Data Management. Kamlesh Laddhad (05329014) Karthik B.(05329021) Guide: Prof. Bernard Menezes. Outline. Introduction to RFID Technology. Issues with RFID Technology. RFID Data Characteristics. Data Warehousing. Expressive Temporal Model: Dynamic Relationship ER Model RFID - Cuboids.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

RFID Data Management

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
    Presentation Transcript
    1. RFID Data Management Kamlesh Laddhad (05329014) Karthik B.(05329021) Guide: Prof. Bernard Menezes

    2. Outline • Introduction to RFID Technology. • Issues with RFID Technology. • RFID Data Characteristics. • Data Warehousing. • Expressive Temporal Model: Dynamic Relationship ER Model • RFID - Cuboids. • Use of Bitmap Datatype. • Data Cleaning. • Extensible Sensor stream Processing (ESP) • Statistical sMoothing for Unreliable RFid data.(SMURF) • Future Plans.

    3. Introduction • Radio Frequency Identification: • It is an Automatic Identification and Data Capture Technology. • Fast • No contact or line of sight. • Uses radio-frequency waves to transfer data • Components • Tag: small, low-cost device that can hold a limited amount of data. • Associated with objects, such as pallets, cases, and even individual items. • Reader: Recognize presence of tag and read info stored on it. • Unique electronic product code (EPC) associated with a tag. • By placing RFID tag readers at various locations, one can track the movement of objects through supply chain networks.

    4. Applications and Adoptions • Supply Chain Management: real-time inventory tracking. • US Department Of Defense: shipments to armed forces • Retail: Active shelves monitor product availability • Wal-Mart, Albertson: Major Retails stores • Access control: toll collection, transportation. • Airline luggage management: • British airways:20 million bags a year • Implemented to reduce lost/misplaced luggage • Anti-counterfeiting and security: • Food and Drug Administration: To reduce counterfeit in pharmaceutical supply chain

    5. Prospective for RFID research • The physics of building tags and readers • Tags have few gates: Apart from basic operation, very less computing power. • Radio-frequency has some issues with operating in certain physical mediums. • The privacy and safety issues: • Complex encryption schemes are not possible on RFID tags. • Counterfeiting by means of either illegitimate readers or spoofed tags are possible • Reader-tag communication is wireless: Third parties can eavesdrop on signals. • Software Architecture to collect, filter, organize, and answer online queries: • No. of tags are proportional to No of items being serviced/tracked. • No. of readers are proportional to traceable strategic locations/areas • Each Reader picks up tag signals on continuous basis. • Data generated by RFID systems is enormous: • E.g. Wal-Mart is expected to generate 7 terabytes of RFID data per day. • Our Focus: Third Stream.

    6. Data Warehousing Techniques

    7. Data Management Challenges • Data Explosion : Example • A retailer with 3,000 stores, selling 10,000 items a day per store. • Each item moves 10 times on average before being sold • Movement recorded as (EPC, location, second) • Data volume: 300 million tuples per day. • Example OLAP Query: “Average time for items to move from warehouse to checkout counter in March 2006?”. • Costly to answer if there are a billion tuples for March 2006.

    8. Data Characteristics • Temporal and history oriented • Applications dynamically generate observations (readings). • Objects location and containment relationship among objects changes • Need: Expressive data model. • Inaccurate data and implicit semantics • False positive: Non-existing tag incorrectly read. • False Negative: Reader missed a tag which was in its vicinity. • Noisy data & duplicate readings (redundancy): Same tag read more than once. • Need: Automated data filtering and transformation. • Streaming and large volume • Object stay in place for longer duration: Readers records them periodically. Large data keeps generating. • We need to preserve this data for tracking and monitoring. • Need: Scalable storage scheme, compression techniques to reduce data. • Data Granularity • Data collection granularity needs to be decided • Differs across applications.

    9. Warehousing Helps!! • Lossless compression • Remove redundancy: (r1,l1,t1) (r1,l1,t2) ... (r1,l1,t10) => (r1,l1,t1,t10) • Group objects that move and stay together. • Data cleaning: Multi-reading, missed-reading, error-reading, bulky movement. • Data mining: Find trends, outliers, frequent, sequential, flow patterns. • Multi-dimensional summary: product, location, time, … • Store manager: Check item movements from the backroom to different shelves in his store • Region manager: Collapse intra-store movements and look at distribution centers, warehouses, and stores • Query Processing • Support for OLAP: roll-up, drill-down, slice, and dice • Path query: New to RFID-Warehouses, about the structure of paths • What products that go through quality control have shorter paths? • What locations are common to the paths of a set of defective auto-parts? • Identify containers at a port that have deviated from their historic paths

    10. Dynamic Relationship ER Model • Proposed by Wang and Liu from Siemens. • RFID entities are static and are not altered. • RFID relationships: dynamic and change all the time. • Two types of dynamic relationships added: • Event-based dynamic relationship. A timestamp attribute added to represent the occurring timestamp of the event. • State-based dynamic relationship. tstart and tend attributes added to represent the lifespan of a state.

    11. SENSOR (sensor_epc, name, description) • TRANSACTION (transaction_id, transaction_type) • CONTAINMENT(epc, parent_epc, tstart, tend) • SENSORLOCATION(sensor epc, location id,position, tstart, tend) • Static entity table • OBJECT (object_epc, name, description) • LOCATION (location_id, name, owner) • Dynamic relationship tables • OBSERVATION(sensor_epc, value, timestamp) • OBJECTLOCATION(epc, location_id, tstart, tend) • TRANSACTIONITEM(transaction_id, epc, timestamp)

    12. Monitoring. • Missing RFID Object Detection: • Find when and where object holding EPC= `MEPC’ was lost. • select location_id, tstart, tend from objectlocaiton where epc='MEPC' and tstart = ( select max(o.tstart) from objectlocation o where o.epc='MEPC' ) • Check if there are missing objects at current location C, knowing that all objects were complete at previous location L at time T. • select l.epc from objectlocation l where l.location_id = 'L' and l.tstart <= 'T' and l.tend >= 'T' and l.epc not in ( select c.epc from objectlocation c where c.location_id = 'C' )

    13. Tracking • RFID Object Moving Time Inquiry: • Time it takes to supply ‘OEPC’ from location S to location E? • select (e.tstart-s.tstart) as supplying_time from objectlocation e, objectlocation s where e.epc = 'OEPC' and s.epc='OEPC' and s.location_id ='S' and e.locaiton_id='E'

    14. shelf 1 store 1 10 pallets (1000 cases) shelf 2 Dist. Center 1 store 2 … Dist. Center2 Factory … 10 packs (12 sodas) … 20 cases (1000 packs) Compression Idea • Bulky object movements • Objects often move and stay together through the supply chain. • If 1000 packs of product P stay together at the distribution center, register a single record. • (GID, distribution center, time_in, time_out). • GID is a generalized identifier that represents the 1000 packs that stayed together at the distribution center • Analysis usually takes place at a much higher level of abstraction than the one present in raw RFID data

    15. RFID Cuboids • Fact Table: (EPC, location, time_in, time_out). • In supply chain: Items travel through a series of locations. • Query: what is the average time that product P stays at store in Location A? • Traditional cubes miss the path structure of the data • Stay Table: (GIDs, location, time_in, time_out: measures): • Records information on items that stay together at a given location • If using record transitions: difficult to answer queries, lots of intersections needed • Map Table: (GID, <GID1,..,GIDn>) • Links together stages that belong to the same path. Provides additional: compression and query processing efficiency • High level GID points to lower level GIDs • If saving complete EPC Lists: high costs of IO to retrieve long lists, costly query processing • Information Table: (EPC list, attribute 1,...,attribute n) • Records path-independent attributes of the items, e.g., color, manufacturer, price..

    16. EPC Overview • Electronic product code • Standard naming scheme, proposed by Auto-Id Center. • An EPC uniquely identifies an item. • Format: <Header, Manager_No., Object Class, Serial No.> • Header: Identifies the length, type, structure, version and generation of EPC. • Manager Number: Identifies an organizational entity. • Object Class: Identifies a “class”, or type of thing. • Serial Number: Specific instance of the Object Class being tagged. • We will refer to • <Header, Manager No, Object Class>: Prefix • <Serial No.>: Suffix

    17. Use of Bitmap Datatype • Observation: Items move together. • Groups of items in the same proximity - e.g. on a shelf, on a shipment • Groups of items with same property - e.g. Same product • Use a bitmap type for modeling a collection of EPCs that can occur in item tracking applications. • Instead of storing a tuple per item store a tuple for all the items having same prefix. • New extra fields instead of epc: • <Len, Suffix_length, Prefix, suffix_start, Suffix_end, bitmap>

    18. With EPC Collections With epc_bitmaps Example: Product Inventory

    19. Use of Bitmap Datatype Header EPC_Manager Object_Class Serial_Number 2-bits 21-bits 17-bits 24-bits 0x4AA890001F62C160 ………………………… 0x4AA890001FA0B38E

    20. Bitmap Operations • To use this with such datatype in SQL, we need operations on such bitmaps. • Conversion and couting Operations: epc2Bmap, bmap2Epc and bmap2Count • Pairwise Logical Operations: bmapAnd, bmapOr, bmapMinus, and bmapXor • Maintenance Operations: bmapInsert and bmapDelete • Membership Testing Operation: bmapExists • Comparison Operation: bmapEqual

    21. Use of these operations in SQL • Items added to a given shelf between time t1 and t2. • SELECT bmap2Epc(bmapMinus(s2.item_bmap, s1.item_bmap)) FROM Shelf_Inventory s1, Shelf_Inventory s2 WHERE s1.shelf_id = <sid1> AND s1.shelf_id = s2.shelf_id AND s1.time = <t1> AND s2.time = <t2>; • Book store categorizes books in various categories. • Following query determines the shelves where the books with property ’Adventure’ and ’Romance’, are currently present in the store. • SELECT s.shelf_id FROM Shelf_Inventory s WHERE bmap2Count(bmapAnd( s.item_bmap, SELECT bmapAnd(p.Adventure, p.Romance) FROM Propery_Inventory p) ) > 0; AND s.time=<current_date>;

    22. Road Ahead • Extension to bitmap proposal: • Bitmap datatype is more appropriate for initial bulk-load & batch updates. • It performs badly for incremental updates. • A ‘hybrid Scheme’ for incremental Updates: • Maintain inventories periodic checkpoints using bitmaps. • For changes occurring between checkpoints, Maintain a traditional item-level table. • Answer queries by merging the latest checkpoint bitmap with the corresponding duration’s item-level data. • The epc_suffix in the collection may not be contiguous • The bitmap will be sparse- Lot of zeros. • Compress this using some encoding scheme • Good for initial bulk loading and batch updates • May reduce efficiency of bitmap operations.

    23. Open Problems • Efficient methods data mining problems • Trend analysis • Outlier detection • Path clustering • We will try exploring data mining applications to RFID data.

    24. RFID Data Cleaning

    25. Issues in Data Cleaning • Lack of Completeness • RFID readers capture only 60-70% of all tags that are in the vicinity • Smoothing of data is done to rectify the loss of intermediate messages • Temporal Nature of data or tag dynamics • RFID tags are in motion and that is what makes them more difficult to handle • But motion of a tag causes dropping of messages • RFID data streams are very fast and are huge in number • Hence filtering is important before sending them to database

    26. Current Strategies • Temporal Granule: • Based on the fact that tag data do not differ much over a small time period • Data can be clubbed on a small time frame • Spatial Granule: • Similarly, data from physically close readers are also homogeneous

    27. Stages of ESP • Point: operates over a single value in a sensor stream, filtered by a predicate in the WHERE clause • Smooth: granularity defined by applications to correct for missed readings temporally (over one input only); uses aggregate function over the input. • Merge: granularity specified by the application to correct for missed readings spatially; grouped by the specified spatial granule.

    28. Stages of ESP (contd.) • Arbitrate: deals with conflicts between different spatial granules; grouped by spatial granule first and then uses HAVING construct to determine those conflicts • Virtualize: used for combining data streams from different sources, could also be different devices; join construct is used to combine the different data streams and then filtered using some predicate

    29. Smooth stage • False Positives: (erroneous readings) reporting objects that are not actually present • False Negatives: (missed readings) not reporting objects that actually are present False positives and False Negatives [Jeff06]

    30. Tag List • The reader has an internal table called the Tag List. • An epoch is the smallest unit of interaction between the reader and the middleware. • Every epoch consists of certain number of Interrogation cycles • Interrogation Cycle is one run of the reader protocol to determine all tags • At every epoch the reader sends the tag list to the middleware.

    31. SMURF – Per tag Cleaning • SMURF uses statistical methods to reduce the false negative and false positives happening in the RFID stream. • The goal here is two fold: one is to determine the statistical window size, and secondly, ensuring that the transition of the tags is determined. • To determine the window size we need to fit a probability distribution to the sample size • And to determine the transition of the tag out of the reader's vicinity, we define a 98% confidence interval within that probability distribution function on the sample size |Si|.

    32. SMURF – Per tag Cleaning (contd.) • Using the tag list, per-epoch sampling probability, pi,t is determined,pi,t = number of times tag was read in a epoch / interrogation cycles per epoch • We average this over the sample size |Si| to get the average read rate (piavg) for a tag i. • If same probability of pi is assumed for each epoch throughout the window then each successful observation is like a Bernoulli trail.

    33. SMURF – Per tag Cleaning (contd.) • So, |Si| is the binomial random variable for a sample Si with mean = wi. piavg and variance = wi. piavg. (1-piavg) • Now using this we can express the window size as a limit, • If the current window size is less than the calculated one then the window size is adjusted accordingly. • Similarly using the Central limit theorem for transition detection we get ||Si| - μ| > 2 σ

    34. Normal Sliding window…. • Epoch based mid-point sliding window • Emits a reading with an epoch value corresponding to the middle of the window

    35. Ensuring Completeness • In the first window, piavg demands a larger window • Thus window size is increased

    36. Transition Detection • In the first window the number of readings decreases significantly (and statistically) • Thus a transition is likely to have occurred; so window is halved [Fraklin06]

    37. SMURF – Multi-tag aggregate Cleaning • Similar to per-tag cleaning, the window for multi-tag cleaning is determined by:Here, pavg is the average per-epoch sampling probability over all observed tags. • To detect the transition in population count, we estimate the population count of two windows [t – wi, t] and [t – wi/2, t]; with true populations: Nw & Nw’ • Thus, for a transition to have happened, we need the difference between the two estimates to be within the limit: 2(σw + σw’)

    38. SMURF – Multi-tag aggregate Cleaning • To calculate the estimate of population count, we use π-estimators; The estimated population count is given by: • Similarly by π-estimators, and assuming independence across different tags, the variance of the estimate is estimated as: • Here πi is probability of reading the tag i at least once during the whole window, given by 1 – (1 – piavg)w

    39. The Road ahead… • Applications in RFID do not accept any delays in the data delivery • Data is either present in the cache or the database; data in the database increases processing time and data in cache does not understand SQL like queries • Anomaly detection in object tracking is also an important part of object tracking • Issues like untraceability, forward security, and database desynchronization are still not completely resolved. • One more serious problem with RFID is counterfeiting • In the next stage we expect to look into some of these issues

    40. ????

    41. Thank You.

    42. References • Xiaolei Li, Hector Gonzalez, Jiawei Han and Diego Klabjan. Warehousing and analyzing massive RFID data sets. ICDE, 2006. • Fusheng Wang and Peiya Liu. Temporal management of RFID data. VLDB, 2005. • Timothy Chorma, Ying Hu, Seema Sundara and Jagannathan Srinivasan. Supporting RFID-based item tracking applications in oracle DBMS using a bitmap datatype. VLDB, 2005.

    43. References • Minos Garofalakis, Shawn R. Jeffery and Michael J. Franklin. Adaptive cleaning for RFID data streams. VLDB, 2006. • J. Franklin, Wei Hong, Shawn R. Jeffery, Gustavo Alonso and Jennifer Widom. Declarative support for sensor data cleaning. In Pervasive, 2006. • Sridhar Ramachandran Sudarshan S. Chawathe, Venkat Krishnamurthy and Sanjay E. Sarma. Managing RFID data. VLDB, 2004.