1 / 30

Miodrag Bolic

STONY BROOK UNIVERSITY. Department of Electrical and Computer Engineering Stony Brook University. Dissertation Defense. ARCHITECTURES FOR EFFICIENT IMPLEMENTATION OF PARTICLE FILTERS. Miodrag Bolic. Advisor: Prof. Petar M. Djuric. Outline. PART III: Implementation of PFs.

Download Presentation

Miodrag Bolic

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. STONY BROOK UNIVERSITY Department of Electrical and Computer Engineering Stony Brook University Dissertation Defense ARCHITECTURES FOR EFFICIENT IMPLEMENTATION OF PARTICLE FILTERS Miodrag Bolic Advisor: Prof. Petar M. Djuric

  2. Outline • PART III: Implementation of PFs • PART I: Introduction • Motivation and goals • Challenges • VLSI signal processing architectures • Methodology • Non-parallel implementation • Algorithm characteristics • Modifications of the PF • New resampling algorithms • Architecture • Implementation results • Parallel implementation • Propagation of particles • Parallel resampling • Architectures for parallel resampling • Space exploration • Gaussian PFs • PART II: Theory of PFs • Dynamic model • Monte Carlo sampling • Importance sampling • Resampling • Bearings-only tracking example • Steps and complexity • Conclusions and future work

  3. Observed signal t Estimation t PARTICLE FILTER CHIP Introduction – Motivations andGoals Particle Filter sensor Goal • Increase speed of particle filters

  4. Introduction -Challenges • Challenges • Reducing computational complexity • Randomness – difficult to exploit regular structures in VLSI • Exploiting temporal and spatial concurrency • Contributions • First hardware implementation of particle filters (50 times improvement in speed in comparison with DSP) • New resampling algorithms suitable for hardware implementation • Fast particle filtering algorithms that do not use memories • First distributed algorithms and architectures for particle filters

  5. Outline • PART III: Implementation of PFs • PART I: Introduction • Motivation and goals • Challenges • VLSI signal processing architectures • Methodology • Non-parallel implementation • Algorithm characteristics • Modifications of the PF • New resampling algorithms • Architecture • Implementation results • Parallel implementation • Propagation of particles • Parallel resampling • Architectures for parallel resampling • Space exploration • Gaussian PFs • PART II: Theory of PFs • Dynamic model • Monte Carlo sampling • Importance sampling • Resampling • Bearings-only tracking example • Steps and complexity • Conclusions and future work

  6. Theory of PFs – Dynamic model • Example: Bearings-only tracking • States: position and velocity xk=[xk, Vxk, yk, Vyk]T • Observations: angle zk • General dynamic model • Observation equation: zk=atan(yk/ xk)+vk • State equation: zk=fz(xk,vk) xk=Fxk-1+ Guk xk=fx(xk-1,uk) fx state transition function uk process noise fz measurement function vk observation noise

  7. Use of knowing the posterior All kinds of estimates can be calculated Gaussian processes and linear model Non-Gaussian processes and/or non-linear model Kalman filter Particle filter Theory of PFs – Bayesian approach Objective in Bayesian approach p(x0:k|z1:k) posterior distribution xk? State space model Problem Solution Estimate posterior Integrals are not tractable Monte Carlo Sampling Difficult to drawsamples Importance Sampling

  8. t Theory ofPFs– Monte Carlo Sampling Densities can be approximated by discrete random measures: Particles and Weights State space model Problem Solution Estimate posterior Integrals are not tractable Monte Carlo Sampling Difficult to drawsamples • χapproximates the density p(x) • Integrals simplify to summations Importance Sampling

  9. Theory ofPFs - Importance Sampling 2.Updating of the weightsBayes theory Objective: Approximate a density p(x) by a discrete random measure State space model Problem Solution • Steps: Estimate posterior Integrals are not tractable 1.Generation of particlesproposal density Monte Carlo Sampling Difficult to drawsamples Importance Sampling

  10. Theory ofPFs - Resampling Particles after resampling Particles after resampling • Problems: • Weight Degeneration • Wastage of Computational resources time Solution RESAMPLING Replicate particles in proportion to their weights

  11. Theory ofPFs – Bearings-Only TrackingExample

  12. Theory ofPFs - Bearings-Only Tracking Example (Cont.) • Blue – True trajectory • Red – Estimates

  13. Theory ofPFs – Steps and Complexity New observation Particle generation 1 2 M . . . Output estimates Output More observations? Complexity Initialize particles Bearings-only tracking problem Number of particles M=1000 4M random number generations 1 2 M . . . M exponential and arctangent functions Weigth computation Normalize weights Propagation of the particles Resampling yes no Exit

  14. Outline • PART III: Implementation of PFs • PART I: Introduction • Motivation and goals • Challenges • VLSI signal processing architectures • Methodology • Non-parallel implementation • Algorithm characteristics • Modifications of the PF • New resampling algorithms • Architecture • Implementation results • Parallel implementation • Propagation of particles • Parallel resampling • Architectures for parallel resampling • Space exploration • Gaussian PFs • PART II: Theory of PFs • Dynamic model • Monte Carlo sampling • Importance sampling • Resampling • Bearings-only tracking example • Steps and complexity • Conclusions and future work

  15. Implementation of PFs – VLSI Signal Processing Architectures • Types of architectures • Programmable digital signal processors • Application-domain specific processors • Application specific processors • Application specific processors • Speed is the main goal • Functionality of the system does not change • Approach • Temporal and spatial concurrency • One-to-one mapping between operations and hardware blocks • FPGA implementation

  16. Algorithmiclevel Architecturelevel RT level Gate level Complexity Joint algorithmic and architectural design • To increase performances, algorithms must be matched to architectures Impact of adesign decision System level Implementation of PFs – Methodology

  17. Implementation of PFs – Algorithm Characteristics Start New observation Particle generation 1 2 M . . . 1 2 M . . . Weight computation Resampling Propagation of particles Exit

  18. Implementation of PFs – Modifications of the PF Modifications Architecture Algorithm Fine-grain pipelining Avoiding normalization Spatial concurrency Loop transformations Dedicated hardware Finite precision arithmetic Addressing schemes

  19. Implementation of PFs –New Resampling Algorithms

  20. Implementation of PFs – Architecture

  21. Implementation of PFs – Implementation results • Hardware platform is Xilinx Virtex-II Pro • Clock period is 10ns • PFs is applied to the bearings-only tracking problem • 1000 particles is used • Sampling frequency • Resources • DSP: ~ 1kHz • FPGA: ~ 50 kHz • Logic blocks: 4% • Memories: 3% • Percentage of utilization of the PF blocks

  22. 1 1 M M 1 1 M M Implementation of PFs – Parallelism Start • Universal architecture with a central unit New observation Particle generation Processing Element 1 Processing Element 2 2 . . . Central Unit 2 . . . Weight computation Processing Element 3 Processing Element 4 Resampling Propagation of particles • Processing elements (PE) • Particle generation • Weight computation • Central Unit • Algorithm for particle propagation • Resampling Exit

  23. PE 2 PE 1 PE 3 PE 4 Implementation of PFs – Propagation of Particles time Particles after resampling Disadvantages of the particle propagation step • Random communication pattern • Decision about connections is not known before the run time • Requires dynamic type of a network • Speed-up is significantly affected t Processing Element 1 Processing Element 2 Central Unit Processing Element 3 Processing Element 4

  24. N=4 N=0 N=4 N=8 4 4 1 1 1 2 2 1 4 1 1 4 1 3 3 4 4 N=4 N=0 N=4 N=8 Implementation of PFs – Parallel Resampling N=0 N=13 1 2 3 4 N=0 N=3 • Solution • The way in which Monte Carlo sampling is performed is modified • Advantages • Propagation is only local • Propagation is controlled in advance by a designer • Performances are the same as in the sequential applications • Result • Speed-up is almost equal to the number of PEs (up to 8 PEs)

  25. Central Unit Implementation of PFs Architectures forParallel Resampling • Controlled particle propagation after resampling PE1 PE3 PE2 PE4 Architecture that allows adaptive connection among the processing elements

  26. Limit: Available memory Limit: Logic blocks Implementation of PFs – Space exploration • Hardware platform is Xilinx Virtex-II Pro • Clock period is 10ns • PFs are applied to the bearings-only tracking problem

  27. Start New observation • Advantages • Sampling period is minimal ~ MTclk • No need for memories for storing particles • Simple communication in parallel implementation • Disadvantages Computing the mean and the covariance matrix • Higher computational complexity • Limited scope of applications Implementation of PFs – Gaussian PFs • Functionality No • Propagates only first two moments • Approximates densities by Gaussians • No need for resampling Yes Drawing conditioning particles 1 2 M . . . 1 2 M . . . Particle generation 1 2 M . . . Weight computation Exit

  28. Implementation of PFs – Gaussian PFs (cont.) Minimum sampling period versus number of PEs of parallel GPFs and SIRs

  29. Conclusions and Future Work • Summary • Modification of the algorithms to be suitable for hardware implementation • Development of parallel algorithms and architectures • Implementation of the particle filter in FPGA • Analysis of the other types of particle filtering algorithms • Future work • Simplifying floating to fixed-point conversion • Developing application-domain specific processor for PFs • Developing reconfigurable architectures for PFs

  30. STONY BROOK UNIVERSITY Department of Electrical and Computer Engineering Stony Brook University Dissertation Defense ARCHITECTURES FOR EFFICIENT IMPLEMENTATION OF PARTICLE FILTERS Miodrag Bolic Advisor: Prof. Petar M. Djuric

More Related