Research loci 2002 2004
This presentation is the property of its rightful owner.
Sponsored Links
1 / 55

Research Loci(2002-2004) PowerPoint PPT Presentation


  • 43 Views
  • Uploaded on
  • Presentation posted in: General

Research Loci(2002-2004). Statistical modeling of forward- and back- scatter fields Polarimetric Field Modeling and Reconstruction (Hory/Blatt) Adaptive multicomponent Pearson model Markov random field (MRF) model for extrapolation/reconstruction

Download Presentation

Research Loci(2002-2004)

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Research loci 2002 2004

Research Loci(2002-2004)

  • Statistical modeling of forward- and back- scatter fields

    • Polarimetric Field Modeling and Reconstruction (Hory/Blatt)

      • Adaptive multicomponent Pearson model

      • Markov random field (MRF) model for extrapolation/reconstruction

      • Target vs clutter discrimination using MRF models

  • Distributed function optimization

    • Aggregation strategies for distributed sensors (Blatt/Patwari)

      • Optimal estimator clustering/aggregation method

      • Hierarchical censoring for distributed detection

      • Cyclically averaged incremental gradient decentralized optimization

  • Sequential adaptive sensor management

    • Non-myopic multi-modality sensor scheduling(Blatt/Kreucher)

      • Information-driven non-linear target tracking algorithms

      • Reinforcement Learning (RL) approaches to sensor management

      • Markov decision process (MDP) for detecting smart targets

  • Active Time Reversal Imaging

    • General MATILDA methodology(Raghuram)


Experiment plate in forest of pine trees

Statistical Modeling

Experiment: Plate in Forest of Pine Trees

Randomized tree positions

Trees

Plate

meters

meters

  • 15cm x 15cm x 1cm plate at 1m from ground

  • Plate under forest canopy (10 pine trees)


Backscatter realizations

Backscatter realizations

Forest AloneTarget in Forest

12 x 12 array of antennas with 0.5 degree interspacing


Research loci 2002 2004

Linear Gaussian Models Inadequate

Q-Q Plot for Gaussianity Testing

Return from forest alone Return from plate in forest

KS goodness-of-fit P value=0.0303

KS goodness-of-fit P value=0.9985


Non parametric mrf model

Non-parametric MRF Model

Step 1: contruct empirical histogram over MRF feature

space

Conditional Markov transition histogram

…estimated from data


Non parametric mrf model1

Non-parametric MRF Model

Step 2: contruct penalized density estimator

  • y is observed data

  • parameter b enforces smoothness

  • function g(f) captures data-fidelity

    • g(f)=|f|^2: standard L2 quadratic regularization

    • g(f)=|f|: L1 edge-preserving regularization for denoising

  • w(x): smoothing within and across neighborhoods


Progress 1 emf reconstruction

Progress 1: EMF Reconstruction

  • Synthesis is facilitated by implementing MRF reconstructrion with causal neighborhood structure

  • Dimension of causal MRF feature space=4x3=12

  • Choice of neighborhood size impacts the bias and variance (fidelity) of the synthesized field.

  • Neighborhood size can be optimized by maximizing feature spread over MRF feature space

Non-causal neighborhood – Causal Neighborhood


Mrf for emf reconstruction

MRF for EMF Reconstruction

Original scattered EM field generated from physical simulation

Synthesized scattered EM fields


Progress 2 target segmentation

Progress 2: Target Segmentation

  • Piecewise constant Gibbs random field model

  • Non-parametric MRF LRT applied to segment the regions determined by g

  • Segmentation accuracy is improved wrt Efros and other algorithms


Sequential adaptive sensor management

Sequential Adaptive Sensor Management

  • Sequential: only one sensor deployed at a time

  • Adaptive: next sensor selection based on present and past measurements

  • Multi-modality: sensor modes can be switched at each time

  • Detection/Classification/Tracking: task is to minimize decision error

  • Centralized decision making: sensor has access to entire set of previous measurements

  • Smart targets: may hide from active sensor

Single-target state vector:


Sequential adaptive sensor management1

Sequential Adaptive Sensor Management

  • Progress made 2002-2004

    • Information-gain strategies for target tracking

      • Value function approximation using visibility constraints

        • Renyi-Divergence approximation

        • Established link between Renyi info and decision error exponents

      • Mitigation of computational bottleneck by adaptive PF

        • Coupled vs independent particle partitions for tracking multiple targets

        • Exploitation of permutation symmetry

      • Multiple model and multiple modality extensions

      • Real-time operation demonstrated for tracking > 40 real target motions

    • Reinforcement learning (RL) strategies

      • Q-learning for multiple target detection/tracking/id

      • Q-learning for detection of smart targets with model mismatch


Sm for multiple target tracking progress since feb 04 review

SM for Multiple Target Tracking: Progress Since Feb. 04 Review

  • Myopic Algorithms

    • Acceleration of myopic SM algorithm towards real-time implementation

    • Symmetric vs asymmetric information divergence

    • Sensitivity to model mismatch

    • Multiple model filtering for lost targets (Kreucher&etal:ASAP, Mar 04)

  • Non-myopic algorithms

    • Improved value function approximation in SM for tracking (Kreucher&etal:CDC, Dec. 04)

    • Q-learning SM for tracking: alpha divergence information state (Kreucher&etal:NIPS, Oct. 04)

    • Q-learning SM for detection of a smart target with model mismatch

    • (Blatt&etal:NIPS, Oct 04)


Sensor scheduling value function

Sensor scheduling value function

  • Action a: deploy a sensor, probe a cell at time t

  • Value of taking action a at time t after observing

Sensor agility

Prediction

Retrospective

value of taking

action a

Available

measurements

at time t-1


In retrospect posterior density

In Retrospect: Posterior Density

x

^

x

Best action is a2 since its posterior update is most concentrated

 induces “highest information gain”


Information value function

Information Value Function

  • Properties of alpha Renyi divergence

    • Simpler and more stably implementable than KL (a=1) (Kreucher&etal:TSP04, SPIE03)

    • Parameter alpha can be adapted to non-Gaussian posteriors

    • More robust to mis-specified models than KL (Kreucher&etal:TSP04, SPIE03)

    • Related directly to decision error probability via Sanov (Hero&etal:SPM02)

    • Information theoretic interpretation


Myopic target tracking application

Myopic Target Tracking Application

  • Possible actions: point radar at cell c and take measurement, c=1, …, L

  • We illustrate the benefit of info-gain SM with AP implementation of JMPD tracking 10 actual moving target positions (2001 NTC exercise).

  • GMTI radar simulated: Rayleigh target/clutter statistics

  • Contrast to a periodic (non-managed) scan: same statistics

  • Coverage of managed and non-managed=50 dwells per second


Comparison with other myopic managed strategies

Comparison with Other Myopic Managed Strategies

  • Renyi-Divergence method of sensor management outperforms others

    • Periodic scan sweeps through all cells and then repeats

    • Methods “A” and “B” point the sensor where targets are estimated to be

      • Method A – chooses cells randomly from cells predicted to have targets and cells surrounding those predicted to have targets

      • Method B – chooses cells probabilistically based on their estimated target count


Progress 3 computational tractability

Particle filter implementation allows for tractable algorithms

Simulated environment containing real ground targets and GMTI-like sensor

Implemented via a Hybrid Matlab/C algorithm on an off the shelf 3GHz linux box

Algorithm can track approximately 40 targets in real time

Algorithm performs tracking and sensor management on 10 targets in real time

Progress 3: Computational Tractability


Progress 4 asymmetric vs symmetric divergence

Progress 4: Asymmetric vs Symmetric Divergence

  • Issue: ordinary RD is symmetric in narrowing vs broadening of posterior

  • Q. Does this lead to biased action sequence?

  • A. Explore broadening penalty

Penalized objective:

where

Expected Renyi Divergence Penalty

P(a,k) =

Expected Change in Renyi Entropy


Progress 4 asymmetric vs symmetric divergence ctd

Progress 4: Asymmetric vs symmetric Divergence (ctd)

  • Model Problem :

    • Tracking 10 real targets using a pixelated sensor that makes thresholded measurements

  • Results:

    • Divergence-optimal action narrows the posterior on the average.

    • Performance of asymmetric and symmetric DIvergence nearly identical.


Approach

Progress 5: Effect of model mismatch

Approach

  • We investigate the effect of mismatch between the filter estimate of SNR and the actual SNR

  • Experiment: 10 (real) targets with myopic SM.

  • CFAR detection w/ pf = .001, and pd = pf1/(1+SNR*O)

    • i.e. Rayleigh distributed energy returns from both background & signal. Threshold set for Pf = .001.

    • For a constant pf, SNR determines what pd is

  • Filter has an estimate of SNR (and hence pd) and uses this for SM and filtering. What is the effect on tracking of erroroneous SNR info?

  • Bottom line: Filter appears quite robust to mismatch in SNR and pd.


Progress 6 multiple model selection

Progress 6: Multiple Model Selection

  • Last year we applied multiple models (HMM) to complex target states

  • We recently realized that HMM also useful for modeling state estimator residual error and predicting imminent loss of track

  • Estimated transitions in HMM control mode-switching of the tracker

  • Estimated state of HMM informs tracker that measurements not consistent with proposed target state and modifies proposal process by switching modes.

Tracker Modes

Tracker transition probabilities


Research loci 2002 2004

Progress 6: Multiple Model Selection(ctd)

  • Interpretation: equivalent to biased sampling scheme for particle proposals

  • The state of the target has not changed; only the filter itself changes.

    • Unlike the earlier application, where the importance density was always the target kinematics (although it changed with time), here we may use something other than target kinematics for particle proposals.

    • This biased sampling scheme must be reflected in the particle weights.


Research loci 2002 2004

Progress 6: Multiple Model Selection(ctd)

Tracking Performance - Averaged over 100 Trials

Bottom Line:

Multiple model filter outperforms filters based on the constituent models and is able to maintain track on targets much more reliably.


Non myopic sensor management

Non-Myopic Sensor Management

  • There are a many situations where long-term planning provides benefit

    • Sensor platform motion creates time varying sensor/target visibility

      • Sensor/target line of sight may change resulting in targets becoming obscured

      • Delay measuring targets that will remain visible in order to interrogate targets that are predicted to become obscured

    • Convoy Movement may involve targets that overtake/pass one another

      • Targets may become closely spaced (and unresolvable to the sensor)

      • Plan ahead to measure targets before they become unresolvable to the sensor

    • Crossing Targets become unresolvable to the sensor

      • Sensor resolution may prohibit successful target identification if targets are too close together

      • Plan ahead to identify targets before they become too close

  • Planning ahead in these situations allows better prediction of reemergence point, target trajectory, target intention


2 step lookahead non myopic search tree

2-step Lookahead Non-Myopic Search Tree

C

<D>=.1

p=0.5

D = 2.2

SC

S6

S7

S8

S9

SA

SB

S5

SE

SF

SG

SH

SI

SJ

SD

SK

S4

S1

S2

S3

D

<D>=.001

C

<Dc>=1.1

A

<D>=1.1

p=0.5

D = 0

D

<Dd>=.001

S0

C

<Dc>=1.1

B

<D>=.9

p=0.5

D = 1.8

D

<Dc>=.001

C

<Dc>=1.1

p=0.5

D=0

D

<Dd>=.001


Relevant non myopic sensor management situation

Relevant non-myopic sensor management situation

Sensor Position

Sensor Position

Visible Target

Region of Interest

Region of Interest

Shadowed Target

Useful extra dwells not made by myopic strategy

Time 1

Time 3

Time 4

Time 5

Time 6


Comparison of greedy and non myopic 2 step decision making

Comparison of Greedy and Non-Myopic (2 step) decision making

Non-Myopic:

Target lost 11% of the time

Myopic:

Target lost 22% of the time


Can we do better optimal non myopic strategies

Can we do better? Optimal non-Myopic Strategies

  • Reward at time t for action sequence

  • is “information state”

  • Optimal action sequence

  • Optimal action sequence satisfies Bellman’s equation

  • Value function


Optimal action determined by partition of information state space

Optimal Action Determined by Partition of “information state space”

1

0

1

Special case of 3 state target


Application to optimal sensor management

Application to Optimal Sensor Management

  • For discrete measurements and finite horizon (T), solution to value equation is linear program

  • Krishnamurthy (2002) exploited this property for SM

  • Problems with Krishnamurty’s approach:

    • Complexity of linear program is geometric in T

    • when number of states is large computations become intractable

    • when measurements are continuous value equation is non-linear


Sub optimal strategies explored

Sub-optimal strategies explored

C

<D>=.1

p=0.5

D = 2.2

SI

S5

SJ

S6

S8

S9

S7

SF

SB

SE

SA

SC

SD

SG

SH

SK

S3

S4

S1

S2

D

<D>=.001

C

<Dc>=1.1

A

<D>=1.1

p=0.5

D = 0

D

<Dd>=.001

C

<Dc>=1.1

B

<D>=.9

p=0.5

D = 1.8

D

<Dc>=.001

C

<Dc>=1.1

p=0.5

D=0

D

<Dd>=.001

 time t-1  time t  time t+1 

  • Impose simple form on scheduling function: infinite horizon

  • Time invariant function of information state

  • Value Function Approximation

  • Approximate non-myopic reward

Approaches

  • Exploring depth with particle proposals

  • Optimal allocation of N particles


Value function approximation

Value Function Approximation

The Bellman equation describes the value of an action in terms of the immediate (myopic) benefit and the long-term (non-myopic) benefit.

Bellman equation:

Non-myopic

correction under

a

Myopic part of V

under action a

Value of state

For computational tractability approximate non-myopic term

Where Na(s) is an easily computed measure of the future benefit of action a (i.e. an approximate long-term value term).


Info gain value to go vtg approximation

Info gain value-to-go (VTG) approximation

Mean change in gain distribution

Mean change in gain

  • Let : expected myopic gain when taking action aat time k

    : distribution of myopic gain when taking action aat time k

  • Approximate long-term value of taking action a

  • Optimization becomes

  • Gaussian approximation


Simulation description

Simulation description

VTG value approximation

  • At initialization, target is localized to a 300m x 500m region.

  • GMTI Sensor must search the region for the target.

  • Sensor visibility region changes with time.

  • Non-myopic strategy scans regions that will be obscured in the future while defering regions that will be visible in the future.


Progress 7 q learning approximations

Progress 7: Q-learning approximations

  • Main idea (Watkins89):

    • simulate actions and the induced information states (measurements)

    • Find the optimal schedules by stochastic averaging

  • Q-function defined as indexed value function

  • Algorithm: For n=1,2,…

  • Using simulate trajectory

  • Update Q functions according to recursion

  • Repeat until variance of Q-function is below tolerance


Q learning applied to multiple target tracking

Q-Learning Applied to Multiple Target Tracking

  • Training used to learn Q function predicting long term value of taking action a in state s

    • Q-function approximation is necessary. Linear approximation

    • Training examples are used to learn coefficients of linear model.

  • The learning is done in batch form and iterated

    • q is initialized (i.e. randomly or all zeros)

    • The Q function is trained from the first batch of examples {s, a, r, s’}

    • The Q function is then used with the second batch of examples {s, a, r, s’} to estimate the value of s’ and the Q function is retrained

Calculate

Update

qk to qk+1

Generate

s, a, s’, r

s, a, s’, Qest


Example two real targets

Example: Two Real Targets

  • Target Trajectories Taken From Real, Recorded Data

    • 2 moving ground targets

    • Need to estimate the position and velocity in x and y (4-d state vector for each target)

  • Time varying visibility taken from real elevation map & simulated platform trajectory

  • Sensor decides where to steer an agile antenna and illuminates a 100mx100m patch on the ground. Thresholded measurements indicate the presence or absence of a target (with pd and pfa)

  • At initialization the filter the target position is known to be in a 300m x 500m area on the ground (i.e. the prior for target position is uniform over this region)


Results two real targets

Results: Two Real Targets

  • Targets become invisible to sensor and impact tracking performance

  • Random strategy selects cells (actions) uniformly

  • Myopic strategy only predicts one step ahead

  • VTG approximation predicts M steps ahead

  • Non-myopic RL converges to optimal strategy


Research loci 2002 2004

Progress 8: Active Time Reversal for Imaging/Classification

Multistatic Adaptive Target Illumination and Detection (MATILDA) framework


Research loci 2002 2004

MATILDA Block Diagram

Signals along MATILDA processing path (calibrated case):


Performance assessment

Performance Assessment

  • CR bound on MSE of unbiased estimators for scattering coefficients matrix D is inverse of FIM:

  • FIM Trace optimized for spatial filtering operators satisfying

  • Simulation study of MATILDA performance for special case of mismatched beamformer


Research loci 2002 2004

Time Reversal Imaging Scenarios


Research loci 2002 2004

Calibrated Time Reversal Imaging: CRB


Research loci 2002 2004

Uncalibrated Time Reversal Imaging: CRB


Research loci 2002 2004

Time Reversal AutoCalibration


Foci for 2004

Foci for 2004

  • Backscatter models for adaptive detection and classification:

    • Model fitting to spheres, plates, dihedrals under foliage

    • refining sensor performance metrics (Pf, Pd, Pid)

    • Parametric modeling of backscatter: Pearson mixture models

  • Adaptive non-myopic sensor scheduling and management:

    • Analytical value function approximations

    • combining Q-learning and particle filtering

    • Q-function approximation: linear

  • Bounds for time reversal 3D imaging

    • Problem formulation

    • Optimizing forward and time-reversal spatial filters for calibrated arrays


Foci for 2005

Foci for 2005

  • Continue to refine sensor performance metrics (Pf, Pd, Pid) and MRF classifer models

    • stationary target phantoms in foliage

    • full MRF modeling of backscatter for classification

  • Adaptive non-myopic sensor scheduling and management:

    • Q-function approximation: non-linear

    • advantage (A) learning for accelerating Q-training phase

    • combining A-learning and particle filtering

  • Time reversal 3D imaging with uncalibrated sensor arrays

    • autocalibration for uncalibrated arrays

    • Perform experiment on scale models


Foci for 2006

Foci for 2006

  • Implement performance metrics (Pf, Pd, Pid) and MRF classifer models

    • Stationary/Moving target/sensor for foliage and other clutter types

    • Quantitative comparisons between MRF and Pearson models

  • Adaptive non-myopic sensor scheduling and management:

    • On-line implementations: PF, advantage learning methods

  • Time reversal 3D imaging with uncalibrated sensor arrays

    • adaptive focus of attention using SM


Foci for 2007

Foci for 2007

  • Integration of SM, multimodality sensors, and MATILDA into one system

  • Integration of physics models and SM system and demonstration for realistic scenarios: smart moving targets under foliage, minefields, etc


Publications 2003 2004

Publications(2003-2004)

  • Kreucher, C., Hero, A., Singh, S., and Kastella, K., “Sensor management for multitarget tracking using a reinforcement learning approach,” under review for the 2004 Neural Information Processing Symposium (NIPS).

  • D. Blatt, S. Murphy, and J. Zhu, “A-learning for approximate planning,” under review for the 2004 Neural Information Processing Symposium (NIPS).

  • Kreucher, C., Hero, A., Kastella, K., and Chang, D., “Efficient methods of non-myopic sensor management for multitarget tracking,” under review for 43rd IEEE Conference on Decision and Control, December 2004.

  • Kreucher, C, Hero, A, and Kastella, K., “Multiple model particle filtering for multitarget tracking,” The Twelfth Annual Workshop on Adaptive Sensor Array Processing (ASAP), Lexington, Mass, March 2004.

  • D. Blatt and A. Hero, "Asymptotic distribution of log-likelihood maximization based algorithms and applications," in Energy Minimization Methods in Computer Vision and Pattern Recognition (EMM-CVPR), Eds. M. Figueiredo, R. Rangagaran, J. Zerubia, Springer-Verlag, 2003

  • J. Costa, A. O. Hero and C. Vignat, "On solutions to multivariate maximum alpha-entropy Problems", in Energy Minimization Methods in Computer Vision and Pattern Recognition (EMM-CVPR), Eds. M. Figueiredo, R. Rangagaran, J. Zerubia, Springer-Verlag, 2003

  • C. Kreucher, K. Kastella, and A. Hero, “Multitarget tracking using particle representation of the joint multi-target density,” accepted subject to revisions in IEEE T-AES, Aug. 2003.


Publications 2003 2004 ctd

Publications(2003-2004) – ctd

  • C..Kreucher, K. Kastella, and A. Hero, “A Bayesian Method for Integrated Multitarget Tracking and Sensor Management”, 6th International Conference on Information Fusion, Cairns, Australia, July 2003.

  • C. Kreucher, C., Kastella, K., and Hero, A., “Tracking Multiple Targets Using a Particle Filter Representation of the Joint Multitarget Probability Density”, SPIE, San Diego California, August 2003.

  • C. Kreucher, K. Castella, and A. O. Hero, "Multitarget sensor management using alpha divergence measures,” Proc First IEEE Conference on Information Processing in Sensor Networks , Palo Alto, April 2003.

  • C. Kreucher, K. Kastella, and A. Hero, “Information-based sensor management for multitarget tracking”, SPIE, San Diego, California, August 2003.

  • C. Kreucher, K. Kastella, and A. Hero, “Particle filtering and information prediction for sensor management”, 2003 Defense Applications of Data Fusion Workshop, Adelaide, Australia, July 2003.

  • C. Kreucher, K. Kastella, and A. Hero, “Information Based Sensor Management for Multitarget Tracking”, Proc. Workshop on Multiple Hypothesis Tracking: A Tribute to Samuel S. Blackman, San Diego, CA, May 30, 2003. N. Patwari and A. O. Hero, "Hierarchical censoring for distributed detection in wireless sensor networks,” Proc. Of ICASSP, Hong Kong, April 2003.

  • N. Patwari, A. O. Hero, M. Perkins, N. S. Correal and R. J. O'Dea, "Relative location estimation in sensor networks,” IEEE T-SP, vol. 51, No. 9, pp. 2137-2148, Aug. 2003.

  • A. O. Hero , “Secure space-time communication," IEEE T-IT, vol. 49, No 12, pp. 1-16, Dec. 2003.

  • M.F. Shih and A. O. Hero, "Unicast-based inference of network link delay distributions using mixed finite mixture models," IEEE T-SP, vol. 51, No. 9, pp. 2219-2228, Aug. 2003.


Synergistic activities and awards 2003 2004

Synergistic Activities and Awards(2003-2004)

  • General Dynamics Medal Paper Award

    • C. Kreucher, K. Castella, and A. O. Hero, "Multitarget sensor management using alpha divergence measures,” Proc First IEEE Conference on Information Processing in Sensor Networks , Palo Alto, April 2003

  • General Dynamics, Inc

    • K. Kastella: collaboration with A. Hero in sensor management, July 2002-

    • C. Kreucher: doctoral student of A. Hero, Sept. 2002-

  • ARL

    • NAS-SEDD: A. Hero is member of yearly review panel, May 2002-

    • NAS-Robotics: A. Hero chaired the cross-cutting review panel, May 2004.

    • B. Sadler: N. Patwari (doctoral student of A. Hero) internship in distributed sensor information processing, summer 2003

  • ERIM Intl.

    • B. Thelen&N. Subotic: H. Neemuchwala (Hero’s PhD student) internship in applying entropic graphs to pattern classification, summer 2003

  • Chalmers Univ.,

    • M. Viberg: A. Hero was Opponent on multimodality landmine detection doctoral thesis, Aug 2003

  • EMM-CVPR plenary speaker:

    • “Entropy, spanner graphs, and pattern matching,” plenary lecture, July 2003


Personnel on a hero s sub project 2003 2004

Personnel on A. Hero’s sub-Project (2003-2004)

  • Chris Kreucher, 3rd year grad student

    • UM-Dearborn

    • General Dynamics Sponsorship

  • Neal Patwari, 2nd year doctoral student

    • Virginia tech

    • NSF Graduate Fellowship/MURI GSRA

  • Doron Blatt, 2nd year doctoral student

    • Univ. Tel Aviv

    • Dept. Fellowship/MURI GSRA

  • Raghuram Rangarajan, 2nd year doctoral student

    • IIT Madras

    • Dept. Fellowship/MURI GSRA


Personnel on a hero s sub project 2003 20041

Personnel on A. Hero’s sub-Project (2003-2004)

  • Cyrille Hory, Post-doctoral researcher (2003)

    • University of Grenoble

    • Area of specialty: data analysis and modeling, SAR, time-frequency


  • Login