Position
Download
1 / 39

Motivation - PowerPoint PPT Presentation


  • 263 Views
  • Updated On :

Position Calibration of Audio Sensors and Actuators in a Distributed Computing Platform Vikas C. Raykar | Igor Kozintsev | Rainer Lienhart University of Maryland, CollegePark | Intel Labs, Intel Corporation. Motivation.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Motivation' - Jeffrey


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Slide1 l.jpg

  • PositionCalibration of Audio Sensors and Actuators

  • in a Distributed ComputingPlatform

  • Vikas C. Raykar | Igor Kozintsev | Rainer Lienhart

  • University of Maryland, CollegePark | Intel Labs, Intel Corporation


Motivation l.jpg
Motivation

  • Many multimedia applications are emerging which use multiple audio/video sensors and actuators.

Speakers

Microphones

Distributed Capture

Current Work

DistributedRendering

Cameras

Number

Crunching

Displays

Other Applications


What can you do with multiple microphones l.jpg
What can you do with multiple microphones…

  • Speaker localization and tracking.

  • Beamforming or Spatial filtering.

X


Some applications l.jpg
Some Applications…

Speech Recognition

Hands free voice

communication

Novel Interactive audio

Visual Interfaces

Multichannel speech

Enhancement

Smart Conference

Rooms

Audio/Image Based

Rendering

Audio/Video

Surveillance

Speaker Localization

and tracking

MultiChannel echo

Cancellation

Source separation and

Dereverberation

Meeting Recording


More motivation l.jpg
More Motivation…

  • Current work has focused on setting up all the sensors and actuators on a single dedicated computing platform.

  • Dedicated infrastructure required in terms of the sensors, multi-channel interface cards and computing power.

    On the other hand

  • Computing devices such as laptops, PDAs, tablets, cellular phones,and camcorders have become pervasive.

  • Audio/video sensors on different laptops can be used to form a distributed network of sensors.

Internal microphone


Common time and space l.jpg
Common TIME and SPACE

  • Put all the distributed audio/visual input/output capabilities of all the laptops into a common TIME and SPACE.

  • For the common TIME see our poster.

    Universal Synchronization Scheme for Distributed Audio-Video Capture on Heterogenous Computing Platforms R. Lienhart, I. Kozintsev and S. Wehr

  • In this paper we deal with common SPACE i.e estimate the 3D positions of the sensors and actuators.

    Why common SPACE

  • Most array processing algorithms require that precise positions of microphones be known.

  • Painful and Imprecise to do a manual measurement.



Slide10 l.jpg

If we know the positions of speakers….

Y

If distances are not exact

If we have more speakers

Solve in the least square

sense

?

X


If positions of speakers unknown l.jpg
If positions of speakers unknown…

  • Consider M Microphones and S speakers.

  • What can we measure?

Distance between each speaker and all microphones.

Or Time Of Flight (TOF)

MxS TOF matrix

Assume TOF corrupted by Gaussian noise.

Can derive the ML estimate.

Calibration signal


Nonlinear least squares l.jpg
Nonlinear Least Squares..

More formally can

derive the ML estimate

using a Gaussian

Noise model

Find the coordinates which minimizes this


Maximum likelihood ml estimate l.jpg

If noise is Gaussian

and independent

ML is same as

Least squares

Maximum Likelihood (ML) Estimate..

we can define a noise model

and derive the ML estimate i.e. maximize the likelihood ratio

Gaussian noise


Reference coordinate system l.jpg

Reference Coordinate system | Multiple Global minima

Reference Coordinate System

Positive Y axis

Similarly in 3D

1.Fix origin (0,0,0)

2.Fix X axis

(x1,0,0)

3.Fix Y axis

(x2,y2,0)

4.Fix positive Z axis

x1,x2,y2>0

Origin

X axis

Which to choose? Later…




Slide17 l.jpg

Multimedia/multistream applications

Operating system

I/O bus

Audio/video I/O devices

The journey of an audio sample..

Network

This laptop wants to play

a calibration signal on

the other laptop.

Play comand in software.

When will the sound be

actually played out from

The loudspeaker.


Slide18 l.jpg

On a Distributed system..

Time Origin

Signal Emitted by source j

t

Playback Started

Signal Received by microphone i

Capture Started

t


Slide19 l.jpg

MS TOF Measurements

Joint Estimation..

Microphone and speaker

Coordinates

3(M+S)-6

Microphone Capture

Start Times

M -1

Assume tm_1=0

Totally

4M+4S-7 parameters to estimates

MS observations

Can reduce the number of parameters

Speaker Emission

Start Times

S


Slide20 l.jpg

Nonlinear least squares..

Levenberg Marquadrat

method

Function of a large number of parameters

Unless we have a good initial guess may not converge

to the minima.

Approximate initial guess required.


Closed form solution l.jpg
Closed form Solution..

Say if we are given all pairwise distances between N points can we get the coordinates.


Slide22 l.jpg

Classical Metric Multi Dimensional Scaling

dot product matrix

Symmetric positive definite

rank 3

Given B can you get X ?....Singular Value Decomposition

Same as

Principal component Analysis

But we can measure

Only the pairwise distance matrix




Slide25 l.jpg

Can we use MDS..Two problems

1. We do not have

the complete

pairwise distances

2. Measured distances

Include the

effect of lack of

synchronization

UNKNOWN

UNKNOWN



Slide27 l.jpg

i j

j i

j j

Clustering approximation…

i i


Slide28 l.jpg

Finally the complete algorithm…

Approximation

TOF matrix

Clustering

Approx

ts

Approx

Distance matrix

between GPCs

Dot product matrix

Approx

tm

Dimension and

coordinate system

MDS to get approx

GPC locations

TDOA based

Nonlinear

minimization

perturb

Microphone and speaker

locations

tm

Approx. microphone

and speaker

locations



Slide30 l.jpg

Algorithm Performance…

  • The performance of our algorithm depends on

    • Noise Variance in the estimated distances.

    • Number of microphones and speakers.

    • Microphone and speaker geometry

  • One way to study the dependence is to do a lot of monte carlo simulations.

  • Or given a noise model can derive bounds on how worst can our algortihm perform.

  • The Cramer Rao bound.


Slide31 l.jpg

Rank Deficit..remove the

Known parameters

Jacobian






Slide36 l.jpg

X estimator.

Room Height = 2.03 m

Speaker 3

Mic 3

Mic 4

Room Length = 4.22 m

Speaker 2

Speaker 4

Mic 2

Mic 1

Speaker 1

Z

Room Width = 2.55 m

Synchronized setup | bias 0.08 cm sigma 3.8 cm



Slide38 l.jpg

Summary estimator.

  • General purpose computers can be used for distributed array processing

  • It is possible to define common time and space for a network of distributed sensors and actuators.

  • For more information please see our two papers or contact [email protected]

  • [email protected]

  • Let us know if you will be interested in testing/using out time and space synchronization software for developing distributed algorithms on GPCs (available in January 2004)



ad