spectral hashing l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Spectral Hashing PowerPoint Presentation
Download Presentation
Spectral Hashing

Loading in 2 Seconds...

play fullscreen
1 / 41

Spectral Hashing - PowerPoint PPT Presentation


  • 1233 Views
  • Uploaded on

Spectral Hashing. Y. Weiss (Hebrew U.) A. Torralba (MIT) Rob Fergus (NYU). What does the world look like?. Motivation. High level image statistics. Object Recognition for large-scale search . Semantic Hashing. [Salakhutdinov & Hinton, 2007]. Query Image. Semantic Hash Function.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Spectral Hashing' - niveditha


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
spectral hashing

Spectral Hashing

Y. Weiss (Hebrew U.)

A. Torralba (MIT)

Rob Fergus (NYU)

slide2

What does the world look like?

Motivation

High level image statistics

Object Recognition for large-scale search

semantic hashing
Semantic Hashing

[Salakhutdinov & Hinton, 2007]

Query Image

Semantic HashFunction

Address Space

Binary code

Images in database

Query address

Semantically similar images

Quite differentto a (conventional)randomizing hash

1 locality sensitive hashing
1. Locality Sensitive Hashing
  • Gionis, A. & Indyk, P. & Motwani, R. (1999)
  • Take random projections of data
  • Quantize each projection with few bits

101

0

Gist descriptor

1

0

No learning involved

1

1

0

toy example
Toy Example
  • 2D uniform distribution
2 boosting
2. Boosting
  • Modified form of BoostSSC [Shaknarovich, Viola & Darrell, 2003]
  • Positive examples are pairs of similar images
  • Negative examples are pairs of unrelated images

0

Learn threshold & dimension for each bit (weak classifier)

0

1

1

0

1

toy example7
Toy Example
  • 2D uniform distribution
3 restricted boltzmann machine rbm
3. Restricted Boltzmann Machine (RBM)
  • Type of Deep Belief Network
  • Hinton & Salakhutdinov, Science 2006

Hidden units

Units are binary & stochastic

SingleRBMlayer

Symmetric weights

W

Visible units

  • Attempts to reconstruct input at visible layer from activation of hidden layer
multi layer rbm non linear dimensionality reduction
Multi-Layer RBM: non-linear dimensionality reduction

Output binary code (N dimensional)

N

Layer 3

w3

256

256

Layer 2

w2

512

512

Layer 1

w1

512

Linear units at first layer

Input Gist vector (512 dimensions)

toy example10
Toy Example
  • 2D uniform distribution
2 d toy example
2-D Toy example:

3 bits

7 bits

15 bits

Distance from query point

Red – 0 bits

Green – 1 bit Black – >2 bits

Blue – 2 bits

Query Point

toy results
Toy Results

Distance

Red – 0 bits

Green – 1 bit

Blue – 2 bits

semantic hashing13
Semantic Hashing

[Salakhutdinov & Hinton, 2007]

Query Image

Semantic HashFunction

Address Space

Binary code

Images in database

Query address

Semantically similar images

Quite differentto a (conventional)randomizing hash

spectral hash
Spectral Hash

Query Image

SpectralHash

Non-lineardimensionality reduction

Address Space

Binary code

Images in database

Real-valuedvectors

Query address

Semantically similar images

Quite differentto a (conventional)randomizing hash

spectral hashing nips 08
Spectral Hashing (NIPS ’08)
  • Assume points are embedded in Euclidean space
  • How to binarize so Hamming distance approximates Euclidean distance?

Ham_Dist(10001010,11101110)=3

spectral hashing theory
Spectral Hashing theory
  • Want to min YT(D-W)Y subject to:
    • Each bit on 50% of time
    • Bits are independent
    • Sadly, this is NP-complete
  • Relax the problem, by letting Y be continuous.
  • Now becomes eigenvector problem
nystrom approximation
Nystrom Approximation
  • Method for approximating eigenfunctions
  • Interpolate between existing data points
  • Requires evaluation of distance to existing data cost grows linearly with #points
  • Also overfits badly in practice
what about a novel data point
What about a novel data point?
  • Need a function to map new points into the space
  • Take limit of Eigenvalues as n\inf
  • Need to carefully normalize graph Laplacian
  • Analytical form of Eigenfunctions exists for certain distributions (uniform, Gaussian)
  • Constant time compute/evaluate new point
  • For uniform:

Only depends on extent of distribution (b-a)

the algorithm
The Algorithm

Input: Data {xi} of dimensionality d; desired # bits, k

  • Fit a multidimensional rectangle to the data
    • Run PCA to align axes, then bound uniform distribution
  • For each dimension, calculate k smallest eigenfunctions.
  • This gives dkeigenfunctions. Pick ones with smallest k eigenvalues.
  • Threshold eigenfunctions at zero to give binary codes
1 fit multidimensional rectangle
1. Fit Multidimensional Rectangle
  • Run PCA to align axes
  • Bound uniform distribution
back to the 2 d toy example
Back to the 2-D Toy example

3 bits

7 bits

15 bits

Distance

Red – 0 bits

Green – 1 bit

Blue – 2 bits

input image representation gist vectors
Input Image representation: Gist vectors
  • Pixels not a convenient representation
  • Use Gist descriptor instead (Oliva & Torralba, 2001)
  • 512 dimensions/image (real-valued  16,384 bits)
  • L2 distance btw. Gist vectors not bad substitute for human perceptual distance

NO COLOR INFORMATION

Oliva & Torralba, IJCV 2001

labelme images
LabelMe images
  • 22,000 images (20,000 train | 2,000 test)
  • Ground truth segmentations for all
  • Assume L2 Gist distance is true distance
bit allocation between dimensions
Bit allocation between dimensions
  • Compare value of cuts in original space, i.e. before the pointwise nonlinearity.
summary
Summary
  • Spectral Hashing
    • Simple way of computing good binary codes
    • Forced to make big assumption about data distribution
    • Use point-wise non-linearities to map distribution to uniform
    • Need more experiments on real data
overview
Overview
  • Assume points are embedded in Euclidean space (e.g. output from RBM)
  • How to binarize the space so that Hamming distance between points approximates L2 distance?
strategies for binarization
Strategies for Binarization
  • Deliberately add noise during backprop - forces extreme values to overcome noise

0

1

0

1