Hierarchical mixture of experts
Download
1 / 23

Hierarchical Mixture of Experts - PowerPoint PPT Presentation


  • 90 Views
  • Uploaded on

Hierarchical Mixture of Experts. Presented by Qi An Machine learning reading group Duke University 07/15/2005. Outline. Background Hierarchical tree structure Gating networks Expert networks E-M algorithm Experimental results Conclusions. Background. The idea of mixture of experts

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Hierarchical Mixture of Experts' - channing-melendez


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Hierarchical mixture of experts

Hierarchical Mixture of Experts

Presented by Qi An

Machine learning reading group

Duke University

07/15/2005


Outline
Outline

  • Background

  • Hierarchical tree structure

  • Gating networks

  • Expert networks

  • E-M algorithm

  • Experimental results

  • Conclusions


Background
Background

  • The idea of mixture of experts

    • First presented by Jacobs and Hintons in 1988

  • Hierarchical mixture of experts

    • Proposed by Jordan and Jacobs in 1994

  • Difference from previous mixture model

    • Mixing weights depends on both the input and the output



One layer structure
One-layer structure

μ

Ellipsoidal Gating function

g1

g2

g3

Gating Network

x

μ1

μ2

μ3

Expert Network

Expert Network

Expert Network

x

x

x



Hierarchical tree structure
Hierarchical tree structure

Linear Gating function


Expert network

  • At the leaves of trees

    for each expert:

linear predictor

output of the expert

link function

For example: logistic function for binary classification


Gating network

  • At the nonterminal of the tree

    top layer: other layer:


Output

  • At the non-leaves nodes

    top node: other nodes:


Probability model
Probability model

  • For each expert, assume the true output y is chosen from a distribution P with mean μij

  • Therefore, the total probability of generating y from x is given by


Posterior probabilities
Posterior probabilities

  • Since the gij and gi are computed based only on the input x, we refer them as prior probabilities.

  • We can define the posterior probabilities with the knowledge of both the input x and the output y using Bayes’ rule


E m algorithm
E-M algorithm

  • Introduce auxiliary variables zij which have an interpretation as the labels that corresponds to the experts.

  • The probability model can be simplified with the knowledge of auxiliary variables


E m algorithm1
E-M algorithm

  • Complete-data likelihood:

  • The E-step


E m algorithm2
E-M algorithm

  • The M-step


IRLS

  • Iteratively reweighted least squares alg.

    • An iterative algorithm for computing the maximum likelihood estimates of the parameters of a generalized linear model

    • A special case for Fisher scoring method


Algorithm
Algorithm

E-step

M-step


Online algorithm
Online algorithm

  • This algorithm can be used for online regression

  • For Expert network:

    where Rij is the inverse covariance matrix for EN(i,j)


Online algorithm1
Online algorithm

  • For Gating network:

    where Si is the inverse covariance matrix

    and

    where Sijis the inverse covariance matrix


Results
Results

  • Simulated data of a four-joint robot arm moving in three-dimensional space



Conclusions
Conclusions

  • Introduce a tree-structured architecture for supervised learning

  • Much faster than traditional back-propagation algorithm

  • Can be used for on-line learning


Thank you

Thank you

Questions?


ad