Supervised learning: Mixture Of Experts (MOE) Network

Supervised learning: Mixture Of Experts (MOE) Network

MOE Module P(y|x,f) a1 (x) Gating Network a2 (x) P(y|x, q1) a3 (x) P(y|x, q2) P(y|x, q3) Local Expert Local Expert Local Expert x

The objective is to estimate the model parameters so as to attain the highest probability of the training set given the estimated parameters. For a given input x , the posterior probability of generating class y given x using K experts can be computed as P( y | x , Φ) = Σj P( y | x , Θj) aj( x )

Each RBF Gaussian kernel can be viewed as an local expert. MAXNET GatingNET

MOE Classifier MAXNET ΣkP(Ek|x) P(ωc|x,Ek) ωwinner P(Ek|x,) P(ωc|x,Ek)

Mixture of Experts The MOE [Jacobs91] exhibits an explicit relationship with statistical pattern classification methods as well as a close resemblance to fuzzy inference systems. Given a pattern, each expert network estimates the pattern's conditional a posteriori probability on the (adaptively-tuned or pre-assigned) feature space. Each local expert network performs multi-way classification over K classes by using either K independent binomial model, each modeling only one class, or one multinomial model for all classes.

Two Components of MOE • local experts: • gating network:

Local Experts • The design of modular neural networks hinges upon the choice of local experts. • Usually, a local expert is adaptively trained to extract a certain it local feature particularly relevant to its local decision. • Sometimes, a local expert can be assigned a predetermined feature space. • Based on the local feature, a local expert gives its local recommendation .

LBF vs RBF Local Expertss Hyperplane Kernel function MLP RBF

Mixture of Experts Class 2 Class 1

Mixture of Experts Expert #2 Expert #1

Gating Network • The gating network serves the function of computing the proper weights to be used for the final weighted decision. • A probabilistic rule is used to integrate recommendations from several local experts taking into account the experts' confidence levels.

The training of the local experts as well as (the confidence levels in) the gating network of the MOE network is based on the expectation-maximization (EM) algorithm.

Supervised learning: Mixture Of Experts (MOE) Network

Supervised learning: Mixture Of Experts (MOE) Network

Presentation Transcript

Elementary Concepts of Neural Networks

Distance Metric Learning: A Comprehensive Survey

Chapter 6

Mixtures and Solutions Review Game Grade 5

On the Power of Ensemble: Supervised and Unsupervised Methods Reconciled*

Prepared especially for the New Members of the Professional Learning Network of ALBEMARLE COUNTY PUBLIC SCHOOLS by Dan

Science Network Meeting September 23, 2013

CMSC 671 Fall 2003

Life is in a Complex Mixture of Electrolytes mostly Na + , K + , and Ca ++ Cl -

Lecture 7 Artificial neural networks: Supervised learning

Prepared especially for the Secondary Professional Learning Network of

Semi-supervised Learning

Supervised Learning II: Backpropagation and Beyond

Applying Finite Mixture Models

Unsupervised Models for Coreference Resolution

Supervised Independent Living for Foster Youth

Scalable Methods for Graph-Based Unsupervised and Semi-Supervised Learning

Network Analysis and Design

A Strength-based Approach to Supervised Visitation in Child Welfare

Chapter 3: Supervised Learning

SlideShare Experts - 7 Experts Reveal Their Presentation Design Secrets