1 / 43

Mingzhu Lu Department of ECE University of Texas at San Antonio

Study on Statistical Machine Learning -Kernel Methods for Intelligent System and Their applications. Mingzhu Lu Department of ECE University of Texas at San Antonio This is a joint work with Dr. C. L. Philip Chen, Dr. Long Chen and Dr. Yufei Huang. Outline. Motivation

marion
Download Presentation

Mingzhu Lu Department of ECE University of Texas at San Antonio

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Study on Statistical Machine Learning-Kernel Methods for Intelligent System and Their applications Mingzhu Lu Department of ECE University of Texas at San Antonio This is a joint work with Dr. C. L. Philip Chen, Dr. Long Chen and Dr. Yufei Huang

  2. Outline • Motivation • Introduction to Kernel Methods • Kernel Method for Multi-agent System • Multiple Kernel Learning for SVM with GA and PSO • Multiple Kernel Fuzzy c-means for Image Segmentation • Multiple Kernel Gaussian Process for miRNA Target Prediction (In process) • Conclusions • Future Work

  3. Motivation • Limited by traditional natural resources and power demand pressure, there is a great need to involve DER to the existing central power plant. • With the increase of system scales, how to build an stable, reliable and intelligent power system with learning capability becomes very important. • Existing intelligent system lacks the learning capability. • Kernel method is one of the most popular area of machine learning recently.

  4. Outline • Motivation • Introduction to Kernel Methods • Kernel Method for Multi-agent System • Multiple Kernel Learning for SVM with GA and PSO • Multiple Kernel Fuzzy c-means for Image Segmentation • Multiple Kernel Gaussian Process for miRNA Target Prediction • Conclusions • Future Work

  5. Kernel Methods • Kernel methods are machine learning methods employing positive definite kernels. • Kernel trick: a mapping function from input space to a high-dimensional feature space, which is often a nonlinear function denoted by . Input Space Feature Space

  6. Commonly Used Kernel methods and functions • Well-known kernel methods • Support Vector Machine(SVM) • Gaussian Process • Principal components anlysis …..any algorithm which involves “Kernel trick” • Common used kernels • Polynomial • Radial basis function • Two-layer NN …..

  7. Outline • Motivation • Introduction to Kernel Methods • Kernel Method for Multi-agent System • Multiple Kernel Learning for SVM with GA and PSO • Multiple Kernel Fuzzy c-means for Image Segmentation • Multiple Kernel Gaussian Process for miRNA Target Prediction • Conclusions • Future Work

  8. Distributed Power System The structure of the hybrid distributed power system (DPS)

  9. Architecture for LADA-DPS Physical layer architecture for medium-scale Overall Architecture for small-scale LADA-DPS LADA-DPS: LeArning Driven multi-Agent based Distributed Power System Physical layer architecture for large-scale

  10. LADA-DPS (Cont.) Learning Driven Single Agent Control of LADA-DPS by FIPA Protocol and JADE Platform

  11. Intelligent Fault Diagnosis AgentIntro. of Linear SVM Given a training set where and .The optimization problem is The optimal hyper-plane is WX+b=0

  12. Nonlinear SVM For the non-linearly separable dataset, the optimization problem becomes To convert it into an optimization problem with equality constraints, which is easier to solve. Its dual problem is After playing kernel trick, the optimal hyper-plane of SVM becomes

  13. Multi-class lArge marGin lEarning SVM (MAGE-SVM) Suppose a dataset , where , { is a function from to {-1, 1}}. Given a function , calculate the margin which is denoted by . Then the large margin learning of SVM is formulated as (1) MAGE-SVM algorithm: Given a m classes and N pairs of dataset • Step1: Compute Eq. (1) and obtain the separating function. Split the dataset into two subsets Up and Unbased on the separating hyper-plane with maximum margin among classes. • Step2: Check whether both the subset Up and Uncontain only one class, if yes, stop; otherwise, go to step 3. • Step3: If the subset is a multi-class problem, treat the dataset Up the same as the initial dataset, go to step 1; Deal with the subset Unsimilarly.

  14. Intelligent Fault Diagnosis Agent for Transformer Table . Data of gas content of the transformer unit: ppm Fig. MAGE-SVM for fault diagnosis agent of power transformer Table. Mean of 10-fold cross-validation on testing data *C1 , C2, C3 , C4, C5 represents normal, low-energy discharge, high-energy discharge, lower-temperature heating, high-temperature superheating, respectively.

  15. Outline • Motivation • Introduction to Kernel Methods • Kernel Method for Multi-agent System • Multiple Kernel Learning for SVM with GA and PSO • Multiple Kernel Fuzzy c-means for Image Segmentation • Multiple Kernel Gaussian Process for miRNA Target Prediction • Conclusions • Future Work

  16. Why Multiple Kernel Learning? • Multi-agent system is composed of homogenous or heterogeneous agents. • If the data is obtained from different resources or methods, they may have different characteristics. • Take an image for example, the intensity of a pixel is directly obtained from the image itself, but some complicated texture information is often gained from some wavelet filtering of the image. • The traditional single kernel is not enough for data confusion.This motives us to investigate multiple kernels to deal with the data.

  17. Multiple Kernels Multiple Kernels: is a combination of kernel functions by different operators, which should be satisfied the Mercer’s theorem. Its representation is where denotes the kernel function, is the exponent of i-th kernel function and represents the operator between the two kernel functions, which can be addition and multiplication operators. Without loss of generality, we study the linear multiple kernels, where is the weight.

  18. Multiple Kernel Learning for SVM • Traditional multiple kernel learning is based on the training classification accuracy on the training dataset, which suffer from the overfitting problem. • To improve the performance of SVM, learn a multiple kernel on the training set such at it attains the maximum margin. Now it becomes an optimization problem.

  19. Genetic Algorithm Solution

  20. Simulation on UCI Database Note: Left table shows the mean of 10 times’ testing accuracy. The single kernel function are Polynomial kernel function, Gaussian kernel function, and Heavy Tailed RBF respectively.

  21. Particle Swarm Optimization Solution

  22. Simulation on UCI Database

  23. Outline • Motivation • Introduction to Kernel Methods • Kernel Method for Multi-agent System • Multiple Kernel Learning for SVM with GA and PSO • Multiple Kernel Fuzzy c-means for Image Segmentation • Multiple Kernel Gaussian Process for miRNA Target Prediction • Conclusions • Future Work

  24. Fuzzy C-means (FCM) • Given a dataset of size n, X={x1,..,xn} and xjX • FCM groups X into c clusters by minimizing the weighted distance between the data and the cluster centers defined as: uij is the membership of data xjbelonging to cluster i • The FCM iteratively updates the prototypes [o1, o2, …, oc] and memberships uij through the equations below: • This iteration is stopped until the difference between old membership values and the updated new ones is small enough. • Finally based on resulting uij, we assign data xjinto the cluster k, where uik is the largest membership value in all uij (i=1 to c)

  25. Kernel Fuzzy C-Means • Both the data and thecluster centers are mapped from the original space to a new space by  , and we don’t know what  exactly is. But because = k(xj,xj) +k(xi,xi) -2k(xj,oi) We reformulate the objective function as • Because we know k, the new Q has explicit formulation; we can optimize it anyway. Original problem with mapping  the problem with only kernel function k . “kernel trick” Can not be solved solve this problem

  26. Multiple Kernel Fuzzy C-means (MKFCM) Without loss of generality, the combined kernel is k=wb1*k1+wb2*k2+…+wbn*kn Now the objective function becomes Then the Lagrange method was used to update coefficients w1, w2, to wn automatically.

  27. MKFCM for Image Segmentation • We mathematically proved that the most two successful kernel fuzzy c-means are special cases of MKFC-means. • They use Gaussian kernel to combine the pixel intensity and spatial information (mean of median of neighborhood pixels). One use the traditional objective function, the other one is • The objective function of MKFCM is

  28. Simulation on two-textured image • (a) The 2-textured image. (b) The segmentation result of AKFCM_meanf (SA=0.716). • (c) The segmentation result of AKFCM_medianf (SA=0.748). • (d) The segmentation result of DKFCM_ meanf (SA=0.715). • (e) The segmentation result of DKFCM_ medianf (SA=0.747). • (f) The segmentation result of MKFCM-K third variant (SA=0.753). • (g) The segmentation result of LMKFCM (SA=0.853). • (h) MKFCM-K, kcom=k1k2k3 (SA=0.723). • The segmentation result of MKFCM-K, kcom=k1+k2+k3 (SA=0.730). • (j) The segmentation result of KFCM, single intensity kernel (SA=0.720). • (k) The segmentation result of KFCM, single spatial kernel (SA=0.709). • (l) The segmentation result KFCM, texture kernel (SA=0.763).

  29. Simulation on MRI (a) MR image and its correct segmentation. From left to right are the integrated MR image, the CSF, the GM and the WM. (b) Segmentation results of AKFCM_meanf. (c) Segmentation results of DKFMC_meanf.

  30. Simulation on MRI (Cont.) (d) Segmentation results of MKFCM-K_meanf (first variant). (e) Segmentation results of MKFCM-K_poly (f) LMKFCM.

  31. Simulation on MRI (Cont.) Table . Segmentation accuracy of different methods on the MRI-brain • Fig. Segmentation results of different methods on a PET dog lung image. • PET dog lung. (b) Segmentation result of AKFCM_meanf. (c) Segmentation results of DKFCM_meanf. (d) Segmentation results of LMKFCM. (e) Segmentation results of MKFCM_poly.

  32. Outline • Motivation • Introduction to Kernel Methods • Kernel Method for Multi-agent System • Multiple Kernel Learning for SVM with GA and PSO • Multiple Kernel Fuzzy c-means for Image Segmentation • Multiple Kernel Gaussian Process for miRNA Target Prediction • Conclusions • Future Work

  33. Background of microRNA Single-stranded RNA of about 21-23 nucleotides in length; Known to regulate more than 20% of human genes; Each miRNA thought to regulate about few hundreds targets Play an important role in post-transcription stage in cell development, stress responses viral infection, and cancer Regulatory modes: Primary: inhibit translation Secondary: degrade mRNA mRNA Binding site and/or Protein down-regulation mRNA down-regulation

  34. New Way to View the Binding Status Fig. Match (bind) ratio for miRNA-122 on Positive Luciferase data Fig. Match (bind) ratio for miRNA-122 on Seed Mapping (proteomic data) Fig. Match status for miRNA-122 and mRNA pairs on Positive Luciferase data Fig. Match status for miRNA-122 and mRNA pairs on proteomic data

  35. BCmicrO Algorithm for miRNA target prediction • Poor sensitivity and specificity of existing algorithms. • Poor agreement between the results of different algorithms and yet they achieve similar performance. • Different algorithms rely on different mechanisms in making prediction, each of which has its own advantages. • For a gene, given the scores of TargetScan, miRanda and Pictar, provide a final score that represent the probability it is the target of miRNA. Graphical model of BCmicrO

  36. BCmicrO Performance ROC Curve of different miRNA target prediction algorithms Cumulative sum of protein fold change for different number of top ranked predictions of miR-124

  37. Multiple Kernel Gaussian Processes for miRNA target prediction String Kernel CLIP-Seq data (binding position info) Expression data Sequence data (ATCGGGCCTT…) Polynomial Kernel RBF Kernel Constructing the multiple kernel mean and covariance functions for Gaussian Process Gaussian Processes mRNA and miRNA target relationships

  38. Outline • Motivation • Introduction to Kernel Methods • Kernel Method for Multi-agent System • Multiple Kernel Learning for SVM with GA and PSO • Multiple Kernel Fuzzy c-means for Image Segmentation • Multiple Kernel Gaussian Process for miRNA Target Prediction • Conclusions • Future Work

  39. Conclusions • Presented the framework for LADA-DPS, which includes MAGE-SVM for fault diagnosis. • Investigated the Multiple kernels for SVM based on large margin learning to improve its generalize capability. • Developed Multiple Kernel Fuzzy C-means for image segmentation. • Create a new way to view the miRNA binding status. • Proposed multiple kernel Gaussian process for miRNA target prediction. (In process)

  40. Future Work • Implement multiple intelligent agents, investigate the better way to cooperate to finish a task (Swarm Intelligence). • Finish the multiple kernel GP and also try some semi-supervised learning methods on the sequencing data. • Current multiple kernel learning is more rely on the prior experience when choosing kernels for data. How to automatically design kernels, including kernel types and parameters, is challenging and worth to explore. Expected graduation time: Aug. 2011

  41. Acknowledgements This work is supported by NASA grant (NNC04GB35G), NSF Grant CCF-0546345 and San Antonio Life Sciences Institute (SALSI). Dissertation Committee: Dr. C. L. Philip Chen* Dr. Yufei Huang Dr. David Akopian Dr. Keying Ye Lab Mates: Dr. Long Chen Mr. Dong Yue (*Work at University of Macau now)

  42. Published Papers • Mingzhu Lu, C. L. Philip Chen, and Long Chen, “The Design and Reliability Assessment of Learning Model Driven Multi-Agent based Distributed Power system”, in IEEE Transactions on system, man and cybernetics: Part C (Submitted). • Long Chen, C. L. Philip Chen, and Mingzhu Lu, “A Multiple-Kernel Fuzzy C-means Algorithm for Image Segmentation”, in IEEE Transactions on system, man and cybernetics: Part B, accepted to appear. • Daniel R. Boutz, Patrick Colins, Uthra Suresh, Mingzhu Lu, Yufei Huanget al.,” A two-tiered approach identifies a network of cancer and liver diseases related genes regulated by miR-122”, in The Journal of Biological Chemistry, accepted to appear. • Jia Meng, Mingzhu Lu, Yidong Chen, et al., “Robust inference of the context specific structure and temporal dynamics of gene regulatory network,” in BMC Genomics2010, 11(Suppl 3):S11. • Mingzhu Lu, Long Chen and C. L. Philip Chen, “Sensitivity Analysis of Parametric t-norm and s-norm based Fuzzy Classification System”, in IEEE conference on system, man and cybernetics, Oct. 2010, Turkey. • Long Chen, Mingzhu Lu, and C. L. Philip Chen, “Multiple Kernel Fuzzy C-means for Image Segmentation”, in IEEE conference on system, man and cybernetics, Oct. 10-13, 2010, Istanbul, Turkey. • Dong Yue, Hui Liu, Mingzhu Lu, C. L. Philip Chen, Yidong Chen, Yufei Huang., “A Bayesian Decision Fusion Approach for miRNA Target Prediction”, in ACM International Conference on Bioinformatics and Computational Biology, Aug. 2010, NY, USA. • Mingzhu Lu, and C. L. Philip Chen, “Optimization of Multiple Kernels for SVM by Genetic Algorithm based on Large Margin Learning” (Poster), in CRA-W Graduate Cohort Workshop, Apr. 2010, Bellevue WA, USA. • Mingzhu Lu, and C. L. Philip Chen, “The Design of Multi-agent based Distributed Energy System”, in IEEE conference on system, man and cybernetics, Oct. 11-14, 2009, San Antonio, TX, USA, pp. 2001-2006. • Mingzhu Lu, C.L.Philip Chen, Jianbing Huo, et al, “Multi-Stage Decision Tree based on Inter-class and Inner-class Margin of SVM”, in IEEE conference on system, man and cybernetics, Oct. 2009, San Antonio, TX, USA, pp. 1875-1880. • Mingzhu Lu, C.L.Philip Chen, Jianbing Huo,” Optimization of Combined Kernel Function for SVM by Particle Swarm Optimization”, in International Conference on Machine Learning and Cybernetics, July 2009, Baoding, China, pp. 1160-1166. • Mingzhu Lu, C.L.Philip Chen, Jianbing Huo, et al, “Optimization of combined kernel function for SVM based on Large margin learning theory”, in IEEE conference on system, man and cybernetics, 2008,Singapore, pp. 353-358. • Xizhao Wang, Mingzhu Lu, Jianbing Huo, “Fault Diagnosis of Power Transformer Based on Large Margin Learning Classifier,” In International Conference on Machine Learning and Cybernetics, Dalian, Aug. 2006, Vol.5, pp. 2886-2891.

  43. Thank you for your attention! Comments and Suggestions?

More Related