1 / 36

Constraint Reasoning and Kernel Clustering for Pattern Decomposition With Scaling

Constraint Reasoning and Kernel Clustering for Pattern Decomposition With Scaling. Ronan LeBras Computer Science Theodoros Damoulas Computer Science Ashish Sabharwal Computer Science Carla P. Gomes Computer Science John M. Gregoire Materials Science / Physics

dale
Download Presentation

Constraint Reasoning and Kernel Clustering for Pattern Decomposition With Scaling

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Constraint Reasoning and Kernel Clusteringfor Pattern Decomposition With Scaling Ronan LeBras Computer Science TheodorosDamoulas Computer Science AshishSabharwalComputer Science Carla P. Gomes Computer Science John M. Gregoire Materials Science / Physics Bruce van Dover Materials Science / Physics Sept 15, 2011CP’11

  2. Motivation Cornell Fuel Cell Institute Mission: develop new materials for fuel cells. • An Electrocatalyst must: • Be electronically conducting • Facilitate both reactions • Platinum is the best known metal to fulfill that role, but: • The reaction rate is still considered slow (causing energy loss) • Platinum is fairly costly, intolerant to fuel contaminants, and has a shortlifetime. Goal: Find an intermetallic compound that is a better catalyst than Pt. Ronan Le Bras - CP’11, Sept 15, 2011

  3. Motivation • Recipe for finding alternatives to Platinum • In a vacuum chamber, place a silicon wafer. • Add three metals. • Mix until smooth, using three sputter guns. • Bake for 2 hours at 650ºC Rh • Deliberately inhomogeneous composition on Si wafer • Atoms are intimately mixed (38% Ta, 45% Rh, 17% Pd) [Source: Pyrotope, Sebastien Merkel] Ta Pd Ronan Le Bras - CP’11, Sept 15, 2011

  4. Motivation • Identifying crystal structure using X-Ray Diffraction at CHESS • XRD pattern characterizes the underlyingcrystalfairlywell • Expensiveexperimentations: Bruce van Dover’sresearch team has access to the facilityone weekeveryyear. Rh (38% Ta, 45% Rh, 17% Pd) [Source: Pyrotope, Sebastien Merkel] Ta Pd Ronan Le Bras - CP’11, Sept 15, 2011

  5. Motivation Rh γ β α Ta Pd δ Ronan Le Bras - CP’11, Sept 15, 2011

  6. Motivation Rh γ γ+δ β α+β δ α Ta Pd Ronan Le Bras - CP’11, Sept 15, 2011

  7. Motivation Rh γ β α Ta Pd δ Ronan Le Bras - CP’11, Sept 15, 2011

  8. Motivation Rh γ β Pj Pi α+β α Ta Pd δ Ronan Le Bras - CP’11, Sept 15, 2011

  9. Motivation Rh γ β Pk Pj α+β α Pi Ta Pd δ Ronan Le Bras - CP’11, Sept 15, 2011

  10. Motivation Additional Physical characteristics: • Peaks shift by  15% within a region • Phase Connectivity • Mixtures of  3 phases • Small peaks might be discriminative • Peak locations matter, more than peak intensities pure phaseregion Al INPUT: OUTPUT: Al Si Fe m phase regionskpure regionsm-kmixed regions XRD patterncharacterizingpure phases Mixedphaseregion Si Fe Ronan Le Bras - CP’11, Sept 15, 2011

  11. Motivation Rh Pd Ta Figure 1: Phase regions of Ta-Rh-Pd Figure 2: Fluorescence activity of Ta-Rh-Pd Ronan Le Bras - CP’11, Sept 15, 2011

  12. Outline • Motivation • Problem Definition • Abstraction • Hardness • CP Model • Kernel-based Clustering • Bridging CP and Machine learning • Conclusion and Future work Ronan Le Bras - CP’11, Sept 15, 2011

  13. Problem Abstraction: Pattern Decomposition with Scaling Input Output Ronan Le Bras - CP’11, Sept 15, 2011

  14. Problem Abstraction: Pattern Decomposition with Scaling Input Output v1 v2 v3 v4 v5 Ronan Le Bras - CP’11, Sept 15, 2011

  15. Problem Abstraction: Pattern Decomposition with Scaling Input Output v1 P1 v2 P2 v3 P3 v4 P4 v5 P5 Ronan Le Bras - CP’11, Sept 15, 2011

  16. Problem Abstraction: Pattern Decomposition with Scaling Input Output v1 P1 v2 P2 v3 P3 v4 P4 v5 P5 M= K = 2, δ = 1.5 Ronan Le Bras - CP’11, Sept 15, 2011

  17. Problem Abstraction: Pattern Decomposition with Scaling Input Output B1 v1 P1 v2 P2 v3 P3 v4 P4 v5 P5 M= K = 2, δ = 1.5 B2 Ronan Le Bras - CP’11, Sept 15, 2011

  18. Problem Abstraction: Pattern Decomposition with Scaling Input Output B1 s21=0.68 s11=1.00 v1 P1 s12=0.95 s22=0.78 v2 P2 s23=0.85 s13=0.88 v3 P3 s24=0.96 s14=0.84 v4 P4 s25=1.00 s15=0 v5 P5 M= K = 2, δ = 1.5 B2 Ronan Le Bras - CP’11, Sept 15, 2011

  19. Problem Hardness Assumptions: Each Bk appears by itself in some vi / No experimental noise The problem can be solved* in polynomial time. Assumption: No experimental noise The problem becomes NP-hard(reduction from the “Set Basis” problem) Assumption used in this work: Experimental noise in the form of missing elements in Pi P0= B1 s21=0.68 s11=1.00 v1 P1 s22=0.78 s12=0.95 v2 P2 s23=0.85 s13=0.88 v3 P3 s14=0.84 s24=0.96 v4 P4 s25=1.00 s15=0 v5 P5 M= K = 2, δ = 1.5 B2 Ronan Le Bras - CP’11, Sept 15, 2011

  20. Outline • Motivation • Problem Definition • CP Model • Model • Experimental Results • Kernel-based Clustering • Bridging CP and Machine learning • Conclusion and Future work Ronan Le Bras - CP’11, Sept 15, 2011

  21. CP Model Ronan Le Bras - CP’11, Sept 15, 2011 Advantage: Captures physical properties and relies on peak location rather than height. Drawback: Does not scale to realistic instances; poor propagation if experimental noise.

  22. CP Model (continued) Advantage: Captures physical properties and relies on peak location rather than height. Drawback: Does not scale to realistic instances; poor propagation if experimental noise. Ronan Le Bras - CP’11, Sept 15, 2011 Advantage: Captures physical properties and relies on peak location rather than height. Drawback: Does not scale to realistic instances; poor propagation if experimental noise.

  23. CP Model – Experimental Results For realistic instances, K’ = 6 and N ≈ 218... Ronan Le Bras - CP’11, Sept 15, 2011 Advantage: Captures physical properties and relies on peak location rather than height. Drawback: Does not scale to realistic instances; poor propagation if experimental noise.

  24. Outline • Motivation • Problem Definition • CP Model • Kernel-based Clustering • Bridging CP and Machine learning • Conclusion and Future work Ronan Le Bras - CP’11, Sept 15, 2011

  25. Kernel-based Clustering Set of features: X = Similarity matrices:    + + +    [X.XT] Better approach: takeshifts into account Red: similarBlue: dissimilar Method: K-means on Dynamic Time-Warping kernel Goal: Select groups of samples that belong to the same phase region to feed the CP model, in order to extract the underlying phases of these sub-problems. Ronan Le Bras - CP’11, Sept 15, 2011 Advantage: Captures physical properties and relies on peak location rather than height. Drawback: Does not scale to realistic instances; poor propagation if experimental noise.

  26. Bridging CP and ML Goal: a robust, physically meaningful, scalable, automated solution method that combines: UnderlyingPhysics Machine Learningfor a “globaldata-driven view” A language forConstraintsenforcing“local details” Constraint Programming model Similarity “Kernels”& Clustering Ronan Le Bras - CP’11, Sept 15, 2011

  27. Bridging Constraint Reasoning and Machine Learning: Overview of the Methodology INPUT: Machine Learning: Kernel methods,Dynamic Time Wrapping Peakdetection Al Al Machine Learning: Partial “Clustering” Fix errorsin data Si Si Fe Fe Full CP Modelguided bypartial solutions CP Model & Solveron sub-problems  only  only OUTPUT , +,  Ronan Le Bras - CP’11, Sept 15, 2011

  28. Experimental Validation Al-Li-Fe instancewith 6 phases: Ground truth(known) Previous work (NMF)violates manyphysical requirements Our CP + ML hybridapproach is muchmore robust

  29. Conclusion Hybrid approach to clustering under constraints • More robust than data-driven “global” ML approaches • More scalable than a pure CP model “locally” enforcing constraints • An exciting application in close collaboration withphysicists • Best inference out of expensiveexperiments • Towards the design of better fuel celltechnology Ronan Le Bras - CP’11, Sept 15, 2011

  30. Future work Ongoingwork: Spatial Clustering, to furtherenhance cluster quality BayesianApproach, to better exploit priorknowledge about local smoothness and availableinorganiclibraries Active learning: Where to samplenext, assumingwecaninterferewith the samplingprocess? When to stop samplingif sufficient information has been obtained? Correlating catalytic properties across many thin-films: In order to understand the underlying physical mechanism of catalysis and to find promising intermettalic compounds Ronan Le Bras - CP’11, Sept 15, 2011

  31. The end Thank you! Ronan Le Bras - CP’11, Sept 15, 2011

  32. Extra slides Ronan Le Bras - CP’11, Sept 15, 2011

  33. Experimental Sample Example on Al-Li-Fe diagram: Ronan Le Bras - CP’11, Sept 15, 2011

  34. Applications with similar structure Flight Calls / Bird conservation Identifying bird population from sound recordings at night. Analogy: basis pattern = species samples = recordings physical constraints = spatial constraints, species and season specificities… Fire Detection Detecting/Locating fires. Analogy: basis pattern = warmth sources samples = temperature recordings physical constraints = gradient of temperatures, material properties… Ronan Le Bras - CP’11, Sept 15, 2011

  35. Previous Work 1: Cluster Analysis (Long et al., 2007) • xi = • (Feature vector) • (Pearson correlation coefficients) • (Distance matrix) • (PCA – 3 dimensional approx) • (Hierarchical Agglomerative Clustering) Drawback: Requires sampling of pure phases, detects phase regions (not phases), overlooks peak shifts, may violate physical constraints (phase continuity, etc.). Ronan Le Bras - CP’11, Sept 15, 2011

  36. Previous Work 2: NMF (Long et al., 2009) • xi = • X = A.S + E • Min ║E║ • (Feature vector) • (Linear positive combination (A) of basis patterns (S)) • (Minimizing squared Frobenius norm Drawback: Overlooks peak shifts (linear combination only), may violate physical constraints (phase continuity, etc.). Ronan Le Bras - CP’11, Sept 15, 2011

More Related