1 / 45

PLSA 建模思想分析

PLSA 建模思想分析. 张小洪. Contents. 什么是建模. LSA 思想方法. PLSA 图像建模. PLSA 建模的应用条件和假设. PLSA 应用及发展. 建模是什么. 软件开发中的建模 业务建模 需求模型 设计模型 实现模型 数据库模型 词法分析→提取对象→刻画对象(属性或方法) →对象关系 模型反映了事物或对象之间的关系. 模型是什么:例子. 模型是什么:例子. 映射. 建筑 汽车 电话 人像 自行车 书 树木. 模型是什么:例子. 模型是什么:例子. 映射.

nancy
Download Presentation

PLSA 建模思想分析

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PLSA建模思想分析 张小洪

  2. Contents 什么是建模 LSA思想方法 PLSA图像建模 PLSA建模的应用条件和假设 PLSA应用及发展

  3. 建模是什么 • 软件开发中的建模 • 业务建模 • 需求模型 • 设计模型 • 实现模型 • 数据库模型 • 词法分析→提取对象→刻画对象(属性或方法) →对象关系 • 模型反映了事物或对象之间的关系

  4. 模型是什么:例子

  5. 模型是什么:例子 映射 建筑 汽车 电话 人像 自行车 书 树木

  6. 模型是什么:例子

  7. 模型是什么:例子 映射 人 手 马 龟 象 犬 鳄

  8. 模型是什么:例子 映射

  9. 建模是什么 y x 目标函数 G x S y LM 机器学习 映射或函数

  10. 建模是什么 • 数学建模 • 模型 函数 • 泛函 • 求满足目标和条件的函数过程 • 基于经验数据的建模 机器学习问题 • 学习问题是指依据经验数据选取所期望的依赖关系的问题 • 学习过程是一个从给定的函数集中选择一个适当函数的过程。 • 模式识别 • 函数值Y:指标集

  11. 建模是什么:模式识别 选择函数的过程 函数集

  12. LSA方法 问题:如何分类文章 Technical Memo Titles c1: Human machine interface for ABC computer applications c2: A survey of user opinion of computersystemresponsetime c3: The EPSuserinterface management system c4: System and humansystem engineering testing of EPS c5: Relation of user perceived responsetime to error measurement m1: The generation of random, binary, ordered trees m2: The intersection graph of paths in trees m3: Graphminors IV: Widths of trees and well-quasi-ordering m4: Graphminors: A survey

  13. LSA方法 如何表示文章:Vector Space Model r (human.user) = -.378 r (human.minors) = -.378 1单词本 2统计词频 问题?

  14. LSA方法:SVD • Singular Value Decomposition A=USVT • Dimension Reduction {~A}~={~U}{~S}{~V}T

  15. LSA方法:SVD {U} = 降至2维

  16. LSA方法:SVD {S} = 降至2维

  17. LSA方法:SVD {V} = 降至2维

  18. r (human.user) = .94 r (human.minors) = -.83 LSA方法:SVD 同义词 问题

  19. LSA方法:SVD

  20. LSA 方法:讨论 • SVD方法为何能有效?其假设是什么? • LSA does not define a properly normalized probability distribution • No obvious interpretation of the directions in the latent space • From statistics, the utilization of L2 norm in LSA corresponds to a Gaussian Error assumption which is hard to justify in the context of count variables • Polysemy problem • 怎样可视化SVD的结果?

  21. PLSA:问题 建筑 汽车 电话 人像 自行车 书 树木

  22. PLSA:问题 • 问题 • 图像怎样表示成特征向量? • 特征向量怎样构成“图像单词”? • 训练图像集怎样表示成共生矩阵(词频矩阵)? • 模型选择?

  23. ….. PLSA:问题 frequency codewords

  24. PLSA:问题 Object Bag of ‘words’

  25. learning recognition 2.codewords dictionary 1.feature detection & representation 3.image representation category decision category models (and/or) classifiers

  26. PLSA:Feature detection and representation

  27. PLSA:Feature detectionand representation Compute SIFT descriptor [Lowe’99] Normalize patch Detect patches [Mikojaczyk and Schmid ’02] [Mata, Chum, Urban & Pajdla, ’02] [Sivic & Zisserman, ’03] Slide credit: Josef Sivic

  28. PLSA:Feature detection and representation

  29. PLSA:Codewordsdictionary formation

  30. PLSA:Codewordsdictionary formation Vector quantization

  31. PLSA:Codewordsdictionary formation

  32. ….. PLSA:Image representation frequency codewords

  33. Representation codewords dictionary feature detection & representation image representation 2. 1. 3.

  34. Learning and Recognition codewords dictionary category decision category models (and/or) classifiers

  35. PLSA Learning and Recognition • Generative method: • - graphical models • Discriminative method: • - SVM category models (and/or) classifiers

  36. generative models • Naïve Bayes classifier • Csurka Bray, Dance & Fan, 2004 • Hierarchical Bayesian text models (pLSA and LDA) • Background: Hoffman 2001, Blei, Ng & Jordan, 2004 • Object categorization: Sivic et al. 2005, Sudderth et al. 2005 • Natural scene categorization: Fei-Fei et al. 2005

  37. First, some notations • wn: each patch in an image • wn = [0,0,…1,…,0,0]T • w: a collection of all N patches in an image • w = [w1,w2,…,wN] • dj: the jth image in an image collection • c: category of the image • z: theme or topic of the patch

  38. Object class decision Prior prob. of the object classes Image likelihood given the class Case #1: the Naïve Bayes model c w N Csurka et al. 2004

  39. Case #2: Hierarchical Bayesian text models z d w N D “face” Probabilistic Latent Semantic Analysis (pLSA) Sivic et al. ICCV 2005

  40. Observed codeword distributions Theme distributions per image Codeword distributions per theme (topic) The pLSA model Slide credit: Josef Sivic

  41. Recognition using pLSA Slide credit: Josef Sivic

  42. Learning the pLSA parameters Observed counts of word i in document j Maximize likelihood of data using EM M … number of codewords N … number of images Slide credit: Josef Sivic

  43. PLSA:讨论 • 数据的特征,PLSA应用条件和假设? • Not a well-defined generative model of documents; d is a dummy index into the list of documents in the training set (as many values as documents) • No natural way to assign probability to a previously unseen document • Number of parameters to be estimated grows with size of training set

  44. PLSA的应用及发展

  45. Thank You ! iiec.cqu.edu.cn

More Related