chapter 5 l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Chapter 5 PowerPoint Presentation
Download Presentation
Chapter 5

Loading in 2 Seconds...

play fullscreen
1 / 107

Chapter 5 - PowerPoint PPT Presentation


  • 171 Views
  • Uploaded on

Chapter 5. 物體與場景知覺. Computer perception system The Defense Advanced Research Projects Agency (DARPA) The March/2004 race (142 miles across the Mojave Desert)—1 million prize. The October/2005 race (132 miles) –2 million prize winner.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Chapter 5' - chiquita


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
chapter 5

Chapter 5

物體與場景知覺

slide2
Computer perception system
    • The Defense Advanced Research Projects Agency (DARPA)
    • The March/2004 race (142 miles across the Mojave Desert)—1 million prize
slide3
The October/2005 race (132 miles) –2 million prize

winner

“…Now we need to teach them how to drive in traffic.”-- Gary Bradski, Intel Corporation as quote in the October 17, 2005 issue of the EE Times

slide4
Urban challenge race
    • Victorville, CA, Nov 3, 2007
    • 55 mile course that resemble city streets and other moving vehicles
    • $2 million
    • 1st place winnder averaged approximately 14 mph throughout the course
    • http://www.youtube.com/watch#!v=SQFEmR50HAk&feature=related
    • http://www.youtube.com/watch#!v=6SfaCkhhQT8&feature=channel
slide5
物體知覺實際上並不簡單
  • 兩大基本問題
    • 知覺組織(perceptual organization)—視覺系統如何把龐雜環境刺激組織成為「物體」?
    • 圖形-背景(Figure-ground)—視覺系統如何把龐雜環境刺激中的一部分歸為「背景」,一部份歸為「圖形」?
slide6
知覺機器(人,車)面對的挑戰
    • 網膜刺激型態未必能代表環境刺激3-D 2-D
slide13
格式塔學派對知覺組織的研究取向
  • 對結構主義(structuralism)的反動
    • 結構主義是馮特(Wundt)等人開始建立(20世紀初期)
      • 知覺是由感覺因子結合而成
      • 心理化學(mental chemistry)
    • Max Wertheimer 覺得似動運動(apparent movement)否定了結構主義AM,所以和K.Kofka, I.Kohler從事格式塔心理學的研究
      • 結構主義也不容易解釋錯覺輪廓(illusory contour)ic
slide23
格式塔學派因而拒絕了結構主義(知覺是感覺的總和),而主張整體不等同於部分的總和,並開始注重知覺組織的問題格式塔學派因而拒絕了結構主義(知覺是感覺的總和),而主張整體不等同於部分的總和,並開始注重知覺組織的問題
slide24
知覺組織的格式塔定理
    • 完形律(law of Pragnanz)=law of good figure, law of simplicity刺激型態的知覺以產生最簡結構為原則

good figure

slide25
相似律
    • 相近的物體會被組織在一起
slide28

連續律(law of good continuation)

    • 傾向將可形成直線或平滑曲線的點連接起來,形成具有平滑路徑的線條型態

Fig. 5-16, p. 100

slide29

接近律(law of proximity)

    • 空間鄰近的物體會被組織在一起
slide30
共同命運(common fate)
    • 以相同方向運動的物體會被組織在一起
  • 熟悉度
    • 能共同構成熟悉型態的影像成分會被組織在一起
slide33
格式塔以外的知覺組織原則(Palmer & Rock)
    • 共同區域(common region)
      • 落入共同區域的元素會被組織在一起
    • 元素連結(element connectedness)
      • 連結的物體會被組織在一起
    • 同步性(synchrony)
      • 同時發生的視覺事件會被組織在一起
slide34

connectedness

synchrony

common region

slide35
這些格式塔定律的地位是什麼?
    • 定理 (law) vs. 原則 (principle) vs. 經驗法則 (heuristics)
      • 經驗法則(Heuristics) vs. 算則 (algorithm)
    • They are best-guess rules that do not work every time. But, when they do, they work very fast.
figure ground
圖形背景(figure-ground)分離
  • 格式塔學派
    • 可逆圖形(reversible figure)vase
    • 圖形及背景的成立要件
      • 圖形比較像東西,位於背景之前 front
      • 對稱的比較可能是圖形 sy
      • 佔據面積較小的比較可能是圖形 small
      • 水平或垂直方位的比較可能是圖形 vertical
      • 有意義的物體比較可能是圖形meaning
      • 下方的比較可能是圖形,左右沒有差異Vecera et al. (2002)
slide37
back

Figure 5.24 A version of Rubin’s reversible face-vase figure.

slide40
Figure 5.27 (a) Stimuli from Vecera et al. (2002). (b) Percentage of trials on which lower or left areas were seen as figure

Vecera 用了二種方法:

1) 判斷那一邊是圖形

2) 30秒期間,根據知覺到的圖形(非背景)是哪一個而按鍵,結果下方的有84%的時間被知覺為圖形

rbc theory recognition by components
RBC theory (recognition by components)

如何由不同觀看角度辨認物體?

結構描述(structural description)模型

將物體表徵為「部件」以及部件之間的「空間關係」

D Marr (1982)

「部件」為柱狀的,具有體積的單元

slide42
部件為幾何子(geons)
  • 為數不多的幾何子(及其間的空間關係)即可用以代表大量的物體
slide43
幾何子最重要的特性是它的解析不受觀看角度影響(view invariant)
  • 因為幾何子是由非偶發特性(non-accidental property, NAP)所界定的
  • 2-D影像中的特性,其實就是3-D物體的實際特性
slide44

平行性

彎曲性

slide47
只要界定幾何子的重要特徵仍然保留,就不太受雜訊影響只要界定幾何子的重要特徵仍然保留,就不太受雜訊影響
slide48
可以用以表徵許多類型的物體
  • 但無法解釋一般人何以能區辨細節不同的物體
slide49
影像描述(image description)模型
    • 觀看角度不變性(view invariance)未必成立,所以辨認歷程將影像與儲存的各種觀看角度表徵作比較
perceiving scenes
Perceiving Scenes
  • What is a scene ?
    • 包括背景與物體(以有意義的方式安排在一起)
perceiving scenes51
Perceiving Scenes
  • The gist of a scene is perceived rapidly
    • use of masks
  • phenomenological method - Li (2007)
    • Gists are reported very early.
slide53

為何場景的gist 那麼容易被辨認?

    • 快速處理整體影像特徵(global image features)
    • 某些可以快速被偵測的整體特徵與場景的類型有相關
    • 自然度(naturalness)
      • 有質理的區塊,波浪狀的輪廓
    • 開放性(openness)
      • 水平線,物體少
    • 崎嶇度(roughness)
      • 元素小而複雜
    • 開闊度(expansion)
      • 平行線向遠方匯聚
    • 特徵色彩
slide54

high low

  • Degree of naturalness: forest vs. street
  • Degree of openness : beach vs. forest
  • Degree of roughness : forest vs. beach
  • Degree of expansion : railroad vs. street
  • Color: blue sky; green forest
slide55
如oblique effect (知覺系統對於垂直以及水平的刺激特別敏感)可能是因為我們的自然環境中充斥垂直與水平的輪廓
experience dependent plasticity back
環境規律性造成經驗引發的可塑性(experience dependent plasticity)back

水平線條環境

垂直線條環境

slide57
Gestalt law “uniform connectedness”
    • 物體的各部分往往有相同的顏色,材質等,所以具有一致性的往往來自同一物體back
slide58

Shape from shading

Figure 5.46 (a) Some of these discs are perceived as jutting out, and some are perceived as indentations. (b) Light coming from above will illuminate the top of a shape that is jutting out, and (c) the bottom of an indentation.

Light-from-above heuristics

slide59
Figure 5.47 Why does (a) look like indentations in the sand and (b) look like mounds of sand? See text for explanation.
slide61

場景影像特徵反映環境中的規律性(regularities)場景影像特徵反映環境中的規律性(regularities)

  • 1.物理規律性(physical regularities)
      • 視覺環境中垂直與水平方位的輪廓比例高投影片 58
      • 格式塔定律 ”uniform connectedness”投影片 60
      • Light-from-above heuristic投影片 58
  • 2.意義規律性(semantic regularities)
    • 與場景類型相關的場景功能性組件與安排
      • 如:特定物體在特定場景中的位置有規律性
slide63
Figure 5.45 Stimuli used in Palmer’s (1975) experiment. The scene at the left is presented first, and the observer is then asked to identify one of the objects on the right.
slide65
知覺的推論歷程
    • 推論歷程以無意識方式影響知覺
    • von Helmholtz (1866/1911)無意識推論理論(theory of unconscious inference)有些知覺經驗源自我們對於環境的無意識假設--可能性原則(likelihood principle)我們知覺到的物體是在造成網膜上2-D型態的所有可能性中,最有機會出現的刺激型態
slide66
為何人的物體知覺表現超越機器甚多?

知覺刺激是不明確的(ambiguous)

人類知覺透過利用環境規律性作推論來解決知覺問題

機器視覺需要能夠學習環境規律性,並能利用規律性來作推論

Boss的例子

Boss 所假設的規律性(知識)

動的東西很可能是車子

如果對方是紅燈則他應該會停下來

未能設計在Boss中的規律性

slide67
物體與場景知覺的生理基礎
  • Fig. 5.43 對於知覺組織產生反應的神經元特性
slide68

對於圖形/背景有反應的V1神經元反應模式

Figure 5.44 How a neuron in V1 responds to stimuli presented to its receptive field (green rectangle). (a) The neuron responded when the stimulus on the receptive field is figure. (b) There is no response when the same pattern on the receptive field is not figure (Adapted from Lamme et al., 1995.)

slide69
反應型態與知覺經驗而非刺激物理特性一致
    • 為何出現在V1神經元?
    • 可能是脈絡調節(contextual modulation)所造成
      • 來自高階視覺處理的回饋
slide70
腦如何處理有關物體的訊息?
    • Sheinberg & Logothetis (1997)
      • Monkey trained to pull lever in response to particular pattern
slide71
Neuron in IT 當猴子知覺到某個刺激時才會fire
    • 猴子接收到的物體刺激總是相同的,知覺意識的的改變是發生在腦
  • Grill-Spector et al. (2004)
    • 用ROI (region of interest)法決定每個人FFA的位置
    • 用遮蔽(masking)法快速呈現人臉圖片
    • 發現FFA的激發程度與受試者的主觀知覺判斷(而非物理刺激)符合
slide74

House vs. Face

    • binocular rivalry
    • PPA vs. FFA
slide75

Figure 5.40 Time-course of brain activation for trials in which Harrison Ford’s face was presented. (Grill-Spector, et al., 2004)

slide76
Freedman et al. (2003)
    • 用morphing方法製作出系列刺激
    • 用延遲配對(delayed matching to sample)程序來測量猴子的辨認表現
slide77

IT與PF/神經元的反應模式不同

Figure 5.43 (a) Response of a monkey IT neuron that responds better to a 100-percent dog stimulus (red line) than to a 100-percent cat stimulus (blue) during the “sample” period of the delayed-matching-to-sample task. Other combinations of dog and cat fell between these two extremes. (b) Response of PF neurons to the same stimuli. For this neuron, the response to dog is greater during the delay and text periods. (From Freedman, D. J. et al., (2003). A comparison of primate prefrontal and inferior temporal cortices during visual categorization. Journal of Neuroscience, 23, 5235-5246.)

models of brain activity perception
Models of Brain Activity & Perception
  • Image  Brain  Measured voxel activity pattern
  • Image  Decoder  Measured voxel activity pattern

∥?

∥?

  • Fig. 5.50
  • An orientation decoder was used to analyze the voxel activity.
    • -The decoder could accurately predict which orientation had been presented.
fig 5 43
Fig 5.43
  • 對於知覺組織產生反應的神經元特性
slide81

對於圖形/背景有反應的V1神經元反應模式

Figure 5.29 How a neuron in V1 responds to stimuli presented to its receptive field (green rectangle). (a) The neuron responded when the stimulus on the receptive field is figure. (b) There is no response when the same pattern on the receptive field is not figure (Adapted from Lamme et al., 1995.)

slide82
反應型態與知覺經驗而非刺激物理特性一致
    • 為何出現在V1神經元?
    • 可能是脈絡調節(contextual modulation)所造成
      • 來自高階視覺處理的回饋
slide83
現代觀點
  • 格式塔心理學的貢獻偏向「描述」知覺現象
  • 現代觀點重視「測量」與「機制」
  • 為何視覺系統會特別對某類型的視覺刺激有反應?
    • 人類知覺系統嘗試捕捉環境特性,所以往往根據環境規律性(regularities)作反應
    • 規律性指規律地在很多情境下出現的環境特性
      • 如一些格式塔經驗法則
        • 「連續律」顯示我們周遭環境中有很多直的輪廓以及平滑的輪廓
        • 「uniform connectedness」顯示物體的各部分往往有相同的顏色,材質等,所以具有一致性的往往來自同一物體
slide84
如oblique effect (知覺系統對於垂直以及水平的刺激特別敏感)可能是因為我們的自然環境中充斥垂直與水平的輪廓
slide85
圖形一定要由背景中分離後才能被知覺嗎?
    • 受試往往認為有意義的才是圖形→代表「辨認」與「圖形/背景分離」的發生順序是……
slide86
如何由不同觀看角度辨認物體?

結構描述(structural description)模型

將物體表徵為「部件」以及部件之間的「空間關係」

D Marr (1982)

「部件」為柱狀的,具有體積的單元

slide87
成分辨識論(Recognition by Components, RBC)
    • 部件為幾何子(geons)
    • 為數不多的幾何子(及其間的空間關係)即可用以代表大量的物體
slide89
幾何子最重要的特性是它的解析不受觀看角度影響(view invariant)如立方柱的平行邊,在大多觀看角度下均可看到—非偶發特性(nonaccidental property)少數觀看角度下,2-D影像中的特性,其實並不會出現在3-D物體– accidental
slide91
只要界定幾何子的重要特徵仍然保留,就不太受雜訊影響只要界定幾何子的重要特徵仍然保留,就不太受雜訊影響
slide92
可以用以表徵許多類型的物體
  • 但無法解釋一般人何以能區辨細節不同的物體
slide93
影像描述(image description)模型
    • 觀看角度不變性(view invariance)未必成立,所以辨認歷程將影像與儲存的各種觀看角度表徵作比較
slide94
腦如何處理有關物體的訊息?
    • Sheinberg & Logothetis (1997)
      • Monkey trained to pull lever in response to particular pattern
slide95
Neuron in IT 當猴子知覺到某個刺激時才會fire
    • 猴子接收到的物體刺激總是相同的,知覺意識的的改變是發生在腦
  • Grill-Spector et al. (2004)
    • 用ROI (region of interest)法決定每個人FFA的位置
    • 用遮蔽(masking)法快速呈現人臉圖片
    • 發現FFA的激發程度與受試者的主觀知覺判斷(而非物理刺激)符合
slide98

Figure 5.40 Time-course of brain activation for trials in which Harrison Ford’s face was presented. (Grill-Spector, et al., 2004)

slide99
Freedman et al. (2003)
    • 用morphing方法製作出系列刺激
    • 用延遲配對(delayed matching to sample)程序來測量猴子的辨認表現
slide100

IT與PF/神經元的反應模式不同

Figure 5.43 (a) Response of a monkey IT neuron that responds better to a 100-percent dog stimulus (red line) than to a 100-percent cat stimulus (blue) during the “sample” period of the delayed-matching-to-sample task. Other combinations of dog and cat fell between these two extremes. (b) Response of PF neurons to the same stimuli. For this neuron, the response to dog is greater during the delay and text periods. (From Freedman, D. J. et al., (2003). A comparison of primate prefrontal and inferior temporal cortices during visual categorization. Journal of Neuroscience, 23, 5235-5246.)

slide101
為何人的物體知覺表現超越機器甚多?
  • 知覺智慧(perceptual intelligence)
    • von Helmholtz
      • 無意識推論理論(theory of unconscious inference)有些知覺經驗源自我們對於環境的無意識假設--可能性原則(likelihood principle)我們知覺到的物體是在造成網膜上2-D型態的所有可能性中,最有機會出現的刺激型態
      • 知覺歷程如同問題解決,但我們「自動化」地運用知覺智慧來解決知覺問題
slide102
Figure 5.44 The display in (a) looks like (b) -- a blue rectangle in front of a red rectangle -- but it could be (c), a blue rectangle and an appropriately positioned 6-sided red figure.
slide103
Figure 5.45 Stimuli used in Palmer’s (1975) experiment. The scene at the left is presented first, and the observer is then asked to identify one of the objects on the right.
slide104

Shape from shading

Figure 5.46 (a) Some of these discs are perceived as jutting out, and some are perceived as indentations. (b) Light coming from above will illuminate the top of a shape that is jutting out, and (c) the bottom of an indentation.

Light-from-above heuristics

slide105
Figure 5.47 Why does (a) look like indentations in the sand and (b) look like mounds of sand? See text for explanation.
slide107
機器視覺系統需要加入知覺智慧來模擬人類快速解決知覺問題的歷程機器視覺系統需要加入知覺智慧來模擬人類快速解決知覺問題的歷程