認知心理学2. pattern recognition: パターン認識

objective

  • make clear distinction between sensation and perception

How could you study

  • tachistoscope (a.k.a. t-scope: 瞬間露出機)
    • Allows visual stimuli to be presented for brief periods of time.
      • Designed to present patterns very rapidly (milliseconds)
    • A box that presents visual stimuli at a specified duration and level of illumination (CP 18)

What is a pattern recognition?

  • How people identify the objects in their environments
  • the stage of perception during which a stimulus is identified (CP 3,18)

Why it’s cool to be human?

  • Computers don’t do this well

http://dl.dropbox.com/u/3770752/wiki/cognitive/02/letters.png

  • Speech recognition software

Theory of pattern recognition

  • template
    • an unanalyzed pattern that is matched against alternative patterns by using the degrees of overlap as a measure of similarity (CP 19)
    • Template pull things from memory very quickly.
  • Assumptions:
  1. Memory representation is an unanalyzed entity
    1. only sensing at that time.
  2. Input pattern is compared to stored representation
    1. input stored in LTM
  3. Amount of overlap determines identity
    1. look for overrap
  4. That is, there is a direct (isomorphic) match between input pattern and stored pattern
  • this idea came from computer model because comp use memory to solve.
    • Computers often use template systems (e.g. bar code readers)

Template theory

  • sensory: N
  • memory: N Z M
  • Compare the image to the template. Select the best match.

Evidence for templates in sensory store

  • PHILLIPS (1974)
    • Showed Ss a slide of star
    • Once the star disappears display pattern for about 1 sec., followed by a blank slide and another pattern.
    • Is the second pattern the same or different from the first pattern?
  • template theory actually predict this.

http://dl.dropbox.com/u/3770752/wiki/cognitive/02/phillips.png

  • On the x-axis we have the interstimulus interval (the amount of time between the end of a stimulus and the beginning of another stimulus (CP 21))
  • The y-axis shows the percent correct.
  • red still line: When the patterns where displayed in the same location accuracy declined as the interstimulus interval increased.
    • This suggests that subjects were able to use their sensory store to make a template match. However, the longer the interstimulus interval the greater the decay of the information in sensory store and as a result the number of errors increases.
  • black line: represents subjects accuracy when the second pattern was displayed in a different location. As predicted by template theory because the second pattern was in a different place subjects were unable to make a template match.
  • 時間が長ければ長いほど忘れてしまう。

Problems

  • Changing perspective doesn’t impair recognition.
  • Limited space in there**
  • according to template theory, we need a template for every different perspective, which is impossible.
    • Compterは莫大な量のtemplateを用意して認識しようとするが、未だに脳に劣るということが、template theory is wrongということを示した。
  • Doesn’t allow for alternative interpretations of stimuli. (e.g., the duck/rabbit or old/young woman drawings)
  • Therefore, template theory was very efficient so not used anymore.
  • How do templates differ? (P vs. R)
  • We need to describe features in order to describe how things differ.

Alternatives to templates?

  • F T B K R H
  • feature theory:
  • Feature theory assumptions:
    • Inputs are broken down into features (e.g. edges, shape, color)
    • LTM contains descriptions of past inputs as lists of attributes or features
    • The feature list that best matches input determines identity.

metaphor

http://dl.dropbox.com/u/3770752/wiki/cognitive/02/metaphor.jpg

  • image demon pick up external stimuli
  • feature daemon monitor image daemon's specific feature.
    • one types of feature and one location
  • cognitive daemons can be neuron on visual cortex. = feature list.
    • start get excited when it matches.
    • screaming competition
  • decision daemon pick up the loudest.

If feature theories are correct.

  • …similar letters should be harder to discriminate between than totally different letters (Gibson, Schapiro, & Yonas, 1968)
  • perceptual confusion
    • how often one pattern is mistaken for another
    • a measure of the frequency with which two patterns are mistakenly identified as each other (CP 22)

Evidence for Feature Theories.

  • Gibson, Schapiro, & Yonas, (1968)
  • I.V.: overlap of features:
    • high or low
    • R vs. P / G vs. W
  • D.V.: time to make a same/different choice
  • Results: G - W: 458ms, R - P: 571ms

More human evidence for feature theories

  • Hubel and Weisel’s 1962 single cell recording in the visual system.
  • Found neurons that responded strongest to specific inputs.
  • electrode into monkey's brain
  • youtube video: visual cortex cell recording

  • huge for feature theory.
  • showed that neurons strong to a specific input.

Application of feature theory.

  • Allows discrimination between categories
  • distinctive feature
    • feature that is present in one pattern but not another
      • e.g.: Pと、Rの右下の棒
    • A feature present in one pattern but absent in another, aiding one's discrimination of the two patterns. (CP 23)
  • Face recognition (Rhodes, Brennen, & Carey, 1987)
  • children get training with distinctive feature get more correct.

Feature detectors: nature or nurture?

  • Both
  • Critical Period
    • e.g. critical periodの猫に垂直の線状の光しか見せていなかったら、水平の線状の物を認識できなく育った。
  • Problems
    • Doesn’t take context into account.
      • don't describe how do they know to organize and put into a form.

Limits

does not also explain alternative interpretation.

structural theory

  • Builds off of feature theories
  • a theory that specifies how the features of a pattern are joined to other features of the pattern (CP 26)
  • Emphasize relations between features.
  • Biederman, 1985- component model
    • geon
      • Different three dimensional shapes that combine to form three-dimensional patterns (CP 28)
    • 全ての物はGeonというシンプルなパーツでできていると考えた。

http://dl.dropbox.com/u/3770752/wiki/cognitive/02/geons.jpg

  • If we can identify the geons, we can identify the object just by looking at (c)

http://dl.dropbox.com/u/3770752/wiki/cognitive/02/geons2.jpg

  • but, we cannot. on the other hand, we can do this by looking at (b)
  • Therefore, people do not care about geons, but rather its important features.
  • People were faster and more accurate when the complementary images had the same geons than when they had different geons. These findings support the theory that relations are important both for grouping features together into larger units (geons) and for showing the relation among geons to form more complex objects.

Information processing stages

http://dl.dropbox.com/u/3770752/wiki/cognitive/02/sensory%20memory.png

  • sensory memory
  • Are there capacity limits for patters recognition?
    • Whole report v. partial report. (情報は全て入ってきているのか、それともその一部だけが入ってきているのか。)

(3)全体報告法と部分報告法
上段、中段、下段の3段で提示されたアルファベットを再生する際に、全体報告法では、思い出した順に定時された単語をできるだけたくさん思い出してもらうのに対し、部分報告法では、上段、中断、下段の各段に順に高音、中音、低音を割り当て、刺激提示語に3種類の音のうち、一種類の音が提示され、提示された音に対応する段の単語を思い出してもらうという方法である。どの音が提示されるかは、ランダムであるので、ある音に対して2つの単語を思い出すことができれば、他の2音の場合にもそれぞれ2つの単語を思い出すことが可能と考えられ、計6単語が記憶されていると考えられる。
(記憶のメカニズム II)

スパーリングは、感覚記憶は保存時間が短時間であるため、報告している間に本来保存されていた記憶が消滅すると考えた。
よって、提示した情報のうち指定した部分のみを報告するように被験者に指示し、段階を経て全ての情報を報告させていった。
この方法を部分報告法という。
結果、スパーリングはこの方法により提示した全ての情報が一度短時間に保存されることを照明した。
部分報告法に対し、提示されたものを全て報告させる方法を全体報告法(whole-report procedure)という。
(route cohの大学の勉強部屋)

    • A task in which observers are cued to report only certain items in a display of items (CP 30)

George Sperling- 1960

G T F B
Q Z C R
K P S N

  • Partial Report
    • Present tone: 0, .15, .30, .50, or 1 sec
    • Pitch of tone indicates which row to recall

Result

  • whole-report procedure Ss able to recall 4.5 items
    • Is this all we can see?
    • Is this limit due to capacity of the sensory store or the fading of memory?
  • partial-report procedure Ss able to recall 3 digits of the row
    • Ss did not know which row they would have to report.
    • Suggests sensory store holds about 9 pieces of information
    • Decreases if participants have to wait to report.

感覚記憶のうち、視覚情報に関するものをアルリック・ナイサー(Neisser,U.)はアイコニックメモリー(略称:アイコン,iconic memory)と呼んだ。
視覚情報保存(visual information store)ともいう。アイコンは、容量が大きく、短時間で消滅する性質を持つ。
(route cohの大学の勉強部屋)

Sperling’s model (1963, 1967)

  • visual information store (VIS)
    • Large capacity but memory traces fade quickly.
      • Stimulus intensity, Duration, Masking
    • a sensory store that maintains visual information for approximately 1/4 of a second (CP 32)
    • scan component: the attention component of Sperling's model that determines what is recognized in the visual information store. (CP 33)
      • (parallel processing : Carrying out more than one operation at a time, such as looking at an art exhibit and making conversation (CP 33))
      • (serial processing : Carrying out one operation at a time, such as pronouncing one word at a time (CP 32))
    • iconic memory

視覚刺激の感覚記憶は『アイコニック・メモリー(iconic memory)』と呼ばれ、その持続時間はスパーリング(Sperling)の実験によると約500ミリ秒以内であるとされています。

リハーサルというのは、予行演習あるいは反復学習のことであり、短期貯蔵庫に格納された情報を繰り返し声に出して覚えたり、心の中で復唱して記憶を強化するというものです。コーディングというのは、水泳や野球、テニスという言葉をスポーツという概念でグループ化したり、遊園地やショッピングセンターという言葉を具体的な視覚イメージと結びつけたりすることで、一般的に認知の体制化や符号化の作業として理解されています。
(短期記憶のメカニズムを説明する基礎理論)

聴覚刺激の感覚記憶は『エコイック・メモリー(echoic memory)』と呼ばれ、その持続時間はグルックスバーグとコワン(Glucksberg & Cowan)の実験によると約5秒以内であるとされています。

    • 今の考えではここじゃない

Rumelhart’s model (1970)

  • Added feature detection to Sperling’s ideas of the visual store and parallel processing.
  • Perception depends upon:
      1. of features
    • Time allowed
  • Perceptual limitations impair performance on partial-report task (not memory limits)
  • Was more specific in how pattern recognition, assumed that recognition occurs by identifying features in a pattern

But is everything bottom-up processing?(feature -> interpret?)

単語優位効果(word superiority effect)とは、構成文字が同じであっても単語中のもの方が非単語中のものよりも知覚が促進されること。たとえば単語:[word]と非単語:[rdow]を比較したとき、[d]が有ったかどうかを再認させると単語:[word]を提示したときの方が成績が良くなる。
相互活性化モデルにより説明される。つまり、単語が活性化されたことで、その中に含まれる文字ユニットの活性が強まったことによると説明できる。
(ぶり。の勉強部屋)

    • The finding that accuracy in recognizing a letter is higher when the letter is in a word than when it appears alone or is in a nonword (CP 35)
    • wordである方が認識しやすい(top-down)

top-down processing

  • BASED ON REICHER (1969)
    • nonword is much more difficult.
  • Q-if letter recognition needs to precede word recognition (as feature theories claim) then why are letters easier to identify when they are parts of words?
  • interactive activation model
    • A theory that proposes that both feature knowledge and word knowledge combine to provide information about the identity of letters in a word. (CP 36)

http://dl.dropbox.com/u/3770752/wiki/cognitive/02/recognition%20of%20the%20letter%20K%20in%20work.png

  • Recognition of the letter K in work.

interactive activation model

  • A theory that proposes that both feature knowledge and word knowledge combine to provide information about the identity of letters in a word. (CP 36)

相互活性化モデル(interactive activation model ; McClelland & Rumelhart, 1981)とは、Mouton, J. のロゴジェンモデルを発展させたモデルで、「特徴レベル」,「文字レベル」,「単語レベル」の3層からなる階層ネットワークを仮定する。各層は神経細胞のようなユニットで構成され、ユニット動詞はレベル内外で興奮性・抑制性の結合で結ばれている。レベル内のユニット間は抑制関係にあり、単語レベルから文字レベルへはフィードバックが行われる。3つのレベルは互いに整合性が有る場合に活性化され、そうでない場合には抑制される。
(ぶり。の勉強部屋)

  • Assumptions:
    • parallel processing
      • Sensory store
      • Within each level
    • Each level influences activation of ‘nodes’ at other levels
      • excitatory connection:
        • A positive association between concepts that belong together, as when a vertical line provides support for the possibility that a letter is a K.(CP 36)
      • inhibitory connection:
        • A negative association between concepts that do not belong together, as when the presence of a vertical line provides negative evidence that a letter is a C (CP 36)
  • neural network model
    • A theory in which concepts (nodes) are linked to other concepts through excitatory and inhibitory connection to approximate the behavior of neural network in the brain (CP 37)
      • node: The format for repressing concepts in a semantic network (CP 37)
      • activation rule: A rule that determines how inhibitory and excitatory connections combine to determine the total activation of a concept (CP 38)

but what about

  • “Does the huamn mnid raed wrods as a wlohe?”

Hearing with your eyes

  • Or, interactivity across sensory modalities
  • McGurk effect

マガークとマクドナルド(McGurk & MacDonald 1976)は、ある音韻の発話の映像と別の音韻の音声を組み合わせて視聴すると、第三の音韻が知覚されることを初めて報告した。たとえば、「ガ(ga)」と言っている映像に、「バ(ba)」と言っている音声を組み合わせて視聴すると、「ガ」でも「バ」でもなく、「ダ(da)」と聞こえる。この現象は、音韻知覚が音声の聴覚情報だけで決まるのではなく、話者の口元の映像のような視覚情報など、他の感覚モダリティの情報にも影響を受けることを示しており、視聴覚情報統合の代表例となっている。海外ではMcGurk-MacDonald effect と呼ばれることもある。
(wikipedia)

    • Visual phenomenon in which we integrate visual info with auditory info
    • visual + audatory
  • youtube mcGurk effect with explanation

  • 見た物だけではなく、ほかのいろいろな要素にも基づいて認識していると言うこと

In summary