Eye tracking
This presentation is the property of its rightful owner.
Sponsored Links
1 / 92

Eye tracking PowerPoint PPT Presentation

  • Uploaded on
  • Presentation posted in: General

Eye tracking. Applications within cognitive science Dr. Christa van Mierlo. Why is eye tracking used in cognitive science?.

Download Presentation

Eye tracking

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Eye tracking

Applications within cognitive science

Dr. Christa van Mierlo

Why is eye tracking used in cognitive science?

Scan patterns give delayed information about the mental processes that are developing in a person’s mind and reveal what visual information is (going to be) used by these processes.

Frequently studied EM components

  • Fixations

    • Gaze stays fixed on one position

    • Intake of new visual information

    • Planning of new eye movement

  • Saccades

    • Fast movement of both eyes in the same direction

    • Processing of new visual input is limited:

      low spatial frequencies attentuated

      high spatial frequencies unaffected.

    • Top velocity proportional to amplitude

Visual Search

  • The sometimes difficult process of finding a target among distractors in often cluttered visual environments.

  • Physical and cognitive processing limitations can prevent us from instantly recognizing the presence of a target item in a single glance (e.g. a large number of shared features with distractor or fuzzy target specifications)

  • This can be overcome by focusing attention:

    • Bottom up

    • Top down

Bottom-up factors drawing attention

  • onsets (e.g., Theeuwes, Kramer, Hahn, Irwin, & Zelinsky, 1999; Yantis & Jonides, 1984)

  • unique colors (e.g., Theeuwes, 1994; Theeuwes & Burger, 1998)

    Even when the location of the target is known, highly salient features that are known not to be associated with the target can still capture attention (Christ & Abrams, 2006).

Top-down factors focusing attention

  • Target specificity: the number of features the candidate shares with the target (e.g., Folk, Remington, & Johnston, 1992; Folk, Remington, & Wright, 1994)

  • Memory (Boot, McCarley, Kramer, & Peterson, 2004; Brockmole & Henderson, 2006; Peterson & Kramer, 2001).

Four Eye Tracking studies within Visual Search

  • The effects of target template specificity on visual search in real-world scenes: Evidence from eye movements (Malcolm & Henderson, 2009)

  • Comparing eye movements to detected vs. undetected target stimuli in an identity search task (Jacob & Hochstein 2009)

  • Stable individual differences in search strategy?: The effects of task demands and motivational factors on scannning strategy in visual search (Boot et al., 2009)

  • Where to look next? Eye movements reduce local uncertainty (Renninger et al. 2007)

The effects of target template specificity on visual search in real-world scenesMalcolm & Henderson (JOV 2009)

Searching is faster for more explicitly specified targets. Why?

  • Does this affect the activation map that is used to select probable target regions for fixation?

  • Does this allow for faster evaluation of a target candidate?

  • Or does it simply allow the search to begin faster?


Subjects had to look for a specific object within a visual scene.

  • Target could be specified by a word or a picture. Pictures specify the target template more elaborately than words.

  • To manipulate the time that the subject had to build up a target template and keep it salient in memory, they manipulated the SOA between the cue presentation and the onset of visual scene (short/long).

  • To manipulate target familiarity, the target specification was either shown 4 times to the subject prior to experiment or not at all.


Divide scanpaths in different epochs:

  • timing of first saccade (= time it takes to determine the first possible candidate for target)

  • time it takes to find target (all saccades and fixations up to the first fixation on the target, representing processes in which target candidates are selected and rejected)

  • time it takes to decide that the object really is the target (verification)


Picture rather than word cues resulted in:

  • Faster total search times

  • Shorter scanning and verification times

    • Fewer regions visited

    • Shorter scanning fixation durations (rejection of distractors)

      Longer SOA’s resulted in faster search initiation, but no interaction with cue type


Knowledge of a target’s appearance prior to search benefits scanning in 2 ways:

  • Facilitating the selection of potential target locations

  • Decreasing the time that it takes to reject fixated distractors before moving on to next potential target

Proposed neural mechanism

  • People represent the visual scene in an activity map

  • Search is accelerated by increasing the topographical activity that is associated with target similar features and decreasing the noisy activity of target irrelevant features. So that candidate selection and verification of the target happen quicker.

Comparing eye movements to detected vs. undetected target stimuli in an identity search taskJacob & Hochstein JOV 2009

In conscious search:

  • What determines of a target will be found?

  • Does conscious detection come before or after concentrated fixations on the target?

  • What is the relation between repeated fixations on the same scene region and limited WM capacity?

  • What in the sequence of fixations reflects or influences ultimate conscious perception?


  • Find two identical cards among distracters

  • In each set there were two pairs of identical cards instead of just one

  • Participants were not informed of this

  • After a learning session of 100 trials their eye movements were measured for 50 trials


Compare fixations on detected targets with fixations on undetected targets

the detected pair

in red

the undetected pair

in blue.


  • More and longer fixations on detected items than on undetected items.

  • Less distance between fixations on detected than on undetected items.

  • The patterns of fixations are nearly identical up to the point of approximately 4 fixations before the end of the trial (~1.5 s before the first mouse click). This is true for both long and short trials.

  • The number of fixations needed for identification is more or less fixed within a range.


So fixations are needed to identify the target; detection is not an inherent property of the stimulus!

Does the large number of fixations give rise to detection or is it a result of detection?

  • The fixations just before the mouse click did not always land on the target cards, indicating that they were not the results of an verification process.

  • The relatively small increase in cumulative number of fixations on detected pairs in long searches, implies that the number of fixations on targets needed for identification is defined within a certain period of time.

    So it seems that the increase in fixations near the target are necessary for its detection and do not result from verification!

Why a short burst of fixations near target cards just before detection?

It may be difficult to keep many cards in working memory at the same time, so that fixations need to be close to each other to associate place with identity. Since the sequential distance decreases when approaching detection; perhaps a necessary condition for detection is that two cards be represented concurrently in working memory.

  • Detection may depend on the increase in fixations rising above some threshold.

  • The point where the slope exceeds a pre-determined threshold may be regarded as the bifurcation point where there is a change of state in the search process; a transition between a first stage of “search in the dark” to a second stage of “early implicit recognition”.

Proposed model

  • Stage 1: Initial search; random fixations on the different cards in arbitrary order.

  • Stage 2: Implicit (unconscious) recognition of the target pair, perhaps controlling and guiding eye movements to the relevant sensed location of these target cards.

  • Stage 3: Insight: Explicit detection with conscious knowledge of target presence and its location followed by rapid marking of the two cards.

Stable individual differences in search strategy? The effects of task demands and motivational factors on scannning strategy in visual searchBoot et al., JOV 2009

This study seeks to further evaluate and understand individual differences in visual search behaviour in the context of search tasks in which poor strategies can have a major impact on performance.


  • In Boot et al. (2006) , participants viewed dynamic displays in which up to 24 dots moved across the display.

  • During some trials a new dot appeared in the display and the task of the participant was to push a button when this occurred.

A surprisingly large range in accuracy:

some participants almost always detected the new dot

others missed 50% or more of the onset events

  • The more participants moved their eyes among moving objects in the display, the fewer targets they detected.

  • When overt searchers were instructed to search covertly, their performance matched the performance of covert searchers. Conversely, covert searchers instructed to search overtly performed just as poorly as overt searchers.

  • This ability to switch strategies suggests that strategy is not dictated by the size of an individual’s attentional field or individual differences in visual processing.

Their current study seeks to explore:

  • whether stable individual differences in preference for a certain scanning strategy might explain maladaptive scan strategies

  • the degree to which strategy might be modulated by task demands, feedback, motivation and monetary incentives


Study scanning strategies during:

  • dynamic dot detection task

  • an efficient search task (a 45° left or right tilted line among vertical lines)

  • an inefficient search task (a tilted T among randomly tilted Ls that had an offset of the _)

  • a change blindness task in which participants searched for changes in driving scenes (change in colour, presence or position masked by other changes).

  • Change blindness and inefficient search require focal attention to the target (overt attention).

  • Dynamic dot detection task and efficient search task do not (covert attention).


  • As a measure of overt versus covert searchers: average number of eye movements made per second.

  • averaged across set sizes.

  • correlated across tasks:

    If participants use the same scan strategy in different tasks, regardless of whether or not this strategy is adaptive, then the rates of eye movements on the different search tasks should be correlated.


Performance in the dynamic dot detection task has been shown to be almost exclusively driven by strategy (Boot et al., 2006).

If the eye movements on this task differ individually but are similar to that those seen on other visual search tasks for each subject, these differences in scan pattern are likely to be caused by differences in strategy choice, not differences in visual processing ability.

In difficult and inefficient search tasks, a covert search strategy would be highly maladaptive due to the difficulty of discriminating complex stimuli in the periphery.

In an efficient or easy search task, eye movements might hinder performance by focusing attention on individual items rather than allowing the unique target item to pop-out.


  • A covert scanning strategy was the most optimal strategy in the dynamic dot detection task

  • A clear trend toward faster response times was found for more overt searchers in the change blindness task.

  • An overt scanning strategy was the most optimal strategy for the inefficient search task

  • No effect of scanning strategy on performance for the efficient search task

  • Observers retain their scanning strategy across different tasks; however, they also adjust their scanning strategy depending on the task performed.

  • Those observers who adjust their scanning strategy to a greater degree exhibit the greatest overall benefit in accuracy.


  • Although strategy remained similar, task-specific modulation of saccade rate was clearly observed. Participants made fewer saccades in tasks such as the dynamic dot detection task and the efficient search task compared to the change blindness task and the inefficient search task.

  • However, in general, strategy tended to remain similar across tasks, even when that strategy resulted in slow or inaccurate performance.

Experiment 2

  • Can participants modify their scanning approach when it becomes clear that it is resulting in poor performance?

  • In Experiment 2, participants were provided with feedback after each trial, and monetary incentive to ensure feedback would be attended.

    • If participants do not modify their strategy this would be evidence of strong, stable individual differences.

    • If participants change their strategy based on feedback and motivation, similar strategy across many tasks seems to be a weak preference to utilize one strategy over another under conditions of uncertain performance and low consequences.


  • Dynamic dot detection versus inefficient search (a tilted T among randomly tilted Ls that had an offset of the _ of the Ls).

  • Explicit feedback

    • for DDD:

      ‘incorrect; you missed the target/no target present’ or

      ‘correct – target/no target’ present’

    • For IS:

      ‘You were fast!’ or ‘A bit slow!’

      Fastest participant received an additional 20 dollars in payment


  • Feedback and monetary incentives caused participants to shift their strategy rather than maintain similar strategies across tasks.

  • Thus, based on situational factors, participants will abandon their default strategy and adopt a strategy that is more adaptive to the task at hand.

General Discussion

  • Scan strategies remain stable across a variety of both static and dynamic tasks when the relationship between strategy and performance is unclear or motivation to perform well is low.

  • This suggests that participants might be utilizing a default strategy.

  • Scan strategies also appear to be shaped by the task. On average, participants tended to adopt more overt or covert strategies depending on the demands of the visual search task at hand, even without explicit feedback about performance.

  • Those participants who varied their strategy performed more accurately overall compared to participants who showed less variability across tasks

Why are there differences in default strategy?

Maybe to compensate for:

  • Differences in visual discrimination ability

  • Differences in attentional abilities

    As a result of differences in the structure and function of various brain regions known to control endogenous eye movements

Where to look next? Eye movements reduce local uncertaintyRenninger et al. JOV 2007

  • They use information theory to probe the underlying decision strategies that govern eye-movement planning.

  • To evaluate the validity of their model they compare individual fixations against strategy predictions using a signal detection approach.


Subjects had to familiarize themselves with a novel and abstract silhouette and decide whether a second silhouette was identical to the one that they familiarized themselves with.

  • Five levels of difficulty: degree of boundary change

  • Degree of boundary change was calculated as the change in orientation entropy along the boundary using a ‘fixation’ at the shape centroid. This metric scales with human shape discrimination performance (Renninger, Verghese, & Coughlan, 2005a).

Model: Global strategy

  • Model’s aim is to build an accurate representation of each shape as it is studied with eye movements so that it can be discriminated from a highly similar shape during the matching phase.

  • Given current knowledge of V1 processing, the information needed for this task is the edge orientations derived from the shape contour.

With each fixation, the model takes a foveated measurement of the stimulus:

  • Estimating the orientations using a set of filters that are selective to eight discrete orientations

  • within a pooling neighbourhood whose size depends on distance from the current fixation point

  • the number of occurrences of different veridical edge orientations are counted to create a histogram (or probability distribution after normalization) of the different orientations at that location.

  • With each successive fixation, the current map is updated by multiplying it with the new measurement distribution. This map is flat before the very first fixation is made.

  • Information is the entropy of a probability distribution:

    Entropy = - Σp (x) log p(x).

  • When there are many different orientations in a neighbourhood (e.g. a bumpy contour in the periphery), all orientations are equally likely and the distribution will be flat (high entropy). Alternatively, straight edges will produce energy at a single orientation or very peaked distributions (low entropy).

  • As the evidence of orientations accumulates with successive fixations, the uncertainty of the shape knowledge at any point in time can be represented by computing a resolution-dependent entropy (RDE) map.

Can this global strategy model predict subjects’ eye movements across the shapes?


  • Percentage correct ranged from 75% to 78%.

  • Mean amplitudes of object-exploring saccades ranged from 2.38° to 4.44°. Mean dwell times ranged from 175 to 403 ms. All well within the normal range of naturalistic stimuli and search tasks.

  • Subjects typically made three to five fixations around the object in the viewing time allowed.

Fixated locations were found to be spatially distributed in a donut shape for three of four subjects.

  • Red fixations are the first fixations to the object. They do not have the same donut distribution of the other object-exploring fixations and are biased in the preview direction.

  • Their clustering suggest that they are simply localizing saccades that are mostly independent of detailed shape information.

Global strategy predictions?

  • If the goal of eye movements is to gather task relevant information, then the best strategy is to fixate on locations that maximize the total information gained about the contour orientations.

  • This prediction is computed by evaluating all possible next fixation locations in a grid of positions spaced 0.25° apart and selecting the position that yield the greatest gain in total information (= the greatest reduction in total uncertainty).

First impression

  • The distributions of saccade amplitude and fixation location are qualitatively similar to those measured for our subjects.

  • The distributions generated by a random strategy that predicts fixations anywhere on the stimulus are quite different.

Quantification of fit: fixation error

  • Every human fixation is mapped to the closest strategy fixation, and the distance errors are accumulated. The mean of these samples is the fixation error and is taken as one measure of how well strategy-predicted locations align with human fixations.

  • The significance of the alignment is assessed by bootstrapping (1,000 iterations) to get 95% confidence intervals of the fixation error.

Human fixations are closer to the global strategy than to random fixation.

Receiver Operating Characteristic

  • Each new fixation is overlaid on the current map, which is updated using the previous series of fixations. The map is rescaled from 0 to 1, and the prediction value is taken as the maximum value that falls within 1° of the human fixation.

  • ROC curves are computed and the area under the curve (AUC) is determined to assess the power of the global strategy prediction.

    • Hits: the probability that the prediction value exceeds a threshold at fixated locations

    • False alarms: the probability that the prediction value exceeds a threshold at random locations.

    • Hits and false alarms are plotted with changing threshold, sweeping out the ROC curve

  • If the global prediction is no better than random at predicting human fixations, the ROC curve should lie along the positive diagonal (AUC = 0.5). If the global strategy is a good predictor of human fixations, it will tend toward the upper left-hand corner of the plot (0.5<AUC<1.0)

  • To assess the significance of the AUC, hits and false alarms are resampled with replacement to produce bootstrapped estimates (significantly better than chance if 95% confidence interval does not include 0.5).

For all of our observers, the global model is significantly better than chance at predicting the next fixation than random fixations.


  • Subject seem to fixate locations that maximize reduction of uncertainty.

  • The simple fact that the global model produces a donut-shaped distribution of fixations may be enough to align it with human fixation patterns.

  • A much more stringent test would be one that compares the performance of the global strategy against a smarter random strategy that knows shape information is near the edges.

Smart Random Strategy

  • Fixation error:

    • Every new human fixation is mapped to a randomly drawn fixation on previous trials and the distance errors are accumulated.

  • ROC:

    • Hits: pixels of strategy map within 1° of human fixations that exceed threshold

    • False alarms: pixels of strategy map within 1° of a fixation randomly drawn from their fixations on other trials

ROC curves shift for each observer toward the diagonal but the AUC is still significantly greater than 0.5.


  • Global strategy is omniscient because the benefit of all possible fixations is fully known before a decision is made about the best next fixation. This would need an enormous amount of computational power. It is more likely that we use estimates (e.g., heuristics or learned priors) to determine the benefit of each possible next fixation. But again, it is unclear how the visual system would do this without complex computation.

  • Is there a simpler, more efficient strategy that produces similar fixation behaviour?

Other strategies

  • Two biologically plausible strategies for making eye-movement decisions:

    • Saliency

    • Local Uncertainty.

  • They evaluate each strategy against the smart random strategy baseline.


  • Given that the shapes in the psychophysical task are novel, top–down influences should be minimized and observers may simply look at salient points on the shape. In our stimuli, salient locations are those that have an orientation that differs from its surround, such as corners or sharp points.

  • We produced saliency prediction maps for our stimuli using a model of Itti and Koch (2000).

Local Uncertainty

  • Only the most informative points, or points of maximum entropy, are fixated.

    To better understand this difference, imagine two nearby locations that have similar prediction values. The global strategy might be to fixate between them to maximize information about both locations, whereas the local uncertainty strategy would fixate the one with slightly higher uncertainty (more information).

  • To model this, we used the RDE map directly as a prediction map and the strategy is to fixate the hot spots.


  • Fixation error and ROC curves for each strategy.

  • In the case of the saliency strategy, we include a 1° mask that inhibits saliency signals at previously fixated locations to mimic the dynamic changes in the saliency map due to IOR. This will presumably improve the prediction of the saliency strategy by reducing the number of salient locations that the random strategy may predict.

  • For the local uncertainty strategy, the RDE map is updated from the history of human fixations.

  • For both strategies, the prediction strength for the next fixation is evaluated using the maximum value of the strategy prediction map within 1 ° of the fixation.


Both the saliency and local uncertainty strategies produce a donut-shaped distribution, but neither strategy shows a distribution of saccade amplitudes exactly like the observers.

Fixation error

  • Neither the saliency nor the local uncertainty strategy performs equal to or better than the global strategy.

  • FE ignores the sequence of fixations, possibly explaining the larger errors with this metric.


  • The local uncertainty strategy is at least as good as the global strategy at predicting where observers will look next.

  • It is well known that humans make fixations toward the centroids of small shapes (Melcher & Kowler, 1999).

  • What if observers are combining the local uncertainty strategy with a simple centroid prior when planning fixations?

    The discrepancy between fixation error and the ROC finding could be explained if observers consistently undershoot the maximum of the local uncertainty prediction but still land within a hot spot.

Local Uncertainty + Centroid bias

  • The calculated fixation locations f will be biased toward the centroid by a weight w:

  • C is the centroid and fˆ is the strategy-defined prediction.

Fixation error

The spatial distribution of predicted fixations has a more compact donut shape and looks strikingly similar to the human pattern. This improved distribution is reflected in the decrease in fixation error.

ROC: AUC comparison

Given observed fixation locations and different values of w, the observer’s intended fixation can be calculated and superimposed on the local uncertainty strategy map. Using the prediction values from these maps, again ROC curves are computed.

For all subjects,

the local uncertainty strategy with centroid weighting provides the best prediction of

human fixation locations.


  • The saliency model did not incorporate eccentricity effects. Salience is less pronounced for more eccentric locations. However, even without eccentricity factors, the saliency strategy shows some predictive power.

  • In the stimuli, local uncertainty and saliency predictions often overlap, especially early in the fixation sequence. This correlation is likely present in all natural stimuli. Stimuli that cleanly isolate local uncertainty and saliency effects would be needed to determine if the visual system makes use of only one strategy or if it uses both strategies.

  • The inference of orientation at a point, discretization into eight bins, and use of vernier parameters are all approximations that may introduce error into the estimates of local uncertainty. As a result, their prediction maps may not be correct in detail.

  • Isolated maxima will predict a fixation regardless of neighbouring activity. This may be the underlying cause of the bimodal distribution of saccade length for the location uncertainty strategy.

  • The visual system, perhaps through lateral interactions, may smooth these spurious signals. Also, large areas of activity may be sharpened through nonlinear competition.

Thank you for your time!


  • Login