Perception of differences in naturalistic dynamic scenes, and a V1-based model

To, Michelle and Gilchrist, I. D. and Tolhurst, D. J. (2015) Perception of differences in naturalistic dynamic scenes, and a V1-based model. Journal of Vision, 15 (1). pp. 1-13. ISSN 1534-7362

Full text not available from this repository.


We investigate whether a computational model of V1 can predict how observers rate perceptual differences between paired movie clips of natural scenes. Observers viewed 198 pairs of movies clips, rating how different the two clips appeared to them on a magnitude scale. Sixty-six of the movie pairs were naturalistic and those remaining were low-pass or high-pass spatially filtered versions of those originals. We examined three ways of comparing a movie pair. The Spatial Model compared corresponding frames between each movie pairwise, combining those differences using Minkowski summation. The Temporal Model compared successive frames within each movie, summed those differences for each movie, and then compared the overall differences between the paired movies. The Ordered-Temporal Model combined elements from both models, and yielded the single strongest predictions of observers' ratings. We modeled naturalistic sustained and transient impulse functions and compared frames directly with no temporal filtering. Overall, modeling naturalistic temporal filtering improved the models' performance; in particular, the predictions of the ratings for low-pass spatially filtered movies were much improved by employing a transient impulse function. The correlations between model predictions and observers' ratings rose from 0.507 without temporal filtering to 0.759 (p = 0.01%) when realistic impulses were included. The sustained impulse function and the Spatial Model carried more weight in ratings for normal and high-pass movies, whereas the transient impulse function with the Ordered-Temporal Model was most important for spatially low-pass movies. This is consistent with models in which high spatial frequency channels with sustained responses primarily code for spatial details in movies, while low spatial frequency channels with transient responses code for dynamic events.

Item Type:
Journal Article
Journal or Publication Title:
Journal of Vision
Uncontrolled Keywords:
ID Code:
Deposited By:
Deposited On:
06 Feb 2015 17:11
Last Modified:
17 Sep 2023 01:37