MIT Saliency Benchmark

The following are results of models evaluated on their ability to predict ground truth human fixations on our benchmark data set containing 300 natural images with eye tracking data from 39 observers. We post the results here and provide a way for people to submit new models for evaluation.

citations

If you use any of the results or data on this page, please cite the following:

@misc{mit-saliency-benchmark,
  author    = {Zoya Bylinskii and Tilke Judd and Ali Borji and Laurent Itti and Fr{\'e}do Durand and Aude Oliva and Antonio Torralba},
  title     = {MIT Saliency Benchmark},
}

This dataset is released in conjunction to the paper "A Benchmark of Computational Models of Saliency to Predict Human Fixations" by Tilke Judd, Fredo Durand and Antonio Torralba, available as a Jan 2012 MIT tech report.

@InProceedings{Judd_2012,
  author    = {Tilke Judd and Fr{\'e}do Durand and Antonio Torralba},
  title     = {A Benchmark of Computational Models of Saliency to Predict Human Fixations},
  booktitle = {MIT Technical Report},
  year      = {2012}
}

Images

300 benchmark images (the fixations from 39 viewers per image are not public such that no model can be trained using this data set).

Model Performances

Return to results

MIT Saliency Benchmark