Our visual system is continuously confronted with an enormous amount of visual information: objects, faces, letters and simple features are constantly present in our visual field. This presents a fundamental limitation to object recognition throughout most of the visual field most of the time, known as visual crowding: objects that can be easily identified in isolation seem jumbled and indistinct in clutter (Whitney & Levi, 2011). Still, despite such a cluttered environment, our visual system is able to organize the world around us into a coherent percept. How do we achieve this? The Manassi lab, using crowding as a tool to investigate object perception, investigates how our visual system organizes all the visual input coming from cluttered scenes into a more global, organized representations.


Crowding was traditionally thought to be characterized by target-flanker interactions, which are (a) deleterious (e.g., pooling and substitution), (b) locally confined, (c) feature-specific and (d) restricted to low-level representations. We showed that none of these assumptions hold true:


Manassi et al. (2012, 2015); Doerig et al. (2020)

see also Sayim et al. (2010) and Hermens et al. (2008)​​

Mansis 2012
Manass 2016
Manassi 2013

Manassi et al. (2013)

Manassi et al. (2016)

Fixate the central cross and compare stimuli on the right to those on the left side. Recognizing the right offset of the two central lines is much easier on the right side. Manassi et al. (2012, 2015): adding flankers can decrease crowding. Manassi et al. (2013): far elements (squares) can modulate crowding. Manassi et al. (2016): high-level processing (regular patterns of shapes) determines low-level processing (vernier offset).

These results show that the spatial configuration across the entire visual field determines crowding, challenging most current models of crowding and object recognition in general. Importantly, they highlight how crowding on a single element is determined by the perceptual organization in a scene, thus, making grouping a necessary component for any proposed crowding mechanism (Herzog et al. 2015a, 2015b). ​

In cluttered scenes crowding happens at multiple levels of visual processing, from low level representations, like orientation and lines to high-level ones, like objects and faces. Importantly, crowding impairs the access we have to visual information at many levels, but it does not impair the representation of that information (Manassi & Whitney, 2018; Xia et al. 2020).

Crowding demo

Upper Panel. Natural scenes are filled with a variety of sorts of clutter, including different visual features, surfaces, objects, faces, etc. (Manassi & Whitney, 2018).

Lower-left panel. When fixating the bull’s eye in the middle, it is relatively easy to identify isolated visual features and objects:  the oriented line of the flag (lines), the tilted blue banner (shapes), the letter ‘E’ on the race bib (letters), the face on the right side (faces), and the (hypothetical) motion direction of the runner on the left side (biological motion).

Lower-right panel. While staring at the bull’s eye, it is more difficult to recognize all these visual features and objects because of flanking elements. In crowding, nearby flanking objects impair identification of the visual target objects, making them appear jumbled and harder to recognize.

Manassi & Whitney (2018)