Eye Tracking Techniques for Linguistic Research

For Clara Cohen, language is a system of patterns. As a postdoctoral researcher in psychology, Cohen’s fascination with linguistic structure began when she studied Russian as an undergraduate. Today, advances in technology allow her to observe how those patterns unfold in real time.

Working with the Center for Language Science and the Language and Bilingualism Lab in the Department of Spanish, Italian and Portuguese, Cohen uses eye-tracking technology to explore subtle differences in how monolingual English speakers and bilingual speakers process singular and plural nouns.

Although this research addresses a narrow topic within the larger field of linguistics, it has the potential to illuminate how listeners anticipate grammatical information in spoken language. The findings could inform language teaching, improve automatic speech recognition, and contribute to interventions for people with auditory processing disorders such as those sometimes associated with dyslexia.

Cohen hypothesizes that English listeners may detect plurality before hearing the final “s” suffix. Small durational differences in the pronunciation of noun stems—where the stem preceding a plural suffix tends to be slightly shorter—can provide early cues that help a listener predict whether a noun will be singular or plural.

“People who speak English regularly hear patterns in their language,” Cohen explained. “One pattern is that the stem before a plural suffix is often a bit shorter. So ‘cats’ can sound shorter than ‘cat.’ Those subtle duration cues may let listeners anticipate plurality even before the suffix arrives.”

By contrast, speakers of languages like Spanish often rely on overt grammatical markers like articles to identify number. Spanish articles such as el (singular) and los (plural) make plurality immediately apparent, reducing the need to use stem duration as a cue.

To measure how quickly listeners use these signals, Cohen records where study participants look while they listen to sentences. An eye-tracking system—using an invisible infrared light and an optical sensor—maps the reflection from the eye to determine gaze direction and timing. This method captures fine-grained, millisecond-level information about how listeners allocate attention as words unfold.

Participants sit in a soundproof booth with their heads stabilized on a chin rest, watching a screen and listening through headphones. Four images appear on the monitor—for example: a single seal, a bun, a bunny, and a group of seals. Cohen’s recorded voice plays a sentence such as “The man looked at the seals.” Researchers then track how quickly the participant’s gaze shifts to the image of the seals, allowing them to infer at what point the listener has identified plurality.

“Eye-tracking gives us temporal precision that traditional methods lack,” Cohen said. “Older experiments often relied on post-sentence judgments—asking subjects whether a noun was singular or plural and measuring button-press times. Those responses are informative but coarse. By contrast, gaze data reveal the moment-by-moment process of comprehension.”

This real-time approach can show whether listeners commit to an interpretation before a suffix appears, and whether bilinguals can flexibly adjust which cues they attend to depending on language context.

As part of a National Science Foundation Partnerships for International Research and Education (PIRE) grant, Cohen will extend the study in May by testing monolingual and bilingual Spanish speakers in Tarragona, Spain. Although most collaborators in Spain learned some English in high school, the monolingual participants have not used English since then, while bilingual participants continue to use both languages.

“In Spain I’ll test whether monolingual Spanish speakers are less sensitive to durational differences, since Spanish provides number information earlier,” Cohen said. “For bilinguals, I’ll examine how quickly they can switch their attention to duration cues when processing English.”

Image shows the eye tracking software on a computer.
A computer in the lab lets Cohen calibrate the eye tracker for each participant. Credit: Rachel Garman.

The practical implications of this work extend beyond academic description. Knowing that durational patterns signal morphological boundaries can improve automatic speech recognition systems. If a speech engine detects a stem that is shorter than expected, it might weigh the probability of a suffix and revise its classification of the word—distinguishing, for instance, between a noun and a verb based on timing cues.

For individuals with auditory processing impairments, including some people with dyslexia, a clearer understanding of how much listeners rely on temporal variation in speech could guide therapeutic and educational strategies. Identifying which cues are essential and which are supplemental across languages may help clinicians design exercises and technologies that compensate for specific perceptual difficulties.

Beyond the potential applications, Cohen finds intellectual reward in uncovering systematic patterns within language.

“Studying a second language reveals that speech has rules and regularities,” she said. “Learning those patterns removes the mystery and gives you tools to understand how languages work—and how listeners make sense of the speech signal in real time.”

About this language research

Source: Rachel Garman – Penn State
Image Source: The image is credited to Rachel Garman.

Feel free to share this Neuroscience News.