Tanja Säily

[Tanja Säily] [Tanja Säily]

Associate Professor in English Language
Faculty of Arts, University of Helsinki

Read more about my research and teaching below.

Research

My research interests include corpus linguistics, digital humanities, historical sociolinguistics, and the concept of productivity in linguistics. In addition to productivity, I am interested in the social embedding of language variation and change in general, including gendered styles in the history of English as well as extralinguistic factors influencing language change. My overarching aim is to develop new ways of understanding language variation and change, often in collaboration with experts from other fields such as history and computer science. I am a member of the Research Unit for Variation, Contacts and Change in English (VARIENG) and the Helsinki Computational History Group (COMHIS), and a co­-compiler of the Corpora of Early English Correspondence (CEEC).

I am currently piloting AI methods for corpus development in the University of Helsinki funded project New Methods for Developing Diverse Corpora (2025–). My recent projects include Historical Sociolinguistics Meets Construction Grammar: The Case of Productivity in English (funded by the Research Council of Finland in 2020–2023) and Rise of Commercial Society and Eighteenth-Century Publishing (joint project with Mikko Tolonen, funded by the Research Council of Finland in 2020–2024); our research into these topics continues.

New Methods for Developing Diverse Corpora (DEDICO)

Project website

Much of what we know about the history of the English language is based on formal written genres produced by highly educated men. Recent research has attempted to provide a more diverse picture by including materials written by women and the lower social ranks, such as handwritten letters. As turning these materials into linguistic corpora has traditionally required a great deal of time and expertise, such corpora tend to be quite small, which limits their use to the study of relatively high-frequency phenomena.

This project explores new ways of compiling and annotating more diverse corpora. For instance, the time-consuming transcription process can be facilitated by tools for handwritten text recognition like Transkribus, or more general vision-language models. These tools have been trained on well-educated hands, so new challenges arise when we apply them to more diverse materials. Similarly, tools for part-of-speech annotation are trained on present-day standard language and may struggle with more diverse data. We aim to improve the automated transcription and annotation of socially diverse corpora of Late Modern English. Our methods will also be applicable to other languages and periods.

Historical Sociolinguistics Meets Construction Grammar:
The Case of Productivity in English (HISCOP)

Project website

Construction Grammar (CxG) is a recent theory of language that focuses on what speakers must know to be able to use a language; this knowledge is expressed in terms of constructions, or form-meaning pairings, such as words or phrases. The aim of my HISCOP project is to extend CxG by drawing on historical sociolinguistics, which focuses on relationships between the individual, language and society throughout history. We study the productivity of constructions in large historical text corpora from a sociolinguistic perspective. The purpose of the project is to increase the explanatory power of CxG, to learn more about linguistic phenomena in the field of productivity in the history of English, and to provide a more balanced picture of these phenomena by studying the language use of not only highly-educated men (as previous research has often done) but also women and the lower social ranks, who may turn out to lead linguistic change.

Teaching