Change is Key! is a research program in which we aim to create computational tools to turn text into a story of both our language, our societies and culture and how these have changed over time.

Firstly, we will develop corpus-based methods for detecting semantic change (over time) and variation (across social groups and media types). This will create general tools for the study and detection of language change at large-scale and directly benefit historical linguistics and lexicography. Secondly, in collaboration with researchers from each field, we aim to answer research questions in social sciences, gender studies, and literary studies.

The program spans six years (2022 - 2027) with a total of 16 researchers, one research engineer and six partner universities.

In May 2024 co-organized the first workshop edition of Large-scale computational approaches to evolution and change at Evolang XV. In August 2024, we will co-organize the fifth edition of International Workshop on Computational Approaches to Historical Language Change, (LChange'24) that will be co-located with ACL 2024.

This research program is funded by the Riksbankens Jubileumsfond under reference number M21-0021 for a total of 33.5 Million SEK.

Change is Key! is on Huggingface!

News

 
 
 
 
 
Three Papers Accepted at EMNLP 2024
September 2024 – Present
We are delighted to announce that Change is Key! has had three papers accepted for EMNLP 2024, which will take place this December in Miami, Florida. Congratulations to our team members on their submissions.
 
 
 
 
 
Taichi Aida is visiting us
September 2024 – October 2024
Taichi Aida, a 3rd-year Ph.D. student in Computer Science from Tokyo Metropolitan University, is visiting us in Gothenburg during the fall semester. His research focuses on Computational Approaches for Lexical Semantic Change Detection (LSCD). During his visit, Taichi will work on application projects related to LSCD.

Recent Events and Talks

Recent Publications

(2024). Using Synchronic Definitions and Semantic Relations to Classify Semantic Change Types. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).

PDF Cite

(2024). Analyzing Semantic Change through Lexical Replacements. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).

PDF Cite

(2024). Hate Speech Detection and Reclaimed Language: Mitigating False Positives and Compounded Discrimination. In Proceedings of the 16th ACM Web Science Conference.

PDF Cite

(2024). Strengthening the WiC: New Polysemy Dataset in Hindi and Lack of Cross Lingual Transfer. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024).

PDF Cite

(2024). Quantitative text analysis. Quantitative text analysis.

PDF Cite DOI

(2024). A Systematic Comparison of Contextualized Word Embeddings for Lexical Semantic Change. Contextualized Word Embeddings for Lexical Semantic Change.

PDF Cite

(2024). (Chat)GPT v BERT Dawn of Justice for Semantic Change Detection. (Chat)GPT v BERT.

PDF Cite

(2023). ChiWUG: A Graph-based Evaluation Dataset for Chinese Lexical Semantic Change Detection. ChiWUG.

(2023). The DURel Annotation Tool: Human and Computational Measurement of Semantic Proximity, Sense Clusters and Semantic Change. DURel Tool.

(2023). Studying Word Meaning Evolution through Incremental Semantic Shift Detection: A Case Study of Italian ParliamentarySpeeches. Studying Word Meaning Evolution through Incremental Semantic Shift Detection.

PDF Cite DOI

(2023). XL-LEXEME: WiC Pretrained Model for Cross-Lingual LEXical sEMantic changE. XL-LEXEME.

PDF Cite

(2023). Human and Computational Measurement of Lexical Semantic Change. University of Stuttgart.

(2023). A Survey on Contextualised Semantic Shift Detection. Survey of Contextualized Semantic Shift Detection.

PDF Cite

(2023). Computational modeling of semantic change. Computational modeling of semantic change.

PDF Cite

(2023). The Finer They Get: Combining Fine-Tuned Models For Better Semantic Change Detection. The Finer They Get.

PDF Cite

(2022). DiaWUG: A Dataset for Diatopic Lexical Semantic Variation in Spanish. DiaWUG.

PDF Cite

(2021). LSCDiscovery: A shared task on semantic change discovery and detection in Spanish. LSCDiscovery.

PDF Cite DOI

(2021). Survey of computational approaches to lexical semantic change detection . Survey of semantic change.

PDF Cite DOI

Project Description

Background

Language changes over time in processes that often span long time periods. However, modern events like the current situation around Covid-19 has stressed that the cultural aspects of words and their meaning can change radically over short periods of time as well: isolation today carries a stronger sense of hopelessness and an extreme negative connotation. Vaccine, while also previously having both a positive and negative connotation, today carries with it a sense of hope; Once the vaccine is in place, life will go back to normal. Our linguistic resources, and our cultural existence are intertwined and must be studied as a single whole. To understand our contemporary and historical societies, we must understand the language used to describe them.

Researchers in text-based humanities and social sciences have always faced hurdles caused by semantic change and linguistic variation on a regular basis (words acquire new subtle meanings, or are replaced by other, more prominent words). Despite technological breakthroughs to alleviate the problems, they are still left to handle these changes on their own using resources like dictionaries, that are slow to update and cover little of our language and its actual use. They risk missing out on important textual clues and are limited to small-scale manual analysis. In addition, many humanities and social science researchers are interested in  changing phenomena portrayed in language.

While acknowledging that textual resources do not have representation of all parts of society, with the socio-economically weak being significantly less represented, these resources are never-the-less reaching an impressive and unprecedented part of our society. By opening up modern and historical Swedish textual resources, social media included, we have enormous possibilities to study our world with reasonable efforts in data collection, and minimal interference to the objects we study.

Program description

The program will run over six years (2022--2027) and is funded by Riksbankens Jubileumsfond for a total of 33.5 Million SEK. It and constitutes a core language technology research team. In addition, we have researchers from (historical) linguistics, lexicography, analytical sociology, gender studies and literary studies. The program comprises five subprojects, out of which three are core language technology and NLP, one relates to reevaluating existing change hypotheses proposed by historical linguists, and the final consist of four humanities and social science projects including lexicography.

We will identify and eliminate linguistic barriers caused by language change to open up our textual accounts of the world to researchers from a wide range of fields; sociology, cultural studies, history, literature, journalism and religion. We will also apply our change detection methodology directly to answer different HSS research questions in our application projects.

Historical Linguistic: There are many open questions in the burgeoning field of quantitative semantics, which we cannot currently answer with existing computational methods; How do lexical change and semantic change interact? Why do different parts of the vocabulary change at different speeds? How does change spread throughout the lexicon? High quality case studies of change often produce hypotheses, and we will provide tools to test and quantify these hypotheses using large-scale methods developed within this program.Our  corpus-based studies will feed insights into our models, thus improving both modeling and theory of meaning, senses, and language change.

Lexicography: Using computational methods, we will advance our understanding of the semantic structures underlying textual data. We will integrate recent advances in computational linguistics into the lexicographic process,    transforming it from manual lexicography into a semi-automatic, and empirically-based work flow. This work will be done together with the lexicographic group at the University of Gothenburg that develops the dictionary Svensk ordbok utgiven av Svenska Akademien (“The Contemporary Dictionary of the Swedish Academy”), and will directly improve their workflow.

Advancing Natural Language Processing and Machine Learning: We will extend the state-of-the-art in lexical semantic change with respect to both theoretical and methodological aspects. In addition, we will adapt our methods to be applicable to the needs of HSS.  A large focus will be on synchronic variation as a complement to diachronic change, stemming from our work on sense-aware models, and in part on the comparison across contemporary corpora. We will advance the state-of-the-art in NLP in several ways:

  • Extend semantic modeling beyond the simplistic ``one vector per word'' or ``one vector per sentence'' to contextually and culturally defined concepts needed for HSS;
  • Develop sense-aware models that can model all parts of a word, to be able to answer what happened, how the changes relate to what we already know and when the change took place. We will apply change detection to Swedish, to individual concepts and the interplay in a semantic field, setting the state-of-the-art for Swedish, and significantly furthering the research field internationally;
  • Adapt methods for diachronic change to synchronic variations needed for many different humanities and social science studies, including the study of radicalisation, cultural transformation and cultural differences across social groups.

We will have four HSS projects within the program. Each collaboration partner brings research question/s and data, and we collaborate around methods to help incorporate expert knowledge and provide answers. These application projects provide us a chance to conduct high-quality research that will benefit both parties, and leave behind tools and methodology useful beyond the scope of the program.

We believe that putting a group of NLP experts in close collaboration with HSS experts, within the field of semantic change, will lead to research results only achievable through collaboration across fields. Secondly, method development is radically improved by close collaboration with fields in which the methods are needed. Thirdly, our research results and methodology are disseminated to relevant fields and serve to set state-of-the-art in terms of research results, and perhaps more importantly, further research methodology.

Our HSS projects investigate (1) radicalization of groups (focused around synchronic variation due to its rapid speed), (2) cultural differences over time (sense-aware diachronic change), (3) how rights, acknowledgement and justice have changed over time in media, legislation, and politics (sense-aware diachronic change), and (4) how the phone, steamer, and electricity changed the society, and how this was reflected in literature (diachronic change and synchronic variation).

The development of methods in NLP is essential to reach our main goal, to integrate our research with, and contribute  to, state-of-the-art HSS research. The application projects are at least equally important as the theoretical and methodological development, and a crucial part in driving the methodological development. All parties bring cutting-edge research questions that we can answer under the umbrella of this program.