August Lohse

Ph.D Student in Social Data Science at University of Copenhagen


Dialectograms: Machine Learning Differences between Discursive Communities

Word embeddings provide an unsupervised way to understand differences in word usage between discursive communities. A number of recent papers have focused on identifying words that are used differently by two or more communities. But word embeddings are complex, high-dimensional spaces and a focus on identifying differences only captures a fraction of their richness. Here, we take a step towards leveraging the richness of the full embedding space, by using word embeddings to map out how words are used differently. Specifically, we describe the construction of dialectograms, an unsupervised way to visually explore the characteristic ways in which each community use a focal word. Based on these dialectograms, we provide a new measure of the degree to which words are used differently that overcomes the tendency for existing measures to pick out low frequent or polysemous words. We apply our methods to explore the discourses of two US political subreddits and show how our methods identify stark affective polarisation of politicians and political entities, differences in the assessment of proper political action as well as disagreement about whether certain issues require political intervention at all.


This project is available as a preprint at:

Responsiveness of politicians to social media feedback

A main project of my Ph.D. where I examine to what extend social media feedback such as “likes”, can be said to drive the political agenda. The project is built on 8 years of Facebook data labeled using deep learning.

I am currently writing the results into a paper.

Ideological scaling of the Danish Parliament using word embeddings

Using transcripts from the Danish Parliament, I perform unsupervised ideological scaling of the Danish parties. For the project I embed around 450.000 political speeches alongside tokens for political parties and parliamentary sessions, using the static word embedding algorithm doc2vec. I perform dimensionality using PCA and reduce the word embedding to the two dimensions of ideological scaling.

The resulting two-dimensional space can be interpreted as the most important dimension in Danish politics and is based on what the parties are actually saying in Parliament. The parties are thus placed alongside the words in the vector space, and their position can be interpreted using the word position.

The work was presented in 2021 at SODAS at the University of Copenhagen as part of the Data Discussion event series.

Computational ethnography at the People Meeting on Bornholm, Denmark

As part of my Ph.D, I work on integrating ethnographic fieldwork with computational and qual/quant methods. As a part of this work, I am developing an app, for ethnographers to store, access, and manage fieldnotes in a more structured way. This allows for after-the-fact NLP analysis of the notes, but also for ethnographers to sort and access fieldnotes based on time, place, keyword, etc.

The app is being tested by 15-20 ethnographers on the 2022 DISTRACT expedition to the sunny island of Bornholm in Denmark, where the People Meeting is being held.

I tested an early version of the app at our 2021 expedition, and am looking to improve it in 2022. Afterward, the results and the app itself will be presented in a publication.

Examining visual responsiveness in politicians’ everyday political communication
In this project, which is based on my thesis in political science, I explore to what extent politicians are visually responsive to changes in the political agenda, in their everyday communication.

I study the Instagram images of Danish politicians, before and after a very sudden re-emergence of #MeToo on the Danish political agenda, using the change as an exogenous shock in an event study. I find that the politicians react by increasing the number of women in their images by about 0.25 standard deviations in the first few weeks after #MeToo. Afterward, the number of women returns to normal levels for all politicians, regardless of gender, political affiliation, or experience on Instagram.

The project is in its final stages of preparation, before being sent for peer review.