Can data scientists replace IR scholars?
The case of the UNGD and the text as data approach

Recently, a growing scholarship has employed data science tools in and for political research. This resulted in an impressive body of interdisciplinary research projects that interweave cutting-edge computational perspectives with substantial questions on political processes, positions, and behavior. Nonetheless, while these new and sophisticated methods are essential for the advancement of the field, they also raise important questions regarding their limitations and the extent to which one can algorithmize political behavior. These challenges are even more pressing in IR, where political processes are decentralized and are usually less ordered, ritualized, or regulated than domestic politics. This paper deals with these challenges by assessing the contribution of the text-as-data approach and NLP methods to IR research, focusing specifically on analyzing states’ speeches in the UNGD. Despite the scientific disguise of automated methods, they are heavily ingrained with interpretive, analytical, and methodological choices that affect both research designs and findings and require thoughtful and meticulous reflection throughout various steps of the research. In an attempt to foster a constructive dialogue between IR scholars and data scientists, this paper highlights fundamental analytical and methodological gaps that hinder the application and review processes and suggests practical solutions to achieve optimal collaboration between the fields.