### Detecting AI-Generated Scientific Writing
#### Introduction
AI companies have struggled to create tools that can reliably identify AI-generated text. However, researchers have developed a new method to estimate the use of large language models (LLMs) in scientific writing by analyzing the frequency of certain “excess words” that became more common during the LLM era (2023 and 2024). According to their findings, at least 10 percent of 2024 abstracts were processed with LLMs.
#### Research Inspiration
Researchers from Germany’s University of Tübingen and Northwestern University were inspired by studies that measured the impact of the Covid-19 pandemic by looking at excess deaths. They applied a similar approach to “excess word usage” after LLM writing tools became widely available in late 2022. Their study found that the appearance of LLMs led to a significant increase in the frequency of certain style words, which was unprecedented in both quality and quantity.
#### Methodology
##### Data Collection
The researchers analyzed 14 million paper abstracts published on [PubMed](https://pubmed.ncbi.nlm.nih.gov/) between 2010 and 2024. They tracked the relative frequency of each word as it appeared each year and compared the expected frequency (based on pre-2023 trends) to the actual frequency in 2023 and 2024.
##### Findings
The study found that certain words, which were rare before 2023, surged in popularity after LLMs were introduced. For example, the word “delves” appeared 25 times more frequently in 2024 papers than expected. Words like ”showcasing” and “underscores” increased ninefold. Other common words also saw notable increases: “potential” by 4.1 percentage points, “findings” by 2.7 percentage points, and “crucial” by 2.6 percentage points.
#### Natural Language Evolution vs. LLM Influence
While language naturally evolves, the researchers noted that such massive and sudden increases in word usage were previously only seen for words related to major world health events, like “ebola” in 2015 and ”coronavirus” during the Covid-19 pandemic. In the post-LLM period, hundreds of words saw sudden increases in scientific usage without any common link to world events. These words were mostly “style words” like verbs, adjectives, and adverbs.
#### Previous Findings and New Insights
The increased prevalence of words like “delve” in scientific papers has been noted before. However, previous studies relied on comparisons with human writing samples or predefined LLM markers. In this study, the pre-2023 abstracts served as an effective control group to show how vocabulary choice has changed in the post-LLM era.
### Identifying LLM Usage
#### Marker Words
By highlighting hundreds of “marker words” that became more common in the post-LLM era, the researchers could identify telltale signs of LLM use. For example, an abstract line with marker words highlighted:
“A comprehensive grasp of the intricate interplay between […] and […] is pivotal for effective therapeutic strategies.”
#### Statistical Measures
After statistical analysis of marker word appearances across individual papers, the researchers estimate that at least 10 percent of post-2022 papers in the PubMed corpus were written with some LLM assistance. The actual number could be higher, as their set might miss LLM-assisted abstracts that don’t include any identified marker words.
### Conclusion
This study provides a novel method for detecting LLM usage in scientific writing by analyzing changes in word frequency. The findings suggest a significant impact of LLMs on scientific vocabulary, offering a new perspective on the influence of AI in academic research.
6 Comments
Why bother? Human creativity is irreplaceable!
Finding generative AI text is the new age talent show, isn’t it?
Do we really need a guide for this.
So, are we now detectives for AI prose?
Just what we need, more paranoia in the digital age!
Krugler: It’s like playing detective, but for words.