Strange IndiaStrange India


Double exposure of hands writing in a notepad, with digital coding icons and laptop keyboard visible.

Using large language models to speed up or automate peer review ignores the fact that these models lack the unique perspectives of real reviewers.Credit: Peshkova/Getty

Time is a precious commodity for research scientists. As students, we develop skills for managing our time, which we hone throughout our careers as responsibilities change. We are often quick to adopt new technologies that lighten the load. From statistical software and digital typesetting to online literature searches and high-throughput data collection, technology helps us to do more, faster. Yet some aspects of our jobs seem resistant to automation: reading the literature, drafting manuscripts and engaging in peer review, for example.

Advances in generative artificial intelligence (AI) in the past few years hint at the possibility of automating these time-consuming tasks. Proposals for adopting AI into scientific processes have run the gamut from generating hypotheses and collecting data to performing analyses, writing papers, conducting review, detecting errors and evaluating the reliability of published work. Some have even suggested that AI could do PhD-level research or operate as a fully autonomous scientist.

As researchers who study how digital information technologies affect society, we find ourselves grappling with whether and how to adopt AI into our research workflows. We face the same time pressures as everyone else, but ethical concerns — and questions of accuracy — loom large. AI-use policies continue to shift as journals and funders struggle to keep pace. Prominent researchers, including AI experts, have already run into trouble.

These are all important, practical matters. But a deeper issue lurks beneath the surface. The reason these aspects of our jobs have been so challenging to automate is that they rely on something even more precious than our time: namely, our capacity for scientific decision-making. It is worth considering what we lose when we cede that — and our agency — to machines.

Peer review

Over the past decade, the number of published articles has grown much faster than the number of scientists. The peer-review system is strained to near breaking point. As editors, we find it harder than ever to secure willing referees; as authors, we watch our manuscripts languish in editorial limbo; as reviewers, we receive many more requests than we can possibly accept.

portrait of Carl Bergstrom.

Evolutionary biologist Carl Bergstrom.Credit: Kris Tsujikawa

We (C.T.B. and J.B.-C.) have collaborated for the better part of a decade and have come to realize that we share an approach to writing referee reports. We begin with an initial read of the paper, taking notes along the way. With a sense of the full arc, we reread the paper, refining our notes and diving deeper: searching the literature, checking code, sketching diagrams or doing a bit of analysis, for example. Eventually, we cull, organize and expand our notes into the prose that constitutes the final report. The process — which is entirely uncompensated and mostly unrecognized — takes hours or even days.

Could a large language model (LLM) accelerate the process? Some engineers are trying to build entirely automated peer-review systems, but that goal strikes us as patently silly: it is peer review, after all, emphasis on ‘peer’.

Attempting a more moderate position, one Nature columnist suggested that reviewers first use auto-dictation tools to compile notes during an initial read and then feed those notes into an LLM to organize their feedback. The author says they can complete a review using this approach in 30–40 minutes, or even faster if the paper is obviously flawed.

But in our view, writing a good peer review is not rote work. Like any kind of critical analysis, it requires that we triage, rank and organize our unstructured thoughts. We perhaps start with praise, then raise a few of the most pressing issues that need to be addressed. From there, we enumerate quick fixes, minor concerns and points of confusion. The whole process constitutes a negotiation with ourselves over what is important enough to mention, so that we can negotiate with the editors and authors over what should be changed. We might discover that our original impressions were misguided; that some of our comments need to be revised or omitted; or that seemingly minor issues are, in fact, fundamental flaws. The process of summarizing and synthesizing helps us to engage more deeply with the manuscript.

Writing a peer review, then, or even going from initial notes to the final text, requires capacities that an LLM simply lacks: our unique perspective, training, values, ethics, domain expertise, understanding of editorial priorities and perceptions of the authors. Even if an AI tool could write a review, it would never be able to write your review — or any peer’s review, for that matter. If we cede this process to LLMs, we relinquish our agency to improve the scientific literature. As historian David McCullough expressed in a 2003 interview: “Writing is thinking. To write well is to think clearly. That’s why it’s so hard.”

AI science, automation and de-skilling

The implications extend well beyond peer review. Proponents of AI-assisted science argue that AI can increase productivity by freeing researchers from the drudgery of routine tasks. Where these tasks are entirely rote — formatting a bibliography, for example — we agree. But, more often than not, they include crucial aspects of doing science.



Source link

By AUTHOR

Leave a Reply

Your email address will not be published. Required fields are marked *