Expanding Biomedical Research and Avoiding Hallucinations

By Chris Heckman '12 (Reference Librarian, Henry M. Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD)

Artificial intelligence technology is already being integrated into much of the basic office software we use. Microsoft's next suite of products will include AI tools, and other software products are hot on their heels.

I work as a medical librarian. I spend most of my time at work helping medical researchers and students locate research articles and data. The introduction of AI tools is likely to have a significant impact on both research and practice in the field of medicine generally, but I will focus here on its applications to my work navigating biomedical research literature.

Much of the coverage of AI has focused on text generators like ChatGPT or Google's Gemini AI, but publicly available tools like these are largely ineffective in navigating scientific literature. They weren't trained on the body of research literature that is locked behind publisher paywalls, and they can't interact effectively with subscription database interfaces.

Reliability is another issue. Large language learning models generate their responses based on an astonishing number of variables, and they're always "learning" based on user input. Consequently, if you request "the five most important articles on anticoagulants" one day and then make the same request the next, you're likely to get two different answers.

Further challenges are posed by "hallucinations," the tendency of text generators to occasionally invent false answers. This is inherent in the creative power of text generators; the same capacity for combining text in unexpected ways that allows ChatGPT to compose an extemporaneous poem on a topic of your choice also means that it might generate a plausible looking citation to a study that doesn't actually exist.

The future of AI in scientific literature lies in tools developed specifically for navigating this body of research. There are already multiple startups focused on this project, and I have no doubt that the big scientific publishers are pursuing AI tools they can embed into their research databases.

The success of these projects will depend on their reliability. Can these AI tools consistently return accurate, useful responses to users with a minimum of hallucinations?

If so, they have the potential to radically streamline and simplify the literature review process for researchers, saving time for busy scientists and students while speeding up the pace of scientific advancement. If not, they will either fail to catch on or artificially distort the direction of future scientific research.

Users also have questions about potential racial and gender bias in AI-generated content. AI output is only as good as the data it is trained on, and if the input reflects systemic biases, so will the output.

Finally, many in my field have data privacy concerns about these tools. What kind of insights might data about frequent searches at a given hospital, or by a certain set of doctors, yield? How might the publisher who owns the database use the data? To whom might they sell it?

In my field, if reliable AI tools become available they could become indispensable tools that medical librarians, students, and doctors use every day. On the other hand, if challenges related to reliability, bias, and data privacy aren't overcome, their applications may be much more limited, or they may have problematic effects.

I will be eagerly waiting to see what happens next.