Google’s NotebookLM, a tool that uses large language models to turn research papers into audio podcasts, can help make complex scientific topics more accessible but is still susceptible to errors, according to a recent study by University of Pittsburgh researchers.
Ian Flynn, a research assistant professor in the Department of Geology and Environmental Science at the Kenneth P. Dietrich School of Arts and Sciences, discussed the benefits and limitations of such AI tools in planetary sciences in an October paper published in Perspectives of Earth and Space Scientists, a journal from the American Geophysical Union (AGU). The paper was recently highlighted as a featured story on AGU’s social media platforms. Flynn said, “It was a great honor for the paper to receive this recognition.”
Flynn collaborated with Sean Peters, visiting assistant professor at Middlebury College in Vermont. Together they selected three previously published papers about volcanism on Mars and used NotebookLM to generate audio overviews. Google describes these overviews as “deep-dive discussions between AI hosts that provide in-depth summaries of the key topics in your uploaded sources.” The intention is for these summaries to objectively represent their source material rather than present subjective opinions from the AI hosts.
The three selected papers varied in format: one was a short letter-type paper with several figures and tables; another was a typical 29-page research article; and the third was a 23-page review paper.
Flynn and Peters found that NotebookLM produced engaging summaries using straightforward language and creative analogies that could aid education and improve accessibility. However, every overview also contained mistakes—often appearing at the end—such as unjustified extrapolations. For example, one summary inferred the presence of liquid water and possible life on Mars based solely on volcanic features described in an original paper, though neither conclusion appeared in that source.
Some errors were less obvious, making them harder for nonexperts to notice. To address this risk, Flynn and Peters recommended always consulting original research materials alongside any AI-generated content.
Despite its flaws, Flynn’s team saw value in NotebookLM’s ability to support learning about scientific topics and demonstrate how information should be interpreted or accessed more widely. Their study concluded: “Overall, NotebookLM’s generated audio overviews can be a useful tool for the planetary science community,” adding that it is unlikely these tools will replace critical reading of original research material.



