Auto-Slides: An Interactive Multi-Agent System for Creating and Customizing Research Presentations

Yuheng Yang¹, Wenjia Jiang¹, Yang Wang¹, Yiwei Wang², Chi Zhang^1,*

¹AGI Lab, Westlake University
²University of California at Merced

Contact: yangyuheng@westlake.edu.cn

Fig. 1. An overview of Auto-Slides' capabilities. Left: Users can generate complete academic presentation slides from an academic paper. Right: Users can iteratively revise the generated slides by providing high-level natural language instructions, enabling efficient and precise slide editing.

Generated Slides Examples

Learning to Be A Doctor: Searching for Effective Medical Agent Architectures

1 / 13

Download PDF

StyleStudio: Text-Driven Style Transfer with Selective Control of Style Elements

1 / 14

Download PDF

Abstract

Creating academic presentations is a complex, time-consuming task that requires researchers to synthesize intricate paper content into clear, digestible slides. While Large Language Models (LLMs) have become widespread for content generation, they still struggle with handling multimodal academic content, ensuring factual accuracy, and offering precise interactive control. This often results in automated tools that fall short of professional standards.To address these challenges, we introduce Auto-Slides, an innovative multi-agent system designed to transform academic papers into high-quality, customizable presentation slides. Drawing inspiration from cognitive science and instructional design, our system integrates content understanding, quality assurance, and interactive optimization to generate slides that are both accurate and pedagogically effective.

Method

Our approach leverages a multi-agent architecture to transform academic papers into presentation slides. The system operates through three distinct phases: content understanding and structuring, quality assurance and refinement, and generation with interactive optimization. The multi-agent architecture of Auto-Slides is detailed in Figure 2, while the interactive optimization workflow is illustrated in Figure 3.

Fig. 2. Overview of Auto-Slides. Our multi-agent architecture orchestrates the transformation of academic papers into presentation slides through a three-phase pipeline: (1) Content Understanding and Structuring, where the Parser and Planner Agents analyze the source material and design the slide structure in JSON format specifying content, figures, and tables for each slide. (2) Quality Assurance and Refinement, where the Verification and Adjustment Agents ensure content fidelity and completeness. (3) Generation and Interactive Optimization, where the Generator Agent produces the final presentation slides in LaTeX code format and the Editor Agent facilitates human-in-the-loop revisions via natural language dialogue.

Fig. 3. Interactive optimization workflow. The user issues a natural language request to modify the presentation—for example, adding an explanatory slide on a specific concept. The Editor Agent interprets the request, locates the relevant slide segments, performs content retrieval and code-level modifications using a ReAct[41]-style process, compiles the updated LaTeX source, and returns the revised slide deck to the user.

BibTeX

@article{yang2025autoslides,
  title={Auto-Slides: An Interactive Multi-Agent System for Creating and Customizing Research Presentations},
  author={Yang, Yuheng and Jiang, Wenjia and Wang, Yang and Wang, Yiwei and Zhang, Chi},
  journal={arXiv preprint arXiv:2509.11062},
  year={2025},
  note={AGI Lab, Westlake University; University of California at Merced; Corresponding author: Chi Zhang},
  url={https://auto-slides.github.io/},
  eprint={2509.11062},
  archivePrefix={arXiv},
  primaryClass={cs.AI}
}