You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
🤩 Tired of chaotic structures and inaccurate references in AI-generated survey paper? SurveyForge is here to revolutionize your research experience!
🔥 News
2025.05: 🎉🎉 Congratulations: SurveyForge was accepted by ACL-2025 main conference.
Introduction
Survey papers are vital in scientific research, especially with the rapid increase in research publications. Recently, researchers have started using LLMs to automate survey creation for improved efficiency. However, LLM-generated surveys often fall short compared to human-written ones, particularly in outline quality and citation accuracy. To address this, we introduce SurveyForge, which first creates an outline by analyzing the structure of human-written outlines and consulting domain-related articles. Then, using high-quality papers retrieved by our scholar navigation agent, SurveyForge can automatically generate and refine the content of the survey.
Moreover, to achieve a comprehensive evaluation, we construct SurveyBench, which includes 100 human-written survey papers for win-rate comparison and assesses AI-generated survey papers across three dimensions: reference, outline, and content quality.
🤔How to try out SurveyForge?
Due to the current limitations on API call frequency, please kindly send us an email or open an issue in the repository to inform us of the survey topic you intend to generate.
⏱️Surveyforge only takes about 10 minutes to generate a survey paper. There may be a wait time as the number of users increases, so submit your topic early!
🌟Don’t forget to click the STAR to track if your survey is ready!
Note: Our initial version currently supports survey generation only in the computer science domain, as we need to align with previous evaluation benchmarks in this field. We're actively working on expanding to other academic disciplines, and integration is already in progress. Thank you for your understanding and support!
Currently , SurveyBench consists of approximately 100 human-written survey papers across 10 distinct topics, carefully curated by doctoral-level researchers to ensure thematic consistency and academic rigor. The supported topics and the core references corresponding to each topic are as follows:
Topics
# Reference
Multimodal Large Language Models
912
Evaluation of Large Language Models
714
3D Object Detection in Autonomous Driving
441
Vision Transformers
563
Hallucination in Large Language Models
500
Generative Diffusion Models
994
3D Gaussian Splatting
330
LLM-based Multi-Agent
823
Graph Neural Networks
670
Retrieval-Augmented Generation for Large Language Models
608
More support topics coming soon!
🧑💻You can evaluate the survey by:
cd SurveyBench && python test.py --is_human_eval
Note set is_human_eval True for human survey evaluation, False for generated surveys.
If you want to evaluate your method on SurveyBench, please follow the format:
@misc{yan2025surveyforgeoutlineheuristicsmemorydriven,
title={SurveyForge: On the Outline Heuristics, Memory-Driven Generation, and Multi-dimensional Evaluation for Automated Survey Writing},
author={Xiangchao Yan and Shiyang Feng and Jiakang Yuan and Renqiu Xia and Bin Wang and Bo Zhang and Lei Bai},
year={2025},
eprint={2503.04629},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2503.04629},
}
About
(ACL-2025 main conference) SurveyForge: On the Outline Heuristics, Memory-Driven Generation, and Multi-dimensional Evaluation for Automated Survey Writing