Wikipedia:WikiProject AI Cleanup/Research
Appearance
From Wikipedia, the free encyclopedia
Research Newsletter
- Up to 5% of AI-generated content? (19 October 2024)
- WikiCrow writes gene articles better than humans (26 September 2024)
- Wikipedia, less trusted than ChatGPT? (4 September 2024)
- STORM paradigm and content retrieval (14 August 2024)
Events
Future
Past
- WikiNLP#2 at ACL 2025 (27 July – 1st August 2025)
- NLP for Wikipedia (EMNLP 2024) (16 November 2024)
- Bellagio 2024 (19–23 February 2024)
Publications
- Many relevant papers at User:Francishunger
- Once added, publications will be organized here by specific topic
Generation
- Skarlinski, Michael D.; Cox, Sam; Laurent, Jon M.; Braza, James D.; Hinks, Michaela; Hammerling, Michael J.; Ponnapati, Manvitha; Rodriques, Samuel G.; White, Andrew D. (2024), Language Agents Achieve Superhuman Synthesis of Scientific Knowledge, San Francisco, CA: FutureHouse
- Shao, Yijia; Jiang, Yucheng; Kanell, Theodore A.; Xu, Peter; Khattab, Omar; Lam, Monica S. (2024). "Assisting in Writing Wikipedia-like Articles from Scratch with Large Language Models". arXiv:2402.14207 [cs.CL].
- Zhang, Jiebin; Yu, Eugene J.; Chen, Qinyu; Xiong, Chenhao; Zhu, Dawei; Qian, Han; Song, Mingbo; Li, Xiaoguang; Liu, Qun; Li, Sujian (2024年02月28日). "Retrieval-based Full-length Wikipedia Generation for Emergent Events". arXiv:2402.18264v1 [cs.CL].
- Li, Irene; Fabbri, Alex; Kawamura, Rina; Liu, Yixin; Tang, Xiangru; Tae, Jaesung; Shen, Chang; Ma, Sally; Mizutani, Tomoe; Radev, Dragomir (June 2022). "Surfer100: Generating Surveys From Web Resources, Wikipedia-style". In Calzolari, Nicoletta; Béchet, Frédéric; Blache, Philippe; Choukri, Khalid; Cieri, Christopher; Declerck, Thierry; Goggi, Sara; Isahara, Hitoshi; Maegaard, Bente; Mariani, Joseph; Mazo, Hélène; Odijk, Jan; Piperidis, Stelios (eds.). Proceedings of the Thirteenth Language Resources and Evaluation Conference. LREC 2022. Marseille, France: European Language Resources Association. pp. 5388–5392.
- Gao, Fan; Jiang, Hang; Yang, Rui; Zeng, Qingcheng; Lu, Jinghui; Blum, Moritz; Liu, Dairui; She, Tianwei; Jiang, Yuang; Li, Irene (2023). "Large Language Models on Wikipedia-Style Survey Generation: An Evaluation in NLP Concepts". arXiv:2308.10410 [cs.CL].
- Agarwal, Aditya; Mamidi, Radhika (2023). "Automatically Generating Hindi Wikipedia Pages using Wikidata as a Knowledge Graph: A Domain-Specific Template Sentences Approach" (PDF). Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing. International Conference Recent Advances in Natural Language Processing. INCOMA Ltd., Shoumen, BULGARIA. pp. 11–21. doi:10.26615/978-954-452-092-2_002. ISBN 978-954-452-092-2.
- Subramanian, Shivansh (2024年06月07日). Grounded Content Automation: Generation and Verification of Wikipedia in Low-Resouce languages (Thesis). IIIT Hyderabad.
- Mille, Simon; Pronesti, Massimiliano; Thomson, Craig; Lorandi, Michela; Fitzpatrick, Sophie; Huidrom, Rudali; Sabry, Mohammed; O'Riordan, Amy; Belz, Anya (September 2024). "Filling Gaps in Wikipedia: Leveraging Data-to-Text Generation to Improve Encyclopedic Coverage of Underrepresented Groups". In Mahamood, Saad; Le Minh, Nguyen; Ippolito, Daphne (eds.). Proceedings of the 17th International Natural Language Generation Conference: System Demonstrations. Tokyo, Japan: Association for Computational Linguistics. pp. 16–19.
Detection
- Petroni, Fabio; Broscheit, Samuel; Piktus, Aleksandra; Lewis, Patrick; Izacard, Gautier; Hosseini, Lucas; Dwivedi-Yu, Jane; Lomeli, Maria; Schick, Timo; Bevilacqua, Michele; Mazaré, Pierre-Emmanuel; Joulin, Armand; Grave, Edouard; Riedel, Sebastian (2024年10月19日). "Improving Wikipedia verifiability with AI". Nature Machine Intelligence. 5 (10): 1142–1148. doi:10.1038/s42256-023-00726-1.
- Brooks, Creston; Eggert, Samuel; Peskoff, Denis (2024). "The Rise of AI-Generated Content in Wikipedia". arXiv:2410.08044 [cs.CL].
- Kobak, Dmitry; González-Márquez, Rita; Horvát, Emőke-Ágnes; Lause, Jan (2025). "Delving into LLM-assisted writing in biomedical publications through excess vocabulary". Science Advances. 11 (27) eadt3813. arXiv:2406.07016 . Bibcode:2025SciA...11.3813K. doi:10.1126/sciadv.adt3813. PMC 12219543 . PMID 40601754.
- Juzek, Tom S.; Ward, Zina B. (2024). "Why Does ChatGPT "Delve" So Much? Exploring the Sources of Lexical Overrepresentation in Large Language Models". arXiv:2412.11385 [cs.CL].
- Russell, Jenna; Karpinska, Marzena; Iyyer, Mohit (2025). "People who frequently use ChatGPT for writing tasks are accurate and robust detectors of AI-generated text". arXiv:2501.15654v2 [cs.CL].