Using AI Agents to Automate Black-Box Audits of Personalization Algorithms at Scale

Morosini, Alessandro; Cen, Sarah H.; Ilyas, Andrew; Driss, Hedi; Mądry, Aleksander; Podimata, Chara

[Submitted on 29 Jun 2026]

Title:Using AI Agents to Automate Black-Box Audits of Personalization Algorithms at Scale

Authors:Alessandro Morosini, Sarah H. Cen, Andrew Ilyas, Hedi Driss, Aleksander Mądry, Chara Podimata

Abstract:Personalization algorithms determine what content users encounter on online platforms. Auditing these systems is difficult because independent auditors have only black-box access to the algorithms, while personalization depends on users' attributes, behavior, and evolving interaction histories. Existing auditing methods face a tradeoff: studies with real users capture realistic behavior but are costly and hard to control, whereas sock-puppet audits scale more easily but often rely on scripted behavior that limits realism. Beyond this, both approaches struggle to decouple user attributes from user behavior, limiting our ability to causally understand personalization. To address this gap, we introduce a framework for black-box audits of personalization algorithms using generative AI agents as behavioral engines for synthetic accounts. Each agent is instantiated with a fixed persona, grounded in demographic and political survey data, and interacts with a platform's content by reasoning about it and choosing actions. Because behavior is fixed within each persona while platform-visible signals such as age, gender, or location can be experimentally perturbed, our design enables counterfactual auditing of how platforms respond to user attributes. As a case study, we deploy 1,120 agents on X shortly after the 2024 U.S. election, spanning 14 personas and three counterfactual conditions, collecting over 200,000 content exposures. We find that X's algorithmic feed amplifies toxic, polarizing, political, and right-leaning content relative to the chronological feed, with amplification varying sharply by user ideology. Counterfactual analyses show that demographic signals affect content delivery in persona-dependent ways: pooled effects are largely null, while subgroup-level effects vary in direction and magnitude. Our work establishes GenAI-based agents as a new tool for algorithmic auditing.

Comments:	43 pages, 10 figures
Subjects:	Computation and Language (cs.CL); Computers and Society (cs.CY); Machine Learning (cs.LG); Social and Information Networks (cs.SI)
Cite as:	arXiv:2606.30801 [cs.CL]
(or arXiv:2606.30801v1 [cs.CL] for this version)
https://doi.org/10.48550/arXiv.2606.30801

Computer Science> Computation and Language

Title:Using AI Agents to Automate Black-Box Audits of Personalization Algorithms at Scale

Submission history

Access Paper:

Current browse context:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science> Computation and Language

Title:Using AI Agents to Automate Black-Box Audits of Personalization Algorithms at Scale

Submission history

Access Paper:

Current browse context:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators