Aren't the central example of founders in AI Safety the people who founded Anthropic, OpenAI and arguably Deepmind? Right after that Mechanize comes to mind.
I am not fully sure what you mean by founders, but it seems to me that the best organizations were founded by people who also wrote a lot, and generally developed a good model of the problems in parallel to running an organization. Even this isn't a great predictor. I don't really know what is. It seems like generally working in the space is just super high variance.
To be clear, overall I do think many more people should found organization, but the arguments in this post seem really quite weak. The issue is really not that otherwise we "can't scale the AI Safety field". If anything it goes the other way around! If you just want to scale the AI safety field, go work at one of the existing big organizations like Anthropic, or Deepmind, or Far Labs or whatever. They can consume tons of talent, and you can probably work with them on capturing more talent (of course, I think the consequences of doing so for many of those orgs would be quite bad, but you don't seem to think so).
Also, to expand some more on your coverage of counterarguments:
> If outreach funnels attract a large number of low-caliber talent to AI safety, we can enforce high standards for research grants and second-stage programs like ARENA and MATS.
No, you can't, because the large set of people you are trying to "filter out" will now take an adversarial stance towards you as they are not getting the resources they think they deserve from the field. This reduces the signal-to-noise ratio of almost all channels of talent evaluation, and in the worst case produces quite agentic groups of people actively trying to worsen the judgement of the field in order to gain entry.
I happen to have written a lot about this this week: Paranoia: A Beginner's Guide for example has an explanation of lemons markets that applies straightforwardly to grant evaluations and program applications.
This is a thing that has happened all over the place, see for example the pressures on elite universities to drop admission standards and continue grade inflation by the many people that are now part of the university system, but wouldn't have been in previous decades.
Summoning adversaries, especially ones that have built an identity around membership in your group should be done very carefully. See also Tell people as early as possible it's not going to work out, which I also happen to have published this week.
> subsequently, frontier AI companies grew 2-3x/year, apparently unconcerned by dilution.
Yes, and this was of course, quite bad for the world? I don't know, maybe you are trying to model AI safety as some kind of race between AI Safety and the labs, but I think this largely fails to model the state of the field.
Like, again, man, do you really think the world would be at all different in terms of our progress on safety if everyone who works on whatever applied safety is supposedly so scalable had just never worked there? Kimi K2 is basically as aligned and as likely to be safe when scaled to superintelligence as whatever Anthropic is cooking up today. The most you can say is that safety researchers have been succeeding at producing evidence about the difficulty of alignment, but of course that progress has been enormously set back by all the safety researchers working at the frontier labs which the "scaling of the field" is just shoveling talent into, which has pressured huge numbers of people to drastically understate the difficulty and risks from AI.
> Many successful AI safety founders work in research-heavy roles (e.g., Buck Shlegeris, Beth Barnes, Adam Gleave, Dan Hendrycks, Marius Hobbhahn, Owain Evans, Ben Garfinkel, Eliezer Yudkowsky, Nate Soares) and the status ladder seems to reward technical prestige over building infrastructure.
I mean, and many of them don't! CEA has not been lead by people with research experience for many years, and man I would give so much to have ended up in a world that went differently. IMO Open Phil's community building has deeply suffered from a lack of situational awareness and strategic understanding of AI, and so massively dropped the ball. I think MATS's biggest problem is roughly that approximately no one on the staff is a great researcher yourself, or even attempts to do any kind of the work you try to cultivate, which makes it much harder for you to steer the program.
Like, I am again all in favor of people starting more organizations, but man, we just need to understand that we don't have the forces of the market on our side, and this means the premium we get for having people steer the organizations who have their own internal feedback loop and their own strategic map of the situation, which requires actively engaging with the core problems of the field, is much greater than it is in YC and the open market. The default outcome if you encourage young people to start an org in "AI Safety" is to just end up with someone making a bunch of vaguely safety-adjacent RL environments that get sold to big labs, that my guess is make things largely worse (I am not confident in this, but I am pretty confident it doesn't make things much better).
And so what I am most excited about are people who do have good strategic takes starting organizations, and to demonstrate that they have that, and to develop the necessary skills, they need to write and publish publicly (or at least receive mentorship from someone who does have for a substantial period of time).