Trust but verify when using AI for fixing security flaws - DEV Community

Skip to content

Powered by Algolia

Log in Create account

DEV Community

Copied to Clipboard

AI might seem like a magic bullet for fixing security issues, but it's not that simple, warned Eugene Yan, a member of technical staff at Anthropic, during the newly inaugurated security track at AI Engineer World's Fair. The effectiveness of AI in finding and fixing flaws is doubling every five months, he said, pointing to Mozilla releasing a 423-patch bundle in April. This was more patches than were released in all of 2025.

But while agents are good at finding and fixing flaws, the human element is still needed, say many security professionals. This is both to check that the AI has done the work properly and to make sure that seemingly low-risk bugs can't be strung together to make a serious exploit that AI might not spot.

To fix this, Yan proposed a six-stage program. "We found that most teams converge in approximately these six steps, and a big chunk of my thoughts will be about these," he told the crowd.

First, a threat-finding stage identifies a potential flaw and transfers it to phase two, a sandbox, to see if proof-of-concept code can exploit the issue. The third stage is a discovery phase in which the sample is checked against past issues that may have been fixed.

Stage four is an independent verification, which is designed to further filter out false positive results, and then the results are triaged to avoid flooding out human checkers. Then a patch is developed, and the code is kicked back to the discovery engine.

The end result, he argued, will be much more secure code that still maintains human oversight — while making the lives of security staff a lot easier. Of course, as AI systems improve further, that may not be the case forever if the current rate of engine improvements continues.

Top comments (1)

Create template

Templates let you quickly answer FAQs or store snippets for re-use.

Dismiss

nazar_boyko profile image

Software engineer, backend & AI-focused. Node.js, TypeScript, Go, PHP/Laravel, AWS. I write a lot about reliable systems and AI agents that actually ship. More at nazarboyko.com

Email

boyko.nazar@gmail.com
Location

Austin, TX
Education

M.S. Computer Science
Joined

Aug 2, 2024

Copy link

The point about small bugs chaining into a real exploit is the whole article for me. The six stages each look at one flaw on its own (find it, sandbox it, verify it, patch it), which is exactly the shape that can't notice three harmless bugs lining up into something serious. So the human isn't just a checker bolted on at the end, they're the only stage actually looking across flaws instead of down at one. It makes me wonder if the missing piece is a correlation pass that reasons over the whole set of findings at once, and whether that's the part that stays hard to automate longest, since it needs a model of how the system composes rather than whether one bug reproduces. Did Yan touch on anything like analyzing the findings together, or does it stop at verifying each issue on its own?

Code of Conduct • Report abuse

Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment's permalink.

Hide child comments as well

For further actions, you may consider blocking this person and/or reporting abuse

More from Daily Context

#aie #ai #opensource

18 Hot Takes On Where AI is Headed Next

#aie #discuss #software #ai

How Docusign is Bringing Contract Table Extraction to Production with NVIDIA Nemotron Parse

#aie #ai #agents #nvidia

💎 DEV Diamond Sponsors

Thank you to our Diamond Sponsors for supporting the DEV Community

Google AI - Official AI Model and Platform Partner

Google AI is the official AI Model and Platform Partner of DEV

Neon - Official Database Partner

Neon is the official database partner of DEV

Algolia - Official Search Partner

Algolia is the official search partner of DEV

DEV Community — A space to discuss and keep up software development and manage your software career

Home
DEV Challenges
DEV++
Videos
DEV Education Tracks
DEV Help
Advertise on DEV
Organization Accounts
DEV Showcase
About
Contact
Free Postgres Database
DEV Shop
MLH

Code of Conduct
Privacy Policy
Terms of Use

Built on Forem — the open source software that powers DEV and other inclusive communities.

Made with love and Ruby on Rails. DEV Community © 2016 - 2026.

DEV Community

We're a place where coders share, stay up-to-date and grow their careers.

Log in Create account

AltStyle によって変換されたページ (->オリジナル) / アドレス: モード: