[フレーム]

InfoQ Software Architects' Newsletter

A monthly overview of things you need to know as an architect or aspiring architect.

View an example

We protect your privacy.

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Login with:

Google Microsoft Twitter Facebook

Don't have an InfoQ account?

Logo - Back to homepage

News Articles Presentations Podcasts Guides

Topics

Development

Featured in Development

How to Compute Without Looking: A Sneak Peek into Secure Multi-Party Computation

This article shows how you can compute a function across multiple parties that do not trust each other without forcing them to share their individual inputs. This technique can be used to split secrets among parties, perform logical operations, or count votes in a way that ensures data privacy is preserved.

How to Compute Without Looking: A Sneak Peek into Secure Multi-Party Computation

All in development

Architecture & Design

Featured in Architecture & Design

OpenSearch Cluster Topologies for Cost Saving Autoscaling

Amitai Stern discusses cost-saving autoscaling topologies for OpenSearch. He explains the inherent challenges in autoscaling unstructured data systems like OpenSearch and Elasticsearch, using analogies to illustrate the complexities beyond simply adding nodes. He shares architectural patterns (burst indexes, burst clusters) to optimize resource utilization and handle fluctuating loads effectively.

OpenSearch Cluster Topologies for Cost Saving Autoscaling

All in architecture-design

AI Infrastructure

Featured in AI, ML & Data Engineering

Beyond Chatbots: Architecting Domain-Specific Generative AI for Operational Decision-Making

This article explores the use of domain-specific Generative AI, models that understand operational constraints, real-world dynamics, and business rules to generate executable strategies, not just text descriptions. These models require significantly smaller datasets and fewer parameters, making them cost-effective while enabling AI-driven core business decision intelligence at scale.

Beyond Chatbots: Architecting Domain-Specific Generative AI for Operational Decision-Making

All in ai-ml-data-eng

Culture & Methods

Featured in Culture & Methods

Bringing a Product Mindset to an Infrastructure Platform Team

Stéphane Di Cesare discusses DKB's experience introducing a product mindset within their platform team, explaining their definition of a platform team, the rationale behind the shift, and their journey including challenges, goals, and key learnings around user value, platform definition, maturity models, and effective communication strategies for senior software developers and engineering leaders.

Bringing a Product Mindset to an Infrastructure Platform Team

All in culture-methods

DevOps

Featured in DevOps

Checklist for Kubernetes in Production: Best Practices for SREs

This article provides SREs with a checklist for managing Kubernetes in production environments. It identifies common challenges including resource management, workload placement, high availability, health probes, storage, monitoring, and cost optimization. By implementing consistent GitOps automation across these areas, teams can significantly reduce complexity, and prevent downtime.

Checklist for Kubernetes in Production: Best Practices for SREs

All in devops

Events

Helpful links

Choose your language

QCon London

Discover emerging trends, insights, and real-world best practices in software development & tech leadership. Join now.

InfoQ Dev Summit Boston

Learn how senior software developers are solving the challenges you face. Register now with early bird tickets.

InfoQ Dev Summit Munich

Learn practical solutions to today's most pressing software challenges. Register now with early bird tickets.

QCon San Francisco

Explore insights, real-world best practices and solutions in software development & leadership. Register now.

InfoQ Homepage News GitHub Leverages AI for More Accurate Code Secret Scanning

Development

GitHub Leverages AI for More Accurate Code Secret Scanning

Mar 26, 2025 2 min read

Steef-Jan Wiggers

Write for InfoQ

Feed your curiosity. Help 550k+ global
senior developers
each month stay ahead.Get in touch

Listen to this article - 0:00

Audio ready to play

0:00

Reading list

GitHub has launched an AI-powered secret scanning feature within Copilot, integrated into GitHub Secret Protection, that leverages context analysis to improve the detection of leaked passwords in code significantly. This new approach addresses the shortcomings of traditional regular expression-based methods, which often miss varied password structures and generate numerous false positives.

According to a GitHub blog post detailing the development, the system now analyzes the usage and location of potential secrets to reduce irrelevant alerts and provide more accurate notifications critical to repository security. Sorin Moga, a senior software engineer at Sensis, commented on LinkedIn that this marks a new era in platform security, where AI not only assists in development, but also safeguards code integrity.

A key challenge identified during the private preview of GitHub's AI-powered secret scanning was its struggle with unconventional file types and structures, highlighting the limitations of relying solely on the large language model's (LLM) initial training data. GitHub's initial approach involved "few-shot prompting" with GPT-3.5-Turbo, where the model was provided with examples to guide detection.

To address these early challenges, GitHub significantly enhanced its offline evaluation framework by incorporating feedback from private preview participants to diversify test cases and leveraging the GitHub Code Security team’s evaluation processes to build a more robust data collection pipeline. They even used GPT-4 to generate new test cases based on learnings from existing secret scanning alerts in open-source repositories. This improved evaluation allowed for better measurement of precision (reducing false positives) and recall (reducing false negatives).

GitHub experimented with various techniques to improve detection quality, including trying different LLM models (like GPT-4 as a confirming scanner), repeated prompting ("voting"), and diverse prompting strategies. Ultimately, they collaborated with Microsoft, adopting their MetaReflection technique, a form of offline reinforcement learning that blends Chain of Thought (CoT) and few-shot prompting to enhance precision.

As stated in the GitHub blog post:

We ultimately ended up using a combination of all these techniques and moved Copilot secret scanning into public preview, opening it widely to all GitHub Secret Protection customers.

To further validate these improvements and gain confidence for general availability, GitHub implemented a "mirror testing" framework. This involved testing prompt and filtering changes on a subset of repositories from the public preview. By rescanning these repositories with the latest improvements, GitHub could assess the impact on real alert volumes and false positive resolutions without affecting users.

This testing revealed a significant drop in both detections and false positives, with minimal impact on finding actual passwords, including a 94% reduction in false positives in some cases. The blog post concludes that:

This before-and-after comparison indicated that all the different changes we made during private and public preview led to increased precision without sacrificing recall, and that we were ready to provide a reliable and efficient detection mechanism to all GitHub Secret Protection customers.

The lessons learned during this development include prioritizing accuracy, using diverse test cases based on user feedback, managing resources effectively, and fostering collaboration. These learnings are also being applied to Copilot Autofix. Since the general availability launch, Copilot secret scanning has been part of security configurations, allowing users to manage which repositories are scanned.

About the Author

Steef-Jan Wiggers

Show moreShow less

This content is in the Artificial Intelligence topic

The InfoQ Newsletter

A round-up of last week’s content on InfoQ sent out every Tuesday. Join a community of over 250,000 senior developers. View an example

We protect your privacy.

InfoQ Software Architects' Newsletter

Login with:

Don't have an InfoQ account?

GitHub Leverages AI for More Accurate Code Secret Scanning

Write for InfoQ

About the Author

Steef-Jan Wiggers

Rate this Article

This content is in the Artificial Intelligence topic

Related Topics:

Related Editorial

Related Sponsored Content

Popular across InfoQ

Related Content

The InfoQ Newsletter