Design a responsible approach

Proactively identify potential risks of your application and define a system-level approach to build safe and responsible applications for users.

Get started

Define system-level policies

Determine what type of content your application should and should not generate.

Design for safety

Define your overall approach to implement risk mitigation techniques, considering technical and business tradeoffs.

Be transparent

Communicate your approach with artifacts like model cards.

Secure AI systems

Consider AI-specific security risks and remediation methods highlighted in the Secure AI Framework (SAIF).

Align your model

Align your model with your specific safety policies using prompting and tuning techniques.

Get started

Craft safer, more robust prompts

Use the power of LLMs to help craft safer prompt templates with the Model Alignment library.

Tune models for safety

Control model behavior by tuning your model to align with your safety and content policies.

Investigate model prompts

Build safe and helpful prompts through iterative improvement with the Learning Interpretability Tool (LIT).

Evaluate your model

Evaluate model risks on safety, fairness, and factual accuracy using our guidance and tooling.

Get started

LLM Comparator

Conduct side-by-side evaluations with LLM Comparator to qualitatively assess differences in responses between models, different prompts for the same model, or even different tunings of a model

Model evaluation guidelines

Learn about red teaming best practices and evaluate your model against academic benchmarks to assess harms around safety, fairness, and factuality.

Protect with safeguards

Filter your application's input and outputs, and protect users from undesirable outcomes.

Get started

SynthID Text

A tool for watermarking and detecting text generated by your model.

ShieldGemma

A series of content safety classifiers, built on Gemma 2, available in three sizes: 2B, 9B, 27B.

Agile classifiers

Create safety classifiers for your specific policies using parameter efficient tuning (PET) with relatively little training data

Checks AI Safety

Ensure AI safety compliance against your content policies with APIs and monitoring dashboards.

Text moderation service

Detect a list of safety attributes, including various potentially harmful categories and topics that may be considered sensitive with this Google Cloud Natural Language API available for free below a certain usage limit.

Perspective API

Identify "toxic" comments with this free Google Jigsaw API to mitigate online toxicity and ensure healthy dialogue.