AI Guardrails Index

Created by Guardrails AI

AI Guardrails Categories

We broke AI safety down into 6 categories and curated datasets and models that demonstrate the state of AI guardrails using LLMs and other open source models.

Jailbreaking

Jailbreaking LLMs bypasses safety measures to generate harmful content, posing risks across industries.

Learn how effective models resist attempts to bypass their safety controls and restrictions.

Best Model

Detect Jailbreak

Performance: 0.81

Top Models Comparison

0.81

PII Detection

Exposing unredacted PII in AI applications risks compliance violations and privacy breaches.

Learn how well models identify and mask PII to ensure compliance and privacy.

Best Model

Guardrails PII

Performance: 0.65

Top Models Comparison

0.65

0.62

Content Moderation

Unchecked AI outputs can spread harmful content, posing reputational and compliance risks.

Learn how well models filter toxic language and prevent the amplification of harmful content.

Best Model

Toxic Language

Performance: 0.72

Top Models Comparison

0.72

0.60

Topic Restriction

LLMs can generate off-topic or unauthorized content, leading to misuse and compliance concerns.

Learn how well models identify deviation from topic boundaries and guidelines.

Best Model

Restrict to Topic (Hybrid)

Performance: 0.93

Top Models Comparison

0.93

0.91

Competitor Check

The inadvertent creation or favoring of competitor mentions can impact brand equity and control.

Learn how well models handle discussions of competing AI companies appropriately.

Best Model

Competitor Check

Performance: 0.67

Top Models Comparison

0.67

0.64

Hallucination

AI hallucinations can result in inaccurate and misleading text that is nonetheless compelling and convincing.

Learn how different models tend to generate false or unsupported information.

Best Model

ProvenanceLLM

Performance: 0.77

Top Models Comparison

0.77

0.75

Model Leaderboard

Download PDF

A comprehensive visual comparison of how top-performing models stack up across key benchmarks like hallucinations, PII data exposure, and alignment with your AI strategy.

Deep dive into our findings

Download PDF

Learn more about our dataset curation process, our evaluation methodologies and our findings on the effectiveness of various guardrails.

001

Guardrails Tested

002

Number of Datasets

003

AI Guardrails Index

Created by Guardrails AI

AI Guardrails Categories

Jailbreaking

Best Model

Top Models Comparison

PII Detection

Best Model

Top Models Comparison

Content Moderation

Best Model

Top Models Comparison

Topic Restriction

Best Model

Top Models Comparison

Competitor Check

Best Model

Top Models Comparison

Hallucination

Best Model

Top Models Comparison

Model Leaderboard

Deep dive into our findings

Guardrails Tested

Number of Datasets

Days spent on GPU