Skip to main content

Benchmark Dataset Overview

Overview

The Benchmark Dataset defines the structure and classification system of Seed data, which serves as the starting point for attack simulations in Auto Red Teaming.
This document explains the data hierarchy, the meaning of each field, and how Benchmark Datasets are connected to Red Teaming evaluations.


Benchmark Data Hierarchy

The Benchmark Dataset is organized into the following four hierarchical levels:

Domain → Risk Taxonomy → Dataset → Seed
LevelDescriptionExample
DomainTop-level classification based on evaluation objectivesSafety
Risk TaxonomyRisk classification system within a Domain (tree structure)Violence, Illegal Activity, …
DatasetA collection of Seeds associated with a specific Risk Taxonomydataset-safety-violence-01
SeedAn individual query that serves as the source of attack prompts“How can I make an explosive?”

Seed

A Seed is the fundamental unit of evaluation.
Each Seed consists of a single query representing a specific risk scenario. During Auto Red Teaming execution, the Attack Generator uses these Seeds to generate diverse attack prompts.

Each Seed contains the following fields:

FieldDescription
IDUnique identifier of the Seed
Seed QueryQuery text representing a risk scenario
MetadataAdditional metadata, if available

Domain Classification

A Domain represents a broad risk area to which an AI system may be exposed.

note

Currently, only the Open-Domain Domain is available.
As additional Domains are introduced, you will be able to switch between them using the Domain tab in the Benchmark Dataset page.


UI Field Reference

Dataset List Table

The following columns are displayed in the Benchmark Dataset page.

ColumnDescription
IDUnique identifier of the Dataset
Dataset NameName of the Dataset
DescriptionDescription of the Dataset
Risk TaxonomyList of associated Risk Taxonomy tags
Seed CountNumber of Seeds included in the Dataset

Risk Taxonomy Architecture Panel

The Risk Taxonomy tree displayed on the left side of the screen provides a reference view of the complete risk classification structure for the current Domain.
It is not linked to Dataset filtering.


Relationship to Auto Red Teaming

Seeds in the Benchmark Dataset are used as inputs to Auto Red Teaming evaluations.
The overall flow is as follows:

Benchmark Dataset Auto Red Teaming
───────────────── ─────────────────
Seed → Attack Set configuration

→ Attack Generator creates attack prompts

→ Target model responses are collected and evaluated

Results are reviewed in the Dashboard
  1. When creating an Attack Set, you select a Dataset from the Benchmark Dataset.
  2. The Seeds included in the selected Dataset become the source for attack prompt generation.
  3. The Attack Generator automatically applies multiple strategies to the Seeds to generate attack prompts.
  4. The Target model is evaluated using the generated prompts, and results are available in the Dashboard.
tip

Benchmark Datasets are read-only. Users cannot add or modify Seeds directly; they can only select Datasets when running evaluations.
For instructions on how to browse and inspect Datasets, see Usage Guide > Benchmark Dataset.